Recently I attended our local Mayor’s Breakfast where the Mayor gave an update on the Town Council’s plan for the upcoming years. During that discussion, the Mayor brought up some statistics about the area. Namely, she brought up the average household income.
Now, the area where I live is quite diverse in households. We are located just north of a very large metropolitan area and thus many of the communities in our town are bedroom communities. We also have a very large rural aspect, with farms of varying livestock and crops. Lastly, we also have many homes that belong to the rich and wealthy. In fact, I have been told our town is the location of one of the most expensive houses in Canada.
To get back to the average household income: according to our Mayor, this is about $113 000. That sounds pretty good and she used that number to discuss the type of industry, education and retail requirements of the town. Now, from a stats perspective, I’d like to show you four different charts. Each chart represents 100 data points, with an average of approximately $100 000.
Chart 1: Average is $100K and the distribution in tight – everyone is very close to $100K.
Chart 2: Average is $100K and the distribution is looser - have some people with no income and some with a household income closer to $300K
Chart 3: Average is again approximately $100K but we have more people with an income of less than $100K but they are being offset by those with incomes greater than $300K
Chart 4: Okay, average is closer to $125K but here you can see that over 75 of the 100 data points have an income less than $100K and nobody has an income between $100K and $350K. Yet those who have an income greater than $350K offset those lower incomes dramatically.
Now which one of these scenarios more accurately represent my local area? I would imagine somewhat of a blend between Chart 2 and Chart 3.
Populations statistics is not the only place this kind of variation in data all with the same average can occur. This also occurs with machine performance – you may on average be able to fill 250ml in your carton but how many are underweights and how many are overweights? Are you filling well above the 250ml to make sure your underweights fit within government regulations? How much profit are you losing to make sure you range of weights doesn’t go below T1 or T2?