Here in Canada our upcoming federal election is sparking a veritable barrage of polls and statistics. Often times the figures just don't seem to agree. What are we to make of all this?
Epidemic or Random Clustering?
Examine the location of the Aces in a recently shuffled deck of cards. Are they spaced a similar number of cards apart or are they all within close proximity of each other? Would you simply accept that it’s a perfectly normal consequence of randomization or would you suspect that cheating might be involved?
The same concept becomes much more volatile when people’s lives are involved. Every year, we hear about small towns that have an unusually high rate of disease such as cancer. In many reported cases, the clustering is due to the “bull’s-eye effect”, which is something akin to drawing a target on the wall after the darts have been thrown.
Sampling Gone Bad
Polling data is obtained by taking a sample from a larger group that hopefully has the same characteristics as the larger group. For example, if pollsters were to ask 100 people who they are going to vote for in the next election, and 45 of them say they will vote for Johnson, we might extrapolate that about 45% of all the voters will vote for Johnson.
Sampling provides many benefits, but it’s not without some important limitations.
The first issue is that the pollsters could have just happened to talk to an unusually large percentage of Johnson supporters by blind luck. This is the problem of sample size. The smaller the sample, the greater the influence of luck on the results we get. For a population of millions, a sample of one hundred participants is far too little.
Recognizing that statistics people present to us are frequently flawed doesn't imply that statistics are useless. Just be aware that the burden to examine the figures for relevance, validity and authority fall squarely on your shoulders.
Read the full article at Webopedia.
See all articles by Rob Gravelle