Data in Everyday Life – Surveys Part 2: Survey Harder

by Kaetlyn Phillips, Data Services Librarian

Cool Librarians don’t look at explosions
Image: Unflinching walk from Image from I am Number Four (2011).

Last month we looked at quick polls with and I promised a second part on public opinions polls and Statistics Canada surveys. So here’s the sequel, where we’ll explore sampling in surveys. Like most sequels, it’s going to be bigger and more intense.

Okay, so there will be no explosions, but this next level of surveys is more intense than a quick poll. There’s even more potential for bias and accurate sampling and representation are needed to make good conclusions. So let’s dive in.

With these types of surveys, more attention is paid to the targeted population and size of the sample. Survey designers will use a formula to determine their sample size by considering the number of the targeted population, the confidence level (usually 95%), and the margin of error (usually 5%). Once the ideal number for the sample is decided, the survey designers will determine how to choose their sample.

Ideally, samples should be selected using probability sampling, but often time and funding require survey designers to use non-probability sampling. Probability sampling uses random selection to choose participants for the survey. For example, in the past, landline phone numbers were frequently used to randomly select participants for national surveys. The randomness of probability sampling reduces selection bias, response bias, and undercoverage bias. However, probability sampling costs more, is more time-consuming, and can be challenging as randomly selected participants may or may not want to participate. As a result, non-probability sampling is becoming more common.

Even if non-probability sampling is used, it’s still vital that the sample be representative of the targeted population and that selection bias be reduced. If the sample is not representative, then the results and conclusions have reduced validity. For example, if a targeted population is Saskatchewan residents, you couldn’t select your sample just from Regina, the city just doesn’t represent all of Saskatchewan.

Public opinion polls are a great example of how representative samples can be selected. If you look into the fine print of public opinion poll, you’ll find a breakdown of the sample by age, gender (usually binary), geographic location, and sometimes political ideology. For example, Angus Reid Institute recently conducted a public opinion poll on how Canadians feel about the monarchy. The main report provides a breakdown of the sample for reporting purposes and at the bottom, you have the option to view a full breakdown of the sample by multiple factors. This show how Angus Reid has strived to gather a representative sample.

Image: Meme made at Image from Harry Potter and the Philosopher’s Stone (2001)

We now have to consider how selection bias could play a role. Angus Reid Institute clearly states how they select participants for their surveys on their website. There are two things to note. First, participants are selected from the Angus Reid Forum. While Angus Reid Institute states they are non-partisan, media analysis tends to identify them as having a slight conservative leaning on the political spectrum, meaning their forum members could identify as conservative which could create a bias. Second, participants may be entered for draws or paid to complete surveys, which as we discussed last month, could create a bias. It is impossible to remove all bias from sampling and surveys and the practices of Angus Reid Institute are common for most public opinion polling, so this level of analysis does not discredit their polls; rather, it demonstrates how all public opinion polls are subject to these issues.

So here we are, we survived the sequel! But wait, where was the section on Statistics Canada surveys? Well, it looks like this series is a trilogy.