by Kaetlyn Phillips
After a summer hiatus, I’m back for more blogs on data in everyday life. This month we’ll cover Statistics Canada Surveys. If you were a reader of my previous blogs, think of this as part 3 of a series on survey data collection. If you are new to the blog, you don’t need to read the previous entries, but you’re always welcome to do so.
Statistics Canada collects data on a national level which are used to inform the public and the government. Within reason, most of this data become publicly available once they have been analyzed. Usually this is in the form of data news (The Daily), aggregated data tables, analytical reports, Public Use Microdata Files (PUMFs), and other data tools and resources. As their data are intended to be representative of the entire nation, they have much stricter sampling guidelines than national opinion polls (for more information click here!).
With a Statistics Canada survey, the target population the survey wants to measure will be well defined. Defined target populations can include nature of units; geographic locations; time periods; and socio-demographic characteristics such as age groups or industry. For example, we might define our target population as Canadian post-secondary students who were enrolled in a graduate level program from September 2021 to September 2022. Other steps needed to design the study and sample include choosing a survey time frame, defining survey units, determining sample size, and choosing an appropriate sampling method. Ideally, the survey will use probability sampling, meaning the sample is selected using random selection or chance. There are many forms of probability sampling and if you want to do a deeper dive, click here. Random sampling minimizes sampling error, but is more costly and time-consuming.
The release of data to the public depends on the survey. Most surveys run on a cycle which is usually annual, but not always. The detail of the data, or granularity, being released also affects when they are released. For example, brief statistical breakdowns like infographics found in The Daily can be released much quicker than PUMFs. The more granular the data get, the greater the need to protect anonymity of respondents. In some cases survey data that is collected are not available to the public.
While Statistics Canada is considered a trusted source of data, that doesn’t mean we shouldn’t thoughtfully evaluate what they release. A quick way to evaluate data is to ask “Who was counted? How were they counted?”; meaning, we should look at the sample size and consider if it represents the target population and we should look at how the survey was conducted, which we’ll tackle next month!