Data in Everyday Life – Invisible Data: Part 1

– by Kaetlyn Phillips

As previously discussed, when collecting data, especially survey data, it’s important to make sure the sample is representative of the population. So what happens when certain groups are excluded from the data?


To start ask yourself, what are the symptoms of a heart attack?

If you said chest pain, shortness of breath, and arm or shoulder pain, you’d only be partially correct. Those are the symptoms of heart attack for men. For women, the symptoms are similar but different enough that many women could miss that they were having heart attacks. There are also data to suggest that doctors are more likely to misdiagnose heart attack in women because the different symptoms. Personally I didn’t know there was a difference until 2012 (Thank you Elizabeth Banks!).

So how is this relevant to data? Well, this is an example invisible data. We live in a society where the default for data collection is still cisgender (majority white) men, so a lot of our data are biased. There are numerous examples of how this data bias has real world effects:

Data drive our society, and we are constantly looking for data to make evidence based decisions. Using data to make decisions is not a bad thing! Using data from only one segment of the population is concerning. When we are using data from only one section of society the results can be misinformed and our decisions can be flawed. It’s important to note that this is not always intentional and malicious, most missing data is due to male-dominance in data collection fields. If we don’t see and include representation, we don’t consider other perspectives. This is even more relevant as we shift to more AI driven data collection. The majority of research and work in AI and machine learning is conducted by men, potentially creating more gender based data bias.

If you’d like to learn more about the invisible data and data bias based on gender, my main source for this post was Invisible Women: Data Bias in a World Designed for Men by Caroline Criado Perez. HOWEVER, please note that while this book has numerous examples of gender based invisible data, there is a glaring omission. Invisible Women only looks at data bias from a cisgender perspective, so the book has its own problem with missing data from transgender and non-binary folx. With that in mind, next month I will be looking at the issues of invisible data from 2SLGBTQIA+ communities.