Protecting Participants with Confidential and Anonymous Data
By Sarah Dunifon
Originally produced for EvaluATE - the National Science Foundation’s Advanced Technological Education (ATE) Evaluation Resource Hub
As an evaluator specializing in informal STEM education, I’ve had to collect sensitive participant data, and I know that when collecting data on human subjects, it’s important to understand what kind of data you need. In this piece, I’ll talk about the difference between confidential data and anonymous data, and why you might consider collecting each.
What’s the difference between confidentiality and anonymity? Why is it Important?
Anonymous data is collected in a way that makes each participant’s identity impossible to detect. It may be that the instruments do not collect any identifying information at all.
Confidential data includes identifying information but is de-identified when used for other purposes. Program staff and/or evaluators have access to the original data and may know “who said what.” But participants’ identities are safeguarded (by removing identifying information or using pseudonyms for participants) before sharing more broadly.
When working with confidential data, it is important to think about who has access to the data and how that is communicated to participants. For example, a program manager who works directly with participants may not be the best person to handle this data if there is sensitive information or information which may impact how this manager chooses to deliver the program. Similarly, participants may feel uncomfortable sharing their true thoughts and feelings if they know this person will have access to the information.
Why might you want to collect confidential or anonymous data?
There are plenty of reasons why protecting participants’ identities may be of concern, such as:
If you’re working with sensitive information (e.g., household socioeconomic data, citizenship data, health data, etc.)
If it might negatively impact a participant to have their data be available to other participants or to program staff
To alleviate participant concerns about how honest they can be in their feedback
When individual identity does not matter and you’re only interested in aggregate data.
Deciding between confidential and anonymous approaches, consider:
Who is your audience?
What information will you be collecting?
How might that information impact participants if it were somehow made public to other participants or staff?
Are you working with an IRB or other entity that has its own rules and regulations for how you collect and manage data?
Who is managing the data?
Designing your approach, consider:
Informed consent: how will you let your participants know how the data is being managed?
If you’re choosing to collect confidential data, how will you de-identify the data for further use?
How will you store your data? For example, will paper documents be stored in a locked file cabinet in a locked storage room? Will electronic documents be kept on Box or behind digital lock and key?
Who will be involved in the evaluation process, and who will have access to the data?
Will those involved in the design and management of the data have human subjects research training?
No matter the type of data you’re collecting, it’s worthwhile to think about how you can best protect your participants. An understanding of the difference between confidentiality and anonymity is an important tool to have in your toolbox.
For more information on the difference between confidentiality and anonymity, check out this resource from Seattle University.