Spooky Sample Sizes: Choosing “The Right” Number of Research Participants
It’s easy to feel intimidated when people question your sample size. The trick (or treat) is choosing the right method and backing up your data with additional research.
Despite my background in quantitative research, I never felt comfortable with surveys and numbers. I love words. Words make sense to me. But numbers...are terrifying.
There is no subjectivity and much less of a grey area. However, even as qualitative researchers, we must incorporate some quantitative research into our practice. One of the most fundamental concepts within quantitative research is the sample size.
For a long time, the word "sample size" was enough to invoke fear in me immediately. I hated the dreaded conversations that started with the phrase, "But you only spoke to ten people..."
I scrambled to explain that words were not the same as numbers or that it was okay to only speak with a small number of people. But my arguments felt weak, especially in companies that highly valued data-driven decisions.
Through the years, I’ve learned the best sample size for a given study and how to properly back it up. But I find people struggling with the same questions I did throughout the years. Are five users enough? Can I send a survey to ten people?
The sample size is critical to retain validity and reliability in your studies and help teams make the most informed and data-driven decisions. So, how do we decide on the right sample size?
Consider the type of research method
The type of research method is the first indicator of what kind of sample size you are using. Usually, I bucket methods into approaches. The most common being:
Generative research
In this approach, you are trying to understand mental models and people's thought processes. You are generating a new level of knowledge about these people or generating new ideas. Standard methods under generative research include:
- 1x1 interviews
- Diary studies
- Contextual inquiry
- Mental model interviews
- Participatory design
Evaluative research
With evaluative research, you are evaluating something that exists and how usable that thing is. Typical evaluative research methods include:
- Usability testing (moderated and unmoderated)
- Concept testing (moderated and unmoderated)
- Card sorting (moderated and unmoderated)
- Benchmarking
Survey research
Although I technically would include surveys into the evaluative category, they are tricky enough to deserve their own space.
If you are having trouble picking a bucket or approach for your study, check out this article to get you started.
Sample sizes for generative research
Generative research allows a deep understanding of our users (inside and outside of a product/service). We can learn what they experience in their everyday lives. It allows us to see users as human, beyond their interaction with a product/service.
Of course, being my favorite approach, I am biased, but I believe generative research unlocks so many rich insights that can ensure a product's success.
However, this type of qualitative research usually summons that phrase I mentioned above, "But you only spoke to so few people," is difficult to hear constantly.
The critical thing to remember about qualitative research is that we are not trying to generalize findings from fifteen people to an entire population. With qualitative research, we are looking to reach saturation, not a generalization.
Consider segmentation
We typically speak with a small number of users in user research, which means we cannot generalize to a larger population. Ideally, when conducting user research, you are segmenting your user base.
You can segment in many different ways, but here are some typical segmentations:
- Demographic information (age, gender, education)
- Psychographic segmentation (people's activities, interests, lifestyle)
- Behavioral data (how people act on a product/service)
- Geographic segmentation (where people live, work, or travel)
- Firmographic information which is information about a business (number of employees, company size, role)
For example, when I was doing generative research at a B2B company to create proto-personas, I didn't recruit an arbitrary sample of people.
Instead, I went through several steps:
- I spoke with account managers, sales, customer support, and marketing to understand the different potential segments we have of customers
- We decided to focus on particular roles (ex: social media managers, brand managers) since they were the roles using our platform most frequently
- We also looked at roles within larger companies, as they gave us the highest amount of revenue and were the most loyal customers
For this project, we recruited social media managers and brand managers in large companies. In that way, we could then better generalize our findings to that specific population.
General sample size guidelines
- 1x1 interviews: For these interviews, I always recommend 15-25 participants per segment. For personas or other similar deliverables, aim for 20 participants per segment
- Diary studies: At least 10 participants per segment, but 15 is closer to the ideal
- Contextual inquiry: 10-12 participants, per segment
- Mental model interviews: Ideally, 15-20 interviews, per segment
- Participatory design: 10-12 participants, per segment
Keep in mind that a survey is required with each of these methods if you generalize the results to larger populations. Surveys such as the opportunity gap survey can help you determine if your findings are applicable beyond the group you spoke with.
So, when a stakeholder mentions the small sample size, always explain you are looking for saturation, not generalization. If you need to generalize, you will send out a larger-scale survey to validate your findings.
Evaluative research sample sizes
Evaluative research is about assessing how a product/service works when placed in front of a user. It isn't merely about functionality but also about the effectiveness, efficiency, and satisfaction of that product/service.
A widespread idea when it comes to evaluative research is testing with five people. When I review research plans, I often see five participants as the number for an evaluative study.
As correct as this sometimes might be, it isn't a hard and fast rule. Like the above, if you pick five random people off the street to test your product, you likely won't find 85% of the issues. The fine print behind five people per usability study is that it means five users per segment.
Here are the general guidelines for the top evaluative methods:
- Usability testing
- Moderated: At least five participants, per segment
- Unmoderated: At least 15 participants, per segment, in case you get unclean data
- Concept testing
- Moderated: At least eight participants, per segment
- Unmoderated: At least 15 participants, per segment, in case you get messy data
- Card sorting is ideally 20-30 participants, per segment
- Benchmarking requires 25 or more participants per segment since we are looking at quantitative data
With evaluative approaches, however, we can look at something else called confidence intervals. A confidence interval is a probability that the population lies within a specified range.
So, for example, if I were to measure how tall the average female is, depending on how many females I measure, I could say the average height of a female with a percentage of confidence.
If you want to be 90% confident that users will complete tasks, you will test with more people. There are plenty of calculators that help you determine confidence intervals.
Survey research sample sizes
Now, if we haven't already veered into the messiness of sample size, surveys are where it gets interesting. As mentioned, surveys are critical when it comes to validating qualitative research across larger populations.
So, I often get asked, "How many people should I send my survey to?" The most straightforward answer I can supply is more than 25 people per segment. However, unfortunately, it isn't always that simple.
When you are creating a survey to send, you need to look at a few different metrics:
Population size
How many people are you trying to generalize your findings to? You can segment your users into smaller groups and send them surveys, but know that your conclusions will only apply to that group. It's okay to have a range, not an exact number.
If you are working on a product that doesn't exist, look toward your competitors to estimate population size.
Confidence intervals/margin of error
This concept popped up again! What margin of error do you feel comfortable with? I usually set this to 5% or 10%.
Confidence level
How confident do you want to be that your findings fall within that certain margin of error? Usually, you see this at 95% or 90% confidence. I usually set this to 80-85% when dealing with fast-paced product teams but have dipped as low as 70%.
Standard deviation
Standard deviation looks at the amount of variation in your data. Think of the famous bell curve (normal distribution) in statistics. A low standard deviation means most data points will be close to the mean (the middle), and a high standard deviation means the data will be spread out across the bell curve.
Don't worry too much about this as a safe choice for the standard deviation of a survey is 0.5.
If you are feeling overwhelmed by all of these steps and terms, don't worry. There are plenty of online calculators that can help you step-by-step. One of my favorites is qualtrics.
Let's say I ran a qualitative study on people who travel. In my 15 1x1 interviews, I found that traveling for business was less stressful than traveling for leisure. I want to know if this is true across a broader population that uses my app.
I would then plug in the above:
- Population size (my user base) is around 50,000 people
- I want to have a margin of error of 7%
- I want to be 90% confident of my findings
- My standard deviation would be 0.5
If I plug these into a calculator, I would see my ideal sample size is sending the survey to 138 people.
Keep in mind
These numbers above are ideals. I have done research that doesn't always get the perfect number of participants for various reasons, such as lack of budget, last-minute drop-outs, or a niche group.
Not reaching these numbers doesn't mean you can't do research. Instead, it means you must be very aware of the findings and that they might not be fully representative of your population. In this case, you may need to conduct follow-up research to make certain conclusions.
Strive for these sample sizes, as they will get you the best data for each approach, but don't feel overwhelmed by them. We all have to make difficult decisions and work with tricky timelines or budgets.
But, if you cannot hit these numbers, always keep that in mind and take your findings with a grain of salt until you can further generalize them.
Nikki Anderson-Stanier is the founder of User Research Academy and a qualitative researcher with 9 years in the field. She loves solving human problems and petting all the dogs.
To get even more UXR nuggets, check out her user research membership, follow her on LinkedIn, or subscribe to her Substack.
Subscribe To People Nerds
A weekly roundup of interviews, pro tips and original research designed for people who are interested in people