Words by Nikki Anderson-Stanier, Visuals by Allison Corr
Usability testing is easy, they said. Just write some tasks, they said.
I oversimplified usability tests for a long time. While it is an easier methodology to learn and conduct, it isn't without challenges. It took me some time to perfect the craft of usability testing, and that included some very interesting sessions with participants, where I was just as confused as they were about tasks.
Additionally, since usability testing is such a popular methodology and one learned earlier, I defaulted to usability testing all the time, even when it wasn't relevant. This mismatch led to confusing results and broke my confidence in usability testing to the point where I avoided it.
Whether you want a refresher or are new to usability testing, this guide can help you learn or re-experience usability testing.
Psst…We’ve made a usability testing template that you can download here for free.
What is usability testing?
A stakeholder once asked me this question, and I was tongue-tied. I wanted to say, "Well, it's usability testing." Instead, I told them that usability testing was "testing the usability of a product."
And while that definition certainly is true, there is more to usability testing. For instance, what is usability? What are we testing, and how do we know what to test?
What is usability?
Usability is the ability for someone to:
- Access your product or service
- Complete the tasks they need
- Achieve whatever goal or outcome they expect
For example, with a clothing company, a user would go to the website with the expectation of being able to purchase clothing. But just because a user can buy clothing, doesn't mean it's an easy or satisfactory experience.
So we can break usability down further into three areas:
- Effectiveness: Whether a user can accurately complete tasks and an overarching goal
- Efficiency: How much effort and time it tasks for the user to complete tasks and an overarching goal accurately
- Satisfaction: How comfortable and satisfied a user is with completing the tasks and goal
Usability testing, whether you use metrics or run a qualitative usability test (more on that later), looks at these three factors to determine whether or not a product or service is usable.
So, what about those who browse e-commerce out of boredom (and no purchase intent)?
This leads to the next question...
What are we testing? And how do we know?
As I mentioned above, there are many aspects you could test (even with a simple product!). But the point of a usability test is to ensure that users can complete their most common tasks to achieve their goals.
Again, one of the main goals for a clothing website would be to purchase clothing. You can have smaller goals within that larger goal, such as comparing clothing.
Then, you can break this down into the tasks people have to do to achieve those goals. For instance:
- Searching for specific types of clothing with keywords
- Filtering through colors, brands, sizes
- Sorting by reviews, prices
- Opening multiple windows to compare different options
- Saving clothing to a favorites list
- Reading (and understanding) the size and fit of clothes
- Adding a piece of clothing to a basket
- Checking out and paying for the clothing
- Receiving a confirmation of purchase
- Receiving the clothing
These are all tasks associated with the larger goal. With usability testing, we ask people to do these critical tasks to assess whether or not they can achieve them and the larger goal in an efficient, effective, and satisfactory way.
If someone can do these tasks, they can get to their expected outcome. However, if efficiency, effectiveness, or satisfaction suffer during this process, they may get frustrated and give up or go to a different website.
We've all encountered this—an infuriating user experience that made us rage-click, throw our phones (against a soft surface, of course), and give up on a product or service.
Usability testing goals
Now, when is usability testing an appropriate method? There are particular goals I always associate with usability testing, but before we dive in, let's understand what usability testing can help us with:
- Whether or not a user can use a product for its intended function
- Whether or not a product allows a user to reach their goals
- How a user uses a product, separate from how we think a user should use a product
- How a product functions when placed in front of a human
- Where bugs and complicated user experiences lie within a product
On the flip side, usability testing will NOT tell us:
Now, let's dive into some goals for usability testing because the best thing you can do is start with goals and a research plan.
These are the most common goals I use for usability tests:
- Learn about people's current pain points, frustrations, and barriers about [current process/current tools] and how they would improve it
- Uncover the current tools people use to [achieve goal] and their experience with those tools. Uncover how they would improve those tools
- Evaluate how people are using a [product/website/app/service]
- OR Evaluate how people are currently interacting with a [product/ website/app/service]
Recruitment and sample size
Once you've written your goals, it is time to think about recruitment and sample size.
Recruiting the right people is essential for a good user research study. Trust me, it's challenging to fill a 60-minute usability test with the wrong participants, plus it's awkward.
For usability tests, I ask myself the following questions, with some examples from above, to understand who my target users would be:
What are the particular behaviors I am looking for?
- Have purchased clothing online in the past month
Have they needed to use the product? And in what timeframe?
- Have used our product in the past month to purchase clothing
What goals are important to our users?
- Getting the right size and fit of clothing without returns
What habits might they have?
- Visiting our website frequently (ex: at least once every other week)
You can create an effective screener survey when you know what you’re looking for.
One note about the sample size for usability testing: A general idea for evaluative research is testing with five people. When I review research plans, I often see five participants as the number for an evaluative study.
While this could be correct, it isn't a hard and fast rule. Like the above, if you pick five random people off the street to test your product, you likely won't find 85% of the issues.
The fine print behind five people per usability study is that it means five users per segment.
Here are the general guidelines for the top evaluative methods:
1. Usability testing
- Moderated: At least five participants per segment
- Unmoderated: At least 15 participants per segment, in case you get messy data
2. Concept testing
- Moderated: At least eight participants per segment
- Unmoderated: At least 15 participants per segment, in case you get messy data
3. Card sorting
- Ideally 20-30 participants per segment
- Requires 25 or more participants per segment, since we’re looking at quantitative data
Task writing is the next step after recruitment and sorting out sample sizes. Task writing was my least favorite part for a while, because it’s more complicated than it seems on the surface.
Throughout the years, I have honed this skill through a lot of practice. If you’re looking to write great usability tasks, that would be my first piece of advice—practice, practice, practice.
These are the steps I go through when constructing my usability testing tasks:
1. Start with a goal
Start with what you want the user's end goal to be, not the goal of the task. What does the user need (or want) to accomplish at the end of the action? What is their goal for using this product?
2. Include some context
Instead of throwing participants into action with no relevant information, give them context on why they need to use the product. You can also consider the context and background information for why they would use the product in the real world.
3. Give them relevant information
Since you are recording metrics, you don't want to be vague in your instructions. If users need to input dates, locations, or particular data in a form, give them that information. You don't want the user to guess what you want them to do, resulting in skewed data.
4. Ensure there is an end the user can reach
If you’re trying to get someone to accomplish a task, make sure they can. There should be a reachable "end," which satisfies the participant and helps you record if the participants could complete the task.
5. Write your task scenario(s)*
Once you have brainstormed this information, write your task scenario. I have included some examples at the end of this article.
6. Conduct a dry run (or two!)
After writing down your task scenarios, perform a dry run with either internal employees, friends, family, or anyone that will give you their time. This dry run will allow you to practice the test flow, make sure the tasks make sense, and indicate whether there are too many.
*A common question is how many tasks should be in one usability test. It depends on the complexity of the tasks and how much time you have with the participant.
For a 45-60 minute session, I generally include five to seven tasks. By including a dry run in your process, you can know how many tasks you can fit into the session.
A potential task example for the above clothing company might look like this:
Winter is coming up, and you’re looking for a new winter coat to keep you warm that is under £150.
Usability tests can be either qualitative or more on the quantitative side if you incorporate usability metrics. However, I always recommend feeling comfortable with observational and qualitative usability testing before jumping in to track specific metrics. Also, if you are looking for qualitative feedback, usability metrics won't be a good fit.
When it comes to usability testing, we go back to our three cornerstones of usability:
- Effectiveness: Whether or not a user can accurately complete a task that allows them to achieve their goals. Can a user complete a task? Can a user complete a task without making errors?
- Efficiency: The amount of cognitive resources it takes for a user to complete tasks. How long does it take a user to complete a task? Do users have to expend a lot of mental energy when completing a task?
- Satisfaction: The comfort and acceptability of a given website/app/product. Is the customer satisfied with the task?
Combining these metrics can help you highlight high-priority problem areas. For example, suppose participants respond confidently that they completed a task, yet most fail.
In that case, there is a vast discrepancy in how participants use the product, leading to problems. Let’s break up the metrics by area of usability testing:
- Task Success: This simple metric tells you if a user could complete a given task (0=Fail, 1=Pass). You can get fancier with this by assigning more numbers that denote the difficulty users had with the task, but you need to determine the levels with your team before the study.
- The number of errors: This task gives you the number of errors a user committed while trying to complete a task. You can also gain insight into common mistakes users run into while attempting to complete the task. If any of your users seem to want to complete a task differently, a common trend of errors may occur.
- Single Ease Question (SEQ): The SEQ is one question (on a seven-point scale) measuring the participant's perceived task ease. Ask the SEQ after each completed (or failed) task.
- Confidence: Confidence is a seven-point scale that asks users to rate how confident they were that they completed the task successfully.
- Time on Task: This metric measures how long it takes participants to complete or fail a given task. This metric can give you a few different options to report on, where you can provide the data on average task completion time, average task failure time, or overall average task time (of both completed and failed tasks)
- Subjective Mental Effort Question (SMEQ): The SMEQ allows the users to rate how mentally tricky a task was to complete.
- System Usability Scale (SUS): The SUS has become an industry standard and measures the perceived usability of user experience. Because of its popularity, you can reference published statistics (for example, the average SUS score is 68).
- SUM: This measurement will enable you to take completion rates, ease, and time on task and combine it into a single metric to describe the usability and experience of a task.
- Standardized User Experience Percentile Rank Questionnaire (SUPRQ): This questionnaire is ideal for benchmarking a product's user experience. It allows participants to rate the overall quality of a product's user experience based on four factors: usability, trust/credibility, appearance, and loyalty.
How do I know when to use metrics?
Metrics won't be helpful if you’re looking for qualitative feedback and want to talk to participants about their experiences as they try tasks. You can't get qualitative feedback when you incorporate metrics into usability tests because you would be interrupting participants.
For instance, if you track task success or time on task, asking participants how they are feeling and what they are thinking during tasks can distract them and skew your data.
If you want numbers to assess the usability of your product, these metrics will be super helpful in bringing a quantitative twist to your user research. In addition, gathering this data and then testing again after improvements can also help toward benchmarking and proving the ROI of user research.
When using metrics, make sure your test evaluates effectiveness, efficiency, and satisfaction with data. If you want to add qualitative components, steer away from metrics or save some time for a qualitative portion after the metric-based usability test.
Once you conduct the usability test, it's time for the exciting process of unwinding your data into patterns and trends you can act on.
Qualitative usability testing
I usually focus on global tags and affinity diagramming when I synthesize qualitative usability testing. The most common global tags I use for usability testing are:
- Goals: What the person is trying to accomplish as an outcome.
- Need: Something a person needs to fulfill a goal.
- Task: Something a person does to achieve a goal.
- Pain point: A barrier or difficulty towards accomplishing a goal.
After each usability session, I do a quick debrief where I split the global tags into four quadrants and note what happened during that interview concerning each quadrant.
So, for example, if we were usability testing the clothing website, a session debrief (one participant) might look like this:
- Purchase a new weatherproof winter jacket to keep warm during snowstorms and in below-freezing weather
- Get the highest quality for the lowest price by comparing different products on the website
- Weatherproof and durable jacket, knowing this is the case through a description
- Lasts for more than five years
- Warm enough for below freezing weather
- Under £250
- Searching for winter jackets using keywords
- Filtering by weatherproof or durability
- Reading the description to understand more about the product
- Sorting by price
- Not knowing if a coat is waterproof or weatherproof for the necessary conditions
- Understanding fit and size with additional layers
- Knowing the length of the coat
- Understanding other peoples' experiences with the coat in similar situations
Completing this small synthesis session makes for easier work at the end of the study. Once you complete all the sessions and each debrief, bring all participants together to assess patterns and trends (the same thing three or more people are saying).
Quantitative usability testing (with metrics)
My go-to for reporting quantitative usability testing is a stoplight report.
A stoplight report:
- Conveys whether or not a user has passed a usability testing task
- Includes how severe the problem is in a task
- Shows the amount of time spent on a task, by task and participant
- Highlights, on average, how many participants passed/failed a given task
- Summarizes how the usability test went through visuals
The most valuable part of the stoplight approach is how visual it is. It can quickly provide a stakeholder with a holistic overview of how the usability test went.
A stoplight chart includes the following components (but you don't have to include all of them!):
- Each participant has a column, with a participant summary at the bottom
- Each task has a row, with an average task summary
- The three colors indicate:
- Whether a participant succeeded (green)
- Whether a participant struggled with a task (orange)
- Whether a participant failed the task (red)
- The time for each task is recorded within the task participant bubble and averaged per task
Want to test it out? Use this template for reporting your next quantitative usability study!
Go forth and test usability!
When you review basics and practice (as I said, practice is critical), you will naturally feel more comfortable with the different components of usability testing. Although it is a more straightforward method, it is still worth mastering—the right way.
If you're getting ready for your next usability test and want an even more in-depth checklist, this article (+ awesome template) is for you.
Nikki Anderson-Stanier is the founder of User Research Academy and a qualitative researcher with 9 years in the field. She loves solving human problems and petting all the dogs.
To get even more UXR nuggets, follow her on LinkedIn, join her bi-weekly newsletter, or read more of her work on Medium.