What is usability testing?

Usability testing is the practice of putting a product or prototype in front of real users and watching what happens. Participants are given tasks to complete, and researchers observe, without helping or explaining, where users succeed, where they hesitate, and where they fail or give up.

The method rests on a straightforward premise: the people who design and build a product are not representative of the people who use it. They know too much. They understand how the product works internally, which terminology is intended, and where things are. Users approach the same interface with different mental models, different expectations, and none of that context. Usability testing closes the gap by generating direct evidence of how real users experience the product, rather than how the team imagines they do.

Usability testing can be applied to anything from a paper sketch to a fully shipped product. It's most valuable when done early and repeatedly rather than as a single checkpoint before launch.

How does usability testing work?

A usability test session typically involves a moderator, a participant, and a set of tasks. The moderator introduces the session, explains that the product, not the participant, is being evaluated, and gives the participant a task to attempt. The participant completes, or attempts to complete, the task while thinking aloud to share what they're noticing and deciding. The moderator observes and takes notes without guiding the participant toward the correct path.

After the session, researchers review notes and recordings, identify patterns across participants, and document usability problems with enough detail to inform design decisions. A finding like "three of five participants could not locate the account settings because they expected it in the navigation rather than under the profile icon" is specific enough to act on directly.

Sessions typically last between 30 and 60 minutes, depending on the scope and complexity of the tasks. Five to eight participants is a commonly cited threshold for moderated qualitative testing, with the reasoning that patterns become visible well before the data becomes redundant. For quantitative testing aimed at measuring task completion rates or time-on-task at statistical confidence, larger sample sizes are needed.

What are the different types of usability testing?

Usability testing varies along several dimensions, and the right approach depends on what the team is trying to learn.

  • Moderated testing involves a researcher actively facilitating the session in real time. The moderator can probe, ask follow-up questions, and adjust the session based on what the participant is doing. This produces rich qualitative data and the ability to explore unexpected findings, but it requires scheduling and coordination with each participant.
  • Unmoderated testing uses software to give participants tasks and record their sessions without a researcher present. Participants complete the test at their own convenience, which makes it faster and cheaper to run at scale. The trade-off is losing the ability to follow unexpected threads or ask clarifying questions.
  • Remote testing, whether moderated or unmoderated, allows researchers to work with participants regardless of geography. It has become the default for most teams. In-person testing remains valuable for products where the physical environment matters, like hardware or point-of-sale systems, or when observing subtle body language and context is important.
  • Guerrilla testing is an informal version where researchers approach people in public spaces and ask for a few minutes of their time to try a product. It's fast and cheap, useful for quick directional feedback, and not suited for research questions that require specific user profiles.

Why does usability testing matter for product outcomes?

The practical argument for usability testing is economic. Finding and fixing a design problem in a prototype costs a fraction of what it costs to find and fix the same problem after development is complete, and less still compared to discovering it after launch through support tickets, poor reviews, or declining retention metrics.

Beyond cost, usability testing creates shared understanding within a product team. Watching a recording of a user struggle with something the team thought was obvious is often more persuasive than a researcher's written summary of the same finding. It builds genuine empathy for users in ways that data alone rarely achieves, and it tends to sharpen prioritization by making abstract usability issues concrete and specific.

For product managers, usability findings provide evidence that can anchor conversations about design trade-offs and feature priority. "Users consistently failed to find this feature because it's buried under a secondary navigation label" is a more actionable input than "users might find this hard to discover."

How has usability testing changed with AI?

AI-powered analysis tools have significantly reduced the time required to process and synthesize session recordings. Platforms like Maze AI can analyze user sessions at scale, identifying friction points, measuring task completion rates, and surfacing patterns across large numbers of participants much faster than manual review. Tools like Dovetail automate the tagging and clustering of observations from recordings and notes.

For moderated testing, AI transcription has made it practical to generate accurate transcripts from session recordings almost immediately, reducing the manual work of notetaking and review. Researchers can search transcripts, tag moments, and build synthesis artifacts faster than was possible with manual methods.

Remote unmoderated testing has matured significantly as a format. The tooling has improved, participant recruitment platforms have grown, and teams have become more comfortable treating unmoderated tests as a standard part of their research toolkit rather than a compromise. The ability to run a round of testing in days rather than weeks has changed how frequently teams integrate usability feedback into their process.