Competitive and benchmark usability evaluations are a type of task-based usability test with an emphasis on metrics (summative evaluation). A benchmark quantifies how usable the experience is using a combination of task-based and study-based metrics that describe both what users do and what they think about the experience.
Having a benchmark before changes are made allows design teams to understand how future changes improved (if at all) the experience. Adding competitors to a benchmark study provides additional context to the metrics. Most benchmarks and competitive studies are conducted using an unmoderated testing approach.
Benchmark studies are most effective when conducted at regular intervals (e.g. annually) or after key milestones, such as after major changes to the interface.
MeasuringU provides full services for each phase of benchmark and competitive usability evaluations.
- Refining research questions, hypotheses, competitors, and metrics. We often recommend a mixed-methods approach for task-based usability testing that involves (where possible) a large unmoderated study and a moderated in-depth qualitative component. In a competitive study we advise on the pros and cons of using a within- versus between-subjects approach.
- Defining the tasks, writing the scenarios, and recommending pre- and post-study questions.
- Hosting the testing environment for unmoderated studies or setting up software and hardware.
- Recommending the right sample size. There isn’t a one-size-fits-all number. For standalone benchmarks, the right sample size is a function of how precise we need to be. Most unmoderated benchmarks have between 100-200 participants, which provides a margin of error around 5%-7%. In competitive studies, the sample size needed is a function of how large a difference exists and is usually twice the size of a standalone study (depending on the number of competitors). A within-subjects study allows for smaller sample sizes than between-subjects studies.
MeasuringU can manage and recruit qualified participants for in-person and unmoderated benchmarks testing from around the world. We assist in developing screeners, scheduling participants, and managing participant honorariums and communication.
In benchmark and competitive studies we emphasize the task and overall metrics but still provide descriptions of problems and insights:
- Metrics: The most common metrics are completion rates, task times, and task and test level perception questions. For websites we recommend the SUPR-Q to provide a standardized measure of UX quality and a relative comparison to 150 other website scores.
- Problems: We provide descriptions of problems encountered by users and use screenshots, quotes, and video highlights to illustrate problems for development teams to fix.
- Insights: Usability tests aren’t just long lists of problems. We also include areas that worked well for users, and compliments and quotes from favorable interactions.