One of the first questions with any metric is “what’s a good score?”. Like in sports, a good score depends on the metric and context.
Here are 10 benchmarks with some context to help make your metrics more manageable.
1. Average Task Completion Rate is 78%: The fundamental usability metric is task completion. If users cannot complete what they came to do in a website or software, then not much else matters. While a “good” completion rate always depends on context, we’ve found that in over 1,100 tasks the average task completion rate is a 78%.
2. Consumer Software Average Net Promoter Score (NPS) is 21%: The Net Promoter Score has become the default metric for many companies for measuring word-of-mouth (positive and negative). In examining 1,000 users across several popular consumer software products, we found the average NPS was 21%.
3. Website Average Net Promoter Score is -14%: We also maintain a large database of Net Promoter Scores for websites. The negative net promoter score shows that there are more detractors than promoters. This suggests that users are less loyal to websites and, therefore, less likely to recommend them. It could be that the bulk of users on any one website are new and are therefore less inclined to recommend things they are unfamiliar with.
4. Average System Usability Scale (SUS) Score is 68: SUS is the most popular questionnaire for measuring the perception of usability. Its 10 items have been administered thousands of times. SUS scores range from 0 to 100. Across the 500 datasets we examined the average score was a 68. The table below shows the percentile ranks for a range of scores, how to associate a letter grade to the SUS score, and the typical completion rates we see (also see #5).
|Grade||SUS||Percentile Rank||Est. Comp Rate|
Table 1: Raw System Usability Scale (SUS) scores, associated percentile ranks, completion rates and letter grades. Adapted from A Practical Guide to SUS and updated by Jim Lewis 2012.
5. High task completion is associated with SUS scores above 80: While task completion is the fundamental metric, just because you have high or perfect task completion doesn’t mean you have perfect usability. The table of SUS scores above shows that across the 122 studies, we see average task completion rates of 100% can be associated with good SUS Scores (80) or great SUS scores (90+). Associating completion rates with SUS scores is another way of making them more meaningful to stakeholders who are less familiar with the questionnaire.
6. Average Task Difficulty using the Single Ease Question (SEQ) is 5.5: The SEQ is a single question that has users rate how difficult they found a task on a 7-point scale where 1 = very difficult and 7 = very easy. Across 200+ tasks we’ve found the average task-difficulty is a 5.5, higher than the nominal midpoint of 4, but consistent with other 7-point scales. (Note: This average SEQ score was updated to 5.5 in May 2022 from 4.8 reflecting the much larger set of data analyzed in the subsequent 10 years.)
7. Average Single Usability Metric (SUM) score is 65%: The SUM is the average of task metrics—completion rates, task-times and task-difficulty ratings. As such, it is impacted by completion rates which are context-dependent (see #1 above) and task times which fluctuate based on the complexity of the task. Despite the context-sensitive nature, I’ve seen that across 100 tasks of websites and consumer software that the average SUM score is 65%. This is for 3-metric SUM scores. It will be higher for 4-metric scores, which include errors. However, most of the datasets I have used are only 3-metric SUM scores. The table below shows a table of SUM scores and the percentile ranking from the 100 tasks. For example, getting a SUM score for a task above 87% puts the task in the 95th percentile.
|SUM %||Percentile Rank|
Table 2: SUM Percent Scores from 100 website and consumer software tasks and percentile ranks. For example, a SUM % score (from averaging completion rates, task-time and task-difficulty) of a 55 was at the 25th percentile–meaning it was worse than 75% of all tasks.
8. The average SUPR-Q score is 50%: The Standardized Universal Percentile Rank Questionnaire (SUPR-Q) is comprised of 13 items and is backed by a rolling database of 200 websites. It measures perceptions of usability, credibility and trust, loyalty, and appearance. A score of 50% means half the websites score higher and half score lower than your site’s SUS score.
9. Usability problems in business software impact about 37% of users: In examining both published and private datasets, we found that the average problem occurrence in things like enterprise accounting and HR software programs impact more than one out of three (1/3) users. While that’s bad for a usable experience, it means a small sample size of five users will uncover most usability issues that occur this frequently.
10. The average number of errors per task is 0.7: Across 719 tasks of mostly consumer and business software, we found that by counting the number of slips and mistakes about two out of every three users (2/3) had an error. Only 10% of all tasks we’ve observed are error-free or, in other words, to err is human.