
Letter grades in school might rekindle some bad memories, and if you have kids, you know the pain crosses generations. But letter grades haven’t always been around. They are a 20th-century adaptation to help with interpreting performance.
Letter grades certainly have shortcomings when judging student performance, but because they are close to being universally understood, they have endured. They have also been adapted to business performance and are a great way of interpreting UX metrics. Letter grades were first applied to the System Usability Scale (SUS) by Bangor, Kortum, and Miller (2008).
In this article, we discuss standard and curved grading scales for the UX-Lite® and when to use them.
Standard and Curved Grading Scales
There are different ways to build grading scales. The two most popular methods are standard (absolute, criterion-referenced) and curved (relative, norm-referenced).
A common standard grading scale ranges from 0 to 100 and assigns an A to the range of 90–100, a B to 80–89, a C to 70–79, a D to 60–69, and an F to 0–59. This was the approach Bangor et al. (2008) took for their SUS grading scale. Grade ranges can be further divided into minus and plus grades (e.g., 90–92 for A− and 97–100 for A+).
Curved grading scales are a little more complicated because grades are determined by comparison with a reference group. In a classroom, the reference group is the class itself. For a given test, such as the math test example above, the instructor might give an A to the top 10% of students, a B to the next 20%, a C to the next 40%, a D to the next 20%, and an F to the bottom 10%.
For a standardized questionnaire, the basis for norms is data collected from a reference group with a sample size large enough to establish percentiles. For a metric in which a low score is poorer than a high score, an observed score that is at the 5th percentile can be interpreted as markedly poorer than one that is at the 95th percentile. Standardized UX questionnaires backed by normative data are of greater value to practitioners than questionnaires lacking such data. On the other hand, even when a questionnaire has norms, there is always a risk that the characteristics of a specific sample don’t match the normative sample, so it’s important to understand where a questionnaire’s norms come from.
Standard and curved grading scales have different pros and cons. With a standard grading scale, it’s possible for everyone to get an A and also possible for everyone to get an F. When the scale is curved, there will be a designated percentage of grades from F to A, even if the scores are all terrible or all terrific.
The UX-Lite
We’ve written numerous articles and some peer-reviewed papers about the UX-Lite (e.g., “Effect of Perceived Ease of Use and Usefulness on UX and Behavioral Outcomes“). The UX-Lite has its roots in the UMUX-LITE, which in turn was derived in 2013 from the UMUX (2010). It’s a two-item questionnaire that is essentially a miniature version of the Technology Acceptance Model, used to assess the perceived ease of use and perceived usefulness of products and services (Figure 1).
Figure 1: The current version of the UX-Lite (created with the MUiQ® platform).
2025 UX-Lite Reference Group Updates
Our reference group for the UX-Lite norms is based on product-level means from retrospective surveys of business software products (e.g., GitHub, Slack), consumer software products (e.g., Duolingo, Spotify), and established websites from diverse sectors (e.g., fitness, vacation rental, video streaming, mass merchant, banking). These surveys were conducted over the past five years with respondents from online panels. We also use the website data for our periodic updates of the SUPR-Q® norms.
After having developed our first reference group in 2021, roughly every two years, we have a major reference group update for our UX-Lite calculator after we complete our biennial business and consumer software studies. For our 2025 update, we removed data from 2019 and added data collected in the second half of 2024 and the first quarter of 2025. Because with each update about 80% of the data remains the same, we typically see very little change in the means of the primary groups (all data, business software, consumer software, websites)—usually less than 1% and rarely more than 2%.
The current composition of the reference group is 48 business software products, 81 consumer software products, and 160 websites (data from 13,768 respondents). Some data in the reference group are from the same products or websites but from different years, depending on which sectors we replicated in the reference time period.
The Two UX-Lite Grading Scales
Based on our combined reference group (business software, consumer software, and established websites), we can create standard and curved grading scales for interpreting UX-Lite scores at the product level (averaged across individual ratings for each of the 289 products/websites collected in retrospective surveys). Table 1 shows the maximum, median, and minimum UX-Lite scores for the combined reference group, along with the product/website that received the score.
| Score Level | UX-Lite | Product |
|---|---|---|
| Maximum | 89.5 | Pottery Barn 2024 |
| Median | 78.8 | Walmart 2021 |
| Minimum | 60.2 | Adobe Illustrator 2022 |
Table 1: Maximum, median, and minimum UX-Lite scores from the combined reference group.
Standard Grading Scale
Table 2 shows our standard grading scale for the UX-Lite for this reference group. Because the highest UX-Lite score in the reference group is less than 90, we expanded the typical range for A from 85–100. We matched that by allowing 15 points for D, and as we’ve done in other grading scales for UX metrics, we have defined minus and plus grades for A, B, and C (but not for D or F).
| Score Range | Grade | Grade Point |
|---|---|---|
| 95–100 | A+ | 4.0 |
| 90–94.9 | A | 4.0 |
| 85–89.9 | A− | 3.7 |
| 82–84.9 | B+ | 3.3 |
| 78–81.9 | B | 3.0 |
| 75–77.9 | B− | 2.7 |
| 72–74.9 | C+ | 2.3 |
| 68–71.9 | C | 2.0 |
| 65–67.9 | C− | 1.7 |
| 50–64.9 | D | 1.0 |
| 0–49.9 | F | 0.0 |
Table 2: A standard grading scale for the UX-Lite.
Applying this grading scale to the minimum, median, and maximum scores from our reference group, Pottery Barn 2024 would get an A− (just missing an A), Walmart 2021 would get a B, and Adobe Illustrator 2022 would get a D. This seems reasonable due to the criteria we use to select products for our reference group—reasonably well-known products and websites.
In previous research, we’ve consistently found that UX-Lite scores are about three points higher than concurrently collected SUS scores. We also know that the historical average SUS score is 68. As shown in Table 1, the median UX-Lite of our current reference group is 78.8, a little more than 10 points over the historical SUS average. If our UX-Lite reference group matched the historical SUS reference group, we would expect its median to be only about three points higher rather than 10 points.
This strongly suggests a difference due to the composition of the reference groups. The historical reference group for the SUS was based on 446 sources (over 5,000 individual SUS responses) that included a mix of surveys and usability tests (both hardware and software), some of which had been previously published and others that had been anonymized and donated from the UX community. This is very different from a select group of well-known websites and software products. We know it includes some poorer user experiences because about 15% of SUS means in the database were less than 50.
What this means for UX practitioners is that unless their data match the composition of our reference group, they should focus on their actual UX-Lite scores instead of percentiles and interpret mean UX-Lite scores with the standard grading scale.
Curved Grading Scale
On the other hand, UX practitioners who conduct retrospective research on business software, consumer software, or established websites with participants from online panels could benefit from comparing their results with the curved grading scale shown in Table 3.
| Score Range | Grade | Grade Point | %ile Range |
|---|---|---|---|
| 87.4–100 | A+ | 4.0 | 96–100 |
| 85.3–87.3 | A | 4.0 | 90–95 |
| 84.0–85.2 | A− | 3.7 | 85–89 |
| 83.1–83.9 | B+ | 3.3 | 80–84 |
| 81.3–83.0 | B | 3.0 | 70–79 |
| 80.6–81.2 | B− | 2.7 | 65–69 |
| 79.8–80.5 | C+ | 2.3 | 60–64 |
| 77.0–79.7 | C | 2.0 | 41–59 |
| 76.0–76.9 | C− | 1.7 | 35–40 |
| 71.8–75.9 | D | 1.0 | 15–34 |
| 0.0–71.7 | F | 0.0 | 0–14 |
Table 3: A curved grading scale for the UX-Lite.
Applying this grading scale to the minimum, median, and maximum scores from our reference group, Pottery Barn 2024 would get an A+, Walmart 2021 would get a C, and Adobe Illustrator 2022 would get an F (as would any website or product with a mean UX-Lite less than 71.8).
Technical note: Like the SUS, the UX-Lite has a pronounced negative skew (many large with a small tail of small values), so we use a logarithmic transformation on reflected data to help normalize the distribution before assigning percentiles to scores.
This has the effect of compressing the curved grading scale from D to A+ to range from 71.8 to 100 rather than our standard grading scale, which ranges from 50 to 100. This allows better discrimination among UX-Lite scores for this reference group.
The S curve in Figure 2 illustrates how the percentiles in the curved grading scale magnify the differences in UX-Lite scores (especially the steep slope in the curve from 70 to 90).
Figure 2: S curve of the relationship between UX-Lite scores and percentiles for our combined reference group (created with our UX-Lite® Calculator Package) showing the locations of six products (online meeting products plus the minimum, median, and maximum products from Table 1).
For example, before the steep rise of the curve, the difference of 9.6 UX-Lite score points between Adobe Illustrator and WebEx corresponds to a difference of 9 percentile points. In contrast, the difference of just 1.9 UX-Lite score points between Walmart and Zoom corresponds to a difference of 13 percentile points.
How to Use the Grading Scales
The curved grading scale is only appropriate for assessing UX-Lite ratings of software products or websites collected in retrospective surveys with participants from online panels. The standard grading scale can be used more generally to assess any UX-Lite ratings. If you have your own specialized data (e.g., UX-Lite ratings from usability tests of medical devices that have a common set of tasks), then this would be the most meaningful reference group even if there aren’t enough historical data to create a custom curved grading scale. When possible, practitioners should use a combination of comparison with norms and competitive evaluation when assessing the quality of their products.
The grading scales presented in this article provide good general guidance for interpreting UX-Lite means. Several lines of SUS research have shown, however, that different types of products and interfaces differ significantly in perceived usability. For example, Kortum and Bangor (2013) published SUS ratings of overall experience for a set of 14 everyday products from a survey of more than 1,000 users. Examples of the SUS means for products with low, medium, and high perceived usability were 56.5 for Excel, 76.2 for Microsoft Word, 81.8 for Amazon, and 92.7 for Google Search.
A pragmatic interpretation of these findings is that they enhance more general norms. For example, it shouldn’t be surprising that a complex spreadsheet program has lower perceived usability than a well-designed search box. For many projects, setting a UX-Lite benchmark of 80 (middle of the B range on the standard scale, high side of C+ on the curved scale) is reasonable (above average) and achievable. If, however, the project is to develop a competitive spreadsheet application, a UX-Lite of 80 is probably unrealistically high (the UX-Lite means for Excel from our 2022 and 2025 consumer software surveys were, respectively, 65.5 and 69.1).
Inside our reference group, there are differences in the means of the subgroups from 72.6 for business software (sd = 4.23), to 77.7 for consumer software (sd = 6.42), to 79.7 for websites (sd = 5.38). If there is a specialized research need for percentiles for any of these subgroups or for the UX-Lite subscales of perceived ease and perceived usefulness, they are available in our UX-Lite Calculator Package.
Summary and Discussion
The UX-Lite is a two-item UX questionnaire that assesses perceived ease of use and perceived usefulness, two of the key drivers of ratings of overall experience and behavioral intentions (e.g., likelihood to use, likelihood to recommend). Since 2021, we have analyzed UX-Lite data collected with retrospective surveys of business software products, consumer software products, and websites from multiple consumer sectors with participants from online panels to enable interpretation of UX-Lite means with percentiles for that reference group. Our most recent update of the reference group has five years of UX-Lite data collected from 1Q 2020 through 1Q 2025 (48 business software products, 81 consumer software products, and 160 websites with ratings from 13,768 respondents).
We’ve developed two grading scales for interpreting UX-Lite scores, one standard and one curved. A standard grading scale (absolute, criterion-referenced) has set ranges for its grades. Grade ranges for a curved grading scale (relative, norm-referenced) depend on the distribution of data in a reference group.
The standard grading scale is appropriate for a general assessment of UX-Lite means. Practitioners can use the standard grading scale for the UX-Lite across a wide range of product types and experimental designs.
The curved grading scale is appropriate when new UX-Lite data match the reference group. When the method used to collect new UX-Lite data is consistent with the reference group, practitioners can use the curved grading scale to assess the new data against the location in the reference group.
When possible, practitioners should also collect UX-Lite data from key competitors. Having data from key competitors will allow better assessment of where a product or website is in its competitive space. UX-Lite means from this type of benchmark study can be interpreted with either the standard or curved grading scale, depending on the type of product and the method of data collection.




