Measuring Usability: From the SUS to the UMUX-Lite

Jeff Sauro, PhD

Many researchers are familiar with the SUS, and for good reason.

It’s the most commonly used and widely cited questionnaire for assessing the perception of the ease of using a system (software, website, or interface).

Despite being short—10 items—the SUS has a fair amount of redundancy given it only measures one construct (perceived usability).

While some redundancy is good to improve reliability, shorter questionnaires are preferred when time in a usability study is limited or when a measure of usability is needed as part of a larger survey (which may already be too long).

UMUX

In response to the need for a shorter questionnaire, Finstad introduced[pdf] the Usability Metric for User Experience (UMUX) in 2010. It’s intended to be similar to the SUS but is shorter and targeted toward the ISO 9241 definition of usability (effectiveness, efficiency, and satisfaction). It contains two positive and two negative items with a 7-point response scale. The four items are:

[This system’s] capabilities meet my requirements.

Using [this system] is a frustrating experience.

[This system] is easy to use.

I have to spend too much time correcting things with [this system].

While reducing length is a good thing, it’s not the only thing to be concerned about when developing an effective questionnaire. Subsequent analyses by Lewis et al. and Borsci et al. found some shortcomings with the items in the UMUX. Most notably, Lewis et al. found that the positively and negatively worded items created an artificial two-factor structure instead of the intended one-factor (unidimensional) structure. This was the same issue Jim Lewis and I found with the SUS.

UMUX-Lite

To improve the UMUX, Lewis et al. proposed a shorter all-positive questionnaire called the UMUX-Lite using the same 7-point scale with the following two items:

[This system’s] capabilities meet my requirements.

[This system] is easy to use.

This shorter version nicely mimics the Technology Acceptance Model (TAM) in content, which provides a measure of perceived usefulness and usability. The TAM was developed around the same time as the SUS and is intended to predict whether participants will use or adopt a new product or technology. Essentially, something needs to be both useful and usable to ensure adoption and continued use.

Lewis et al. later validated the UMUX-Lite in a follow-up study with additional data and found it continued to show high internal reliability, similar to the SUS (UMUX-Lite alpha= .86 vs. SUS alpha = .91).

The UMUX-Lite also correlated highly with the SUS at r = .83 and the Net Promoter Score at r = .72, similar to our findings. Lewis et al. provided a regression equation to predict SUS scores from the two items (they call the UMUX-Liter) and found that the UMUX-Lite could predict SUS scores with about 99% accuracy. The regression equation is below

SUS Score =0.65 ∗ ((UMUX-Lite Item1 + UMUX-Lite Item2 − 2) ∗ (100/12))+22.9

This relationship was confirmed again with a separate data set and Lewis and Borsci et al again found only a 1% error (99% accuracy) between the UMUX-Lite scores and the SUS. A similar finding by Berkman & Karahoca was also shown for the unadjusted version.

UMUX-Lite (5-point version)

Intrigued by these results, we began to collect data on SUS and UMUX-Lite ourselves as part of our ongoing benchmark efforts. This was quite efficient to do because the second item in the UMUX-Lite—[This system] is easy to use—is common in UX questionnaires. It’s in the SUS, SUPR-Q, TAM, and is similar to the SEQ.

While there is some advantage in using the two 7-point scales Finstad and Lewis used, we wanted to use a 5-point scale for two reasons:

  1. To leverage existing norms on the SUS. (We have over 10k SUS scores and values on the 5-point ease of use item.)
  2. Because many questionnaires use the common 5-point Likert format (including the SUS), it’s easier to insert the additional item(s).

How to Use the UMUX-Lite

To use the UMUX-Lite, administer it just like the SUS on a 5- or 7-point agree/disagree scale. Like the SUS, it can be administered after a usability test or as part of a larger survey. We’ve found the UMUX-Lite particularly helpful for benchmarking many internal IT products—it’s short and easily fits into surveys.

How to Score and Report

There are a few ways to present the UMUX-Lite.

  1. As a mean for both items separately or aggregate them as an average (figure below).
  2. Convert the average of the two items to a SUS equivalency score using a variation of the Lewis regression equation.
  3. As percentile ranks for each item. We’re building a normalized database similar to the SUS so the UMUX-Lite can be displayed as a percentile rank. For now, we’re using the common threshold of 4 and converting the raw scores to z-scores and then percentile ranks. These percentages can then be displayed in a 2×2 usability x usefulness quadrant like the one shown in the figure below from the consumer software report.This quadrant shows that Microsoft Word is considered above average in both usability and usefulness. In contrast, iCloud was rated low in both usability and usefulness.

Ongoing & Future Research

The UMUX-Lite is continually being evaluated and more research is needed to confirm its validity and reliability (especially if it becomes an alternative to the SUS) and to refine the regression equation for predicting the SUS. One open question is how flexible the wording can be on Item 1 (capabilities meet my requirements).

For example, we’ve had one client ask us if “features” or “functions” can be substituted for “capabilities”. Slight changes in wording to the SUS (e.g., “cumbersome” to “awkward”) add clarity without sacrificing reliability and we’ll see if the same holds for the UMUX-Lite. The UMUX-Lite shows promise as a shorter but similarly effective instrument to the SUS for evaluating perceptions of the user experience.

0
    0
    Your Cart
    Your cart is emptyReturn to Shop
    Scroll to Top