The UMUX-Lite Usefulness Item: Assessing a “Useful” Alternate

Jim Lewis, PhD • Jeff Sauro, PhD

When Kraig Finstad (2010) developed the Usability Metric for User Experience (UMUX), his goal was to replace the ten-item System Usability Scale (SUS, a popular measure of perceived usability) with a shorter questionnaire that would (1) correlate highly with the SUS and (2) have item content related to the ISO 9241 Part 11 international standard, which defines usability as having components of effectiveness, efficiency, and satisfaction. The questionnaire Finstad published had four seven-point agreement items (two positive-tone, two negative-tone):

  • Effectiveness: {System}’s capabilities meet my requirements.
  • Satisfaction: Using {System} is a frustrating experience.
  • Overall: {System} is easy to use.
  • Efficiency: I have to spend too much time correcting things with {System}.

While attempting to replicate Finstad’s findings, Lewis et al. (2013) discovered that using just the positive-tone items provided a metric that strongly corresponded to concurrently collected SUS scores. They named this new metric the UMUX-Lite. They also noted that although Finstad drew his item content from ISO 9241 Part 11, the two items of the UMUX-Lite—“{System}’s capabilities meet my requirements” and “{System} is easy to use”—are also related to the two major components of the Technology Acceptance Model (TAM), Perceived Usefulness and Perceived Ease-of-Use.

Later research (Lah et al., 2020) demonstrated a strong structural relationship between the items of the UMUX-Lite and the components of the TAM when predicting concurrently collected ratings of overall experience and likelihood-to-recommend (LTR).

After using the UMUX-Lite in our practice (and in response to client questions on the topic), we began to wonder whether we could use simpler versions of the Usefulness item without changing the fundamental measurement properties of the UMUX-Lite. To date, we’ve found that the following alternates produce essentially equivalent overall UMUX-Lite scores relative to the standard Usefulness item in combination with the Ease item:

An obvious additional alternate version would be to rate the simple statement “{Product} is useful.” After all, we assume that this construct is what the original UMUX-Lite and our alternates are measuring—so we ran a study to find out.

The Study

We included the standard version of the UMUX-Lite and a version using the alternate Usefulness item (“{Product} is useful”—see Figure 1) as part of a survey of two types of online sellers (Mass Merchants: Amazon, Walmart, Target, Walgreens, CVS, Wayfair; Seller Marketplaces: eBay, Facebook Marketplace, Etsy, Craigslist) collected from March 4–5, 2021, using an online U.S.-based panel (n = 201).

Figure 1: The alternate version of the UMUX-Lite used in this study.

Respondents were randomly assigned to rate two companies using both versions of the UMUX-Lite in one of the following orders:

  • Order 1: Mass Merchant with standard version then Seller Marketplace with alternate version
  • Order 2: Mass Merchant with alternate version then Seller Marketplace with standard version
  • Order 3: Seller Marketplace with standard version then Mass Merchant with alternate version
  • Order 4: Seller Marketplace with alternate version then Mass Merchant with standard version

With this Greco-Latin experimental design, we could focus on a within-subjects comparison of the standard and alternate versions of the UMUX-Lite.

Results: Comparison of Means

Figure 2 shows the mean UMUX-Lite scores from the study. There was no significant difference between the standard and alternate UMUX-Lite scores (t(200) = −.89, p = .38).

Figure 2: Mean UMUX-Lite scores (with 95% confidence intervals).

The mean difference was −1.43, with a 95% confidence interval from −4.6 to 1.7. The confidence interval contains 0, so it’s plausible that there is no real difference in means. If there is a difference, the confidence interval shows us that it’s small, unlikely to exceed 4.6 on the 0-100–point UMUX-Lite scale.

Results: Comparison of Distribution of Response Options

Figure 3 shows the distributions of response options for the standard and alternate versions of the Usefulness item. The response patterns were similar, and for each response option, there was a substantial overlap of 95% confidence intervals.

In addition to the mean, it’s common in UX research to assess data collected using multipoint scales with top-box scores. When there are five response options, both top-one (percentage of 5s) and top-two (percentage of 4s and 5s) box scores are commonly used.

Top-one-box scores were not significantly different in these data (Standard: 27.9%; Alternate: 28.4%). As expected, a McNemar test of the difference between these percentages was not statistically significant (p = 1.0).

There was a greater difference between the top-two-box scores (Standard: 82.1%; Alternate: 87.1%), but again, the McNemar test was not significant (p = .21). Despite this nonsignificant result, we advise researchers to be cautious about using this alternate if planning to report top-two-box scores. We will replicate this investigation in a future study to test the reliability of this difference.

Figure 3: Distributions of Usefulness response options from the online sellers survey with 95% confidence intervals.

Results: Scale Reliabilities

Scale reliability, measured with coefficient alpha, was .87 for the standard version and .76 for the alternate version. These results exceed the typical reliability criterion of > .70 for research metrics and support the use of either version.

Summary and Takeaways

The two items of the original UMUX-Lite questionnaire measure are (1) Perceived Ease-of-Use: “The system is easy to use” and (2) Perceived Usefulness: “The system’s capabilities meet my requirements.”

We have investigated several alternate versions of the Usefulness item to assess whether a variety of simply worded items have measurement properties similar to the standard version. In this study, we tested another variant with wording directly taken from the hypothesized construct being measured—“{Product} is useful.”

The results of these studies have demonstrated the practical measurement equivalence of five alternative forms of the UMUX-Lite Perceived Usefulness item. UX researchers and practitioners can use any of them in place of the standard version when computing means. Pending planned replication, researchers should be aware of the possibility of small deviations when computing top-one-box scores with “{Product}’s functions meet my needs” and top-two-box scores with “{Product} is useful.”

  • {Product}’s functionality meets my needs.
  • {Product}’s features meet my needs.
  • {Product}’s functions meet my needs.
  • {Product} does what I need it to do.
  • {Product} is useful.
0
    0
    Your Cart
    Your cart is emptyReturn to Shop
    Scroll to Top