“Does What I Need It to Do”: Assessing an Alternate Usefulness Item

Jim Lewis, PhD and Jeff Sauro, PhD

The UMUX-Lite is a two-item standardized questionnaire that, since its publication in 2013, has been adopted more and more by researchers who need a concise UX metric. Figure 1 shows the standard version with its Perceived Ease-of-Use (“{Product} is easy to use”) and Perceived Usefulness (“{Product}’s capabilities meet my requirements”) items.

 

Figure 1: Standard version of the UMUX-Lite.

As we continue to use the UMUX-Lite in our practice, we see opportunities to expand its reach by validating alternate forms of its Usefulness item. To date, we’ve found that the following alternates produce essentially equivalent overall UMUX-Lite scores relative to the standard Usefulness item in combination with the Ease item:

In this iteration of our ongoing study of alternate forms, we tested “{Product} does what I need it to do.” We became interested in this version because it has a different sentence structure from “meets my requirements/needs” while expressing a similar sentiment with simpler language. We wondered whether this structural change would lead to significant measurement differences or, like many of our other investigations of relatively minor wording and format manipulations, would behave like the other alternates we’ve investigated.

The Study

We included the standard version of the UMUX-Lite and a version using the alternate Usefulness item (“{Product} does what I need it to do”) as part of a survey of two types of online sellers (Mass Merchants: Amazon, Walmart, Target, Walgreens, CVS, Wayfair; Seller Marketplaces: eBay, Facebook Marketplace, Etsy, Craigslist). We conducted the survey from March through April 2021 using an online U.S.-based panel (n = 260). Respondents were randomly assigned to rate two companies using both versions of the UMUX-Lite in one of the following orders:

  • Order 1: Mass Merchant with standard version then Seller Marketplace with alternate version.
  • Order 2: Mass Merchant with alternate version then Seller Marketplace with standard version.
  • Order 3: Seller Marketplace with standard version then Mass Merchant with alternate version.
  • Order 4: Seller Marketplace with alternate version then Mass Merchant with standard version.

We used this Greco-Latin experimental design so we could focus on within-subjects comparisons of the standard and alternate versions of the UMUX-Lite.

Results: Comparison of Means

Figure 2 shows the mean UMUX-Lite scores from the study. There was no significant difference between the standard and alternate UMUX-Lite scores (t(259) = .53; p = .60).

Figure 2: Mean UMUX-Lite scores (with 95% confidence intervals).

The mean difference was .72 (less than 1% of the maximum possible difference of 100), with a 95% confidence interval from -2.0 to 3.4. The confidence interval contains 0, so it’s plausible that there is no real difference in means. If there is a real difference, the limits of the confidence interval show that it’s small, unlikely to exceed 3.4 on the 0-100–point UMUX-Lite scale.

Results: Comparison of Distribution of Response Options

Figure 3 shows the distributions of response options for the standard and alternate versions of the Usefulness item. The response patterns were similar, and there was a substantial overlap of 95% confidence intervals for each response option.

In addition to the mean, it’s common in UX research to assess data collected from multipoint scales with top-box scores. When there are five response options, both top-one (percentage of 5s) and top-two (percentage of 4s and 5s) box scores are commonly used.

For these data, the top-two-box scores were 83.5% for both versions. As expected, a McNemar test of the difference between these percentages was not statistically significant (p = 1.0). The top-one-box scores, as shown in Figure 3, were 33.5% for the standard version and 31.9% for the alternate. Again, the McNemar test was not significant (p = .70).

Figure 3: Distributions of Usefulness response options from the online seller survey with 95% confidence intervals.

Results: Scale Reliabilities

Scale reliability, measured with coefficient alpha, was .84 for the standard version and .90 for the alternate version. These exceed the typical reliability criterion of > .70 for research metrics, supporting the use of either version.

Summary and Takeaways

There are two items in the standard UMUX-Lite questionnaire: (1) Perceived Ease-of-Use: “The system is easy to use” and (2) Perceived Usefulness: “The system’s capabilities meet my requirements.”

The standard wording of the Perceived Ease-of-Use item is simple and straightforward, but the standard wording of the Perceived Usefulness item, while grammatically simple, contains two infrequently used multisyllabic words—“capabilities” and “requirements.” In this and three previous studies we simplified the wording of this item. In earlier studies, we directly replaced the multisyllabic words with simpler words; in the current study, we investigated the measurement properties of a simple phrase (“does what I need it to do”), which has a sentence structure that differs from previous alternates (e.g., “meets my needs”).

These results demonstrate the practical measurement equivalence of four alternative forms of the UMUX-Lite Perceived Usefulness item. UX researchers and practitioners can use any of these alternates in place of the standard version when computing means or top-two-box scores. Pending planned replication, however, researchers should be aware of possible deviations when computing top-one-box scores with “{Product}’s functions meet my needs.”

  • {Product}’s functionality meets my needs.
  • {Product}’s features meet my needs.
  • {Product}’s functions meet my needs.
  • {Product} does what I need it to do.
0
    0
    Your Cart
    Your cart is emptyReturn to Shop
    Scroll to Top