What Metrics Has MeasuringU Created?

Jeff Sauro, PhD • Jim Lewis, PhD

November 11, 2025

At MeasuringU^®, we don’t just use UX metrics—we create them.

But what have we created, and what have we just used or extended?

Across our combined careers, we (Jeff and Jim) have published 16 psychometrically qualified UX metrics (both creating original and modifying existing questionnaires) plus a method for combining prototypical usability metrics, and we have made major contributions to a popular standardized UX questionnaire that we did not create, the System Usability Scale (SUS).

In this article, we briefly describe each of these metrics (presented in roughly reverse chronological order by decade) and provide key links to more information about them (so you won’t need to ask ChatGPT and risk hallucinated references).

2020–2025

From 2020 to 2025, we developed and published four standardized UX questionnaires: UX-Lite^®, SUPR-Qm^® V2, TAC-10™, and PWCQ.

UX-Lite^®

The UX-Lite has its roots in the UMUX-LITE (more on the UMUX-LITE below). It’s a two-item questionnaire that is essentially a miniature version of the Technology Acceptance Model (TAM), assessing the perceived ease-of-use and perceived usefulness of products and services with two five-point scales. It’s becoming an increasingly popular metric in UX research and practice.

From 2020 to 2024, we published 15 articles on the UX-Lite, many of which explored different ways to phrase the “usefulness” item because its original wording was overly complex. In addition to demonstrating the reliability and validity of the UX-Lite, it has also proved to be useful in regression and structural equation modeling of higher-level outcome metrics like ratings of overall experience, behavioral intentions (e.g., likelihood to recommend, likelihood to reuse), and actual user behaviors.

Key Characteristics

Measures: Perceived ease of use and perceived usefulness
Number of items: 2
Reliability: 0.75 (coefficient alpha unless otherwise specified)
Types of Validity: Content, construct, concurrent
Number of subscales: 2 (single-item scales)
Interpretive norms: Yes
Development method: Classical test theory

Key Links and Publications

Lewis, J. R., & Sauro, J. (2023). Effect of Perceived Ease of Use and Usefulness on UX and Behavioral Outcomes. International Journal of Human-Computer Interaction, 40(20), 6676–6683.
Measuring UX: From the UMUX-LITE to the UX-Lite
Evolution of the UX-Lite
How to Score and Interpret the UX-Lite

SUPR-Qm^® V2

The mobile app experience is a unique and defining aspect of our interactions with our devices. While the experience shares many characteristics with using software and websites on a traditional monitor, the mobility, screen size, and interaction style make the experience distinct. Consequently, we developed a questionnaire, the SUPR-Qm, to measure attitudes toward the mobile app user experience. In 2025, we published the second version of the SUPR-Qm, reducing the number of items from the original 16 to five.

Key Characteristics

Measures: Intensity of the UX of mobile apps
Number of items: 5
Reliability: 0.83
Types of Validity: Content, construct, concurrent
Number of subscales: 0
Interpretative norms: Yes
Development method: Rasch scaling

Key Links & Publications

Lewis, J. R., & Sauro, J. (2025). Streamlining the SUPR-Qm: The SUPR-Qm V2. Journal of User Experience, 20(2), 65–88.
Ten Things to Know About the SUPR-Qm
How to Score and Interpret the Five-Item SUPR-Qm V2

TAC-10™

We based the TAC-10 on research conducted at MeasuringU from 2015 through 2023 and presented it at UXPA 2024. The TAC-10 is a select-all-that-apply checklist of ten different technical activities. We published six blog articles in 2023 detailing its development, including why there was a need for a measure of tech savviness in UX research (to enable discrimination of interface and participant characteristics when analyzing UX data) and how to use the TAC-10 to classify participants into different levels of tech savviness.

Key Characteristics

Measures: Level of tech savviness
Number of items: 10
Reliability: 0.67 (Spearman–Brown for dichotomous data)
Types of Validity: Content, construct, concurrent
Number of subscales: 0
Interpretative norms: Yes
Development method: Rasch scaling

Key Links & Publications

PWCQ

In our UX research practice, we frequently encounter users and designers who criticize website interfaces for being cluttered and stakeholders who worry about the experiential and business consequences of a cluttered website. But what exactly does it mean for a website to appear cluttered? To answer this question, we developed the Perceived Website Clutter Questionnaire (PWCQ), a five-item questionnaire with two subscales: Content Clutter and Design Clutter.

Key Characteristics

Measures: The perceived clutter of websites
Number of items: 5
Reliability: 0.90
Types of Validity: Content, construct, concurrent
Number of subscales: 2
- Content Clutter: Reliability = 0.91
- Design Clutter: Reliability = 0.88
Interpretative norms: No
Development method: Classical test theory

Key Links & Publications

Lewis, J. R., & Sauro, J. (2024). Measuring the Perceived Clutter of Websites. International Journal of Human-Computer Interaction, 41(9), 5260–5273.
Confirming the Perceived Website Clutter Questionnaire
Incorporating Clutter in the SUPR-Q Measurement Framework

2010–2019

From 2010 through 2019, we (Jeff and Jim) both collaborated and worked separately on the creation and publication of seven standardized UX questionnaires, plus the publication of books, papers, and numerous articles on how to use and interpret the SUS.

SUPR-Q^®

At MeasuringU, we originally benchmarked websites using the SUS. But we knew that the quality of the website user experience was more than just usability, so we developed the Standardized User Experience Percentile Rank Questionnaire (SUPR-Q) in 2011 and published our findings in 2015. The SUPR-Q is a short (eight-item) questionnaire that measures perceptions of Usability, Trust, Appearance, and Loyalty for websites. The combined score provides an overall measure of the quality of the website user experience. The normative percentile database contains responses from more than 10,000 participants and 150 websites (updated on an ongoing basis, about once per quarter).

Key Characteristics

Measures: Perceptions of the quality of UX with websites
Number of items: 8
Reliability: 0.90
Types of Validity: Content, construct, concurrent
Number of subscales: 4
- Usability: Reliability = 0.88
- Trust: Reliability = 0.87
- Appearance: Reliability = 0.80
- Loyalty: Reliability = 0.73
Interpretive norms: Yes
Development method: Classical test theory

Key Links & Publications

Sauro, J. (2015). SUPR-Q: A Comprehensive Measure of the Quality of the Website User Experience. Journal of Usability Studies, 2(10), 68–86.
SUPR-Q License & Calculator Package
Validating the Basic SUPR-Q Measurement Model

SUPR-Qm^®

Our original version of the mobile app questionnaire had 16 items selected from a larger set using Rasch scaling. We list this here for historical purposes, but our current practice is to use the SUPR-Qm V2 (see above).

Key Characteristics

Measures: Intensity of the UX of mobile apps
Number of items: 16
Reliability: 0.94
Types of Validity: Content, construct, concurrent
Number of subscales: 0
Interpretative norms: Yes
Development method: Rasch scaling

Key Links & Publications

Sauro, J., & Zarolia, P. (2017). SUPR-Qm: A Questionnaire to Measure the Mobile App User Experience. Journal of Usability Studies, 13(1), 17–37.
Lewis, J. R., & Sauro, J. (2025). Streamlining the SUPR-Qm: The SUPR-Qm V2. Journal of User Experience, 20(2), 65–88.
How Stable is the SUPR-Qm After 8 Years?

UMUX-LITE

The UMUX-LITE is a mini-TAM with two seven-point items, assessing perceived ease of use and perceived usefulness. It was derived from the four-item UMUX (Usability Metric for User Experience) when Jim was at IBM (in collaboration with Brian Utesch and Deb Maher) and is the predecessor to the UX-Lite. At MeasuringU, we prefer the UX-Lite (described above) due to its enhanced flexibility, but the UMUX-LITE is also used in current UX research and practice.

Key Characteristics

Measures: Perceived ease of use and usefulness
Number of items: 2
Reliability: 0.83
Types of Validity: Content, construct, concurrent
Number of subscales: 2 (single-item scales)
Interpretative norms: No
Development method: Classical test theory

Key Links & Publications

Lewis, J. R., Utesch, B. S., & Maher, D. E. (2013). UMUX-LITE: When There’s No Time for the SUS. In Proceedings of CHI 2013 (pp. 2099–2102). Association for Computing Machinery.
Measuring Usability: From the SUS to the UMUX-Lite

MOS-X2

As part of his work on speech systems at IBM, Jim and other collaborators at IBM developed variants of the Mean Opinion Scale (MOS) that had first been published by others in the 1990s. The MOS-X2 is the culmination of that research, a four-item questionnaire that assesses four key characteristics of user experiences with synthetic voices: Intelligibility, Naturalness, Prosody, and Social Impression.

Key Characteristics

Measures: The perceived intelligibility, naturalness, prosody, and social impression of synthetic voices
Number of items: 4
Reliability: 0.85
Types of Validity: Content, construct, concurrent
Number of subscales: 4 (single-item scales)
Interpretative norms: Yes
Development method: Classical test theory

Key Links & Publications

Lewis, J. R. (2018). Investigating MOS-X Ratings of Synthetic and Human Voices. Voice Interaction Design, 2(2), 1–22.
The Evolution of the Mean Opinion Scale: From MOS-R to MOS-X2

SUISQ-R

The original version of the Speech User Interface Service Quality (SUISQ) questionnaire was developed at IBM and published by Melanie Polkosky in 2008. During its development, participants rated the quality of recorded interactions rather than interactions in which they participated, leaving open the question of the extent to which the findings would generalize to personal as opposed to observed interactions. Collaborating at State Farm, Jim and Mary Hardzinski collected SUISQ data in a large-sample usability study and (1) replicated the factor structure of the original and (2) used item analysis to reduce the questionnaire from 25 to 14 items (getting the SUISQ-R) while still adequately measuring its four subscales: User Goal Orientation, Customer Service Behaviors, Speech Characteristics, and Verbosity.

Key Characteristics

Measures: Service quality of speech applications
Number of items: 14
Reliability: 0.88
Types of Validity: Content, construct, concurrent
Number of subscales: 4
- User Goal Orientation: Reliability = 0.91
- Customer Service Behavior: Reliability = 0.88
- Speech Characteristics: Reliability = 0.80
- Verbosity: Reliability = 0.67
Interpretative norms: No
Development method: Classical test theory

Key Links & Publications

Lewis, J. R., & Hardzinski, M. L. (2015). Investigating the Psychometric Properties of the Speech User Interface Service Quality Questionnaire. International Journal of Speech Technology, 18(3), 479–487.

EMO

The Emotional Metric Outcomes (EMO) questionnaire was also developed while Jim was consulting at State Farm. His collaborators at State Farm wanted a standardized questionnaire for assessing the emotional consequences of interaction with a company. They published the EMO in both long (16 items) and short (8 items) versions, measuring four subscales: Positive Relationship Affect, Negative Relationship Affect, Positive Personal Affect, and Negative Personal Affect. The key characteristics below are for the more efficient short version.

Key Characteristics

Measures: Emotional consequence of interacting with a company
Number of items: 8
Reliability: 0.88
Types of Validity: Content, construct, concurrent
Number of subscales: 4
- Positive Relationship Affect: 0.89
- Negative Relationship Affect: 0.72
- Positive Personal Affect: 0.83
- Negative Personal Affect: 0.82
Interpretative norms: No
Development method: Classical test theory

Key Links & Publications

Lewis, J. R., & Mayes, D. K. (2014). Development and Psychometric Evaluation of the Emotional Metric Outcomes (EMO) Questionnaire. International Journal of Human-Computer Interaction, 30, 685–702.
Lewis, J. R., Brown, J., & Mayes, D. K. (2015). Psychometric Evaluation of the EMO and the SUS in the Context of a Large-Sample Unmoderated Usability Study. International Journal of Human-Computer Interaction, 31(8), 545-553.

mTAM

The mTAM is a modified version of the TAM (Technical Acceptance Model), a questionnaire developed in the 1990s to assess the drivers of technology acceptance. In its original version, the TAM had 12 items measuring two subscales, Perceived Ease-of-Use and Perceived Usefulness, with items worded to focus on potential future use. For the mTAM, the only modification was to change the focus to ratings of actual use. Note that we do not use this as a practical UX questionnaire, but we have used it when exploring how other standardized metrics work within the Technology Acceptance Model.

Key Characteristics

Measures: Perceived ease of use and perceived usefulness
Number of items: 12
Reliability: 0.95
Types of Validity: Content, construct, concurrent
Number of subscales: 2
- Perceived Ease of Use: Reliability = 0.95
- Perceived Usefulness: Reliability = 0.95
Interpretative norms: No
Development method: Classical test theory

Key Links & Publications

Lah, U., Lewis, J. R., & Šumak, B. (2020). Perceived Usability and the Modified Technology Acceptance Model. International Journal of Human-Computer Interaction, 36(13), 1216–1230.
Lewis, J. R. (2019). Comparison of Four TAM Item Formats: Effect of Response Option Labels and Order. Journal of Usability Studies, 4(14), 224–235.

SUS

No, we didn’t develop the System Usability Scale (SUS)—that honor goes to John Brooke—but between 2010 and 2019, we conducted extensive research to improve its interpretability and flexibility, with publications listed in the Key Links below.

Key Characteristics

Measures: Perceived usability
Number of items: 10
Reliability: 0.91
Types of Validity: Content, construct, concurrent
Number of subscales: 0
Interpretative norms: Yes
Development method: Classical test theory

Key Links & Publications

Sauro, J. (2011). A Practical Guide to the System Usability Scale: Background, Benchmarks & Best Practices. MeasuringU Press. Note: First book about the SUS, focusing on its measurement characteristics and practical use, including the curved grading scale developed at MeasuringU.
Sauro, J., & Lewis, J.R. (2011). When Designing Usability Questionnaires, Does It Hurt to Be Positive? In Proceedings of CHI 2011 (pp. 2215–2223). Association for Computing Machinery. Honorable Mention for Best Paper award.
Sauro, J., & Lewis, J. R. (2012/2016). Quantifying the User Experience: Practical Statistics for User Research. Morgan Kaufmann. Note: Extensive coverage of SUS research and use in Chapter 8.
Lewis, J. R., & Sauro, J. (2017). Revisiting the Factor Structure of the System Usability Scale. Journal of Usability Studies, 12(4), 183–192.
Lewis, J. R., & Sauro, J. (2017). Can I Leave This One Out? The Effect of Dropping an Item from the SUS. Journal of Usability Studies, 13(1), 38–46.
Lewis, J. R., & Sauro, J. (2018). Item Benchmarks for the System Usability Scale. Journal of Usability Studies, 13(3), 158–167.
Lewis, J. R. (2018). The System Usability Scale: Past, Present, and Future. International Journal of Human-Computer Interaction, 34(7), 577–590.
Extensive publication of articles on the SUS at MeasuringU.com. Some recent examples are:

2000–2009

In this decade, Jeff introduced the Single Ease Question (SEQ) and a method for combining multiple UX metrics into a Single Usability Metric (SUM). Jim and colleagues at IBM investigated and published enhancements to the Mean Opinion Scale (MOS), most notably, the MOS-X.

SEQ^®

The concepts of ease of use and usability are deeply intertwined. No one knows who the first person was to ask someone to rate the ease of completing a task in a usability study, but in 2009, Jeff and Joe Dumas were the first to publish a version of the item that is now known as the Single Ease Question (SEQ). Since its initial publication, it has undergone some cosmetic changes, and research has established good norms for its interpretation.

Key Characteristics

Measures: Perceived ease of completing a task in a usability study
Number of items: 1
Reliability: 0.80 (test-retest)
Types of Validity: Content, concurrent
Number of subscales: 0
Interpretative norms: Yes
Development method: Classical test theory

Key Links & Publications

Sauro, J., & Dumas, J. (2009). Comparison of Three One-Question, Post-Task Usability Questionnaires. In Proceedings of CHI 2009 (pp. 1599–1608). Association for Computing Machinery. Nominated for Best Paper Award.
The Evolution of the Single Ease Question (SEQ)

SUM

The Single Usability Metric (SUM) is not a standardized questionnaire, so there is no list of key characteristics in this section. Instead, it is a standardized method for combining prototypical usability metrics such as completion rates, completion times, and subjective ratings (e.g., satisfaction or ease), an important step toward a unified measure of usability that we continue to use in benchmark studies.

Key Links & Publications

Sauro, J., & Kindlund, E. (2005). Using a Single Usability Metric (SUM) to Compare the Usability of Competing Products. In Proceedings of HCII 2005 (pp. 235–244). Human Computer Interaction International.
Sauro, J., & Lewis, J.R. (2009). Correlations among Prototypical Usability Metrics: Evidence for the Construct of Usability. In Proceedings of CHI 2009 (pp. 1609–1618). Association for Computing Machinery. Nominated for Best Paper Award.
10 Things to Know about the Single Usability Metric (SUM)

MOS-X

The Mean Opinion Scale-Expanded (MOS-X) is a 15-item questionnaire developed at IBM to obtain listeners’ subjective assessments of synthetic speech on four dimensions: Intelligibility, Naturalness, Prosody, and Social Impression. In current practice, we have replaced this questionnaire with the four-item MOS-X2 (described above).

Key Characteristics

Measures: The perceived intelligibility, naturalness, prosody, and social impression of synthetic voices
Number of items: 15
Reliability: 0.93
Types of Validity: Content, construct, concurrent
Number of subscales: 4
- Intelligibility: 0.88
- Naturalness: 0.86
- Prosody: 0.86
- Social Impression: 0.86
Interpretative norms: Yes
Development method: Classical test theory

Key Links

Polkosky, M. D., & Lewis, J. R. (2003). Expanding the MOS: Development and Psychometric Evaluation of the MOS-R and MOS-X. International Journal of Speech Technology, 6(2), 161–182.
Lewis, J. R. (2018). Investigating MOS-X Ratings of Synthetic and Human Voices. Voice Interaction Design, 2(2), 1–22.

1990–1999

This was the decade in which Jim published his first three standardized UX questionnaires at IBM: ASQ, PSSUQ, and CSUQ. They continue to be used in UX research and practice, but we don’t use them at MeasuringU because they’re a bit antiquated in style and content (included in this article for historical completeness).

ASQ

The After-Scenario Questionnaire (ASQ) was an early attempt to develop a concise but comprehensive three-item questionnaire to administer after tasks in usability studies with ratings of satisfaction with ease, completion time, and support information.

Key Characteristics

Measures: Task-level satisfaction
Number of items: 3
Reliability: 0.93
Types of Validity: Content, construct, concurrent
Number of subscales: 0
Interpretative norms: No
Development method: Classical test theory

Key Links & Publications

Lewis, J. R. (1991). Psychometric Evaluation of an After-Scenario Questionnaire for Computer Usability Studies: The ASQ. SIGCHI Bulletin, 23(1), 78–81.
Lewis, J. R. (1995). IBM Computer Usability Satisfaction Questionnaires: Psychometric Evaluation and Instructions for Use. International Journal of Human-Computer Interaction, 7(1), 57–78.

PSSUQ

The Post-Study System Usability Questionnaire (PSSUQ) was an early standardized usability questionnaire to administer at the end of a usability study containing three subscales: System Usefulness, Information Quality, and Interface Quality (most recently slightly redesigned as Version 3).

Key Characteristics

Measures: Perceived usability
Number of items: 16
Reliability: 0.96
Types of Validity: Content, construct, concurrent
Number of subscales: 3
- System Usefulness: Reliability = 0.96
- Information Quality: Reliability = 0.92
- Interface Quality: Reliability = 0.83
Interpretative norms: No
Development method: Classical test theory

Key Links & Publications

Lewis, J. R. (1992). Psychometric Evaluation of the Post-Study System Usability Questionnaire: The PSSUQ. In Proceedings of the 36^thAnnual Meeting of the Human Factors Society (pp. 1259–1263). HFES.
Lewis, J. R. (1995). IBM Computer Usability Satisfaction Questionnaires: Psychometric Evaluation and Instructions for Use. International Journal of Human-Computer Interaction, 7(1), 57–78.
Lewis, J. R. (2002). Psychometric Evaluation of the PSSUQ Using Data from Five Years of Usability Studies. International Journal of Human-Computer Interaction, 14(3), 463–488.
Lewis, J. R. (2019). Using the PSSUQ and CSUQ in User Experience Research and Practice. MeasuringU Press.

CSUQ

The Computer System Usability Questionnaire (CSUQ) is a version of the PSSUQ modified for use as a general standardized UX questionnaire outside of the confines of a usability test (achieved primarily by changing references to “tasks and scenarios” to “work”). Its key characteristics are very close to those found for the PSSUQ.

Key Links & Publications

Lewis, J. R. (1995). IBM Computer Usability Satisfaction Questionnaires: Psychometric Evaluation and Instructions for Use. International Journal of Human-Computer Interaction, 7(1), 57–78.
Lewis, J. R. (2019). Measuring Perceived Usability: SUS, UMUX, and CSUQ Ratings for Four Everyday Products. International Journal of Human-Computer Interaction, 35(15), 1404–1419.
Lewis, J. R. (2019). Using the PSSUQ and CSUQ in User Experience Research and Practice. MeasuringU Press.

Summary

From 1990 to 2025, we developed and published 16 standardized UX questionnaires from the general measurement of perceived usability to specialized measurement of the UX of websites and speech applications. Table 1 lists those questionnaires and their key characteristics in descending chronology.

Questionnaire	Measures	Number of Items	Reliability	Number of Subscales	Interpretative Norms	Development Method
UX-Lite	Perceived ease and usefulness	2	0.75	2	Yes	Classical Test Theory
SUPR-Qm V2	Intensity of UX of mobile apps	5	0.83	0	Yes	Rasch Scaling
TAC-10	Level of tech savviness	10	0.67	0	Yes	Rasch Scaling
PWCQ	Perceived website clutter	5	0.90	2	No	Classical Test Theory
SUPR-Q	Quality of UX of websites	8	0.90	4	Yes	Classical Test Theory
SUPR-Qm	Intensity of UX of mobile apps	16	0.94	0	Yes	Rasch Scaling
UMUX-LITE	Perceived ease and usefulness	2	0.83	2	No	Classical Test Theory
MOS-X2	UX of synthetic voices	4	0.85	4	Yes	Classical Test Theory
SUISQ-R	Service quality of speech apps	14	0.88	4	No	Classical Test Theory
EMO	Emotional interaction	8	0.88	4	No	Classical Test Theory
mTAM	Perceived ease and usefulness	12	0.95	2	No	Classical Test Theory
SEQ	Perceived task ease	1	0.80	0	Yes	Classical Test Theory
MOS-X	UX of synthetic voices	15	0.93	4	Yes	Classical Test Theory
ASQ	Task-level usability	3	0.93	0	No	Classical Test Theory
PSSUQ	Study-level usability	16	0.96	3	No	Classical Test Theory
CSUQ	Computer usability	16	0.97	3	No	Classical Test Theory

Table 1: Summary of standardized questionnaires created by MeasuringU researchers (all questionnaires have published evidence of content and concurrent validity; all except the SEQ have construct validity).

In addition to these questionnaires, the SUM, a method for combining prototypical usability metrics, was created at MeasuringU.

And even though we did not create the SUS, we have published numerous studies on making it more flexible and interpretable (e.g., curved grading scale and item benchmarks).

What Metrics Has MeasuringU Created?

What Metrics Has MeasuringU Created?

2020–2025

UX-Lite®

Key Characteristics

Key Links and Publications

SUPR-Qm® V2

Key Characteristics

Key Links & Publications

TAC-10™

Key Characteristics

Key Links & Publications

PWCQ

Key Characteristics

Key Links & Publications

2010–2019

SUPR-Q®

Key Characteristics

Key Links & Publications

SUPR-Qm®

Key Characteristics

Key Links & Publications

UMUX-LITE

Key Characteristics

Key Links & Publications

MOS-X2

Key Characteristics

Key Links & Publications

SUISQ-R

Key Characteristics

Key Links & Publications

EMO

Key Characteristics

Key Links & Publications

mTAM

Key Characteristics

Key Links & Publications

SUS

Key Characteristics

Key Links & Publications

2000–2009

SEQ®

Key Characteristics

Key Links & Publications

SUM

Key Links & Publications

MOS-X

Key Characteristics

Key Links

1990–1999

ASQ

Key Characteristics

Key Links & Publications

PSSUQ

Key Characteristics

Key Links & Publications

CSUQ

Key Links & Publications

Summary

Stay informed.

Platform

Blog

Most Popular

Most Recent

Upcoming Events

Books

Surveying the User Experience

Benchmarking the User Experience

Customer Analytics For Dummies

Quantifying The User Experience: Practical Statistics For User Research

UX-Lite^®

SUPR-Qm^® V2

SUPR-Q^®

SUPR-Qm^®

SEQ^®