SUM is a standardized, summated and single usability metric. It was developed to represent the majority of variation in four common usability metrics used in summative usability tests: task completion rates, task time, satisfaction and error counts. The theoretical foundations of SUM are based on a paper presented at CHI 2005 entitled “A Method to Standardize Usability Metrics into a Single Score.” Sauro and Kindlund.
Usability ScoreCard Added June. 2007
The UsabilityScorecard web-application will take raw usability metrics (completion, time, sat, errors and clicks) and calculate confidence intervals and graph the results automatically. You can also combine any combination of the metrics into a 2, 3 or 4 measure combined score. Data can be imported from Excel (.csv) and exported to Word(.rtf).
The SUM calculator will take raw usability metrics and convert them into a SUM score with confidence intervals. The analyst needs to provide the raw metrics on a task-by-task basis and know the opportunity for errors. SUM will automatically calculate the maximum acceptable task time, or it can be provided. This calculator is an Excel based version of the UsabilityScorecard except that it can only combine 4 measures (time, errors, sat and completion) and does not graph the results.
- Why would you want a single usability metric?
If you agree that to truly know the usability of a product means measuring its usability, then you must necessarily ask: how do you measure usability? For us, that’s SUM–a composite of multiple measures that all attempt to measure usability.
The bulk of usability activities are formative, that is, their major intent is to uncover and fix usability problems in a user interface. A single score would not be as beneficial here. It is most beneficial during a benchmarking test or summative assessment when you want to know how usable the application is, as opposed to what the usability problems are. Before you can say how usable something is, you need to be able to measure usability.
Why would you want to measure usability? If you cannot measure the construct of usability, or agree on what accounts for usability in some measurable way, then how will we know if any of the UE activities actually make the product more usable? This is a larger philosophical issue that has been asked in many contexts, here are some examples:
“Measurement is at the heart of our scientific method. “Numerical Precision is the very soul of science” D’Arcy Wentworth Thompson, On Growth and Form (1917)
“If you can’t measure it, you cant manage it.” (Old Management Saying)
“When you can measure what you are speaking about, and express it in numbers you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of and unsatisfactory kind: It may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of Science.” Lord (William Thompson) Kelvin, pioneer in thermodynamics and electricity (1891).