All Blogs

How many users will complete the task and how long will it take them? If you need to benchmark an interface, then a summative usability test is one way to answer these questions. Summative tests are the gold-standard for usability measurement. But just how precise are the metrics? Just as a presidential poll uses a sample to estimate outcomes for the entire population, usability tests

Read More

Neilsen derives his "five users is enough" formula from a paper he and Tom Landauer published in 1993. Before Nielsen and Landauer James Lewis of IBM proposed a very similar problem detection formula in 1982 based on the binomial probability formula.[4] Lewis stated that: The binomial probability theorem can be used to determine the probability that a problem of probability p will occur r times

Read More

It is common to think of time-on-task data gathered only during summative evaluations because, during a formative evaluation, the focus is on finding and fixing problems, or at least finding the problems and delivering a report. For a variety of reasons, time-on-task measures often get left out of the mix. In this article, I show that time-on-task can be a valuable diagnostic and comparative tool

Read More

SUM is a standardized, summated and single usability metric. It was developed to represent the majority of variation in four common usability metrics used in summative usability tests: task completion rates, task time, satisfaction and error counts. The theoretical foundations of SUM are based on a paper presented at CHI 2005 entitled "A Method to Standardize Usability Metrics into a Single Score." Sauro and Kindlund.

Read More

Adding confidence intervals to completion rates in usability tests will temper both excessive skepticism and overstated usability findings. Confidence intervals make testing more efficient by quickly revealing unusable tasks with very small samples. Examples are detailed and downloadable calculators are available.   Are you Lacking Confidence?   You just finished a usability test. You had 5 participants attempt a task in a new version of

Read More

One of the biggest and usually first concerns levied against any statistical measures of usability is that the number of users required to obtain "statistically significant" data is prohibitive. People reason that one cannot with any reasonable level of confidence employ quantitative methods to determining product usability. The reasoning continues something like this: "I have a population of 2500 software users. I plug my population

Read More

We already saw how a manageable sample of users can provide meaningful data for discrete-binary data like task completion. With continuous data like task times, the sample size can be even smaller. The continuous calculation is a bit more complicated and involves somewhat of a Catch-22. Most want to determine the sample size ahead of time, then perform the testing based on the results of

Read More

Often the most reported measures of usability is task success. Success rates can be converted into a sigma value by using the discrete-binary defect calculation: Proportion Unsuccessful= Defects/Opportunities Where opportunities are the total number of tasks and defects are the total number of unsuccessful tasks. This calculation provides a proportion that is equivalent to a success or failure rate. For example, if 143 total tasks

Read More

It’s a big step when User Centered Design methods are employed in a company to improve the usability of a product. It shouldn’t be the last step and often times it is. Many popular usability testing techniques are the right method to gather user data, however, their results alone will only scratch the surface of the true state of usability. Often their results can be

Read More