Let’s imagine you are testing five users as part of an iterative testing services to find and fix problems. During the test only one user encounters a problem with logging in. To fix this particular problem would take a lot of effort and the small sample size is met with skepticism from the overburdened and overcommitted development team. They say “We really don’t know whether this problem will affect 1 out of 5 users (20%) or 1 out of 100 users (1%) and don’t have the resources to fix all edge cases.” They send you packing to ponder the merits of larger sample sizes or changing careers.
While many people have come to accept testing with small sample sizes in usability testing, there is still a lot of discomfort when extrapolating the results to the entire user population. It turns out the developers are right: we don’t know whether it will affect 1%, 20% or another percentage. But consider these uncertainties:
- Insurance companies don’t know if you’re going to die next year when they issue you a policy.
- Drug companies don’t know for sure if that drug will work or cause a fatal side-effect.
- You don’t know that you’ll win a hand of black-jack with your two queens.
Life is full of uncertainly and usability testing is no exception. Small sample sizes don’t preclude us from using the same method as insurance companies, drug companies and compulsive gamblers use to understand the uncertainly and make informed decisions.
For this particular question we can use probability to help decide. Doing so is what makes a usability evaluation quantitative. We can use the Binomial Probability Mass function to compare the probability 1 out of 5 users encountering a problem comes from an actual occurrences in the user-population of 20% versus 1%.
The probability the actual occurrence of the problem is 20% is about 41%. You can use the Excel formula =BINOMDIST(1,5,0.2,FALSE) to get the answer. And the probability the actual occurrence is 1% is a probability of 4.8%. Use the formula BINOMDIST(1,5,0.01,FALSE). I put together a calculator which will do the calculations for you.
Therefore, it is about 8.5 times more likely the problem you saw in your test of five users will affect 20% of users than 1% of users (41% divided by 4.8%). Now, it is also not likely that it will impact exactly 1% or 20%, but we can generate a binomial confidence interval around the problem occurrence to know that we can be 95% sure the problem will affect between 2% and 64% of users.
If you don’t like the thought of using formulas you can think of it conceptually. With small sample sizes of around 5 users, you are more likely to observe frequently occurring problems and miss the infrequent ones. If the problem didn’t occur frequently then you probably wouldn’t have seen it in a test. Of course there is a chance that you just happened to see a rare problem with five users, but it is not very likely. In other words, when you see a problem from a small sample of users, put your money on it affecting more than a tiny percentage of users. You could be wrong, just like drug companies, insurance companies and gamblers get it wrong, but using probability as part of a quantitative testing strategy means you at least know the risk you’re taking.