{"id":553,"date":"2019-09-11T03:45:52","date_gmt":"2019-09-11T03:45:52","guid":{"rendered":"http:\/\/measuringu.com\/sample-size-reality\/"},"modified":"2022-03-21T17:15:08","modified_gmt":"2022-03-21T23:15:08","slug":"sample-size-reality","status":"publish","type":"post","link":"https:\/\/measuringu.com\/sample-size-reality\/","title":{"rendered":"Sample Size in Usability Studies: How Well Does the Math Match Reality?"},"content":{"rendered":"
<\/a>We\u2019ve written extensively<\/a> about how to determine the right sample size for UX studies.<\/p>\n There isn\u2019t one sample size that will work for all studies.<\/p>\n The optimal sample size is based on the type of study<\/a>, which can be classified into three groups:<\/p>\n And it\u2019s the sample size needed for the problem discovery study type (the classic usability study) where one of the more enduring and misunderstood controversies in UX research comes from. That is, the magic number five<\/a>.<\/p>\n There are times when five users<\/a> will be sufficient, but many times it will fall far short. In usability studies where the goal is to uncover usability issues, five users can be enough. You can use a mathematical model<\/a>\u00a0to help plan how many users you should test based on the likelihood of problem occurrences.<\/p>\n Jim Lewis and I wrote extensively about the math in Chapter 7 of Quantifying the User Experience<\/a>. But it’s easy to get lost in the math and qualifications of problem occurrence rates and likelihood-of-detection rates. In this article I\u2019ll use some real usability test data to see what happens with just five users.<\/p>\n The plain-language way to think of what five users can do for you is this: with five users, you\u2019ll see MOST of the COMMON problems. The two key words are most and common. It\u2019s wrong to think that five users will uncover all common problems\u2014or even worse, all problems.<\/p>\n To help provide some idea about what it means to see most of the common problems with five users, I pulled together usability data<\/a>\u00a0from seven larger-sample usability studies, some of which we conducted in the last year.<\/p>\n The studies were a mix of in-person moderated studies and unmoderated studies<\/a> with videos. All studies had one or two researchers coding the usability issues. Across the seven studies there were four different researchers (two conducted multiple studies).<\/p>\n The interfaces tested included three consumer websites, one that featured real-estate listings and two that were rental car websites (Budget and Enterprise<\/a>, which we tested in 2012). Two interfaces were quite technical web-based applications for IT engineers. There was one mobile website for a consumer credit card. For the final interface, study data was sent to us anonymized with only the problem matrix; we knew few details about the problems or interface other than it was a B2B<\/a> application. The sample sizes were relatively large compared to typical formative studies<\/a>, with the smallest having 18 participants and the largest 50.<\/p>\n The datasets contain a mix of usability problems<\/a> (e.g., users struggle to find \u201cAdd-ons\u201d when renting a car) and insights (users suggested some new features but didn\u2019t encounter any problems) that were collected to fulfill specific research requests for each project. These datasets provide a reasonable range of typical usability issues and insights reported in usability reports and offer a good range of different usability problem coding types (including both granular and broader issues) and facilitation styles<\/a>.<\/p>\n Table 1 shows an overview of the seven datasets, including the type of interface, the sample size, the number of problems, and the average (unadjusted) problem occurrence. The problem occurrence is unadjusted in that it\u2019s just the average of all problems across the number of users. For example, if there are three problems, one that affected 10 out of 30 users (33%), another that affected 20 out of 30 (67%) and another that affected only 1 out of 30 (3%), the average problem occurrence is the average of all three, or 34%.<\/p>\n This is different from the adjusted problem occurrence that takes into account the number of unique issues and may be a better estimate of problem occurrence, especially when the sample sizes are smaller than the ones in this analysis. Jim describes this in detail in Chapter 7<\/a> of Quantifying the User Experience.<\/p>\n\n\n
Five Users Reveal Most of the Common Issues<\/h2>\n