But rarely can we talk to all users in the population we’re studying, whether they be consumers, lawyers, doctors or iPhone users.
Instead, we need to select a subset of the population and use this sample of users to make general inferences about the unknown total user population.
This can be used for estimating attitudes toward products, average completion rates, median task times, discovering usability problems or likelihoods to recommend.
The “right” sample largely depends on the research question you’re addressing. Here are seven sampling strategies to consider:
- Simple Random Sampling: This is the ideal method of sampling, but one of the most difficult to achieve. Ideally you can pick users at random and have them answer your survey or participate in a usability test. Rarely does this happen, but don’t let that discourage you. This is the same issue faced in all of the behavioral sciences and even medical research. You can’t randomly assign people cancer and much of what we know in psychology comes from sample groups made up of college freshman earning extra credit (neither random nor representative!).
- Starbucks Sampling: Usually called convenience sampling (but that doesn’t start with an S). Getting the average person on the street to attempt some tasks on a website usually reveals the obvious problems. This works best if you’re looking for generic consumers, students or anyone with a pulse. As your website becomes more specialized, bothering people at Starbucks will reveal fewer of the problems that users well-versed in a domain will have and may likely uncover some false positives–so use the double-skim-no-fat-latte option with care.
- Stratified Sampling: Sometimes it’s essential to subdivide a population of users. For one software application I worked on, we had teachers and students that we sampled in similar proportions. You can get carried away with stratification, such as stratifying by age, gender, income and geography. I’ll often see a client put out a long wish list of sampling attributes. While each of these can have a major impact on sensitive or political issues, when you ask people to attempt tasks or have them reflect on their experience, usually the major differentiators are product and domain experience.
- Snowball Sampling: In snowball sampling, you find a participant for your study and ask them if they know any friends or acquaintances who would also like to participate (think snowballs rolling down hill collecting more snow). This isn’t a random sampling process, but I’ve found it helpful when you have difficult to recruit users (think hardware engineers, dentists or physicians). This was the technique used in the famous (and famously criticized) Kinsey Reports on human sexuality.
- Spot Sampling : Want to know what problems users are trying to solve during their workday? Instead of tracking their every minute, randomly sample times through the day and week to get a better idea about actual usage. This can easily be done using emails or through text messages.
- Sequential Sampling: Not sure how long the study should continue or how many emails you need to send out to for your survey? Start with a reasonable sample size, then plan on stopping or increasing your sample as you review responses and compute the confidence interval.
- Serial Sampling: Sometimes all you have are serial numbers from products, such as from the competition, or a bad record keeping system for your product, or receipts from purchases. If you want to estimate the total number of products, you can use the information on serial numbers, which are sequential.
You can estimate the total number bought, produced, sold, shipped, used, etc. by using the formula : N = m (1 + k -1) – 1 where m is the largest serial number observed and k is your sample size.
For example, suppose you had product serial numbers of 1001, 1050, 2012 and 2079 and you wanted to know about how many were used. The maximum serial number is 2079 and the total sample is just 4:
N = m (1 + k-1) – 1 = 2079 (1 + 4 -1) – 1 = 2597.75
So your best guess is about 2598 total products. This strategy was successfully used to estimate the number of German tanks made during WWII.