Percentages are popular.
Executives like them, the media reports them, and we hear them every day.
We use percentages all the time in UX research.
They can be based on demographics:
- Percent women
- Percent millennials
- Percent who are current customers
Or based on attitudes and actions:
- Percent completing a task
- Percent recommending a product
- Percent who click on a button
Percentages are popular for at least a few reasons:
- They are familiar. We learn early to convert fractions into proportions. 3/4 becomes .75 and moving the decimal over two places gets us the familiar 75%.
- They are bound from 0% to 100% (in most cases).
- They work for any sample size. Percentage literally means per one-hundred, so 75% can come from 3 out of 4 or from 75 out of 100. You can apply confidence intervals to both percentages to understand their precision. Also probabilities; as in, the probability a user will complete a task is 75%.
Converting to Percentages
Percentages are so popular that even measures that are more continuous often get converted into them. For example, you can report how long it takes users to book a rental car online in seconds, or as the percentage of users that took less than a minute.
Marketing departments often turn rating scale data into a percentage (top-box score) by dividing the number of people who picked the most favorable response (e.g. strongly agree) by the total number of responses. The idea is that a percentage is easier to understand than a mean—especially when you aren’t familiar with historical data or underlying scale.
This is also one of the reasons the Net Promoter Score is converted from an 11-point continuum to two percentages (percent promoting minus percent detracting).
While percentages can make raw data more interpretable, comparing percentages is often what leads to insight.
For example, we were working on a segmentation study with a client to understand their prospective customers and attitudes about a new online product. We included the usual array of demographic variables and an assortment of attitudinal data to assess intent and interest for around 1,000 respondents.
We found that the age of the participant in our study was a good predictor of interest. While we found there was generally strong stated interest across age groups, older participants (50+) were less interested (63% expressed strong interest) compared to those under 50 (86% expressed strong interest).
With another variable (2nd home ownership), 20% of the interested group owned a second home, compared to only 10% of the less interested group.
The 86% of younger participants is clearly more than 63% (by 23 percentage points) and 20% is greater than 10% (by 10 percentage points). We can compare the difference between proportions for statistical significance and we’d see they are statistically different (p <.05). But in this case, we want to know more than if they’re different—we want to know whether the size of the difference is large enough to differentiate between interested and not interested segments.
The first way to express the difference in interest is to create a ratio of the two percentages. For the difference in age groups, this becomes 86/63 = 1.37. To communicate this we basically subtract the 1 and convert the leftover to a percentage.
We can then say participants under 50 are 37% more likely to be interested than the older cohort of respondents.
For the difference in owning a second home, this becomes 20/10 = 2. We’d say that participants who own a second home are twice as likely to be strongly interested in the product. After subtracting the 1 we could also say 2nd home owners are 100% more likely to be interested, but once you reach 100%, it becomes easier to understand if you say twice.
This ratio is usually called relative risk as it’s often in the context of epidemiology or medical journals where the percentage represents the percent of people with a medical condition. Because we’re usually concerned about attitudes or actions (not life and death), I often call it the relative interest ratio. However, sometimes it’s good to think about this ratio as risk in the way we think of symptoms of a disease—as in there’s more risk of lung cancer for people who smoke. The data from relative interest comes from a sample so we can also add confidence intervals to understand the low and high end.
You may be tempted to say the odds someone is interested is twice as high for second homeowners. However, the odds are actually something different but are often misused in the media, the recent Powerball record lottery, and even in medical journals.
The odds are the likelihood something will happen divided by the likelihood that same thing will NOT happen. For example, according to the gamblers in Las Vegas, the likelihood the Denver Broncos will win Super Bowl 50 (as of this writing) are about 25%. The odds AGAINST them winning are then 4 to 1 (.75/.25).
So in the segmentation example, the odds are the percentages of interested participants divided by the percentage NOT interested for each of the variables we looked at.
For second homeowners, this is 20/80 = .25, which is usually communicated as fractional odds as 1 in 4. Therefore, we’d say the odds an interested participant owns a second home is 1 to 4. It also works in reverse (80/20 = 4) meaning the odds an interested participant does NOT own a second home is 4 to 1. It means we’d expect to see four times as many interested people NOT own a second home; most would not.
However, while most don’t own a second home, what we want to know is whether owning a second home is a lot more likely in the interested group than in the disinterested group. And it’s the relative comparison that often has more meaning, especially when segmenting your customers. To do this with odds, we use the odds ratio.
We’d then compare the odds an interested person owns a second home (20/80 = .25) to the odds a disinterested participant owns a second home (10/90) = .11) as a ratio of odds:
|20/80||= .25||= 2.27|
The odds someone is interested who owns a second home are 2.27 times the odds of someone not being interested. The odds ratio also has a confidence interval and can be tested for statistical significance.
In this case, the odds ratio and relative interest ratio are close (2.27 versus 2) and convey a similar message that people who own a second home are more interested in the product. But they aren’t always so close.
What to Use?
The term “odds,” like the term “significant,” has a specific statistical meaning but is also used widely in everyday use.
The odds and odds ratio are used in a number of statistical procedures (for example, logistic regression) as it has advantageous properties. However, I’ve found most people have a hard time deciphering what odds really mean and really treat it more like the relative interest ratio. For that reason, when you have a choice, I find it best to stick with the relative interest ratio when communicating the results.
- Percentages are popular because even when people know little about the underlying measure, they can more easily interpret a percentage: they work for any sized sample and are bound from 0 to 100%.
- Even more continuous measures like task time and rating scale data can be converted to percentages to make them more digestible.
- An effective way to compare the magnitude of the differences in percentages is to use the relative interest ratio (often called relative risk). It’s simply the ratio of two percentages and can be computed along with a confidence interval.
- The odds are the percentage of interest divided by the percentage of non-interest for the same measure.
- While the term odds is in general use, it’s not the same thing as relative interest.
- The odds ratio tells you the relative difference in the odds and can sometimes generate a similar ratio as the relative interest ratio.
- The relative interest is easier to understand than odds but easily confused for the odds ratio (for a general audience and technical audience).