Confidence intervals are your frenemies.
They are one of the most useful statistical techniques you can apply to customer data. At the same time they can be perplexing and cumbersome.
But confidence intervals provide an essential understanding of how much faith we can have in our sample estimates, from any sample size, from 2 to 2 million. They provide the most likely range for the unknown population of all customers (if we could somehow measure them all).
A confidence interval pushes the comfort threshold of both user researchers and managers. People aren’t often used to seeing them in reports, but that’s not because they aren’t useful but because there’s confusion around both how to compute them and how to interpret them. While it will probably take time to appreciate and use confidence intervals, let me assure you it’s worth the pain. Here is a peek behind the statistical curtain to show you that it’s not black magic or quantum mechanics that provide the insights.
To compute a confidence interval, you first need to determine if your data is continuous or discrete binary. Continuous data are metrics like rating scales, task-time, revenue, weight, height or temperature. Discrete binary data takes only two values, pass/fail, yes/no, agree/disagree and is coded with a 1 (pass) or 0 (fail).
To compute a 95% confidence interval, you need three pieces of data:
- The mean (for continuous data) or proportion (for binary data)
- The standard deviation, which describes how dispersed the data is around the average
- The sample size
Continuous data example
Imagine you asked 50 customers how satisfied they were with their recent experience with your product on an 7 point scale, with 1 = not at all satisfied and 7 = extremely satisfied.
- Find the mean by adding up the scores for each of the 50 customers and divide by the total number of responses (which is 50). If you have Excel, you can use the function =AVERAGE() for this step. For the purpose of this example, I have an average response of 6.
- Compute the standard deviation. You can use the Excel formula = STDEV() for all 50 values or the online calculator. I have a sample standard deviation of 1.2.
- Compute the standard error by dividing the standard deviation by the square root of the sample size: 1.2/ √(50) = .17.
- Compute the margin of error by multiplying the standard error by 2. 17 x 2 = .34.
- Compute the confidence interval by adding the margin of error to the mean from Step 1 and then subtracting the margin of error from the mean:5.96+.34=6.3
5.96-.34=5.6
We now have a 95% confidence interval of 5.6 to 6.3. Our best estimate of what the entire customer population’s average satisfaction is between 5.6 to 6.3.
If you have a smaller sample, you need to use a multiple slightly greater than 2. You can find what multiple you need by using the online calculator. Note: There is also a special calculator when dealing with task-times.
Now try two more examples from data we’ve collected.
Example 1
Fourteen users attempted to add a channel on their cable TV to a list of favorites. After the task they rated the difficulty on the 7 point Single Ease Question. Compute the 95% confidence interval. The responses are shown below
2, 6, 4, 1, 7, 3, 6, 1, 7, 1, 6, 5, 1, 1
- Find the mean: 3.64
- Compute the standard deviation: 2.47
- Compute the standard error by dividing the standard deviation by the square root of the sample size: 2.47/ √(14) = .66
- Compute the margin of error by multiplying the standard error by 2. .66 x 2 = 1.3
- Compute the confidence interval by adding the margin of error to the mean from Step 1 and then subtracting the margin of error from the mean:3.64-1.3 = 2.3
3.64+1.3 = 4.94
The 95% confidence interval is 2.3 to 4.94. From several hundred tasks, the average score of the SEQ is around a 5.2. This confidence interval tells us that we can be fairly confident that this task is harder than average because the upper boundary of the confidence interval (4.94) is still below the historical average of 5.2
Example 2
The brand favorability rating of LinkedIN on a five point scale from 62 participants was 4.32 with a standard deviation of .845. What is the 95% confidence interval?
- Find the mean: 4.32
- Compute the standard deviation: .845
- Compute the standard error by dividing the standard deviation by the square root of the sample size: .845/ √(62) = .11
- Compute the margin of error by multiplying the standard error by 2. .11 x 2 = .22
- Compute the confidence interval by adding the margin of error to the mean from Step 1 and then subtracting the margin of error from the mean:4.32+.22 = 4.54
4.32-.22 = 4.10
The 95% confidence interval is 4.10 to 4.54. We don’t have any historical data using this 5-point branding scale, however, historically, scores above 80% of the maximum value tend to be above average (4 out of 5 on a 5 point scale). Therefore we can be fairly confident that the brand favorability toward LinkedIN is at least above the average threshold of 4 because the lower end of the confidence interval exceeds 4.
Discrete Binary example
Imagine you asked 50 customers if they are going to repurchase your service in the future. Using a dummy variable you can code yes = 1 and no = 0. If 40 out of 50 reported their intent to repurchase, you can use the Adjusted Wald technique to find your confidence interval:
- Find the average by adding all the 1’s and dividing by the number of responses. 40/50=.8
- Adjust the proportion to make it more accurate by adding 2 to the numerator (the number of 1s) and the adjusted sample size by adding 4 to the denominator (total responses). Then divide the result.
40+2 = 42
50+4 = 54 (this is the adjusted sample size)
42/54 = .78 (this is your adjusted proportion) - Compute the standard error for proportion data.
- Multiply the adjusted proportion by 1 – the adjusted proportion.
.78 * ( 1-.78 )=.17 - Divide the result of step a by the adjusted sample size from step 2.
.17/ 54 = .0032 - Take the square root of the value from step b.
0032= .056
- Multiply the adjusted proportion by 1 – the adjusted proportion.
- Compute the margin of error by multiplying the standard error (result from step 3c) by 2.
.56×2=.11 - Compute the confidence interval by adding the margin of error from the sample proportion from step 2 and then subtracting the margin of error from the sample proportion.
.8+.11=.91
.8-.11=.69
The 95% confidence interval is .69 to .91. Our best estimate of the entire customer population’s intent to repurchase is between 69% and 91%.
Note: I’ve rounded the values to keep the steps simple. If you want more a more precise confidence interval, use the online calculator and feel free to read the mathematical foundation for this interval in Chapter 3 of our book, Quantifying the User Experience.
Now try some examples yourself from actual data we’ve collected.
Example 1:
If 6 out of 8 participants have a problem installing a printer from the printed installation instructions, what’s the best estimate for the minimum number of customers that would also have a problem.
- Find the proportion: 6/8=.75
- Adjust the proportion to make it more accurate by adding 2 to the numerator (the number of 1s) and the adjusted sample size by adding 4 to the denominator (total responses). Then divide the result.
6+2 = 8
8+4 = 12 (this is the adjusted sample size)
8/12 = .667 (this is your adjusted proportion) - Compute the standard error for proportion data.
- Multiply the adjusted proportion by 1 – the adjusted proportion.
.667 * ( 1-.667 ) = .22 - Divide the result of step a by the adjusted sample size from step 2.
.22/12 = .019 - Take the square root of the value from step b.
&radical;(.019) = .14
- Multiply the adjusted proportion by 1 – the adjusted proportion.
- Compute the margin of error by multiplying the standard error (result from step 3c) by 2.
.14×2=.28 - Compute the confidence interval by adding the margin of error from the sample proportion from step 2 and then subtracting the margin of error from the sample proportion.
.667+.28=.91
.667-.28=.39
The 95% confidence interval is .39 to .91. That means we’re pretty sure that almost 40% of customers would install the printer wrong and likely call customer support or return the printer (true story).
Example 2:
If 5 out of 16 participants in a study mention they don’t pay credit card bills online because they fear their credit card information will be stolen, what’s the best estimate for the percent of all customers who feel this way?
- Find the proportion: 5/16=.31
- Adjust the proportion to make it more accurate by adding 2 to the numerator (the number of 1s) and the adjusted sample size by adding 4 to the denominator (total responses). Then divide the result.
5+2 = 7
16+4 = 20 (this is the adjusted sample size)
7/20= .35 (this is your adjusted proportion) - Compute the standard error for proportion data.
- Multiply the adjusted proportion by 1 – the adjusted proportion.
.35 * ( 1-.35 ) = .23 - Divide the result of step a by the adjusted sample size from step 2.
.23/20 = .011 - Take the square root of the value from step b.
&radical;(.019) = .11
- Multiply the adjusted proportion by 1 – the adjusted proportion.
- Compute the margin of error by multiplying the standard error (result from step 3c) by 2.
.11×2=.22 - Compute the confidence interval by adding the margin of error from the sample proportion from step 2 and then subtracting the margin of error from the sample proportion.
.35+.22=.57
.35-.22=.13
The 95% confidence interval is .13 to .57. That means we’re pretty sure that at least 13% of customers have security as a major reason why they don’t pay their credit card bills using mobile apps (also a true story).
Example 3:
If 3 out of 11 website visitors had a problem downloading and installing AutoCAD because they picked the wrong operating system on the download screen, what is our best estimate for the total percentage of website visitors who will also encounter this problem?
- Find the proportion: 3/11=.73
- Adjust the proportion to make it more accurate by adding 2 to the numerator (the number of 1s) and the adjusted sample size by adding 4 to the denominator (total responses). Then divide the result.
3+2 = 5
11+4 = 15 (this is the adjusted sample size)
5/15= .333 (this is your adjusted proportion) - Compute the standard error for proportion data.
- Multiply the adjusted proportion by 1 – the adjusted proportion.
.333 * ( 1-.333 ) = .22 - Divide the result of step a by the adjusted sample size from step 2.
.22/15 = .015 - Take the square root of the value from step b.
&radical;.015 = .12
- Multiply the adjusted proportion by 1 – the adjusted proportion.
- Compute the margin of error by multiplying the standard error (result from step 3c) by 2.
.12×2=.24 - Compute the confidence interval by adding the margin of error from the sample proportion from step 2 and then subtracting the margin of error from the sample proportion.
.333+.24= .57
.333-.24=.09
The 95% confidence interval is .09 to .57. That means we’re pretty sure that at least 9% of prospective customers will likely have problems selecting the correct operating system during the installation process (yes, also a true story).