There are a number of variables that affect UX metrics.
In most cases though, you’ll simply want to measure the user experience and not these other “nuisance variables” that may mask the experience users have with an interface.
This is especially the case when making comparisons. In a comparative analysis you use multiple measures to determine which website or product is superior.
Three of the most common variables that can affect UX metrics and mask differences in designs are:
- Prior experience
- Domain knowledge and skills
- Brand attitudes
In an earlier post I discussed the effects of prior experience and how it can be measured and controlled for. Here I’ll describe the more nebulous nature of brand attitudes and how best to measure and control for them in UX studies.
Attitudes toward People and Brands
A user’s attitude toward a company or product can heavily influence (for better or worse) their attitude toward an experience. People treat companies and brands like they treat people. Think of someone in your life who you generally don’t like (a colleague, manager, or friend of a friend). This person may annoy you, or you disagree with on many issues.
If I asked you to rate your favorability toward them on a seven-point scale, chances are, it’d be on the low side. This unfavorable attitude would then likely carry over to how you judge the quality of his or her work and ideas. In contrast, think of someone you like, admire, and respect. When you judge this person’s work and ideas you’re likely to cast a more positive “halo” on their efforts.
The same concept applies for companies and their brands. If you don’t like a company (for example Coca-Cola or Walmart), or one of their brands (Odwalla or Sam’s Club), you’re probably not going to like their website and in turn rate the experience less favorably.
Controlling for Brand
While it’s realistic to have a mix of brand lovers and haters in a UX study, you’ll often want to understand how much the experience differs despite these variations in participants’ brand attitudes.
For example, I have data from a comparative website benchmark study we conducted with well-known consumer retail brands. The study was between-subjects with 592 participants and four websites. At the beginning of the study we asked participants to rate their favorability toward the brands used in the study. At the end of the study, participants answered the 8-item SUPR-Q—a reliable measure of the quality of the website user experience. You can see the scores by website in Figure 1 below.
Figure 1: SUPR-Q scores for four websites (un-corrected). Error bars are 95% confidence intervals.
In Figure 1, website A had the highest SUPR-Q score and separated itself from the other websites. Running an Analysis of Variance (ANOVA) we find at least one website is statistically significant from the others F(3,589)=6.31; p <.001. You can also see this difference visually from the lack of overlap in the confidence intervals between website A and the other sites.
But Figure 2 below shows the same participants’ brand favorability ratings on a 7-point scale (1 = very unfavorable and 7 = favorable).
Figure 2: Brand favorability score for the four websites.
We see the same pattern as the SUPR-Q scores in Figure 1. Participants in particular had a strong affinity toward Brand A. In fact, the correlation between brand favorability and SUPR-Q scores in this study is reasonably large ( r = .52). But when measuring the website user experience, we usually want to measure how the website affected attitudes about usability and not just a preexisting measure of brand attitude.
In other words, what we want to know is, how much of the differences in SUPR-Q scores are due to the website and not brand favorability? To find out we can use two approaches: the ANCOVA or create subgroups by splitting the data.
The Analysis of Covariance (called ANCOVA) will “partial” out the correlated effects of brand attitudes to tell us if the website experiences continue to differ. Both SPSS and R have ANCOVA procedures and they both allow you to save the corrected scores, which you can then use as a new variable and to visualize. Figure 3 below shows the new “corrected” means from the ANCOVA along with the original SUPR-Q mean scores by website.
Figure 3: Raw and corrected SUPR-Q scores for brand favorability for four websites.
After taking into account brand favorability, the differences between websites are no longer statistically significant F(3,593) = 1.790; p = .148. You can also see that website A’s error bars are now overlapping with the other websites, also showing the loss of statistical significance. In other words, after accounting for brand favorability, attitudes toward the quality of the website user experience are no longer substantially different. Accounting for brand favorability in this study makes a difference in our conclusions—the sites aren’t substantially different (not statistically) given the tasks and types of users.
Compare Brand Subgroups
If you aren’t able to run an ANCOVA on your data, you can also take a simpler approach by segmenting your data into High versus Low Brand Favorability groups and then running the comparisons again. For example, I created a new segment of high brand favorability (responses of 7 on the brand favorability scale) and a lower brand favorability segment (1 to 6 on the same scale) and then graphed the SUPR-Q scores.
You can see in Figure 4 below that while there’s a noticeable difference in scores between the high and low brand groups, there’s much less differentiation within each group. The confidence intervals overlap and the results of the ANOVAs are not significant for the low brand group F(3,160) =1.89; p = .13 or the high groups F(3,195) =1.57; p = .20.
Figure 4: SUPR-Q scores across four websites for low brand favorability (green bars) and high brand favorability (peach bars).
As we’ve seen with this example, brand attitudes matter and they can explain much of the differences in UX metrics. But they don’t always render the differences as not significant.
In another study with five websites and 492 participants the differences between sites remained even after factoring in brand favorability. Brand favorability again was correlated with SUPR-Q scores (r = .46). The ANCOVA results still showed statistically significant differences between websites after removing existing brand attitudes F(4,467) =3.10; p = .02.
Figure 5 shows how the differences remain after factoring out brand favorability. Website Q had higher SUPR-Q scores than the other websites with the raw scores (orange bars) and, while this difference was reduced, still remained with the corrected scores from the ANCOVA (blue bars).
Figure 5: Raw and corrected SUPR-Q scores for brand favorability on five websites.
A few things to keep in mind when measuring the user experience and accounting for brand favorability:
- Be sure to track brand favorability. Even a simple single item asking how favorable your participants feel towards a brand is usually sufficient. Collect this prior to any UX metrics or tasks.
- Brand favorability has a strong and statistically significant impact on UX metrics. The correlation is often above r = .4, meaning brand accounts for at least 16% of the variation in UX metrics (which often exceeds the differences between interfaces being compared).
- To control for the effects of brand favorability you can use an ANCOVA (preferred) or create a segment of low and high brand favorability participants to see whether the patterns in metrics hold.