Having participants think aloud is a valuable tool used in UX research.
It’s primarily used to understand participants’ mental processes, which can ultimately uncover problems with an interface.
It has a rich history in the behavioral sciences that dates back over a century. Despite its value, it’s not without its controversy.
Some research has shown that depending on the activity, having participants think aloud can actually affect their behavior [pdf].
But further research has shown that how the behavior is affected is very much a matter of context and what’s being asked. Sweeping generalizations about the effect of think aloud on behavior should be used with caution.
There are a number of ways to assess the effects of thinking aloud on behavior, including
- Number and type of usability problems uncovered
- Task metrics (e.g. completion rates, time, clicks)
- Where people look
- How long people look at elements
- Comprehension and recall
- Purchase behavior
- Scores on standardized questionnaires
One good place to start investigating the effects of thinking aloud is how and where people look at webpages. If thinking aloud causes people to systematically look at each part of a page differently, then it’s likely other metrics can be affected: for example, task time, usability problems encountered, and attitudinal metrics collected in standardized questionnaires. Tracking where participants look is a good place to start because it’s more sensitive to subtle differences compared to blunt measures like task completion rates.
Earlier research on the effects of thinking aloud (often called Concurrent Thinking Aloud or CTA) on where and how people look has mixed results. Eger found [pdf] that more problems were uncovered when participants thought aloud when watching videos of their own gaze paths (called Retrospective Think Aloud) compared to Concurrently Thinking Aloud. They also found participants completed fewer tasks and had slightly longer task completion times when they thought aloud. They didn’t indicate how participants gaze paths were or were not affected though.
In a study by Hertzum et al [pdf], researchers found participants took longer to solve tasks and had an increase in mental workload while thinking aloud; they also found some difference in where people looked when thinking aloud. They separated thinking aloud into classic think aloud (as explained by Ericsson and Simon [pdf]) and what they call Relaxed Thinking Aloud, something more akin to what practitioners do today).
During Relaxed Thinking Aloud they found eye-movements (saccades) were of shorter duration for an assessment task and suggested it was a function of decreased visual search.
Another study by Ogolla [pdf] found that participants had more fixations and more of the screen was viewed on low information scent tasks when participants thought aloud. This suggested participants “looked around more” when thinking aloud, something akin to “surveying” a page before beginning a task.
In another study of 95 adults viewing the U.S. Census Bureau website, Romano Bergstrom and Olmsted-Hawala found thinking aloud affected where participants looked on the website. They reported different numbers of eye-fixations on the top and left navigation areas of the website across two tasks depending on whether the participants were concurrently thinking aloud or retrospectively thinking aloud.
The difference was most pronounced between the oldest and youngest cohort and within the young cohort. For example, young adults had twice the number of eye-fixations in the navigation when concurrently thinking aloud.
We wanted to further understand how thinking aloud may affect where and how participants view websites but wanted to include more websites and a variety of layouts (also a suggestion for future research in the Bergstrom and Olmsted-Hawala study).
To help minimize the variables, we started with a core activity: viewing the homepage of a website. In earlier research we’ve seen that first impressions of websites can have a significant impact on attitudinal measures and task success. But does thinking aloud affect where people look on homepages?
We selected 20 homepages across a few industries and measured participants’ gaze paths in our lab in Denver, CO using our SMI eye tracker.
Pier 1 Imports
To mitigate the effects of between-person variability and increase the statistical power, we employed a within-subjects approach. We randomly assigned 13 participants to view all 20 websites. Participants were randomly assigned 10 of the websites to think aloud and the other half to view without being prompted to think aloud.
Areas of Interest
A challenge with using multiple websites is that the layout can differ quite substantially between websites. Some websites have more prominent navigation, hero images, or logos. Apple for example doesn’t have side navigation. We identified four areas of interest (AOIs) that were common across most websites and mentioned in previous research: upper navigation, side navigation, the company logo, and the main content area. An example of these areas of interest is shown in the Zappos website in Figure 1.
Figure 1: Areas of interests (AOIs) on the Zappos homepage.
Participants who were instructed to think aloud while viewing the websites, were asked to tell the moderator anything they found interesting, things they liked, and things they didn’t like. They were then shown the homepage for five seconds. After the five seconds they were asked to briefly describe what they saw on the website (in typed words) and answer the 8 SUPR-Q items. Our primary dependent variables are the number of times (fixation count) and total time people looked (dwell time) at these common areas of interest.
Even with this relatively small sample size we found a statistically significant difference in viewing patterns based on whether participants were thinking aloud. Figure 2 shows the heatmaps for all websites when participants thought aloud and when they didn’t. The heatmaps show the aggregated fixations from the participants across the 20 websites.
|Not Thinking Aloud||Thinking Aloud|
Figure 2: Aggregated heatmaps for the 20 websites when participants didn’t think aloud versus when thinking aloud.
You can also see a slightly different fixation pattern from the participants for the Zappos website in Figure 3.
|Not Thinking Aloud||Thinking Aloud|
Figure 3: Aggregated heatmaps for the Zappos website.
When we looked at differences in the AOIs we also saw some different patterns in the upper navigation and left navigation.
Figure 4 shows the difference in the mean dwell time and mean number of fixations on the upper navigation when participants thought aloud.
Figure 4: Differences in dwell time and fixation counts on the upper navigation.
We found fewer average fixations (1.4 versus 2; p=.002) and a shorter dwell time (353s versus 552 ms; p=.008) on the upper navigation when participants were thinking aloud (. That is participants spent about 30% less time looking at the upper navigation when thinking aloud.
We found slightly more fixations (2.5 versus 1.9; p=.12) and a longer average dwell time (1022 ms versus 577 ms; p = .02) on the left navigation when participants were thinking aloud—or about 77% more time looking at the left navigation when thinking aloud. Figure 5 charts these differences.
Figure 5: Differences in dwell time and fixation counts on the left navigation. Fixation count difference is not statistically significant (p=.12).
This study found that participants tended to shift their gazes from the upper navigation to the left navigation when thinking aloud. The results partially matched the findings from the Bergstrom and Olmsted-Hawala study. They found generally more fixations for both upper and left navigation when participants were thinking aloud. So while both studies show a clear effect when participants think aloud, there is an inconsistent pattern as to where the visual focus shifts. An advantage of using multiple websites in this study is that the pattern we observed is less likely a consequence of the design of one website (such as the Census Bureau website in the Bergstrom and Olmsted-Hawala study).
It’s been hypothesized that one reason a difference exists is that participants are looking away from the screen [pdf] to describe something to the researcher or focusing on certain areas of the screen while describing their thought processes regarding that area.
Another explanation for the shift is that participants are quickly leaving the top navigation and moving to the left navigation to find things to read (especially because the participants knew they would be asked to recall what the website did). Future research can examine whether this pattern holds.
Weakness & Future Research
There are a few shortcomings and possible mitigating factors that can be addressed in future research.
- Homepages: Only homepages were used in this study and not an entire website. Participants were also not asked to complete a task while thinking aloud on the homepage. Future research can continue to explore whether these results continue to hold when participants engage in a task, rather than just viewing a homepage for five seconds.
- Carryover effects in a within-subjects study: Because of the within-subjects nature of this study, it’s possible that the repeated websites (20 websites total; 10 viewed while thinking aloud) may have primed participants to act differently. For example, after a few websites, participants knew they would be asked what they recalled on the website.
It’s possible that participants may have employed a strategy for systematically looking at certain parts of a webpage (such as the left navigation). However, participants were asked the same questions and had the same number of websites to recall for both thinking aloud and not thinking aloud, which would mean these unwanted carryover effects are equally applied to thinking aloud and non-thinking aloud conditions.
- Different gaze patterns might not translate to other metrics: Just because people had different gaze paths doesn’t necessarily mean it translates into other behaviors, such as task completion rates, time on task, attitudes, or buying behavior. This research did establish more evidence that thinking aloud does affect where participants look and other research has found differences in metrics from participants thinking aloud. Future research can continue to examine how thinking aloud affects other metrics.
The results of this study found that thinking aloud does affect where and how long people look at parts of a website homepage in the first five seconds. The results were consistent with earlier research that also found thinking aloud affects where participants look (at least in some capacity and depending on how participants are asked to think aloud). In this study, thinking aloud tended to shift participants’ gaze from the upper navigation of the webpage to the left navigation. It’s unclear if this pattern will hold with different tasks and webpages.
These results add to earlier research that shows thinking aloud can affect behavior, including where people look. However as other researchers have stated, it doesn’t mean we should abandon thinking aloud. What’s gained in understanding participants’ thoughts and its ability to help uncover usability problems likely outweighs the slight changes in behavior. This research suggests that when you need a precise measure of where and how long people look at parts of a homepage, don’t prompt participants to think aloud.
Thanks to Preston Malenke for assisting with the data collection and analysis.
|UX Measurement Boot Camp : Three Days of Intensive Training on UX Methods, Metrics and Measurement Aug. 7th-9th 2019|