Are Click Scales More Sensitive than Radio Button Scales?

Jim Lewis, PhD • Jeff Sauro, PhD

Feature image with click scaleResponse scales are a basic type of interface. They should reflect the attitude of the respondent as precisely as possible while requiring as little effort as possible to answer.

When collecting data from participants, some wished they could have picked a value between two numbers, for example, 5.5 or 6.5, which can be done with slider scales.

In previous research, we found that UX-Lite® data collected with slider scales produced slightly more sensitive measurements than comparable data collected with standard five-point scales with radio buttons.

When all else is equal, researchers prefer to use scales that are more sensitive. The choice between multipoint rating scales and sliders, however, is complicated by the relative physical difficulty of setting a slider to a desired location compared to just clicking a radio button.

This led us to use the click testing capability of our MUiQ® platform to experiment with a new rating format—the click scale, where we present an image of a scale on which a respondent can click anywhere.

Using the Single Ease Question (SEQ®) as an example, Figure 1 shows the standard format on which respondents select an integer on the scale by clicking its associated radio button, while Figure 2 shows an example of a click format on which respondents can simply click anywhere on the image, whether directly on a number or somewhere between the numbers (see Figure 3).

Standard SEQ where respondents click on the radio button

Figure 1: Standard SEQ where respondents click on the radio button.

The click SEQ where respondents click anywhere on the image.

Figure 2: The click SEQ where respondents click anywhere on the image.

Example of responses on the click SEQ (rating of ease of most recent tax filing).

Figure 3: Example of responses on the click SEQ (rating of ease of most recent tax filing).

There are two ways to assign numbers to locations on a click scale, which are illustrated in Figure 4. One is the location of the click on the horizontal scale from 0 to 100; the other is to define hotspots on the image associated with the seven response options and then  interpolate the seven-point scale to match the range of the horizontal scale for easier comparison of means and standard deviations.

Two methods for assigning numbers to responses on the SEQ click scale.

Figure 4: Two methods for assigning numbers to responses on the SEQ click scale.

There are many interesting topics to investigate as we experiment with these new click scales. For this article, we collected data to compare the sensitivity of the two methods for assigning numbers to responses on click scales and the sensitivity of the SEQ click scale with the standard (radio button) SEQ scale.

Study Details

We used our MUiQ® platform to conduct an unmoderated usability study with 200 participants (sampled from a U.S. panel provider in October and November 2023). Participants completed five exercises that were a mix of retrospective and task-based activities with varying difficulty (Table 1) presented in random order. After each task, respondents rated task ease with the click SEQ.

Task CodeTypeDescription
XFITask-basedImagine you are helping your friend troubleshoot a problem with their cable TV and internet service. Find the Xfinity customer service phone number to call for support.
ATTTask-basedYou will be asked to find the monthly cost of an iPhone 14 with a specific plan on the AT&T website. Copy or remember the monthly cost (including all fees.
TAXRetrospectiveReflect on the last time you filed your taxes.
SHATask-basedOne of the numbered figures in the drawing below is more different than the others. What is the number in that figure?" (Note: The fourth figure had six sides while all the others had four.)
AMARetrospectivePlease reflect on your most recent purchase from Amazon.com.

Table 1: The five tasks in descending order from most difficult to easiest.

Study Results

We analyzed means and standard deviations for the two measurements (horizontal locations from 0 to 100 and hotspot regions from 1 to 7). We expected differences in task ratings because the tasks were specifically designed to cover a wide range from very easy to very difficult.

SEQ Click Scale Means

As shown in Figure 5, the pattern of means was similar for both click scale metrics. A within-subjects ANOVA with task and metric as independent variables found both main effects and their interaction to be statistically significant. As expected, the differences among the tasks had the strongest effect (F(4, 796) = 229.1, p < .0001, partial η2 = .54). The main effect of SEQ click metric was the third strongest (F(1, 199) = 57.2, p < .0001, partial η2 = .22, overall difference: 1.4, which is less than 2% of the scale range). The interaction between task and metric was the second strongest (F(4, 796) = 129.3, p < .0001, partial η2 = .22). Despite the statistically significant differences due to metric and their interaction with task, the metric profiles across tasks in Figure 5 were similar enough to question whether the differences have practical significance.

Interaction between metric and task.

Figure 5: Interaction between metric and task.

SEQ Click Scale Standard Deviations

Table 2 shows the standard deviations for each metric and task. Across the five tasks, the mean standard deviations were 23.4 for the horizontal position metric and 27.7 for the hotspot metric.

TaskHorizontal PositionHotspot
XFI30.935.7
ATT26.131.3
TAX22.426.5
SHA22.026.4
AMA11.113.1
Mean23.427.7

Table 2: Standard deviations for horizontal position and hotspot metrics for each task.

This difference in standard deviations isn’t that surprising. In general, you expect to lose sensitivity when converting a metric from many to a few values.

This difference in standard deviations doesn’t seem that large. In a previous article, however, we demonstrated how changes in standard deviation affect scale sensitivity such that, all other things being equal, sample size requirements are smaller with metrics that have smaller standard deviations (i.e., the study is more efficient).

In this case, if a researcher decided to use the hotspot measurement instead of the horizontal position, the increase in sample size estimate (assuming no other changes) would be 40%:

Formula describing the increase in sample size estimate for the hotspot metric.

This indicates that the horizontal position is a better click metric than the hotspot for this type of scale.

But how about comparing horizontal position with the standard SEQ (radio buttons)?

We didn’t collect the SEQ with radio buttons in this experiment, but we have two estimates from previous research focused on comparing the standard SEQ with different click versions of the Subjective Mental Effort Question (SMEQ) using the same tasks as in the current experiment. Our standard (radio button) SEQ variability estimates in those experiments were 23.3 and 24.5 (averaging 23.9).

The combined estimate of 23.9 from previous research is very close to the 23.4 for the horizontal position in the current study—close enough that it’s reasonable to question whether it would be worth using click SEQ in place of the standard SEQ.

In our comparison of slider scales with five-point radio button scales to measure the UX-Lite, our conclusion was, “It’s possible that using scales with more than five points would reduce the sensitivity advantage of sliders (another good topic for future research).”

The results of this new experiment do not firmly prove that seven-point scales will always be as sensitive as more fine-grained slider or click scales, but they are consistent with that hypothesis.

Discussion and Summary

We collected data from 200 participants who used the click version of the SEQ to rate the difficulty of completing five online tasks that varied significantly in how hard they were. We compared the measurement sensitivity (standard deviation) of two metrics derived from click SEQ responses (horizontal position and hotspot) with data from previous studies that used the standard (radio button) version of the SEQ. Our key findings were:

The click SEQ does not appear to be more sensitive than the standard (radio button) SEQ. Our best estimate of the standard deviation of the click SEQ across these five tasks is 23.4, just a bit lower than our best estimate of the radio button SEQ standard deviation from two previous studies of those five tasks (23.9).

The slight sensitivity advantage we previously reported for slider scales over five-point scales seems to have disappeared with the seven-point SEQ. This result was not totally unexpected because rating scales with more points tend to be more reliable and sensitive than those with fewer points. There were other differences between the studies: the current study used a click scale and the former a slider, so it will take more experiments to disentangle these effects.

For click scale data, horizontal position metrics are more sensitive than hotspot metrics. The standard deviation for hotspot metrics across five tasks was 27.7, sufficiently larger than the standard deviation for the horizontal position metric (23.4) that researchers using click scales should use horizontal position rather than hotspot metrics.

Limits to generalization: To date, we have conducted only three studies with click scales and only one with the click SEQ; additional studies may identify sensitivity advantages to clicking.

Future research: Additional studies that would improve the generalizability of these findings should include comparisons of standard (radio button) scales with three, five, seven, and eleven response options with slider and click versions of the scales.

0
    0
    Your Cart
    Your cart is emptyReturn to Shop
    Scroll to Top