{"id":41595,"date":"2024-06-18T15:07:26","date_gmt":"2024-06-18T21:07:26","guid":{"rendered":"https:\/\/measuringu.com\/?p=41595"},"modified":"2024-06-18T15:07:26","modified_gmt":"2024-06-18T21:07:26","slug":"how-to-analyze-click-test-data-standalone","status":"publish","type":"post","link":"https:\/\/measuringu.com\/how-to-analyze-click-test-data-standalone\/","title":{"rendered":"How to Analyze Click Test Metrics in Stand-Alone Studies"},"content":{"rendered":"
As we cover in our short course<\/a>, click testing tends to be used in the design and release phases of product development, and it generates mostly quantitative data.<\/p>\n Our earlier analyses also showed how click testing provides a reasonable approximation for how people would click on a live website<\/a> or live product pages<\/a> (especially when the live web page doesn\u2019t contain dynamic elements).<\/p>\n Click testing can be thought of as a specialized type of usability testing (defined by the ISO specification of usability<\/a>). Therefore, essential click-testing metrics<\/a>, like usability metrics, can be classified as:<\/p>\n These metrics provide quantitative support for visually interpreted representations such as click maps and heat maps. Software like the MUiQ platform makes it relatively straightforward to see where people click or don\u2019t click on images.<\/p>\n Beyond just eyeballing the data, how do you correctly analyze click-testing metrics? The answer depends on the study setup and your research questions. In this article, we\u2019ll walk through the steps to analyze the data you\u2019ve collected for common research questions addressed with click testing.<\/p>\n We’ll use data from our earlier click-testing study in which participants attempted to locate information across five homepage website images: Creative Commons, NASA, Disqus, IKEA, and California Parks.<\/p>\n There are two main ways to analyze click-test data: stand-alone analyses and comparative analyses. In stand-alone analyses, you use confidence intervals around your sample of data to infer the plausible range of a population parameter like a mean or proportion. In an upcoming article, we\u2019ll cover how to make comparisons with click-testing data (assessing differences in click patterns).<\/p>\n To know which one to use, we\u2019ll start with the research questions and the metric used. When using the MUiQ platform, many of these computations are done automatically. Otherwise, you can use statistical packages or online calculators.<\/p>\n As the name suggests, a stand-alone analysis involves summarizing a metric from a sample to make inferences about the larger population of users who were not sampled. To do this, you provide an average or percentage (called a point estimate) and generate a confidence interval around this value to provide the plausible range within which the average would fluctuate if you were able to sample the entire population.<\/p>\n There are two types of confidence intervals for click-testing metrics\u2014one for percentages (the adjusted-Wald binomial confidence interval) and one for rating scale and time data (the t<\/em>-confidence interval, using a log transformation when analyzing time data).<\/p>\n Below are common research questions for the three types of metrics and examples of their corresponding confidence intervals. These examples come from our image versus live site click comparisons of five websites<\/a>: Creative Commons, NASA, Disqus, IKEA, and the California State website.<\/p>\n The following are examples of research questions measuring effectiveness that use percentages:<\/p>\n To analyze percentages, we recommend computing adjusted-Wald binomial confidence intervals. They can be computed using our online calculator<\/a> for binomial confidence intervals, or the intervals can be automatically generated in our MUiQ platform<\/a>.<\/p>\n Although some of the example research questions above could be interpreted as asking about total numbers (e.g., \u201cHow many participants \u2026\u201d), it\u2019s more useful to report percentages even when sample sizes are small<\/a> (dividing the number of participants who made the designated click by the total number of participants who were exposed to the image, thus having an opportunity to click). This percentage applies to the larger population that will eventually get exposed to the image (e.g., usually a web page).<\/p>\n For example, in our study on testing an image of the NASA home page, we wanted to know what percentage of participants would click on the \u201cAbout\u201d or \u201cHistory\u201d menu items shown in Figure 1.<\/p>\n Figure 1:<\/strong> Image from the Nasa.gov website with \u201cAbout\u201d and \u201cHistory\u201d hotspots identified.<\/p>\n Online calculator<\/em><\/a>. Of the 62 participants who attempted to click on the optimal locations on the NASA homepage image, 57 were successful. Figure 2 shows the 90% adjusted-Wald binomial confidence interval<\/a> from our online calculator for the task success rate for the image of the NASA site (determined by whether a participant\u2019s first click was on a valid area of interest on the image).<\/p>\n The calculator shows results for four different methods, but for these types of analyses, we recommend reporting the maximum-likelihood estimate (MLE) as the observed percentage and the adjusted-Wald method for the confidence interval (inside the green boxes in Figure 2). For these data, the observed percentage (MLE) is 91.9%, with the adjusted-Wald interval ranging from a low of 84.1% to a high of 96.3% (shown as ratios rather than percentages in the calculator).<\/p>\n Figure 2:<\/strong> 90% confidence interval for the NASA image condition\u2019s 57 successes out of 62 attempts using our online calculator.<\/p>\n MUiQ platform<\/em><\/a>.<\/em> Figure 3 shows the success rate analysis downloaded from the MUiQ platform with 90% adjusted-Wald confidence intervals for all five websites in both experimental conditions (image and live).<\/p>\n Figure 3:<\/strong> Task success rates (90% confidence intervals computed in the MUiQ platform).<\/p>\n When measuring efficiency in click, tests use time (measured in seconds and milliseconds). Examples of research questions include:<\/p>\n When you analyze time data, you should use log-transformed t<\/em>-confidence intervals, which can be done using our online calculator<\/a>, or they are automatically computed in the MUiQ platform<\/a>.<\/em><\/p>\n Online calculator<\/em><\/a>.<\/em> Figure 4 shows a 90% confidence interval from our online calculator, displaying task-completion times for the image of the Creative Commons homepage. The calculator shows three measures of central tendency: arithmetic mean, median, and our preferred estimate, the geometric mean<\/a>. The geometric mean was 23.1 seconds with a 90% confidence interval ranging from 20.0 to 26.6. This online calculator always computes the confidence interval around the geometric mean rather than the median. Our research<\/a> has shown that the geometric mean is more accurate when n<\/em> < 25, and it’s about as good as the median when sample sizes are larger.<\/p>\n Figure 4:<\/strong> Time to task completion for Creative Commons (90% confidence interval around the geometric mean).<\/p>\n MUiQ platform<\/em><\/a>.<\/em> Figure 5 shows the completion times and 90% confidence intervals computed in the MUiQ platform. The slight difference in the confidence interval for the Creative Commons image is due to presenting the confidence interval around the median rather than the geometric mean. (MUiQ computes confidence intervals around the median when n <\/em>> 25 and around the geometric mean when n<\/em> < 25.)<\/p>\n Figure 5:<\/strong> Time to task completion (90% confidence intervals around the median).<\/p>\n We generally recommend time rather than clicks <\/a>to measure efficiency, but clicks can complement time. Research questions that involve clicks may include:<\/p>\n To analyze clicks, we recommend using t<\/em>-confidence intervals, which can be computed in our online calculator<\/a> or automatically in the MUiQ platform<\/a>.<\/p>\n Online calculator<\/em><\/a>.<\/em> With a sample size of 62, the mean number of clicks to task completion for the Disqus image was 1.4 with a standard deviation of 1.9. The 90% confidence interval around the mean ranged from 0.997 to 1.803 (Figure 6).<\/p>\n Figure 6:<\/strong> 90% confidence interval for the mean number of clicks to task completion on the Disqus image.<\/p>\n MUiQ platform<\/em><\/a>.<\/em> Figure 7 shows 90% confidence intervals for all websites and conditions around the mean number of clicks to task completion. At a glance, it\u2019s apparent that except for IKEA, task completion on images took fewer clicks than on live websites.<\/p>\n Figure 7:<\/strong> 90% confidence intervals around mean clicks to task completion computed in MUiQ for all five websites and both conditions (image and live).<\/p>\n Examples of research questions assessing the perception of the experience include:<\/p>\n When you analyze rating scale data, you can compute the mean (the average), top-box responses (extreme responders), or both.<\/p>\n You can compute confidence intervals around the mean with t<\/em>-confidence intervals using our online calculator<\/a> or the MUiQ platform<\/a>.<\/p>\n Online calculator<\/em><\/a>.<\/em> With the same calculator we used to analyze clicks, the 90% confidence interval around the mean SEQ of 5.18 with a standard deviation of 1.71 and sample size of 62 ranges from 4.817 to 5.543 for IKEA in the image condition (Figure 8).<\/p>\n Figure 8:<\/strong> 90% confidence interval around the mean SEQ for the IKEA image (with summary data input).<\/p>\n MUiQ platform<\/em><\/a>. Figure 9 shows, for all websites and conditions, the 90% confidence intervals computed around the mean SEQ in the MUiQ platform.<\/p>\n Figure 9:<\/strong> Mean SEQ ratings with 90% confidence intervals computed in MUiQ for all five websites and both conditions (image and live).<\/p>\n Box scores are percentages based on the frequency distributions of responses to rating scales. There are different types of box score, including top box, top-two box, bottom box, and net box<\/a>. The top-box score is the percentage of most favorable responses (e.g., for the SEQ, the percentage of 7s). Like other percentages, we recommend computing adjusted-Wald binomial confidence intervals with our online calculator<\/a> for binomial confidence intervals.<\/p>\n For the 62 participants who attempted the IKEA task in the click test study we ran, 15 selected the response option of 7 for the SEQ, so the top-box score was 24.2% with a 90% confidence interval ranging from 16.4 to 34.2% (Figure 10).<\/p>\n<\/a>In an earlier article, we reviewed when and why to use click testing<\/a>. Click testing involves presenting images to participants and tracking where they click based on tasks participants are asked to complete. It\u2019s typically administered using a tool like the MUiQ\u00ae<\/sup> platform<\/a>.<\/p>\n
\n
Stand-Alone Analysis (Confidence Intervals)<\/h1>\n
Effectiveness Measures: Percentages<\/h2>\n
\n
How to Analyze<\/h3>\n
<\/a><\/p>\n
<\/a><\/p>\n
<\/a><\/p>\n
Efficiency Measures: Time<\/h2>\n
\n
How to Analyze<\/h3>\n
<\/a><\/p>\n
<\/a><\/p>\n
Efficiency Measures: Clicks<\/h2>\n
\n
How to Analyze<\/h3>\n
<\/a><\/p>\n
<\/a><\/p>\n
Perception Measures: Rating Scales (e.g., SEQ)<\/h2>\n
\n
How to Analyze<\/h3>\n
Analyzing The Mean<\/h4>\n
<\/a><\/p>\n
<\/a><\/p>\n
Analyzing Top-Box Scores<\/h4>\n