{"id":40896,"date":"2024-04-23T18:05:50","date_gmt":"2024-04-24T00:05:50","guid":{"rendered":"https:\/\/measuringu.com\/?p=40896"},"modified":"2024-04-23T14:07:31","modified_gmt":"2024-04-23T20:07:31","slug":"how-long-are-typical-unmoderated-ux-tasks","status":"publish","type":"post","link":"https:\/\/measuringu.com\/how-long-are-typical-unmoderated-ux-tasks\/","title":{"rendered":"How Long Are Typical Unmoderated UX Tasks?"},"content":{"rendered":"

\"feature<\/a>A common logistical consideration when planning a task-based usability study is how much time you should plan for a task.<\/p>\n

Many usability studies (especially benchmark studies) suffer from trying to do too many things. That includes asking participants to attempt too many tasks. It\u2019s understandable why tasks get packed in\u2014even low-cost usability testing takes time and money, so you want to make the most of the effort. This is especially the case when participants are difficult or expensive to recruit.<\/p>\n

You want to be able to cover as many tasks as possible, but if you have too many tasks, you won\u2019t get through them in the allotted time. It would be good to know how long it usually takes to complete typical tasks while planning the study.<\/p>\n

Of course, task duration depends on the context. The application being tested, participants\u2019 roles, research goals, research protocols (e.g., think-aloud<\/a>, non-think-aloud<\/a>, tree test<\/a>), and the data collection mode (moderated or unmoderated<\/a>) will all play a role.<\/p>\n

We had a similar challenge a few years ago when we investigated the \u201caverage\u201d task completion rate<\/a>, but task times seem even more context dependent than completion rates.<\/p>\n

In this case, though, we\u2019re analyzing completion times not to set a performance benchmark<\/strong> (which is highly context dependent) but rather to get an idea about typical task duration for planning purposes. This will help researchers quickly calculate how many tasks they may plan to include given a study\u2019s time constraints.<\/p>\n

We narrowed our focus to unmoderated studies because participants can get stuck or flounder for a long time without a moderator. We also limited ourselves to traditional usability tasks with small numbers of precisely defined completion goals (e.g., book a hotel room on a specified date in a specified location within a specified price range). In other words, these are pragmatic tasks (where shorter task times reflect more efficient design) rather than hedonic<\/a> activities (where longer times reflect more engagement<\/a> as users interact with a product).<\/p>\n

Details of the Datasets<\/h1>\n

Using our MUiQ\u00ae<\/sup> platform<\/a> to conduct unmoderated UX studies, we collected data from 1,222 tasks as shown in Table 1. These tasks were composed of think-aloud (TA), non-TA, and tree test tasks across 112 different studies (a mix of desktop and mobile websites, mobile apps, and prototypes) from 2021 to 2023.<\/p>\n

The average sample size per task was 71, with a low of 3 participants and a high of 211.<\/p>\n

The overall distribution of times ranged from 3 to 505 seconds (about 8.4 minutes), with a mean of 72 seconds, a median of 54 seconds (interquartile range from 30 to 88 seconds), and a geometric mean of 51 seconds. As expected, the distribution was skewed by a few very long tasks (Figure 1 shows the completion time distributions for the three types of studies). Consequently, we focused on medians and geometric means rather than arithmetic means to estimate the centers of the distributions<\/a>.<\/p>\n

\"Dotplots<\/a><\/p>\n

Figure 1:<\/strong> Dot plots of task completion times for TA, non-TA, and tree test tasks.<\/p>\n

As shown in Table 1, our dataset had a lot more non-TA (803) than TA (270) or tree test (149) tasks. We collected many of the non-TA tasks in large-scale UX benchmark studies. All differences between the means, medians, and geometric means of the study types were statistically significant (p<\/em> < .01). The median time for TA (67 seconds) was 18% longer than the median for non-TA (57). As expected, times for tree tests were much faster than those for more complex tasks.<\/p>\n\n\n\n\n\t\n\n\t\n\t\n\t
Study Type<\/th>Median<\/th>Geo Mean<\/th>Mean<\/th>St Dev<\/th>Minimum<\/th>Maximum<\/th>N (Tasks)<\/th>\n<\/tr>\n<\/thead>\n
TA<\/td>67<\/td>74<\/td>96<\/td>80<\/td>3<\/td>489<\/td>270<\/td>\n<\/tr>\n
Non-TA<\/td>57<\/td>57<\/td>75<\/td>64<\/td>3<\/td>505<\/td>803<\/td>\n<\/tr>\n
Tree Test<\/td>13<\/td>14<\/td>15<\/td>\u20078<\/td>5<\/td>\u200741<\/td>149<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n

Table 1:<\/strong> Summary of task completion times (in seconds) for three types of unmoderated UX studies.<\/p>\n

Table 2 shows the key percentiles for each study type. The 50th<\/sup> percentile is the median. The 25th<\/sup> and 75th<\/sup> percentiles are the endpoints of the interquartile range<\/a>.<\/p>\n

It\u2019s rare for studies to have only a single task, so Table 3 also shows what the task time alone would take up for five tasks\u2014something more typical in our experience. Note that these time estimates don\u2019t include the time needed to read tasks before attempting them or answer post-task questions.<\/p>\n\n\n\n\n\t\n\n\t\n\t\n\t
Study Type<\/th>5th<\/th>10th<\/th>25th<\/th>50th<\/th>75th<\/th>90th<\/th>95th<\/th>\n<\/tr>\n<\/thead>\n
TA<\/td>28<\/td>34<\/td>46<\/td>67<\/td>111<\/td>205<\/td>268<\/td>\n<\/tr>\n
Non-TA<\/td>15<\/td>23<\/td>34<\/td>57<\/td>\u200790<\/td>149<\/td>200<\/td>\n<\/tr>\n
Tree Test<\/td>\u20077<\/td>\u20078<\/td>10<\/td>13<\/td>\u200718<\/td>\u200729<\/td>\u200734<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n

Table 2:<\/strong> Key percentiles for each study type (cells are completion times in seconds).<\/p>\n\n\n\n\n\t\n\n\t\n\t\n\t
Study Type<\/th>50th (Median)<\/th>75th<\/th>90th<\/th>95th<\/th>\n<\/tr>\n<\/thead>\n
TA<\/td>5.6<\/td>9.2<\/td>17.1<\/td>22.3<\/td>\n<\/tr>\n
Non-TA<\/td>4.8<\/td>7.5<\/td>12.4<\/td>16.7<\/td>\n<\/tr>\n
Tree Test<\/td>1.1<\/td>1.5<\/td>\u20072.4<\/td>\u20072.8<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n

Table 3:<\/strong> Estimated times (in minutes) for different percentiles to complete five<\/strong> tasks for each study type (task completion time only\u2014does not include time for reading tasks or completing post-task questions).<\/p>\n

So which percentile should you use for planning? Starting with the median (the 50th<\/sup> percentile) seems like a good place to start, but it\u2019s risky. By definition, the median is the point at the middle of the distribution where 50% of tasks in our data were longer. That means that half the time, the tasks will take longer than planned. It\u2019s usually better to be a bit conservative and use something like the 75th<\/sup> percentile.<\/p>\n

There are also other times when you want to be sure you have enough time, so it would be better to use the 90th<\/sup> or 95th<\/sup> percentile.<\/p>\n

Summary and Discussion<\/h1>\n

The key takeaways from our analysis of 1,222 unmoderated task times are:<\/p>\n

Median times for different research protocols are significantly different.<\/strong> The task time distributions for the three research protocols are different enough that researchers should use the estimates in Tables 2 and 3 when planning a TA, non-TA, or tree test study rather than the overall median and interquartile percentiles.<\/p>\n

Use the 75th<\/sup> percentile for planning.<\/strong> When you plan an unmoderated study and lack historical data about the task, we generally recommend using the 75th<\/sup> percentile for the planned research protocol. This means most tasks will take less than these times by task type:<\/p>\n

    \n
  • Tree Tests: 20 seconds<\/li>\n
  • Non-TA Tasks: ~90 seconds<\/li>\n
  • TA Tasks: ~120 seconds<\/li>\n<\/ul>\n

    These estimates don\u2019t include pre-task or post-task questions.<\/strong> The task times in the analysis are only the times participants spend attempting a task and don\u2019t include reading instructions or answering post-task questions. Both activities will add some time to an overall study depending on the length and complexity of instructions and the number of questions.<\/p>\n

    There are no moderated data in these estimates.<\/strong> This analysis did not include any data from moderated studies (tasks with an attending researcher). We included only data from unmoderated studies collected using the MUiQ platform.<\/p>\n

    The dataset is not necessarily broadly representative.<\/strong> Although we created a large dataset of task times from unmoderated TA, non-TA, and tree test studies, a broader dataset could likely include longer times. We created our data sets from the types of UX studies we typically conduct, which may be different from other UX research contexts where pragmatic tasks are more complex (e.g., coding an error-free speech recognition app) or there is more of a focus on hedonic activities (e.g., \u201cspend as much time as you want to browse the website to see if you find anything interesting\u201d).<\/p>\n

    These data do not define \u201cgood\u201d task times.<\/strong> Be careful not to extrapolate this time data into a benchmark. An unmoderated task that takes 40 seconds is shorter than the median time in our dataset, but this doesn\u2019t mean it\u2019s necessarily a fast or efficient task experience. For example, clicking a login or entering a search string are tasks that should take much less than 30 seconds.<\/p>\n","protected":false},"excerpt":{"rendered":"

    A common logistical consideration when planning a task-based usability study is how much time you should plan for a task. Many usability studies (especially benchmark studies) suffer from trying to do too many things. That includes asking participants to attempt too many tasks. It\u2019s understandable why tasks get packed in\u2014even low-cost usability testing takes time […]<\/p>\n","protected":false},"author":2,"featured_media":41187,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"default","adv-header-id-meta":"","stick-header-meta":"default","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"_price":"field_56e41332a1ae5","_stock":"","_tribe_ticket_header":"","_tribe_default_ticket_provider":"Tribe__Tickets_Plus__Commerce__WooCommerce__Main","_tribe_ticket_capacity":"0","_ticket_start_date":"","_ticket_end_date":"","_tribe_ticket_show_description":"","_tribe_ticket_show_not_going":false,"_tribe_ticket_use_global_stock":"","_tribe_ticket_global_stock_level":"","_global_stock_mode":"","_global_stock_cap":"","_tribe_rsvp_for_event":"","_tribe_ticket_going_count":"","_tribe_ticket_not_going_count":"","_tribe_tickets_list":"[]","_tribe_ticket_has_attendee_info_fields":false,"footnotes":""},"categories":[183,189,2,39],"tags":[26,30,190,73],"acf":[],"ticketed":false,"_links":{"self":[{"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/posts\/40896"}],"collection":[{"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/comments?post=40896"}],"version-history":[{"count":7,"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/posts\/40896\/revisions"}],"predecessor-version":[{"id":41186,"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/posts\/40896\/revisions\/41186"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/media\/41187"}],"wp:attachment":[{"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/media?parent=40896"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/categories?post=40896"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/measuringu.com\/wp-json\/wp\/v2\/tags?post=40896"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}