Perhaps it’s something about the precision of minutes and seconds that demands greater scrutiny.
There’s a lot to consider when measuring and analyzing task time. Here are 10 of them.
- Task times are collected in about half of formative usability tests and 75% of summative tests.
- Task times can be great for diagnosing usability problems. A long task time is often caused by problems with the interaction with the interface.
- Time on task can be collected even when users think aloud. Some studies show that it actually increases users’ speed, rather than decrease it. This is different than probing a user (asking them questions during the task). Retrospectively probing users after the task allows you to both collect more stable task times and better understand any problems they were having.
- There are three core ways to report task-times:
- Average Task Completion Time: Include only users who completed the task successfully. This is the most common one to report.
- Mean Time to Failure: The average time users are spending on the task before they give up or complete the task incorrectly.
- Average Time on Task: The total duration users are spending on your task.
- When reporting the average task time, use the geometric mean for small samples (< 25) and the median for larger samples. The arithmetic mean tends to be heavily skewed by a few outliers and the median tends to be less accurate for small sample sizes.
- Compute a confidence interval around the task time after log-transforming your data. You can use this free calculator or download the Excel version.
- Time saved can be an excellent measure of productivity: Showing that a new design can substantially reduce the time it takes users to complete a task is a metric even executives understand.
- You can estimate the task time for experienced users who commit no errors using Keystroke Level Modeling. You can consider this the fastest time to complete a task.
- How long should a task take? There are several strategies for identifying an acceptable task time.
- Find task times for competing products or products with acclaimed user interfaces.
- Find the time for the prior version or similar version of the product.
- Find the times from the most satisfied users. I call this a bootstrapped specification limit[pdf].
- Carrying out the task without use of a computer system (or on an older platform e.g., in DOS)
- Identifying the expert or fastest task time and setting the unacceptable condition to 1.5 times (or another multiple) this time for each task. You can use the results from a KLM analysis. Note: It’s unclear what a good multiple should be. More research is needed to make this services more meaningful so use it as a last resort.
- Task Times are Relative: Don’t get hung on what the “right” task time is. All task times will be wrong. Lab based studies are too idealized. Unmoderated remote times are easily affected by users being distracted by other activities. It’s the relative comparison using the same method that provides the most meaning.
For more information on collecting and analyzing tasks times see How to Conduct a Quantitative Usability Test, also available in print from Amazon under the title a Practical Guide to Measuring Usability.