{"id":326,"date":"2016-07-05T21:30:57","date_gmt":"2016-07-05T21:30:57","guid":{"rendered":"http:\/\/measuringu.com\/stats-started\/"},"modified":"2021-01-28T06:30:16","modified_gmt":"2021-01-28T06:30:16","slug":"stats-started","status":"publish","type":"post","link":"https:\/\/measuringu.com\/stats-started\/","title":{"rendered":"5 Steps for Getting Started with Statistics for Research"},"content":{"rendered":"

Statistics can be daunting, especially for UX professionals who aren’t particularly excited about the idea of using numbers<\/a> to improve designs.<\/p>\n

But like any skill that can be learned, it takes some time to understand statistical concepts and put them into practice.<\/p>\n

Most participants at our UX Boot Camp<\/a> go from little knowledge of statistics to running statistical comparisons in just three days.<\/p>\n

Here’s the path we take participants on.<\/p>\n

1. Understand the Different Data Types<\/h2>\n

Not all data is created equal. You need to know what type of data you have to determine the best statistical test and method for finding the sample size.<\/p>\n

While there are a number of ways to organize data into types<\/a>, first segment the data into two groups: qualitative and quantitative. Qualitative data<\/a> limits the type of statistical analysis you can perform, but at the very least you can still count<\/a>, and summarize the counts of observations, names, and other non-numeric attributes.<\/p>\n

Quantitative data can be subdivided into discrete and continuous groups. Discrete data is countable (e.g. number of errors) and is often binary (having only two options) such as purchased\/didn’t purchase or complete\/didn’t complete. But you can’t further divide discrete data into smaller units\u2014you can’t have half a completion rate for example. Continuous data can be subdivided into smaller meaningful units. For example, task time is continuous and you can break down time from say a minute to 30 seconds, 1 second, etc.<\/p>\n

Continuous data provides more fidelity and therefore requires a smaller sample size to detect differences. Where possible, look to collect more continuous measures; you can always decompose a continuous measure down to a discrete or binary measure.<\/p>\n

2. Grasp Sampling Error<\/h2>\n

Rarely can you measure the entire population of users for a product or website. Instead you have to rely on a sample of users. For example, instead of surveying all 100,000 customers to see how likely they are to renew their service contract, we can use a sample of 100. The percentage of customers in the sample who state they will repurchase will fluctuate depending on the sample.<\/p>\n

We might get a lot more interested customers than we would if we measured all 100,000 customers just from random chance. Interestingly enough, the mean or proportion<\/a> we collect from samples, even rather small samples, doesn’t fluctuate as much as you might think. In fact, there’s a pattern to how much sample means fluctuate that allow us to better predict the unknown population mean. This is embodied in the most important concept in statistics, called the central limit theorem<\/a>. It’s why techniques like confidence intervals and statistical tests work\u2014they take into account sampling error.<\/p>\n

3. Compute Confidence Intervals<\/h2>\n

To understand how precise an estimate is from a sample, you use a confidence interval<\/a>. Confidence intervals take into account the sampling error and tell you the mostly likely range for the unknown population average.<\/p>\n

You’ve almost certainly heard of the margin of error, usually in regard to an electorate’s attitude toward an issue. If 50% approve of the president’s job performance with a margin of error of 3%, the confidence interval is 47% to 53%. This range is the best estimate (usually from around 1,000 people) for how every eligible voter (millions of people) would also assess the performance.<\/p>\n

With confidence intervals, you need to consider three factors<\/a>:<\/p>\n