{"id":31204,"date":"2022-02-15T18:52:59","date_gmt":"2022-02-16T01:52:59","guid":{"rendered":"https:\/\/measuringu.com\/?p=31204"},"modified":"2022-02-15T07:32:06","modified_gmt":"2022-02-15T14:32:06","slug":"five-styles-of-statistical-rhetoric","status":"publish","type":"post","link":"https:\/\/measuringu.com\/five-styles-of-statistical-rhetoric\/","title":{"rendered":"Five Styles of Statistical Rhetoric"},"content":{"rendered":"

\"\"<\/a>When learning statistics, you\u2019ll encounter many formulas based on principles of probability and mathematics. But statistics isn\u2019t just a formulaic process where you enter data and are told what to do. Statistics should guide, not dictate, decisions.<\/p>\n

In making decisions, though, there are different styles of interpreting data. Although a lot of people think statistics will provide definitive right and wrong answers, the practice is much more nuanced and subject to interpretation. In fact, statistics, like other disciplines, is influenced by the people using them and the context in which they are used. Interpreting statistical output can be a lot like interpreting the law.<\/p>\n

For example, in constitutional interpretation<\/a>, multiple well-educated and informed people can read the same words and interpret those words to mean different things. What constitutes speech? With legal interpretation, we\u2019ve come to expect differences that can result in heated debate, as judges and justices are selected based on how they differ in their interpretation of the same few words. Styles of legal interpretation can lie on a spectrum from conservative to liberal, with many subtle distinctions.<\/p>\n

While statistics doesn\u2019t have a Magna Carta, constitution, or bill of rights, it does have guiding principles. Those principles, like the written law, are subject to interpretation. And while it may seem odd to compare statistics to politics, the two have much in common when it comes to understanding differences in opinions.<\/p>\n

Experts in statistics, as in other disciplines<\/a> (e.g., particle physics, radiology), disagree. We see this disagreement in how a video of a participant attempting a task elicits different interpretations<\/a> and remedies from different UX professionals.<\/p>\n

Interpretative differences can take many forms. In statistics, there tends to be a continuum of interpretation styles. Robert P. Abelson, a former statistical professor at Yale, has attempted to define these styles modeled on what we see in politics.<\/p>\n

In 1995, Abelson published Statistics as Principled Argument<\/a><\/em>, a book targeted at students taking their first-year graduate statistics course. He would joke that his original title was going to be, \u201cLots of Things You Ought to Know about Statistics, but Are Too Stupefied to Ask.\u201d<\/p>\n

In that book, Abelson articulated four different styles of statistical rhetoric, displayed in Figure 1 from brash to stuffy.<\/p>\n

\"\"<\/a>
Figure 1:<\/strong> Four styles of statistical rhetoric, both reasonable and unreasonable.<\/figcaption><\/figure>\n

In this article, we review Abelson\u2019s four styles and describe a fifth style\u2014the pragmatic style.<\/p>\n

The Brash Style<\/h2>\n

On the spectrum of attitudes, we start with an extreme-left attitude. When applied to statistical rhetoric, in most cases, researchers who exhibit the brash style would rather have their analyses produce statistically significant outcomes than fail to reject their null hypotheses<\/a>.<\/p>\n

Taken to an extreme, researchers who are desperate to achieve statistical significance have several ways to create the illusion of statistical significance. The use of these methods when they cannot be justified is the hallmark of the brash style. Here are some symptoms of this style:<\/p>\n

Conducts one-tailed tests when they are not appropriate.<\/strong> Statistical tests can be one-tailed or two-tailed. All other things being equal, one-tailed tests indicate statistical significance more often than two-tailed tests. One-tailed tests are appropriate in a small set of research contexts, for example, when comparing a sample of data to a set benchmark<\/a>. Otherwise, the logic of statistical hypothesis testing<\/a> requires the use of two-tailed tests.<\/p>\n

Engages in p-hacking.<\/strong> Just by chance, if you run enough tests, some of them will indicate statistical significance even though there is no real effect (i.e., Type I errors<\/a>). This practice is known as p-hacking<\/a>. One way to do this is to test all possible splits in a data set, whether or not they were related to hypotheses documented before data collection. Another is to run all possible multiple comparisons for an independent variable with many levels (e.g., multiple products in a retrospective UX survey), focusing on significant results without regard to the number of comparisons.<\/p>\n

States actual p-value but talks around it.<\/strong> If you have decided ahead of time, based on careful consideration of the research context, to set the alpha criterion to p < .05, you should stick with it. If you have analyzed the relative costs of Type I and Type II statistical decision errors and concluded that an alpha criterion of p < .10 (or p < .001) is appropriate, then stick with that. (There is nothing magical about p < .05<\/a>.) But these decisions need to be made before running statistical tests. If you fudge on the p-values to support the conclusions you want to make, your style is brash.<\/p>\n

Manipulates outliers.<\/strong> Like the decision about which alpha criterion to use, decisions about how to treat outliers should be made before data collection and analysis. Researchers with a brash style make that decision after they see the data, either including or excluding outliers as needed to support the points they want to make.<\/p>\n

We refer to the rhetorical style that overstates every statistical result as brash. Investigators who use the \u2026 devices previously listed, freely and inappropriately, invite skepticism and disfavor. <\/em>(Abelson, 1995, p. 55)<\/p>\n

The Stuffy Style<\/h2>\n

On the other end of the spectrum is the extremely conservative stuffy style. It\u2019s the extreme opposite of the brash style (\u201canything goes\u201d), reaching extreme cautiousness (\u201cnothing goes\u201d). Researchers using a stuffy style would exhibit these symptoms:<\/p>\n