Tuesday, April 26, 2016

Statistical software should remove *** notation for statistical significance

Now that the emotional storm following the American Statistical Association's statement on p-values is slowing down (is it? was there even a storm outside of the statistics area?), let's think about a practical issue. One that greatly influences data analysis in most fields: statistical software. Statistical software influences which methods are used and how they are reported. Software companies thus affect entire disciplines and how they progress and communicate.
Star notation for p-value thresholds in statistical software

No matter whether your field uses SAS, SPSS (now IBM), STATA, or another statistical software package, you're likely to have seen the star notation (this isn't about hotel ratings). One star (*) means p-value<0.05, two stars (**) mean p-value<0.01, and three stars (***) mean p-value<0.001.

According to the ASA statement, p-values are not the source of the problem, but rather their discretization. The ASA recommends:

"P-values, when used, would be reported as values, rather than inequalities (p = .0168, rather than p < 0.05). Indeed, we envision there being better recognition that measurement of the strength of evidence really is continuous, rather than discrete."
This statement is a strong signal to the statistical software companies: continuing to use the star notation, even if your users are addicted to them, is in violation of the ASA recommendation. Will we be seeing any change soon?