Friday, February 02, 2007

The legendary threshold of 5% for p-values

Almost every introductory course in statistics gets to a point where the concept of the p-value is introduced. This is a tough concept and usually takes time to absorb. It is also usually one of the hardest concepts for students to internalize. An interesting paper by Hubbard and Armstrong discuss the confusion in marketing research which takes place in textbooks and journal articles.

Another "fact" that usually accompanies the p-value concept is the 5% threshold. One typically learns to compare the p-value (that is computed from the data) to a 5% threshold, and if it is below that threshold, then the effect is statistically significant.

Where does the 5% come from? I pondered on that at some point. Since a p-value can be thought of as a measure of risk, then 5% is pretty arbitrary. Obviously some applications warrant lower risk levels, while others might tolerate higher levels. According to Jerry Dallal's webpage, the reason is historical: before the age of computers, tables were used for computing p-values. In Fisher's original tables the levels computed were 5% and a few others. The rest, as they say, is history.

2 comments:

Star River said...

totally agree. And this is why in some econ journals, they require to just report the standard error and get rid of the little magic stars over the coefficents.

It leaves the readers (and of course the authors) to judge whether the coefficients are significant or not.

Gordon

Star River said...

totally agree. And this is why in some econ journals, they require to just report the standard errors and not show the little magic stars over the coefficents.

It leaves the readers (and of course the authors) to judge whether the coefficients are significant or not.

Another thought is that if everyone is using the 5% threshold, does this mean that 5% of all empirical researches are wrong on the earth? Because in expectation, 5% of the hypothesis are rejected by mistake. I might be wrong though.