There are various proposed corrections for multiple testing, the most basic principle being reducing the individual α's. However, the various corrections suffer in this way or the other from reducing statistical power (the probability of detecting a real effect). One important approach is to limit the number of hypotheses to be tested. All this is not new to statisticians and also to some circles of researchers in other areas (a 2008 technical report by the US department of education nicely summarizes the issue and proposes solutions for education research).
|"Large-Scale" = many measurements|
And now I get to my (hopefully novel) point: empirical research in the social sciences is now moving to the era of "large n and same old k" datasets. This is what I call "large samples". With large datasets becoming more easily available, researchers test few hypotheses using tens and hundreds of thousands of observations (such as lots of online auctions on eBay or many books on Amazon). Yet, the focus has remained on confirmatory inference, where a set of hypotheses that are derived from a theoretical model are tested using data. What happens to multiple testing issues in this environment? My claim is that they are gone! Decrease α to your liking, and you will still have more statistical power than you can handle.
But wait, it's not so simple: With very large samples, the p-value challenge kicks in, such that we cannot use statistical significance to infer practically significant effects. Even if we decrease α to a tiny number, we'll still likely get lots of statistically-significant-but-practically-meaningless results.
The bottom line is that with large samples (large-n-same-old-k), the approach to analyzing data is totally different: no need to worry about multiple testing, which is so crucial in small samples. This is only one among many other differences between small-sample and large-sample data analysis.
Post a Comment