Monday, December 13, 2010

Discovering moderated relationship in the era of large samples

I am currently visiting the Indian School of Business (ISB) and enjoying their excellent library. As in my student days, I roam the bookshelves and discover books on topics that I know little, some, or a lot. Reading and leafing through a variety of books, especially across different disciplines, gives some serious points for thought.

As a statistician I have the urge to see how statistics is taught and used in other disciplines. I discovered an interesting book coming from the psychology literature by Herman Aguinas called Regression Analysis for Categorical Moderators. "Moderators" in statistician language is "interactions". However, when social scientists talk about moderated relationships or moderator variables, there is no symmetry between the two variables that create the interaction. For example if X1=education level, X2=Gender, and Y=Satisfaction at work, then an inclusion of the moderator X1*X2 would follow a direct hypothesis such as "education level affects satisfaction at work differently for women and for men."

Now to the interesting point: Aguinis stresses the scientific importance of discovering moderated relationships and opens the book with the quote:
"If we want to know how well we are doing in the biological, psychological, and social sciences, an index that will serve us well is how far we have advanced in our understanding of the moderator variables of our field."        --Hall & Rosenthal, 1991
Discovering moderators is important for understanding the bounds of generalizability as well as for leading to adequate policy recommendations. Yet, it turns out that "Moderator variables are difficult to detect even when the moderator test is the focal issue in a research study and a researcher has designed the study specifically with the moderator test in mind."

One main factor limiting the ability to detect moderated relationships (which tend to have small effects) is statistical power. Aguinas describes simulation studies showing this:
a small effect size was typically undetected when sample size was as large as 120, and ...unless a sample size of at least 120 was used, even ... medium and large moderating effects were, in general, also undetected.
This is bad news. But here is the good news: today, even researchers in the social sciences have access to much larger datasets! Clearly n=120 is in the past. Since this book has come out in 2004, have there been large-sample studies of moderated relationships in the social sciences?

I guess that's where searching electronic journals is the way to go...

No comments: