## Thursday, December 23, 2010

### No correlation -> no causation?

I found an interesting variation on the "correlation does not imply causation" mantra in the book Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences by Cohen et al. (apparently one of the statistics bibles in behavioral sciences). The quote (p.7) looks like this:
Correlation does not prove causation; however, the absence of correlation implies the absence of the existence of a causal relationship
Let's let the first part rest in peace. At first glance, the second part seems logical: you find no correlation, then how can there be causation? However, after further pondering I reached the conclusion that this logic is flawed, and that one might observe no correlation when in fact there exists underlying causation. The reason is that causality is typically discussed at the conceptual level while correlation is computed at the measurable data level.

 Where is Waldo?
Consider an example where causality is hypothesized at an unmeasurable conceptual level, such as "higher creativity leads to more satisfaction in life". Computing the correlation between "creativity" and "satisfaction" requires operationalizing these concepts into measurable variables, that is, identifying measurable variables that adequately represent these underlying concepts. For example, answers to survey questions regarding satisfaction in life might be used to operationalize "satisfaction", while a Rorschach test might be used to measure "creativity". This process of operationalization obviously does not lead to perfect measures, not to mention that data quality can be sufficiently low to produce no correlation even if there exists an underlying causal relationship.

In short, the absence of correlation can also imply that the underlying concepts are hard to measure, are inadequately measured, or that the quality of the measured data is too low (i.e., too noisy) for discovering a causal underlying relationship.