Showing posts with label causality. Show all posts
Showing posts with label causality. Show all posts

Friday, August 09, 2013

Predictive relationships and A/B testing

I recently watched an interesting webinar on Seeking the Magic Optimization Metric: When Complex Relationships Between Predictors Lead You Astray by Kelly Uphoff, manager of experimental analytics at Netflix. The presenter mentioned that Netflix is a heavy user of A/B testing for experimentation, and in this talk focused on the goal of optimizing retention.

In ideal A/B testing, the company would test the effect of an intervention of choice (such as displaying a promotion on their website) on retention, by assigning it to a random sample of users, and then comparing retention of the intervention group to that of a control group that was not subject to the intervention. This experimental setup can help infer a causal effect of the treatment on retention. The problem is that the information on retention can take long to measure -- if retention is defined as "customer paid for the next 6 months", you have to wait 6 months before you can determine the outcome.

Thursday, December 23, 2010

No correlation -> no causation?

Applied Multiple Regression/Correlation Analysis for the Behavioral SciencesI found an interesting variation on the "correlation does not imply causation" mantra in the book Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences by Cohen et al. (apparently one of the statistics bibles in behavioral sciences). The quote (p.7) looks like this:
Correlation does not prove causation; however, the absence of correlation implies the absence of the existence of a causal relationship
Let's let the first part rest in peace. At first glance, the second part seems logical: you find no correlation, then how can there be causation? However, after further pondering I reached the conclusion that this logic is flawed, and that one might observe no correlation when in fact there exists underlying causation. The reason is that causality is typically discussed at the conceptual level while correlation is computed at the measurable data level.

Where is Waldo?
Consider an example where causality is hypothesized at an unmeasurable conceptual level, such as "higher creativity leads to more satisfaction in life". Computing the correlation between "creativity" and "satisfaction" requires operationalizing these concepts into measurable variables, that is, identifying measurable variables that adequately represent these underlying concepts. For example, answers to survey questions regarding satisfaction in life might be used to operationalize "satisfaction", while a Rorschach test might be used to measure "creativity". This process of operationalization obviously does not lead to perfect measures, not to mention that data quality can be sufficiently low to produce no correlation even if there exists an underlying causal relationship.

In short, the absence of correlation can also imply that the underlying concepts are hard to measure, are inadequately measured, or that the quality of the measured data is too low (i.e., too noisy) for discovering a causal underlying relationship.

Saturday, January 10, 2009

Beer and ... crime

I often glimpse the local newspapers while visiting a foreign country (as long as it is in a language I can read). Yesterday, the Australian Herald Sun had the article "Drop in light beer sales blamed for surge in street violence".

The facts presented: "Light beer sales have fallen 15% in seven years, while street crime has soared 43%". More specifically: "Police statistics show street assaults rose from 6400 in 2000-01 to more than 9000 in 2007-08. At the same time, Victorians' thirst for light beer dried up."

The interpretation by health officials: "there was a definite connection between the move away from light beer and the rise in drunken violence."

The action: There is now a suggestion to drastically reduce tax on light beer to encourage people to switch back from full-strength beer.

I am far from being an expert on drinking problems or crime in Australia (although they are both very visible here in Melbourne), but let's look at the title of this article and the data-interpretation-action sequence more carefully. The title Drop in light beer sales blamed for surge in street violence implies that the drop in light beer sales is the cause of increase in violence. Obviously such a direct causal relationship cannot be true unless perhaps retailers of light beer have become frustrated and violent... So, the first causal argument (I suppose) is that the decline in drinking light beer reflects a move to full-strength alcohol, which in turn leads to more violence. If there indeed is a shift of this sort, then the decline in light beer sales is merely a proxy for violent behavior trends*.

The second causal hypothesis, implied by the proposed action, is that beer drinkers in Victoria will switch from full-strength to light beer if the latter is sufficiently cheap.

To establish such causal arguments I'd like to see a bit more research (which might already exist and not mentioned in the article):
  • Have people in Victoria indeed shifted from drinking light beer to full-strength beer? (perhaps via a survey or from transactional data at "bottle" stores) -- there might just be an overall decline in beer consumption, as well as an overall increase in violent behavior like in other places in the world
  • Has violence increased also by non beer drinkers? What about drugs, violent movies, shift of populations, economic trends, global violence levels?
  • If such a shift exists, what are its reasons? (e.g., better quality of full strength beer, social trends, price)
  • What segment of beer drinkers in Victoria becomes violent? (age, gender, employment, income, where they buy beer, etc.)
  • Is today's beer drinking population different from the population 7 years ago in some other important ways that relate to violence?
  • Has violence been treated differently over the years? (police presence, social norms, etc.)
  • Determine how price-sensitive today's drinkers-become-aggressors are.
Only after answering the above questions, and perhaps others, would I be comfortable with seeing a causal relationship between the price of light beer (compared to full-strength beer) and the levels of aggression.

And if beer drinking is indeed a cause of violence in Victoria, how about adopting behavioral and educational ideas from other countries like France? Or maybe alcohol is simply loosening the inhibitions on the growing aggressive 21st century society.

------------------------------------------
*Note: Even if light beer sales are merely a proxy for crime levels, they can be used for predictive purposes. For example, police stations can use light beer sale levels for staffing decision.