A recent Harvard Business Review article Don't Let Big Data Bury Your Brand touches on one aspect of predictive analytics usage to be alarmed about: companies do not realize that machine-learning-based predictive analytics can be excellent for short-term prediction, but poor in the long-term. The HBR article talks about the scenario of a CMO torn between the CEO's pressure to push prediction-based promotions (based on the IT department's data analysts), and his/her long-term brand-building efforts:
Advanced marketing analytics and big data make [balancing short-term revenue pursuit and long-term brand building] much harder today. If it was difficult before to defend branding investments with indefinite and distant payoffs, it is doubly so now that near-term sales can be so precisely engineered. Analytics allows a seeming omniscience about what promotional offers customers will find appealing. Big data allows impressive amounts of information to be obtained about the buying patterns and transaction histories of identifiable customers. Given marketing dollars and the discretion to invest them in either direction, the temptation to keep cash registers ringing is nearly irresistible.There are two reasons for the weakness of prediction in the long term: First, predictive analytics learn from the past to predict the future. In a dynamic setting where the future is very different from the past, predictions will obviously fail. Second, predictive analytics rely on correlations and associations between the inputs and the to-be-predicted output, not on causal relationships. While correlations can work well in the short term, they are much more sensitive in the long term.
The danger is then using predictive analytics for long-term prediction or planning. It's a good tool, but it has its limits. Prediction becomes much more valuable when it is combined with explanation. The good news is that establishing causality is also possible with Big Data: you run experiments (the now-popular A/B testing is a simple experiment), or you rely on other causal expert knowledge. There are even methods that use Big Data to quantify causal relationships from observational data, but they are trickier and more commonly used in academia than in practice (that will come!).
Bottom line: we need a combination of causal modeling and predictive modeling in order to make use of data for short-term and long-term actions and planning. The predictive toolkit can help discover correlations; we can then use experiments (or surveys) to figure out why. And then improve our long-term predictions. It's a cycle.