Thursday, August 20, 2009

Data Exploration Celebration: The ENBIS 2009 Challenge

The European Network for Business and Industrial Statistics (ENBIS) has released the 2009 ENBIS Challenge. The challenge this time is to use an exploratory data analysis (EDA) tool to answer a bunch of questions regarding sales of laptop computers in London. The data on nearly 200,000 transactions include 3 files: sales data (for each computer sold, with time stamps and zipcode locations of customer and store), computer configuration information, and geographic information linking zipcodes to GIS coordinates. Participants are challenged to answer a set of 11 questions using EDA.

The challenge is sponsored by JMP (by SAS), who are obviously promoting the EDA strengths of JMP (fair enough), yet analysis can be done using any software.

What I love about this competition is that unlike other data-based competitions such as the KDD Cup, INFORMS, or the many forecasting competitiong (e.g. NN3), it focuses solely on exploratory analysis. No data mining, no statistical models. From my experience, the best analyses rely on a good investment of time and energy in data visualization. Some of today's data visualization tools are way beyond static boxplots and histograms. Interactive visualization software such as TIBCO Spotfire (and Tableau, which I haven't tried) allow many operations such as zooming, filtering, panning. They support multivariate exploration via the use of color, shape, panels, etc. and they include specialized visualization tools such as treemaps and parallel coordinate plots.

And finally, although the focus is on data exploration, the business context and larger questions are stated:

In the spirit of a "virtuous circle of learning", the insights gained from this analysis could then used to design an appropriate choice experiment for a consumer panel to determine which characteristics of the various configurations they actually value, thus helping determine product strategy and pricing policies that will maximise Acell's projected revenues in 2009. This latter aspect is not part of the challenge as such.

The Business Objective:
Determine product strategy and pricing policies that will maximise Acell's projected revenues in 2009.

Management's Charter:
Uncover any information in the available data that may be useful in meeting the business objective, and make specific recommendations to management that follow from this (85%). Also assess the relevance of the data provided, and suggest how Acell can make better use of data in 2010 to shape this aspect of their business strategy and operations (15%).


mehul said...

Data Visualization tools seem great for Exploratory analysis. But in the Enterprise software space they seem to occupy a niche segment. There related cousins Business Intelligence tools (Business objects, Cognos, Microstrategy ) have a much wider coverage in a corporate environment.
These tools are great for reporting .
Do you know any Business intelligence tool which also has very good Data Visualization Capabilities. Ideally a company would like to have one tool that can be used for reporting, analysis Data visualization as well as data mining. Have you seen anything in industry that comes close to meet all the needs.

Galit Shmueli said...

Hi Mehul,
TIBCO (who bought Spotfire a couple of years ago) actually has a wider suite of BI tools, but I have no experience with them. I have not heard of an overall BI tool that has as good visualization capabilities. Even the data mining analytic tools such as SAS Enterprise Miner and SPSS Clementine (now IBM) do not have visualization capabilities that are even close to designated visualization tools such as Spotfire and Tableau. I guess it's like looking for fantastic salads at Starbucks...

Galit Shmueli said...

Mehul,I asked Stephen Few (, an expert in Visualization and BI. Here is his response:
"I've worked in the business intelligence field for almost 20 years and have specialized exclusively in data visualization for the last seven. I haven't seen a full-blown BI reporting solution--certainly not from the big BI vendors (Cognos, Business Objects, etc.)--that includes a decent visual analysis tool. Some of the vendors that have specialized in data visualization such as Tableau, however, are rapidly building into their software the kind of functionality that is needed for enterprise-level production reporting. I predict that these smaller, data visualization oriented vendors will manage to add the functionality that is currently missing for a full-blown BI solution much faster than the traditional BI vendors will figure out what data visualization is about. If you take a look at Tableau, you might find that it already comes close to the full solution you're hoping to find."

Tabrez said...

BI tools are used at the company ( I work for. I talked to an analyst today to see if they used any data visualization tools. His response was that it depends on the client for whom they are working. He mentioned that he knows of at least some projects where Tableau is used.

I will try to check out Tableau to see how it compares to SpotFire.

Galit Shmueli said...

Tabrez - It would be great if indeed you had a chance to try out Tableau and write back about your experience. That would be useful to many others who are contemplating what software to use.