Friday, May 20, 2011

Nice April Fool's Day prank

The recent issue of the Journal of Computational Graphics & Statistics published a short article by Columbia Univ Prof Andrew Gelman (I believe he is the most active statistician-blogger) called "Why tables are really much better than graphs" based on his April 1, 2009 blog post (note the difference in publishing speed using blogs and refereed journals!). The last parts made me laugh hysterically - so let me share them:

About creating and reporting "good" tables:
It's also helpful in a table to have a minimum of four significant digits. A good choice is often to use the default provided by whatever software you have used to fit the model. Software designers have chosen their defaults for a good reason, and I'd go with that. Unnecessary rounding is risky; who knows what information might be lost in the foolish pursuit of a "clean"-looking table?
About creating and reporting "good" graphs:
If you must make a graph, try only to graph unadorned raw data, so that you are not implying you have anything you do not. And I recommend using Excel, which has some really nice defaults as well as options such as those 3-D colored bar charts. If you are going to have a graph, you might as well make it pretty. I recommend a separate color for each bar—and if you want to throw in a line as well, use a separate y-axis on the right side of the graph.
Note: please do not follow these instructions for creating tables and graphs! Remember, this is an April Fool's Day prank!
From Stephen Few's examples of bad visualizations (