Showing posts with label nytimes. Show all posts
Showing posts with label nytimes. Show all posts

Sunday, November 14, 2010

Data visualization in the media: Interesting video

A colleague who knows my fascination with data visualization pointed me to a recent interesting video created by Geoff McGhee on Journalism in the Age of Data. In this 8-part video, he interviews media people who create visualizations for their websites at the New York Times, Washington Post, CNBC, and more. It is interesting to see their view of why interactive visualization might be useful to their audience, and how it is linked to "good journalism".

Also interviewed are a few visualization interface developers (e.g., IBM's Many Eyes designers) as well as Infographics experts and participants at the major Inforgraphics conference in Pamplona, Spain. The line between beautiful visualizations (art) and effective ones is discussed in Part IV ("too sexy for its own good" - Gert Nielsen) - see also John Grimwade's article.


Journalism in the Age of Data from Geoff McGhee on Vimeo.

The videos can be downloaded as a series of 8 podcasts, for those with narrower bandwidth.

Friday, May 11, 2007

NYT to mine their own data

You might ask yourself how on earth I have time for an entry during the last day of classes. Well, I don't. That's why I am doing it.

The New York Times recently announced to their stockholders that they are going to be revolutionary by mining their own data. As quoted from the village voice,
Data mining, [The company CEO Janet Robinson] told the crowd, would be used "to determine hidden patterns of uses to our website." This was just one of the many futuristic projects in the works by the newspapers company's research and development program

The article focuses on the alarm that this causes in terms of "what happens when the government comes in and subpoenas it?"

My question is, since every company and organization is mining (or potentially can mine) their own data anyway, what is the purpose of announcing it publicly? Clearly data mining is not such a "futuristic" act. What kind of "hidden patterns" are they looking for? the paths that readers take when they move between articles? what precedes their clicking an ad? Or maybe there is a futuristic goal?

Monday, April 02, 2007

Visualizing hierarchical data

Today much data is gathered from the web. Data from websites often tend to be hierarchical in nature: For example, on Amazon we have categories (music, books, etc.), then within a category there are sub-categories (e.g, within Books: Business & Technology, Childrens' books, etc.), and sometimes there are ever additional layers. Other examples are eBay, epinions, and almost any e-tailor. Even travel sites usually include some level of hierarchy.

The standard plots and graphs such as bar charts, histograms, boxplots might be useful for visualizing a particular level of hierarchy, but not the "big picture". The method of trellising is useful, where a particular graph is "broken down" by one or more variables. However, you still do not directly see the hierarchy.

An ingenious method for visualizing hierarchical data is the Treemap, designed by Professor Ben Shneiderman from the Human-Computer Lab at the University of Maryland. The treemap is basically a rectangle region broken down into sub-rectangles (and then possbily into further sub-sub-rectangles), where each basic smallest rectangle represents the unit of interest. Then color and/or size can be used to describe measures of interest.

Treemap's original goal was to visualize one's hard drive (with all its directories and sub-directories) for detecting pheonomena such as duplications. There a file was a single entity, and its size, for instance, could be represented by the rectangle's size. Since its development in the 1990s it has spread widely across almost every possible discipline. Probably the most popular application is in SmartMoney's Map of the Market where you can visualize the current state of the entire stock market. The strength of the treemap lies both in the ability to include multiple levels of hierarchy (you can drill-in and out to different levels) and also in its interactive nature, where users can choose to manipulate color, size, and order to represent measures of interest.

Microsoft research posts a free Excel add-on called Treemapper, but after trying it out I think it is too limited: It allows only one level of hierarchy and does not have any interactivity (it also requires only numerical information).

Last month the business section of the New York Times featured an article This time, no roadside assistance on DaimlerChrysler, which included a neat Treemap. Since it is no longer available online (NYT does not include graphics in its archives...) here it is -- courtesy of Amanda Cox from the NYT, known as their "statistics wiz".


You can find many more neat examples of using Treemap on the HCIL website.