Tuesday, April 17, 2012

Google Scholar -- you're not alone; Microsoft Academic Search coming up in searches

In searching for a few colleagues' webpages I noticed a new URL popping up in the search results. It either included the prefix academic.microsoft.com or the IP address I got curious and checked it out to discover Microsoft Academic Search (Beta) -- a neat presentation of the author's research publications and collaborations. In addition to the usual list of publications, there are nice visualizations of publications and citations over time, a network chart of co-authors and citations, and even an Erdos Number graph. The genealogy graph claims that it is based on data mining so "might not be perfect".

All this is cool and helpful. But there is one issue that really bothers me: who owns my academic profile?

I checked my "own" Microsoft Academic Search page. Microsoft's software tried to guess my details (affiliation, homepage, papers, etc.) and was correct on some details but wrong on others. To correct the details required me to open a Windows Live ID account. I was able to avoid opening such an account until now (I am not a fan of endless accounts) and would have continued to avoid it, had I not been forced to do so: Microsoft created an academic profile page for me, without my consent, with wrong details. Guessing that this page will soon come up in user searches, I was compelled to correct the inaccurate details.

The next step was even more disturbing: once I logged in with my verified Window Live ID, I tried to correct my affiliation and homepage and added a photo. However, I received the message that the affiliation (Indian School of Business) is not recognized (!) and that Microsoft will have to review all my edits before changing them.

So who "owns" my academic identity? Since obviously Microsoft is crawling university websites to create these pages, it would have been more appropriate to find the authors' academic email addresses and email them directly to notify them of the page (with an "opt out" option!) and allow them to make any corrections without Microsoft's moderation.

Tuesday, April 03, 2012

New Google Consumer Surveys: revolutionizing academic data collection?

Surveys are a key data collection tool in several academic research areas. As opposed to experiments or field studies that yield observational data, surveys can give access to attitudes, reaching "inside the head" of people rather than observing their behavior.

Technological advances in survey tool development now offer "poor academics" sufficiently powerful online survey tools, such as surveymonkey.com and Google forms. Yet, obtaining access to a large pool of potential respondents from a particular population remains a challenge. Another challenge is getting fast responses -- how do you reach people quickly and get many of them to respond quickly?

We may now have a solution that is affordable for academic research: A few days ago Google announced a new service called "Google Consumer Surveys". Similar to Ad Sense, where Google places ads on websites of publishers (and pays the publishers a commission), with Consumer Surveys, Google places a single-question survey (=poll) on websites of publishers. The publishers require website users to complete the poll to get access to premium content.

Google Consumer Surplus: How it works (from their website)

The good:

  • Very affordable: the charge for each response is $0.10 (=only $100 for the magic number of 1,000 responses). Or, for an audience targeted by demographics or some trait, it is $.50 per response (more here).
  • Fast: Google will likely post the polls on pages with high traffic.
  • Google presents the results with attractive charts
  • Getting IRB permission may be easier, given the stringent policies that Google mandates
The bad:
  • You can only post one question at a time. For a longer survey, breaking it up into single questions means that not the same person is answering all the questions. Also, each additional question increases the cost exponentially.
  • Google does not supply the poll creator with the raw data. You only get aggregated data. You can choose the aggregation (inferred age, gender, urban density, geography, or income). This is likely to be a huge "bad" for researchers who need access to the raw data for more advanced analyses than those provided by Google. 
  • Currently Google only offers this service for websites in the US. To collect information from users visiting non-US website we will all have to continue holding our breath.
A curious anecdote: I filled in the support contact form to ask a few extra questions. I received speedy and helpful answers (within 24 hours), but they all landed in my Google Spam folder!

Monday, April 02, 2012

The world is flat? Only for US students

Learning and teaching has become a global endeavor with lots of online resources and technologies. Contests are an effective way to engage a diverse community from around the world. In the past I have written several posts about contests and competitions in data mining, statistics and more. And now about a new one.

Tableau is a US-based company that sells a cool data visualization tool (there's a free version too). The company has recently seen huge growth with lots of new adopters in industry and academia. Their "Tableau for teaching" (TfT) program is intended to assist instructors and teachers by providing software and resources for data visualization courses. The program is promoted as global "Tableau for Teaching Around the World" (see the interactive dashboard at the bottom of this post). As part of this program, a student contest was recently launched where students are provided with real data and are challenged to produce good visualizations that tell compelling stories. The data are from Lesotho, Africa (given by the NGO CARE) and the prizes are handsome. I was almost getting excited about this contest (non-US data, visualization, nice prizes for students) when I read the draconian contest eligibility rules:
ELIGIBILITY: The Tableau Student Data Challenge Contest (“The Awards,” “Contest” or “Promotion”) is offered and open only to legal residents of the 50 United States and the District of Columbia (“United States”) who at time of entry (a) are the legal age of majority in their state of residence; (b) physically reside in the United States; (c) are enrolled as a college or university accredited in the United States; and (d) are not an Ineligible Person
I was deeply disappointed. Not only does the contest exclude non-US students (even branches of US universities outside of the US are excluded!), but more disturbing is the fact that only US residents can win a prize for telling a story about lives of people in Lesotho. Condescending? Wouldn't local Lesotho students (or at least students in the region) be the most knowledgeable about the meaning of the data? Wouldn't they be the ones most qualified to tell the story of Lesotho people that emerges from the data? Wouldn't they be the first to identify surprising patterns or exceptions and even wrong data?

While one country "telling the story" of another country is common at the political level, there is no reason that open-minded private visualization software companies should endorse the same behavior. If the problem of awarding cash prizes to non-US citizens is tax-related, I am sure there are creative ways, such as giving free software licenses, to offer prizes that can be distributed to any enthusiastic and talented student of visualization around the world. In short, I call Tableau to change the rules and follow CARE's motto "Defending Dignity".