BzST | Business Analytics, Statistics, Teaching: Business intelligence

Showing posts with label Business intelligence. Show all posts

Wednesday, March 07, 2012

Forecasting + Analytics = ?

Quantitative forecasting is an age-old discipline, highly useful across different functions of an organization: from forecasting sales and workforce demand to economic forecasting and inventory planning.

Business schools have offered courses with titles such as "Time Series Forecasting", "Forecasting Time Series Data", "Business Forecasting", more specialized courses such as "Demand Planning and Sales Forecasting" or even graduate programs with title "Business and Economic Forecasting". Simple "Forecasting" is also popular. Such courses are offered at the undergraduate, graduate and even executive education. All these might convey the importance and usefulness of forecasting, but they are far from conveying the coolness of forecasting.

I've been struggling to find a better term for the courses that I teach on-ground and online, as well as for my recent book (with the boring name Practical Time Series Forecasting). The name needed to convey that we're talking about forecasting, particularly about quantitative data-driven forecasting, plus the coolness factor. Today I discovered it! Prof Refik Soyer from GWU's School of Business will be offering a course called "Forecasting for Analytics". A quick Google search did not find any results with this particular phrase -- so the credit goes directly to Refik. I also like "Forecasting Analytics", which links it to its close cousins "Predictive Analytics" and "Visual Analytics", all members of the Business Analytics family.

Wednesday, August 17, 2011

Where computer science and business meet

Data mining is taught very differently at engineering schools and at business schools. At engineering schools, data mining is taught more technically, deciphering how different algorithms work. In business schools the focus is on how to use algorithms in a business context.

Business students with a computer science background can now enjoy both worlds: take a data mining course with a business focus, and supplement it with the free course materials from Stanford Engineering school's Machine Learning course (including videos of lectures and handouts by Prof Andrew Ng). There are a bunch of other courses with free materials as part of the Stanford Engineering Everywhere program.

Similarly, computer science students with a business background can take advantage of MIT's Sloan School of Management Open Courseware program, and in particular their Data Mining course (last offered in 2003 by Prof Nitin Patel). Unfortunately, there are no lecture videos, but you do have access to handouts.

And for instructors in either world, these are great resources!

Thursday, August 04, 2011

The potential of being good

Yesterday I happened to hear talks by two excellent speakers, both on major data mining applications in industry. One common theme was that both speakers gave compelling and easy to grasp examples of what data mining algorithms and statistics can do beyond human intelligence, and how the two relate.

The first talk, by IBM's Global Services Christer Johnson, was given at the 2011 INFORMS Conference on Business Analytics and Operations Research (see video). Christer Johnson described the idea behind Watson, the artificial intelligence computer system developed by IBM that beat two champions of the Jeopardy quiz show. Two main points in the talk about the relationship between humans and data mining methods that I especially liked are:

Data analytics methods are designed not only to give an answer, but also to evaluate how confident they are about the answer. In answering the jeopardy questions, the data mining approach tells you not only what is the most likely answer, but also how confident you are about that answer.
Building trust in an analytics tool occurs when you see it make mistakes and learn from those mistakes.

The second talk, "The Art and Science of Matching Items to Users" was given by Deepak Agarwal , a Yahoo! principle research scientist and fellow statistician, was webcasted at ISB's seminar series. You can still catch it on Aug 10 at Yahoo!'s Big Thinker Series in Bangalore. The talk was about recommender systems and their use within Yahoo!. Among various approaches used by Yahoo! to improve recommendations, Deepak described a main idea for improving the customization of news item displays on news.yahoo.com.

On the relation between human intelligence and automation, the process of choosing which items to display on Yahoo! is a two-step process, where first human editors create a pool of potential interesting news items, and then automated machine-learning algorithms choose which individual items to display from that pool.

Like Christer Johnson's point #2, Deepak illustrated the difference between "the answer" (what we statisticians call a point estimate) and "the potential of it being good" (what we call the confidence in the estimate, AKA variability) in a very cool way: Consider two news items of which one will be displayed to a user. The first item was already shown to 100 users and 2 users clicked on links from that page. The second was shown to 10,000 users and 250 users clicked on links. Which news item should you show to maximize clicks? (yes, this is about ad revenues...) Although the first item has a lower click-through-rate (2%), it is also less certain, in the sense that it is based on less data than item 2. Hence, it is potentially good. He then took this one step further: Combine the two! "Exploit what is known to be good, explore what is potentially good".

So what do we have here? Very practical and clear examples of why we care about variance, the weakness of point estimates, and expanding the notion of diversification to combining certain good results with uncertain not-that-good results.

Wednesday, July 27, 2011

Analytics: You want to be in Asia

Business Intelligence and Data Mining have become hot buzzwords in the West. Using Google Insights for Search to "see what the world is searching for" (see image below), we can see that the popularity of these two terms seems to have stabilized (if you expand the search to 2007 or earlier, you will see the earlier peak and also that Data Mining was hotter for a while). Click on the image to get to the actual result, with which you can interact directly. There are two very interesting insights from this search result:

Looking at the "Regional Interest" for these terms, we see that the #1 country searching for these terms is India! Hong Kong and Singapore are also in the top 5. A surge of interest in Asia!
Adding two similar terms that have the term Analytics, namely Business Analytics and Data Analytics, unveils a growing interest in Analytics (whereas the two non-analytics terms have stabilized after their peak).

What to make of this? First, it means Analytics is hot. Business Analytics and Data Analytics encompass methods for analyzing data that add value to a business or any other organization. Analytics includes a wide range of data analysis methods, from visual analytics to descriptive and explanatory modeling, and predictive analytics. From statistical modeling, to interactive visualization (like the one shown here!), to machine-learning algorithms and more. Companies and organizations are hungry for methods that can turn their huge and growing amounts of data into actionable knowledge. And the hunger is most pressing in Asia.

Click on the image to refresh the Google Insight for Search result (in a new window)