Thursday, August 15, 2013

Designing a Business Analytics program, Part 3: Structure

This post continues two earlier posts (Part 1: Intro and Part 2: Content) on Designing a Business Analytics (BA) program. This part focuses on the structure of a BA program, and especially course structure.

In the program that I designed, each of the 16 courses combines on-ground sessions with online components. Importantly, the opening and closing of a course should be on-ground.

The hybrid online/on-ground design is intended to accommodate participants who cannot take long periods of time-off to attend campus. Yet, even in a residential program, a hybrid structure can be more effective, if it is properly implemented. The reason is that a hybrid model is more similar to the real-world functioning of an analyst. At the start and end of a project, close communication is needed with the domain experts and stakeholders to assure that everyone is clear about the goals and the implications. In between these touch points, the analytics group works "offline" (building models, evaluating, testing, going back and forth) while communicating among the group and from time to time with the domain people.

A hybrid "sandwich" BA program can be set up to mimic this process:
  • The on-ground sessions at the start and end of each course help set the stage and expectations, build communication channels between the instructor and participants as well as among participants; at the close of a course, participants present their work and receive peer and instructor feedback.
  • The online components guide participants (and teams of participants) through the skill development and knowledge acquisition that the course aims at. Working through a live project, participants can acquire the needed knowledge (1) via lecture videos, textbook readings, case studies and articles, software tutorials and more, (2) via self-assessment and small deliverables that build up needed proficiency, and (3) a live online discussion board where participants are required to ask, answer, discuss and share experiences, challenges and discoveries. If designing and implementing the online component is beyond the realm of the institution, it is possible to integrate existing successful online courses, such as those offered on Statistics.com or on Coursera, EdX and other established online course providers.
For example, in a Predictive Analytics course, a major component is a team project with real data, solving a potentially real problem. The on-ground sessions would focus on translating a business problem into an analytics problem and setting the expectations and stage for the process the teams will be going through. Teams would submit proposals and discuss with the instructor to assure feasibility and determine the way forward. The online components would include short lecture videos, textbook reading, short individual assignments to master software and technique, and a vibrant online discussion board with topics at different technical and business levels (this is similar to my semi-MOOC course Business Analytics Using Data Mining). In the closing on-ground sessions, teams present their work to the entire group and discuss challenges and insights; each team might meet with the instructor to receive feedback and do a second round of improvement. Finally, an integrative session would provide closure and linkage to other courses.

Designing a Business Analytics program, Part 2: Content

This post follows Part 1: Intro of Designing a Business Analytics program. In this post, I focus on the content to be covered in the program, in the form of courses and projects.

The following design is based on my research of many programs, on discussions with faculty in various analytics areas, with analysts and managers at different levels, and on feedback from many past MBA students who have taken my analytics courses over the years (data mining, forecasting, visualization, statistics, etc.) and are now managing data at a broad range of companies and organizations.

Content
Dealing with data, little or mountains, and being able to tackle an array of business challenges and opportunities, requires a broad and diverse set of tools and approaches. From data access and management to modeling, assessment and deployment requires a skill set that derives from the fields of statistics, computer science, operations research, and more. In addition, one needs integrative and "big picture" thinking and effective communication skills. Here is a list of 16 courses, divided into four sets, that attempts to achieve such a skill set (by no means is this the only set - would love to hear comments):

Set I
  1. Analytic Thinking (what is a model? what is the role of a model? data in context and data-domain integration)
  2. Data Visualization (data exploration, interactive visualization, charts and dashboards, data presentation and effective communication, use of BI tools)
  3. Statistical Analysis 1: Estimation and inference (observational studies and experiments; estimating population means, proportions, and more; testing hypotheses regarding population numbers; using programming and menu-driven software)
  4. Statistical Analysis 2: Regression models (linear, logistic, ANOVA)
Set II
  1. Data Management 1: Database design and implementation, data warehousing
  2. Forecasting Analytics: Exploring and modeling time series
  3. Data Management 2: Big Data (Hadoop-MapReduce and more)
  4. Operations 1: Simulation (principles of simulation; Monte Carlo and Discrete Event simulation)
Set III
  1. Operations 2: Optimization (optimization techniques, sensitivity analysis, and more)
  2. Statistical Analysis 3: Advanced statistical models (censoring and truncation, modeling count data, handling missing values, design of experiments (A/B testing and beyond))
  3. Data Collection (Web data collection, online surveys, experiments)
  4. Data Mining 1: Supervised Learning - Predictive Analytics (predictive algorithms, evaluating predictive power, using software)
Set IV
  1. Data Mining 2: Unsupervised Learning (dimension reduction, clustering, association rules, recommender systems)
  2. Contemporary Analytics 1 (choose between: text mining, network analytics, social analytics, customer analytics, web analytics, risk analytics)
  3. Contemporary Analytics 2 (from the list above)
  4. Integrative Thinking (BA in different fields, choosing and integrating tools and analytic approaches into an effective solution)
The courses are divided into sets of four, where courses in each set can be offered in parallel. The order should take into account coverage of other courses and natural linkages.

Lastly: two industry team projects that require integrating skills from multiple courses should give participants the opportunity to interface with industry, test their skills in a more realistic setting, and gain initial experience and confidence to move forward on their own.

Continue to Part 3: Structure

Designing a Business Analytics program, Part 1: Intro

I have been receiving many inquiries about programs in "Business Analytics" (BA), online and offline, in the US and outside the US. The few programs that are already out there (see an earlier post) are relatively new, so it is difficult to assess their success in producing data-savvy analysts.

Rather than concentrate on the uncertainty, let me share my view and experience regarding the skill set that such programs should provide. To be practical, I will share the program that I designed for the Indian School of Business one-year certificate program in BA(*), in terms of content and structure. Both reflect the needed skills and knowledge that I believe make a valuable data analyst in a company. As well as a powerful consultant.

The program was designed for participants who have a few years of business experience and are planning to manage the data crunchers, but must acquire a solid knowledge of the crunchers' toolkit, and especially how it can be used effectively to tackle business goals, challenges and opportunities.

Business Analytics experts have a broad skill set
One important note: Although some universities and business schools are tempted to rename an existing operations or statistics program as a BA (or "Big Data" or "Data Science", etc) program, this will by no means supply the required diversity of skills. A program in BA should not look like a statistics program. It also should not look like a program in operations research. The key is therefore a combination of courses from different areas (statistics and operations among them), which usually requires experts from across campus. In a recent post by visualization expert Nathan Yaw, he comments on the need to know more than just visualization to be successful in the field ("It still surprises me how little statistics visualization people know... Look at job listings though, and most employers list it in the required skill set, so it's a big plus for you hiring-wise.")

The next two posts describe the content and structure of the program.

Continue to Part 2: Structure

(*) The final program structure and content at ISB were modified by the program administrator to accommodate constraints and shortages.

Friday, August 09, 2013

Predictive relationships and A/B testing

I recently watched an interesting webinar on Seeking the Magic Optimization Metric: When Complex Relationships Between Predictors Lead You Astray by Kelly Uphoff, manager of experimental analytics at Netflix. The presenter mentioned that Netflix is a heavy user of A/B testing for experimentation, and in this talk focused on the goal of optimizing retention.

In ideal A/B testing, the company would test the effect of an intervention of choice (such as displaying a promotion on their website) on retention, by assigning it to a random sample of users, and then comparing retention of the intervention group to that of a control group that was not subject to the intervention. This experimental setup can help infer a causal effect of the treatment on retention. The problem is that the information on retention can take long to measure -- if retention is defined as "customer paid for the next 6 months", you have to wait 6 months before you can determine the outcome.