Monday, March 09, 2009

Start the Revolution

Variability is a key concept in statistics. The Greek letter Sigma has such importance, that it is probably associated more closely with statistics than with Greek. Yet, if you have a chance to examine the bookshelf of introductory statistics textbooks in a bookstore or the library you will notice that the variability between the zillions of textbooks, whether in engineering, business, or the social sciences, is nearly zero. And I am not only referring to price. I can close my eyes and place a bet on the topics that will show up in the table of contents of any textbook (summaries and graphs; basic probability; random variables; expected value and variance; conditional probability; the central limit theorem and sampling distributions; confidence intervals for the mean, proportion, two-groups, etc; hypothesis tests for one mean, comparing groups, etc.; linear regression) . I can also predict the order of those topics quite accurately, although there might be a tiny bit of diversity in terms of introducing regression up front and then returning to it at the end.

You may say: if it works, then why break it? Well, my answer is: no, it doesn't work. What is the goal of an introductory statistics course taken by non-statistics majors? Is it to familiarize them with buzzwords in statistics? If so, then maybe this textbook approach works. But in my eyes the goal is very different: give them a taste of how statistics can really be useful! Teach 2-3 major concepts that will stick in their minds; give them a coherent picture of when the statistics toolkit (or "technology", as David Hand calls it) can be useful.

I was recently asked by a company to develop for their managers a module on modeling input-output relationships. I chose to focus on using linear/logistic regression, with an emphasis on how it can be used for predicting new records or for explaining input-output relationships (in a different way, of course); on defining the analysis goal clearly; on the use of quantitative and qualitative inputs and output; on how to use standard errors to quantify sampling variability in the coefficients; on how to interpret the coefficients and relate them to the problem (for explanatory purposes); on how to trouble-shoot; on how to report results effectively. The reaction was "oh, we don't need all that, just teach them R-squares and p-values".

We've created monsters: the one-time students of statistics courses remember just buzzwords such as R-square and p-values, yet they have no real clue what those are and how limited they are in almost any sense.

I keep checking on the latest in statistics intro textbooks and see exercpts from the publishers. New books have this bell or that whistle (some new software, others nicer examples), but they almost always revolve around the same mishmash of topics with no clear big story to remember.

A few textbook have tried going the case-study avenue. One nice example is A Casebook for a First Course in Statistics and Data Analysis (by Chatterjee, Handcock, and Simonoff). It presents multiple "stories" with data, and how statistical methods are used to derive some insight. However, the authors suggest to use this book as an addendum to the ordinary teaching method: "The most effective way to use these cases is to study them concurrently with the statistical methodology being learned".

I've taught a "core" statistics course to audiences of engineers of different sorts and to MBAs. I had to work very hard to make the sequence of seemingly unrelated topics appear coherent, which in retrospect I do not think is possible in a single statistics course. Yes, you can show how cool and useful the concepts of expected value and variance are in the context of risk and portfolio management, or how the distribution of the mean is used effectively in control charts for monitoring industrial proceses, but then you must move on to the next chapter (usually sampling variance and the normal distribution), thereby erasing the point by piling on it totally different information. A first taste of statistics should be more pointed, more coherent, and more useful. Forget the details, focus on the big picture.

Bring on the revolution!

5 comments:

William said...

Hi. I agree that examples or case must be included. However, I feel that my goal in teaching Statistics is to make the text readable to the students. It is impossible to cover all the material and case studies. I want to break down the barrier that exists when trying to read a technical book. Make that book accessible to the students so that after they are finished with the course they feel confident about going back to the text and getting more information and topics we could not cover in class on their own.
Another point is that as a Statistics teacher you are giving them the tools which hopefully in other (exciting) classes they can make use of in interesting ways.

Galit Shmueli said...

Hi William - thanks for your comment. Let me respond to two points in particular. First, I believe that students who want to learn about new topics in statistics would most likely search the information on the web (Googling is faster than even finding that old textbook). There is also a growing number of books that are available for browsing online (free: Google Books, Amazon's look-inside). So I don't think a textbook should be treated as a reference/encyclopedia. Second, I share with you the hope that students see statistical tools used in other courses within their domain of interest. The problem is that in many disciplines statistical methods are used and taught quite differently (e.g., the social sciences rely heavily on causal models and use statistical models for verifying causal hypotheses). In addition, I think it is pedagogically sounder to integrate the statistics toolkit and the domain area more closely. I recall hearing about universities experimenting with teaching physics in this way (integrating all the intro math courses directly into the physics courses rather than separating), and I think that they showed good results. In that sense, I prefer a textbook that is truly domain-oriented, addressing a few big and statistically ill-defined problems rather than one that solves a large number of well-defined small exercises/examples.

Finally, I am not afraid of technical textbooks, as long as they are written clearly and the instructor can teach well. Techophobia is something that students should get over, or at least learn to cope with.

William said...

Hi Galit. I still feel partial to a text because of the continuity that the writer must maintain in terms of terminology and progress from one topic to another topic which depends on the first topic. Of course the web is there as well to use but as a base I think a text is very important. I don't have experience on how well students can learn by looking up serious mathematical concepts on the web so I don't know.
With regard to teaching them together, I think that it can definitely be a good idea however I actually used a text for teaching Physics and Calculus together and I was not all that happy with it. One problem I felt was that not all the topics I thought should be covered in a Calculus course were covered in that presentation. However it may have been that we just did not have enough time for reasons not related to the text. Not sure.
And with Statistics it may be a different story.

ronkenett said...

Fromm RSS News January 2008, in response to a letter by Steven Senn published in December 2007

I entered the Huxley building at Imperial College, as a mathematics undergraduate, in 1974. Our instructor in Introduction to Statistics was D R Cox. By the third year, Professor David Cox was running the exercise class, with junior instructors teaching the advanced courses such as Design of Experiments (Ms White) and Decision Theory (Dr R Coleman). Students were thus presented, from the start, with the power of statistics and a first hand demonstration of how a remarkable statistician solves problems.

In 2005, as ENBIS president elect, I awarded Sir David with the George Box Medal, and remarked that I never saw this model routinely applied in the various universities where I taught later on. The classical model is for juniors to teach introductory courses and for seniors to present graduate lectures.

These days, students expect and demand effective education with a combination of high tech visuals and relevant course topics. The challenge is to get students to think and acquire statistical skills. We hope most of them will learn to apply these skills in research, business and industry. Some will remain theoreticians and have careers in more mathematically oriented statistics departments or elsewhere.

Stephen Senn emphasises content over form, a long term perspective, as opposed to short term student satisfaction, and stimulation and hard work as a basis for in-depth learning and the setting of standards of excellence.

These goals require role models that can be emulated. Education demands leadership. Proper leadership will get students to accept responsibility for their education. Proper leadership will also adapt the education process to modern expectations and allow teachers to introduce new technologies. These challenges will open new possibilities, such as the use of computing intensive methods and novel application areas for statistical technology.

After pointing out problems, a statistical leader should move to the driver's seat and lead focused initiatives to improve the position of the discipline, relative to other disciplines. Unfortunately statistics is loosing it's standing as a relevant discipline. Physicists, computer scientists, biologists, psychologists are playing an increased role in the development of statistics. One reason might be the weak outreach of the statistical community, perhaps because of the academic emphasis on the mathematical aspects of statistics.

Stephen Senn is provoking us. Creativity management experts deliberately provoke people to produce ideas. In statistics, we definitely need new ideas. I hope Stephen’s call will generate some good ones. The "Cox Model" described above is a great example that worked for me.

Ron S Kenett
Raanana, ISRAEL

Unknown said...

As a student (non statistics major) I agree with both Professor Shmueli and William. I agree with Shmueli that the goal of an introductory statistics course should not be to familiarize students with buzzwords in statistic but instead to show students how statistics can be useful in real life situations. With that said, I feel that the most useful textbook for a non statistics major should be full of cases and examples based on common business issues/problems. It is also critical that the textbook is written in easily understandable language, if the language is too technical the message and concepts will be lost. Additionally, I agree with William that textbooks should be accessible for student to use as a reference after they have finished their statistics course. In the past I have used many of my textbooks as a reference.