Saturday, March 15, 2014

Can women be professors or doctors? Not according to Jet Airways

I am already used to the comical scene at airports in Asia, where a sign-holder with "Professor Galit Shmueli" sees us walk in his/her direction and right away rushes to my husband. Whether or not the stereotype is based on actual gender statistics of professors in Asia is a good question.

What I don't find amusing is when a corporate like Jet Airways, under the guise of "celebrating international women's day", follows the same stereotype. When I tried to book a flight on, it would not allow me to use the Women's Day discount code if I chose title "Prof" or "Dr". Only if I chose "Mrs" or "Ms" would it work.

A Professor does not qualify as a woman

So I bowed low and switched the title in the reservation to "Mrs", only to get the error message

After scratching my head, I realized that I was (unfortunately?) logged into my JetPrivilege account where my title is "Dr" - a detail set at the time of the account creation that I cannot modify online. The workaround that I found was to dissociate the passenger from the account owner, and book for a "Mrs. Galit Shmueli", who obviously cannot be a Professor.
"Conflicting" information

For those who won't tolerate the humiliation of giving up Dr/Prof but are determined to get the discount in principle, a solution is to include another (non-Professor or non-Doctor) "Mrs" or "Ms" in the same booking. Yes, I'm being cynical.

In case you're thinking: "but how will the airline's booking system identify the passenger's gender if you use Prof or Dr?" - I can think of a few easy solutions such as adding the option "Prof. (Ms.)" or simply asking the traveler's gender, as common in train bookings. In short, it's beyond blaming "technology".

One thing is clear: According to Jet Airways, you just can't have it all. A JetPrivilege account with title "Prof", flying solo, and availing the Women's Day discount with your JetPrivilege number.

My only consolation is that during the flight I'll be able to enjoy "audio tracks from such leading female international artists as Beyonce Knowles, Lady Gaga, Jennifer Hudson, Taylor Swift, Kelly Clarkson and Rihanna on the airline's award-winning in-flight entertainment system." Luckily, Jet Airways doesn't include "artist" as a title.

Thursday, March 06, 2014

The use of dummy variables in predictive algorithms

Anyone who has taken a course in statistics that covers linear regression has heard some version of the rule regarding pre-processing categorical predictors with more than two categories and the need to factor them into binary dummy/indicator variables:
"If a variable has k levels, you can create only k-1 indicators. You have to choose one of the k categories as a "baseline" and leave out its indicator." (from Business Statistics by Sharpe, De Veaux & Velleman)
Technically, one can easily create k dummy variables for k categories in any software. The reason for not including all k dummies as predictors in a linear regression is to avoid perfect multicollinearity, where an exact linear relationship exists between the k predictors. Perfect multicollinearity causes computational and interpretation challenges (see slide #6). This k-dummies issue is also called the Dummy Variable Trap.

While these guidelines are required for linear regression, which other predictive models require them? The k-1 dummy rule applies to models where all the predictors are considered together, as a linear combination. Therefore, in addition to linear regression models, the rule would apply to logistic regression models, discriminant analysis, and in some cases to neural networks.

What happens if we use k-1 dummies in other predictive models? 
The choice of the dropped dummy variable does not affect the results of regression models, but can affect other methods. For instance, let's consider a classification/regression tree. In a tree, predictors are evaluated one-by-one, and therefore omitting one of the k dummies can result in an inferior predictive model. For example, suppose we have 12 monthly dummies and that in reality only January is different from other months (the outcome differs between January and other months). Now, we run a tree omitting the January dummy as an input and keep the other 11 monthly dummies. The only way the tree might discover the January effect is by creating 11 levels of splits by each of the dummies. This is much less efficient than a single split on the January dummy.

This post is inspired by a discussion in the recent Predictive Analytics 1 online course. This topic deserves more than a short post, yet I haven't seen a thorough discussion anywhere.