K-nearest neighbors (k-NN) is a simple yet often powerful classification / prediction method. The basic idea, for predicting a new observation, is to find the k most similar observations in terms of the predictor (X) values, and then let those k neighbors vote to determine the predicted class membership (or take their Y average to predict their numerical outcome). Since this is such an intuitive method, I thought it would be useful to discuss two improvements that have been suggested by data miners. Both use weighting, but in different ways.
One intuitive improvement is to weight the neighbors by their proximity to the observation of interest. In other words, rather than giving each neighbor equal importance in the vote (or average), closer neighbors have higher impact on the prediction.
A second way to use weighting to improve the predictive performance of k-NN is related to predictors: In ordinary k-NN predictors are typically brought to the same scale by normalization, and then treated equally for the purpose of determining proximities of observations. An improvement is therefore to weight the predictors according to their predictive power, such that higher importance is given to more informative predictors. The question is how to assign the weights, or in other words, how to assign predictive power scores to the different predictors. There are a variety of papers out there suggesting different methods. The main approach is to use a different classification/prediction method that yield predictor importance measures (e.g., logistic regression), and then to use those measures in constructing the predictor weights within k-NN.