Superensemble classifier for learning from imbalanced business school data set
Private business schools in India face a common problem of selecting quality students for their MBA programs to achieve desired placement percentage. Business school data set is biased towards one class, i.e., imbalanced in nature. And learning from imbalanced data set is a difficult proposition. Most existing classification methods tend not to perform well on minority class examples when the data set is extremely imbalanced, because they aim to optimize the overall accuracy without considering the relative distribution of each class. The aim of the paper is twofold. We first propose an integrated sampling technique with an ensemble of classification tree (CT) and artificial neural network (ANN) model as one of the methodologies which works better compared to other similar methods. Further we propose a superensemble imbalanced classifier which works better on the original business school data set. Our proposed superensemble classifier not only handles the imbalance data set but also achieves higher accuracy in case of feature selection cum classification problems. The proposal has been compared with other state-of-the-art classifiers and found to be very competitive.
READ FULL TEXT