Robust Combining of Disparate Classifiers through Order Statistics

05/20/1999
by   Kagan Tumer, et al.
0

Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In the typical setting investigated till now, each classifier is trained on data taken or resampled from a common data set, or (almost) randomly selected subsets thereof, and thus experiences similar quality of training data. However, in certain situations where data is acquired and analyzed on-line at several geographically distributed locations, the quality of data may vary substantially, leading to large discrepancies in performance of individual classifiers. In this article we introduce and investigate a family of classifiers based on order statistics, for robust handling of such cases. Based on a mathematical modeling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when such combiners are used. We show analytically that the selection of the median, the maximum and in general, the i^th order statistic improves classification performance. Furthermore, we introduce the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and show that they are quite beneficial in presence of outliers or uneven classifier performance. Experimental results on several public domain data sets corroborate these findings.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset