Convex Hull-Based Multi-objective Genetic Programming for Maximizing ROC Performance
ROC is usually used to analyze the performance of classifiers in data mining. ROC convex hull (ROCCH) is the least convex major-ant (LCM) of the empirical ROC curve, and covers potential optima for the given set of classifiers. Generally, ROC performance maximization could be considered to maximize the ROCCH, which also means to maximize the true positive rate (tpr) and minimize the false positive rate (fpr) for each classifier in the ROC space. However, tpr and fpr are conflicting with each other in the ROCCH optimization process. Though ROCCH maximization problem seems like a multi-objective optimization problem (MOP), the special characters make it different from traditional MOP. In this work, we will discuss the difference between them and propose convex hull-based multi-objective genetic programming (CH-MOGP) to solve ROCCH maximization problems. Convex hull-based sort is an indicator based selection scheme that aims to maximize the area under convex hull, which serves as a unary indicator for the performance of a set of points. A selection procedure is described that can be efficiently implemented and follows similar design principles than classical hyper-volume based optimization algorithms. It is hypothesized that by using a tailored indicator-based selection scheme CH-MOGP gets more efficient for ROC convex hull approximation than algorithms which compute all Pareto optimal points. To test our hypothesis we compare the new CH-MOGP to MOGP with classical selection schemes, including NSGA-II, MOEA/D) and SMS-EMOA. Meanwhile, CH-MOGP is also compared with traditional machine learning algorithms such as C4.5, Naive Bayes and Prie. Experimental results based on 22 well-known UCI data sets show that CH-MOGP outperforms significantly traditional EMOAs.
READ FULL TEXT