1 Introduction
Object detection is one of several fundamental topics in computer vision. The task of object detection is to identify predefined objects in a given images using knowledge gained through analysis of a set of labelled positive and negative exemplars. Viola and Jones’ face detection algorithm
[23] forms the basis of many of the stateoftheart realtime algorithms for object detection tasks.The most commonly adopted evaluation method by which to compare the detection performance of different algorithms is the Receiver Operating Characteristic (ROC) curve. The curve illustrates the varying performance of a binary classifier system as its discrimination threshold is altered. In the face and human detection literature researchers are often interested in the low false positive area of the ROC curve since this region characterizes the performance needed for most realworld vision applications. This is due to the fact that object detection is a highly asymmetric classification problem as there are only ever a small number of target objects among the millions of background patches in a single test image. A false positive rate of per scanning window would result in thousands of false positives in a single image, which is impractical for most applications. For many tasks, and particularly human detection, researchers also report the partial area under the ROC curve (pAUC), typically over the range and false positives per image [7]. As the name implies, pAUC is calculated as the area under the ROC curve between two specified false positive rates (FPRs). It summarizes the practical performance of a detector and often is the primary performance measure of interest.
Although pAUC is the metric of interest that has been used to evaluate detection performance, Most classifiers do not directly optimize this evaluation criterion, and as a result, often underperform. In this paper, we present a principled approach for learning an ensemble classifier which directly optimizes the partial area under the ROC curve, where the range over which the area is calculated may be selected according to the desired application. Built upon the structured learning framework, we thus propose here a novel form of ensemble classifier which directly optimizes the partial AUC score, which we call pAUCEns. As with all other boosting algorithms, our approach learns a predictor by building an ensemble of weak classification rules in a greedy fashion. It also relies on a sample reweighting mechanism to pass the information between each iteration. However, unlike traditional boosting, at each iteration, the proposed approach places a greater emphasis on samples which have the incorrect ordering^{1}^{1}1The positive sample has an incorrect ordering if it is ranked below the negative sample. In other words, we want all positive samples to be ranked above all negative samples. to achieve the optimal partial AUC score. The result is the ensemble learning method which yields the scoring function consistent with the correct relative ordering of positive and negative samples and optimizes the partial AUC score in a false positive rate range where .
Main contributions
(1) We propose a new ensemble learning approach which explicitly optimizes the partial area under the ROC curve (pAUC) between any two given false positive rates. The method is of particular interest in the wide variety of applications where performance is most important over a particular range within the ROC curve. The approach shares similarities with conventional boosting methods, but differs significantly in that the proposed method optimizes a multivariate performance measure using structured learning. Our design is simple and a conventional boostingbased visual detector can be transformed into a pAUCEnsbased visual detector with very few modifications to the existing code. Our approach is efficient since it exploits both the efficient weak classifier training and the efficient cutting plane solver for optimizing the partial AUC score in the structural SVM setting. (2) We show that our approach is more intuitive and simpler to use than alternative algorithms, such as Asymmetric AdaBoost [22] and CostSensitive AdaBoost [14], where one needs to crossvalidate the asymmetric parameter from a fixed set of discrete points. Furthermore, it is unclear how one would set the asymmetric parameter in order to achieve a maximal pAUC score for a specified false positive range. To our knowledge, our approach is the first principled ensemble method that directly optimizes the partial AUC in an arbitrary false positive range . (3) Experimental results on several data sets, especially on challenging human detection data sets, demonstrate the effectiveness of the proposed approach. Our pedestrian detector performs better than or on par with the stateoftheart, despite the fact that our detector only uses two standard lowlevel image features.
Related work
Various ensemble classifiers have been proposed in the literature. Of these AdaBoost is one the most well known as it has achieved tremendous success in computer vision and machine learning applications. In object detection, the cost of missing a true target is often higher than the cost of a false positive. Classifiers that are optimal under the symmetric cost, and thus treat false positives and negatives equally, cannot exploit this information. Several cost sensitive learning algorithms, where the classifier weights a positive class more heavily than a negative class, have thus been proposed.
Viola and Jones introduced the asymmetry property in Asymetric AdaBoost (AsymBoost) [22]
. However, the authors reported that this asymmetry is immediately absorbed by the first weak classifier. Heuristics are then used to avoid this problem. Peng proposed a fullycorrective asymmetric boosting method which does not have this problem
[25]. Note that one needs to carefully crossvalidate the asymmetric parameter in order to achieve the desired result. MasnadiShirazi and Vasconcelos [14] proposed a costsensitive boosting algorithm based on the statistical interpretation of boosting. Their approach is to optimize the costsensitive loss by means of gradient descent. Shen proposed LACBoost and FisherBoost to address this asymmetry issue in cascade classifiers [20]. Most works along this line address the pAUC evaluation criterion indirectly. In addition, one needs to carefully crossvalidate the asymmetric parameter in order to maximize the detection rate in a particular false positive range.Several algorithms that directly optimize the pAUC score have been proposed in bioinformatics [9, 11]. Komori and Eguchi optimize the pAUC using boostingbased algorithms [11]. This algorithm is heuristic in nature. Narasimhan and Agarwal develop a structural SVM based method which directly optimizes the pAUC score [16]
. They demonstrate that their approach, which uses a support vector method, significantly outperforms several existing algorithms, including pAUCBoost
[11] and asymmetric SVM [28]. Building on Narasimhan and Agarwal’s work, we propose the principled fullycorrective ensemble method which directly optimizes the pAUC evaluation criterion. The approach is flexible and can be applied to an arbitrary false positive range . To our knowledge, our approach is the first principled ensemble learning method that directly optimizes the partial AUC in a false positive range not bounded by zero. It is important to emphasize here the difference between our approach and that of [16]. [16] train a linear structural SVM while our approach learns the ensemble of classifiers. For pedestrian detection, HOG with the ensemble of classifiers reduces the average missrate over HOG+SVM by more than [2].Notation
Bold lowercase letters, , , denote column vectors and bold uppercase letters, , , denote matrices. Let be the set of positive training data and be the set of negative training data. A set of all training samples can be written as where and . We denote by a set of all possible outputs of weak learners. Assuming that we have possible weak learners, the output of weak learners for positive and negative data can be represented as where and , respectively. Here is the label predicted by the weak learner on the positive training data . Each column of the matrix represents the output of all weak learners when applied to the training instance . Each row of the matrix represents the output predicted by the weak learner on all the training data. The goal is to learn a set of binary weak learners and a scoring function, , that has good performance in terms of the pAUC between some specified false positive rates and where .
Structured learning approach for optimizing pAUC
Before we propose our approach, we briefly review the concept of SVM [16], in which our ensemble learning approach is built upon. Unless otherwise stated, we follow the symbols used in [16]. The area under the empirical ROC curve (AUC) can be defined as,
(1) 
and the partial AUC in the false positive range can be written as [5, 16],
(2) 
where , , denotes the negative instance in ranked in the th position amongst negative samples in descending order of scores. , and correspond to the sum of detection rates at , , and , respectively.
Given a training sample , our objective is to find a linear function that optimizes the pAUC in an FPR range of . We cast this pAUC optimization problem as a structural learning task. For any ordering of the training instances, the relative ordering of positive instances and negative instances is represented via a matrix where,
(3) 
We define the correct relative ordering of as where . The pAUC loss in the false positive range of with respect to can be written as,
(4) 
where denotes the index of the negative instance consistent with the matrix . We define the joint feature map of the form
(5) 
The choice of over guarantees that the variable , which optimizes , will also produce the scoring function that achieves the optimal partial AUC score. The above problem can be summarized as the following convex optimization problem [16]:
(6)  
and . Note that denote the correct relative ordering and denote any arbitrary orderings.
2 Our approach
In order to design an ensemblelike algorithm for the pAUC, we first introduce a projection function, , which projects an instance vector to . This projection function is also known as the weak learner in boosting. In contrast to the previously described structured learning, we learn the scoring function, which optimizes the area under the curve between two false positive rates of the form: where is the linear coefficient vector and denote a set of binary weak learners. Let us assume that we have already learned a set of all projection functions. By using the same pAUC loss, , as in (1), and the same feature mapping, , as in (5), the optimization problem we want to solve is:
(7)  
and . is the projected output for positive and negative training samples. where and it is defined as,
(8)  
The only difference between (6) and (7) is that the original data is now projected to a new nonlinear feature space. We will show how this can further improved the pAUC score in the experiment section. The dual problem of (7) can be written as (see supplementary),
(9)  
where is the dual variable, denotes the dual variable associated with the inequality constraint for and . To derive the Lagrange dual problem, the following KKT condition is used,
(10) 
Finding best weak learners
In this section, we show how one can explicitly learn the projection function, . We use the idea of column generation to derive an ensemblelike algorithm similar to LPBoost [4]. The condition for applying the column generation is that the duality gap between the primal and dual problem is zero (strong duality). By inspecting the KKT condition, at optimality, (10) must hold for all . In other words, must hold for all .
For the weak learner in the current working set, the corresponding condition in (10) is satisfied by the current solution. For the weak learner that are not yet selected, they do not appear in the current restricted optimization problem and the corresponding . It is easy to see that if for any that are not in the current working set, then the current solution is already the globally optimal one. Hence the subproblem for selecting the best weak learner is:
(11) 
In other words, we pick the weak learner with the value most deviated from zero. At iteration , we pick the most optimal weak learner from . Substituting (8) into (11), the subproblem for generating the optimal weak learner at iteration can be defined as,
(12) 
where , , index the positive training samples (), the negative training samples () and the entire training samples (,,), respectively. Here
(13) 
For decision stumps, the last equation in (2) is always valid since the weak learner set is negationclosed [12]. In other words, if , then , and vice versa. Here . For decision stumps, one can flip the inequality sign such that and . In fact, any linear classifiers of the form are negationclosed. Using (2) to choose the best weak learner is not heuristic as the solution to (11) decreases the duality gap the most for the current solution. See supplementary for more details.
Optimizing weak learners’ coefficients
We solve for the optimal that minimizes our objective function (7). However, the optimization problem (7) has an exponential number of constraints, one for each matrix . As in [16, 10], we use the cutting plane method to solve this problem. The basic idea of the cutting plane is that a small subset of the constraints are sufficient to find an approximate solution to the original problem. The algorithm starts with an empty constraint set and it adds the most violated constraint set at each iteration. The QP problem is solved using linear SVM and the process continues until no constraint is violated by more than
. Since, the quadratic program is of constant size and the cutting plane method converges in a constant number of iterations, the major bottleneck lies in the combinatorial optimization (over
) associated with finding the most violated constraint set at each iteration. Narasimhan and Agarwal show how this combinatorial problem can be solved efficiently in a polynomial time [16]. We briefly discuss their efficient algorithm in this section.The combinatorial optimization problem associated with finding the most violated constraint can be written as,
(14) 
where
(15)  
The trick to speed up (14) is to note that any ordering of the instances that is consistent with yields the same objective value, in (15). In addition, one can break down (14) into smaller maximization problems by restricting the search space from to the set where
Here represents the set of all matrices in which the ordering of the scores of two negative instances, and , is consistent. The new optimization problem is now easier to solve as the set of negative instances over which the loss term in (15) is computed is the same for all orderings in the search space. This simplification allows one to reduce the computational complexity of (15) to . Interested reader may refer to [16].
Discussion
Our final ensemble classifier has a similar form as the AdaBoostbased object detector of [23]. Based on Algorithm LABEL:ALG:main, step ① and ② of our algorithm are exactly the same as [23]. Similar to AdaBoost, in step ① plays the role of sample weights associated to each training sample. The major difference between AdaBoost and our approach is in step ③ and ④ where the weak learner’s coefficient is computed and the sample weights are updated. In AdaBoost, the weak learner’s coefficient is calculated as where and is the indicator function. The sample weights are updated with . We point this out here since a minimal modification is required in order to transform the existing implementation of AdaBoost to pAUCEns. Given the existing code of AdaBoost and the publicly available implementation of [16], our pAUCEns can be implemented in less than lines of codes. A computational complexity analysis of our approach can be found in the supplementary.
3 Experiments
Synthetic data set
We first illustrate the effectiveness of our approach on a synthetic data set similar to the one used in [22]. We compare pAUCEns against the baseline AdaBoost, CostSensitive AdaBoost (CSAdaBoost) [14] and Asymmetric AdaBoost (AsymBoost) [22]. We use vertical and horizontal decision stumps as the weak classifier. We evaluate the partial AUC score of different algorithms at FPRs. For each algorithm, we train a strong classifier consisting of and weak classifiers. Additional details of the experimental setup are provided in the supplementary. Fig. 1 illustrates the boundary decision^{2}^{2}2We set the threshold such that the false positive rate is . and the pAUC score. Our approach outperforms all other asymmetric classifiers. We observe that pAUCEns places more emphasis on positive samples than negative samples to ensure the highest detection rate at the leftmost part of the ROC curve (FPR ). Even though we choose the asymmetric parameter, , from a large range of values, both CSAdaBoost and AsymBoost perform slightly worse than our approach. AdaBoost performs worst on this toy data set since it optimizes the overall classification accuracy. However as the number of weak classifiers increases ( stumps), we observe all algorithms perform similarly on this simple toy data set. This observation could explain the success of AdaBoost in many object detection applications even though AdaBoost only minimizes the symmetric error rate.
In the next experiment, we train a strong classifier of weak classifiers and compare the performance of different classifiers at FPR of . We choose this value since it is the node learning goal often used in training a cascade classifier. Also we only learn weak classifiers since the first node of the cascade often contains a small number of weak classifiers for realtime performance. For pAUCEns, we set the value of to be . In Fig. 2, we display the decision boundary of each algorithm, and display both their pAUC score (in the FPR range ) and detection rate at false positive rate. We observe that our approach and AsymBoost have the highest detection rate at false positive rate. However, our approach outperforms AsymBoost on a pAUC score. We observe that our approach places more emphasis on positive samples near the corners (at , , and angles) than other algorithms.
Proteinprotein interaction prediction
In this experiment, we compare our approach with existing algorithms which optimize pAUC in bioinformatics. The problem we consider here is a proteinprotein interaction prediction [18], in which the task is to predict whether a pair of proteins interact or not. We used the data set labelled ‘Physical Interaction Task in Detailed feature type’, which is publicly available on the internet^{3}^{3}3http://www.cs.cmu.edu/~qyj/papers_sulp/proteins05_pages/featuredownload.html. The data set contains protein pairs known to be interacting (positive) and a random set of protein pairs labelled as noninteracting (negative). We use a subset of features as in [16]. We randomly split the data into two groups: for training/validation and for evaluation. We choose the best regularization parameter form , , , , by fold cross validation. We repeat our experiments times using the same regularization parameter. We train a linear classifier as our weak learner using LIBLINEAR [8]. We set the maximum number of boosting iterations to and report the pAUC score of our approach in Table 1. Baselines include SVM, SVM, pAUCBoost and Asymmetric SVM. Our approach outperforms all existing algorithms which optimize either AUC or pAUC . We attribute our improvement over SVM [16], as a result of introducing a nonlinearity into the original problem. This phenomenon has also been observed in face detection as reported in [27].
Comparison to other asymmetric boosting
Here we compare pAUCEns against several boosting algorithms previously proposed for the problem of object detection, namely, AdaBoost with Fisher LDA postprocessing [27], AsymBoost [22] and CSAdaBoost [14]. The results of AdaBoost are also presented as the baseline. For each algorithm, we train a strong classifier consisting of weak classifiers. We then calculate the pAUC score by varying the threshold value in the FPR range . For each algorithm, the experiment is repeated times and the average pAUC score is reported. For AsymBoost, we choose from , , , by crossvalidation. For CSAdaBoost, we choose from , , , by crossvalidation. We evaluate the performance of all algorithms on
vision data sets: USPS digits, scenes and face data sets. See supplementary for more details on feature extraction. We report the experimental results in Table
2. From the table, pAUCEns demonstrates the best performance on all three vision data sets.iters  USPS  SCENE  FACE  

Ours  ()  ()  ()  
(pAUCEns)  ()  ()  ()  
()  ()  ()  
AdaBoost  ()  ()  ()  
[23]  ()  ()  ()  
()  ()  ()  
Ada + LDA  ()  ()  ()  
[27]  ()  ()  ()  
()  ()  ()  
AsymBoost  ()  ()  ()  
[22]  ()  ()  ()  
()  ()  ()  
CSAdaBoost  ()  ()  ()  
[14]  ()  ()  ()  
()  ()  () 
Average pAUC scores and their standard deviations on vision data sets at various boosting iterations. All experiments are repeated
times. The best average performance is shown in boldface.Pedestrian detection  Strong classifier
We evaluate our approach on the pedestrian detection task. We train our approach on the INRIA pedestrian data set. For the positive training data, we use all INRIA cropped pedestrian images. To generate the negative training data, we first train the cascade classifier with nodes using Viola and Jones’ approach. We then combine random negative windows generated in the first node with another negative windows generated in the subsequent nodes. The resulting negative windows are used for training the strong classifier. We generate a large pool of features by combining the histogram of oriented gradient (HOG) features [3] and covariance (COV) features^{4}^{4}4Covariance features capture the relationship between different image statistics and have been shown to perform well in our previous experiments. However, other discriminative features can also be used here instead, , Haarlike features, Local Binary Pattern (LBP) [15], Sketch Tokens [13] and selfsimilarity of lowlevel features (CSS) [24]. [21]. Additional details of HOG and COV parameters are provided in the supplementary. We use weighted linear discriminant analysis (WLDA) as weak classifiers. We train weak classifiers and set multiexits [17]. To be more specific, we set the threshold at , , , and weak classifiers. These exits reduce the evaluation time during testing significantly. The regularization parameter is crossvalidated from . Since we have not carefully crossvalidated a finer range of , tuning this parameter could yield a further improvement. The training time of our approach is under two hours on a parallelized quad core Xeon machine.
During evaluation, each test image is scanned with
pixels step stride and the scale ratio of input image pyramid is
. The overlapped detection windows are merged using the greedy nonmaximum suppression strategy as introduced in [6]. We use the continuous AUC evaluation software of Sermanet [19] and report the pAUC score between FPPI ( false positive), FPPI ( false positives), FPPI ( false positives) and FPPI ( false positives) in Table 3. From the table, we observe that setting the value of to be minimal () yeilds the best pAUC score at FPPI . As we increase the FPPI range, the higher value of tends to perform better. This table clearly illustrates the advantage of our approach.TrainTest (FPPI)  

Fig. 3 compares the performance of our approach with other stateoftheart algorithms on the INRIA pedestrian data set. We use the evaluation software of Dollár [7], which computes the AUC from discrete points sampled between FPPI . Our approach performs second best on this data set. It performs comparable to VeryFast [1] which trains multiple detectors at multiple scale. Upon a closer observation, our pAUCEns performs slightly better than VeryFast when the number of FPPI is less then and VeryFast performs slightly better when the number of FPPI is greater . We evaluate our strong classifier on TUDBrussels and ETH pedestrian data sets but we observe that the detection results contain a large number of false positives. Instead of bootstrapping with more negative samples as in [6, 24], we train a cascade classifier in the next section.
Pedestrian detection  Cascade classifier
In this section, we train a cascade classifier using our pAUCEns. We train our detector on INRIA training set and evaluate the detector on INRIA, TUDBrussels and ETH test sets. On both TUDBrussels and ETH data sets, we upsample the original image to pixels before applying our pedestrian detector. We train the human detector with a combination of HOG and COV features as previously described. To achieve the node learning goal of the cascade (each node achieves an extremely high detection rate () and a moderate false positive rage ()), we optimize the pAUC in the FPR range . We train a multiexit cascade [17] with exit. In this experiment, we use the software of [19] to compute the continuous AUC score in the FPPI range . We sort different algorithms based on the pAUC score in the FPPI range and report the results in Fig. 4. We compare our proposed approach with the baseline HOGCOV classifier (using AdaBoost). We observe that our approach reduces the average missrate over HOGCOV by on INRIA test set. From Fig. 4, our approach achieves similar performance to the stateoftheart detector. We then breakdown experimental results of different measures using the partial AUC score (FPPI range ) in Table 4. On average, our approach performs best on the large evaluation setting where pedestrians are at least pixels tall. On other settings, our approach yields competitive results to the stateoftheart detector in that category. In summary, our approach performs better than or on par with the stateoftheart despite its simplicity (in comparison to LatSvm — a partbased approach which models unknown parts as latent variables). In addition, the current detector is only trained with two discriminative visual features (HOG and COV). Applying additional discriminative features, , LBP [26] or motion features [24], could further improve the overall detection performance.
Ours  ChnFtrs  ConvNet  CrossTalk  FPDW  FeatSynth  FtrMine  HOG  HikSvm  HogLbp  LatSvmV1  LatSvmV2  MultiFtr  Pls  PoseInv  Shapelet  VJ  VeryFast  

Reasonable (min. 50 pixels tall & no/partial occlusion)  Partial AUC(0,0.1)%  
INRIAFixed  27.4  31.6  31.9  26.9  30.9  49.3  71.5  59.4  53.9  50.5  62.6  29.1  51.3  50.4  89.4  93.0  83.2  23.9 
TudBrussels  65.8  72.2  77.8  68.8  77.0      87.9  92.3  91.3  95.5  80.8  81.6  82.6  94.5  97.8  97.4   
ETH  62.8  72.4  62.8  67.0  74.5      78.1  85.5  67.4  86.1  61.4  73.1  69.5  98.1  97.3  95.4  68.7 
Large (min. 100 pixels tall)  Partial AUC(0,0.1)%  
INRIAFixed  26.0  29.6  27.6  25.0  28.7  48.6  70.9  59.0  53.1  49.5  61.6  25.9  50.0  49.3  89.4  92.9  82.9  21.7 
TudBrussels  47.3  50.0  49.5  52.8  53.8      88.4  85.1  73.9  88.9  67.9  71.4  66.8  94.2  92.8  96.3   
ETH  42.9  57.6  48.1  48.6  62.7      56.9  66.0  54.2  74.7  45.5  59.4  50.8  96.3  93.4  92.0  48.6 
Near (min. 80 pixels tall)  Partial AUC(0,0.1)%  
INRIAFixed  25.9  30.1  30.8  25.4  29.5  48.3  70.9  58.5  53.0  49.4  62.0  27.6  50.3  49.5  89.2  92.9  82.8  22.7 
TudBrussels  49.0  59.1  60.1  58.7  62.1      87.1  88.0  79.5  91.7  70.6  74.5  75.1  93.9  95.1  96.3   
ETH  51.3  64.8  51.8  55.9  66.5      69.2  74.2  56.7  77.5  51.0  63.2  60.9  97.8  95.2  93.6  55.4 
Medium (min. 30 pixels tall and max. 80 pixels tall)  Partial AUC(0,0.1)%  
INRIAFixed  100.0  100.0  54.9  96.5  100.0  100.0  100.0  100.0  100.0  94.6  88.8  96.5  94.3  100.0  96.5  96.5  89.6  51.3 
TudBrussels  75.3  78.0  84.6  74.0  81.3      86.4  93.3  97.0  96.1  85.7  83.7  86.3  94.0  98.6  97.3   
ETH  67.2  64.8  76.0  65.0  67.1      69.8  80.9  78.7  88.3  76.7  68.6  67.0  96.3  89.8  89.0  73.2 
4 Conclusion
We have proposed a new ensemble learning method for object detection. The proposed approach is based on optimizing the partial AUC score in the FPR range . Extensive experiments demonstrate the effectiveness of the proposed approach in visual detection tasks. We plan to explore the possibility of applying the proposed approach to the multiple scales detector of [1] in order to improve the detection results of very low resolution pedestrian images.
References
 [1] R. Benenson, M. Mathias, R. Timofte, and L. V. Gool. Pedestrian detection at 100 frames per second. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2012.
 [2] R. Benenson, M. Mathias, T. Tuytelaars, and L. V. Gool. Seeking the strongest rigid detector. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2013.
 [3] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., volume 1, 2005.
 [4] A. Demiriz, K. Bennett, and J. ShaweTaylor. Linear programming boosting via column generation. Mach. Learn., 46(1–3):225–254, 2002.

[5]
L. E. Dodd and M. S. Pepe.
Partial auc estimation and regression.
Biometrics, 59(3):614–623, 2003.  [6] P. Dollár, Z. Tu, P. Perona, and S. Belongie. Integral channel features. In Proc. of British Mach. Vis. Conf., 2009.
 [7] P. Dollár, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell., 34(4):743–761, 2012.
 [8] R.E. Fan, K.W. Chang, C.J. Hsieh, X.R. Wang, and C.J. Lin. LIBLINEAR: A library for large linear classification. J. Mach. Learn. Res., 9:1871–1874, 2008.
 [9] M.J. Hsu and H.M. Hsueh. The linear combinations of biomarkers which maximize the partial area under the roc curves. Comp. Stats., 28(2):1–20, 2012.
 [10] T. Joachims, T. Finley, and C.N. J. Yu. Cuttingplane training of structural svms. Mach. Learn., 77(1):27–59, 2009.
 [11] O. Komori and S. Eguchi. A boosting method for maximizing the partial area under the roc curve. BMC Bioinformatics, 11(1):314, 2010.

[12]
O. Komori and S. Eguchi.
Boosting learning algorithm for pattern recognition and beyond.
IEICE Trans. Infor. and Syst., 94(10):1863–1869, 2011.  [13] J. J. Lim, C. L. Zitnick, and P. Dollar. Sketch Tokens: A learned midlevel representation for contour and object detection. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2013.
 [14] H. MasnadiShirazi and N. Vasconcelos. Costsensitive boosting. IEEE Trans. Pattern Anal. Mach. Intell., 33(2):294–309, 2011.
 [15] Y. Mu, S. Yan, Y. Liu, T. Huang, and B. Zhou. Discriminative local binary patterns for human detection in personal album. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., Anchorage, AK, US, 2008.
 [16] H. Narasimhan and S. Agarwal. A structural svm based approach for optimizing partial auc. In Proc. Int. Conf. Mach. Learn., 2013.
 [17] M.T. Pham, V.D. D. Hoang, and T.J. Cham. Detection with multiexit asymmetric boosting. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2008.
 [18] Y. Qi, Z. BarJoseph, and J. KleinSeetharaman. Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Proteins: Struct., Func., and Bioinfor., 63(3):490–500, 2006.
 [19] P. Sermanet, K. Kavukcuoglu, S. Chintala, and Y. LeCun. Pedestrian detection with unsupervised multistage feature learning. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2013.
 [20] C. Shen, P. Wang, S. Paisitkriangkrai, and A. van den Hengel. Training effective node classifiers for cascade classification. Int. J. Computer Vision, 103(3):326–347, 2013.
 [21] O. Tuzel, F. Porikli, and P. Meer. Pedestrian detection via classification on Riemannian manifolds. IEEE Trans. Pattern Anal. Mach. Intell., 30(10):1713–1727, 2008.
 [22] P. Viola and M. Jones. Fast and robust classification using asymmetric AdaBoost and a detector cascade. In Proc. Adv. Neural Inf. Process. Syst., pages 1311–1318. MIT Press, 2002.
 [23] P. Viola and M. J. Jones. Robust realtime face detection. Int. J. Comp. Vis., 57(2):137–154, 2004.
 [24] S. Walk, N. Majer, K. Schindler, and B. Schiele. New features and insights for pedestrian detection. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., San Francisco, US, 2010.

[25]
P. Wang, C. Shen, N. Barnes, and H. Zheng.
Fast and robust object detection using asymmetric totallycorrective
boosting.
IEEE Trans. Neural Networks and Learning Systems
, 23(1):33–46, 2012.  [26] X. Wang, T. X. Han, and S. Yan. An HOGLBP human detector with partial occlusion handling. In Proc. IEEE Int. Conf. Comp. Vis., 2009.
 [27] J. Wu, S. C. Brubaker, M. D. Mullin, and J. M. Rehg. Fast asymmetric learning for cascade face detection. IEEE Trans. Pattern Anal. Mach. Intell., 30(3):369–382, 2008.

[28]
S.H. Wu, K.P. Lin, C.M. Chen, and M.S. Chen.
Asymmetric support vector machines: low falsepositive learning under the user tolerance.
In Proc. of Intl. Conf. on Knowledge Discovery and Data Mining, 2008.
Comments
There are no comments yet.