Binary Classification with Bounded Abstention Rate
We consider the problem of binary classification with abstention in the relatively less studied bounded-rate setting. We begin by obtaining a characterization of the Bayes optimal classifier for an arbitrary input-label distribution P_XY. Our result generalizes and provides an alternative proof for the result first obtained by chow1957optimum, and then re-derived by denis2015consistency, under a continuity assumption on P_XY. We then propose a plug-in classifier that employs unlabeled samples to decide the region of abstention and derive an upper-bound on the excess risk of our classifier under standard Hölder smoothness and margin assumptions. Unlike the plug-in rule of denis2015consistency, our constructed classifier satisfies the abstention constraint with high probability and can also deal with discontinuities in the empirical cdf. We also derive lower-bounds that demonstrate the minimax near-optimality of our proposed algorithm. To address the excessive complexity of the plug-in classifier in high dimensions, we propose a computationally efficient algorithm that builds upon prior work on convex loss surrogates, and obtain bounds on its excess risk in the realizable case. We empirically compare the performance of the proposed algorithm with a baseline on a number of UCI benchmark datasets.
READ FULL TEXT