Learning to Bound the Multi-class Bayes Error

11/15/2018
by   Salimeh Yasaei Sekeh, et al.
0

In the context of supervised learning, meta learning uses features, metadata and other information to learn about the difficulty, behavior, or composition of the problem. Using this knowledge can be useful to contextualize classifier results or allow for targeted decisions about future data sampling. In this paper, we are specifically interested in learning the Bayes error rate (BER) based on a labeled data sample. Providing a tight bound on the BER that is also feasible to estimate has been a challenge. Previous work [1] has shown that a pairwise bound based on the sum of Henze-Penrose (HP) divergence over label pairs can be directly estimated using a sum of Friedman-Rafsky (FR) multivariate run test statistics. However, in situations in which the dataset and number of classes are large, this bound is computationally infeasible to calculate and may not be tight. Other multi-class bounds also suffer from computationally complex estimation procedures. In this paper, we present a generalized HP divergence measure that allows us to estimate the Bayes error rate with log-linear computation. We prove that the proposed bound is tighter than both the pairwise method and a bound proposed by Lin [2]. We also empirically show that these bounds are close to the BER. We illustrate the proposed method on the MNIST dataset, and show its utility for the evaluation of feature reduction strategies.

READ FULL TEXT

page 1

page 9

research
04/27/2015

Meta learning of bounds on the Bayes classifier error

Meta learning uses information from base learners (e.g. classifiers or e...
research
10/31/2017

Rate-optimal Meta Learning of Classification Error

Meta learning of optimal classifier error rates allows an experimenter t...
research
11/09/2017

Fast Meta-Learning for Adaptive Hierarchical Classifier Design

We propose a new splitting criterion for a meta-learning approach to mul...
research
09/16/2019

Learning to Benchmark: Determining Best Achievable Misclassification Error from Training Data

We address the problem of learning to benchmark the best achievable clas...
research
10/25/2022

Some Simulation and Empirical Results for Semi-Supervised Learning of the Bayes Rule of Allocation

There has been increasing attention to semi-supervised learning (SSL) ap...
research
08/04/2022

Equivalence between Time Series Predictability and Bayes Error Rate

Predictability is an emerging metric that quantifies the highest possibl...

Please sign up or login with your details

Forgot password? Click here to reset