Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

11/15/2017
by   Dhruv Mahajan, et al.
0

For many applications, an ensemble of base classifiers is an effective solution. The tuning of its parameters(number of classes, amount of data on which each classifier is to be trained on, etc.) requires G, the generalization error of a given ensemble. The efficient estimation of G is the focus of this paper. The key idea is to approximate the variance of the class scores/probabilities of the base classifiers over the randomness imposed by the training subset by normal/beta distribution at each point x in the input feature space. We estimate the parameters of the distribution using a small set of randomly chosen base classifiers and use those parameters to give efficient estimation schemes for G. We give empirical evidence for the quality of the various estimators. We also demonstrate their usefulness in making design choices such as the number of classifiers in the ensemble and the size of a subset of data used for training that is needed to achieve a certain value of generalization error. Our approach also has great potential for designing distributed ensemble classifiers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2021

A Dataset-Level Geometric Framework for Ensemble Classifiers

Ensemble classifiers have been investigated by many in the artificial in...
research
09/08/2023

Probabilistic Safety Regions Via Finite Families of Scalable Classifiers

Supervised classification recognizes patterns in the data to separate cl...
research
04/23/2021

Selecting a number of voters for a voting ensemble

For a voting ensemble that selects an odd-sized subset of the ensemble c...
research
10/14/2016

Generalization Error of Invariant Classifiers

This paper studies the generalization error of invariant classifiers. In...
research
06/11/2022

Kaggle Kinship Recognition Challenge: Introduction of Convolution-Free Model to boost conventional

This work aims to explore a convolution-free base classifier that can be...
research
03/04/2013

A Sharp Bound on the Computation-Accuracy Tradeoff for Majority Voting Ensembles

When random forests are used for binary classification, an ensemble of t...
research
06/07/2019

On the Current State of Research in Explaining Ensemble Performance Using Margins

Empirical evidence shows that ensembles, such as bagging, boosting, rand...

Please sign up or login with your details

Forgot password? Click here to reset