Reproducible Bootstrap Aggregating

01/12/2020
by   Meimei Liu, et al.
8

Heterogeneity between training and testing data degrades reproducibility of a well-trained predictive algorithm. In modern applications, how to deploy a trained algorithm in a different domain is becoming an urgent question raised by many domain scientists. In this paper, we propose a reproducible bootstrap aggregating (Rbagging) method coupled with a new algorithm, the iterative nearest neighbor sampler (INNs), effectively drawing bootstrap samples from training data to mimic the distribution of the test data. Rbagging is a general ensemble framework that can be applied to most classifiers. We further propose Rbagging+ to effectively detect anomalous samples in the testing data. Our theoretical results show that the resamples based on Rbagging have the same distribution as the testing data. Moreover, under suitable assumptions, we further provide a general bound to control the test excess risk of the ensemble classifiers. The proposed method is compared with several other popular domain adaptation methods via extensive simulation studies and real applications including medical diagnosis and imaging classifications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2020

Predictive Inference Is Free with the Jackknife+-after-Bootstrap

Ensemble learning is widely used in applications to make predictions in ...
research
09/18/2017

Model-Powered Conditional Independence Test

We consider the problem of non-parametric Conditional Independence testi...
research
05/08/2022

One-Class Knowledge Distillation for Face Presentation Attack Detection

Face presentation attack detection (PAD) has been extensively studied by...
research
02/17/2021

BEDS: Bagging ensemble deep segmentation for nucleus segmentation with testing stage stain augmentation

Reducing outcome variance is an essential task in deep learning based me...
research
03/05/2020

A Nearest-Neighbor Based Nonparametric Test for Viral Remodeling in Heterogeneous Single-Cell Proteomic Data

An important problem in contemporary immunology studies based on single-...
research
03/26/2015

Towards Learning free Naive Bayes Nearest Neighbor-based Domain Adaptation

As of today, object categorization algorithms are not able to achieve th...
research
01/19/2021

Testing Simultaneous Diagonalizability

This paper proposes novel methods to test for simultaneous diagonalizati...

Please sign up or login with your details

Forgot password? Click here to reset