A Non-Intrusive Correction Algorithm for Classification Problems with Corrupted Data

02/11/2020
by   Jun Hou, et al.
0

A novel correction algorithm is proposed for multi-class classification problems with corrupted training data. The algorithm is non-intrusive, in the sense that it post-processes a trained classification model by adding a correction procedure to the model prediction. The correction procedure can be coupled with any approximators, such as logistic regression, neural networks of various architectures, etc. When training dataset is sufficiently large, we prove that the corrected models deliver correct classification results as if there is no corruption in the training data. For datasets of finite size, the corrected models produce significantly better recovery results, compared to the models without the correction algorithm. All of the theoretical findings in the paper are verified by our numerical examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2018

Skeptical Deep Learning with Distribution Correction

Recently deep neural networks have been successfully used for various cl...
research
03/23/2021

SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression

Logistic regression remains one of the most widely used tools in applied...
research
10/11/2020

A computationally and cognitively plausible model of supervised and unsupervised learning

Both empirical and mathematical demonstrations of the importance of chan...
research
06/26/2019

Leveraging Text Repetitions and Denoising Autoencoders in OCR Post-correction

A common approach for improving OCR quality is a post-processing step ba...
research
02/18/2022

The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression

Methods to correct class imbalance, i.e. imbalance between the frequency...
research
08/29/2023

Enhancing OCR Performance through Post-OCR Models: Adopting Glyph Embedding for Improved Correction

The study investigates the potential of post-OCR models to overcome limi...
research
05/31/2021

Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models

A sequence-to-sequence learning with neural networks has empirically pro...

Please sign up or login with your details

Forgot password? Click here to reset