Gray Learning from Non-IID Data with Out-of-distribution Samples

06/19/2022
by   Zhilin Zhao, et al.
0

The quality of the training data annotated by experts cannot be guaranteed, even more so for non-IID data consisting of both in- and out-of-distribution samples (i.e., in-distribution and out-of-distribution samples hold different distributions). Experts may mistakenly annotate out-of-distribution samples the same as in-distribution samples, incurring untrustworthy ground-truth labels. Learning such non-IID data mixing in- and out-of-distribution samples with untrustworthy labels significantly challenges both shallow and deep learning, with no relevant work reported. It would be possible to identify trustworthy complementary labels of a sample indicating which classes it does not belong to, because both in- and out-of-distribution samples do not belong to the classes except those corresponding to the ground-truth label. With this insight, we propose a novel gray learning approach to robustly learn from non-IID data with both in- and out-of-distribution samples. Due to the uncertain distributions of training samples, we reject the complementary labels for low-confidence inputs while mapping high-confidence inputs to the ground-truth labels in training. Building on the statistical learning theory, we derive the generalization error which shows that gray learning achieves a tight bound on the non-IID data. Extensive experiments show that our method provides significant improvement over alternative methods from robust statistics.

READ FULL TEXT
research
06/19/2022

Out-of-distribution Detection by Cross-class Vicinity Distribution of In-distribution Data

Deep neural networks only learn to map in-distribution inputs to their c...
research
06/04/2023

Active Inference-Based Optimization of Discriminative Neural Network Classifiers

Commonly used objective functions (losses) for a supervised optimization...
research
01/29/2019

Robust Learning from Untrusted Sources

Modern machine learning methods often require more data for training tha...
research
06/26/2021

Midpoint Regularization: from High Uncertainty Training to Conservative Classification

Label Smoothing (LS) improves model generalization through penalizing mo...
research
07/27/2022

NICEST: Noisy Label Correction and Training for Robust Scene Graph Generation

Nearly all existing scene graph generation (SGG) models have overlooked ...
research
12/08/2020

Concept Drift and Covariate Shift Detection Ensemble with Lagged Labels

In model serving, having one fixed model during the entire often life-lo...
research
10/28/2018

Iteratively Learning from the Best

We study a simple generic framework to address the issue of bad training...

Please sign up or login with your details

Forgot password? Click here to reset