Adapting to Online Label Shift with Provable Guarantees

by   Yong Bai, et al.

The standard supervised learning paradigm works effectively when training data shares the same distribution as the upcoming testing samples. However, this assumption is often violated in real-world applications, especially when testing data appear in an online fashion. In this paper, we formulate and investigate the problem of online label shift (OLaS): the learner trains an initial model from the labeled offline data and then deploys it to an unlabeled online environment where the underlying label distribution changes over time but the label-conditional density does not. The non-stationarity nature and the lack of supervision make the problem challenging to be tackled. To address the difficulty, we construct a new unbiased risk estimator that utilizes the unlabeled data, which exhibits many benign properties albeit with potential non-convexity. Building upon that, we propose novel online ensemble algorithms to deal with the non-stationarity of the environments. Our approach enjoys optimal dynamic regret, indicating that the performance is competitive with a clairvoyant who knows the online environments in hindsight and then chooses the best decision for each round. The obtained dynamic regret bound scales with the intensity and pattern of label distribution shift, hence exhibiting the adaptivity in the OLaS problem. Extensive experiments are conducted to validate the effectiveness and support our theoretical findings.


page 1

page 2

page 3

page 4


Online Label Shift: Optimal Dynamic Regret meets Practical Algorithms

This paper focuses on supervised and unsupervised online label shift, wh...

Online Adaptation to Label Distribution Shift

Machine learning models often encounter distribution shifts when deploye...

Adapting to Continuous Covariate Shift via Online Density Ratio Estimation

Dealing with distribution shifts is one of the central challenges for mo...

Online Continual Adaptation with Active Self-Training

Models trained with offline data often suffer from continual distributio...

An Unbiased Risk Estimator for Learning with Augmented Classes

In this paper, we study the problem of learning with augmented classes (...

Model Specification Test with Unlabeled Data: Approach from Covariate Shift

We propose a novel framework of the model specification test in regressi...

NetRCA: An Effective Network Fault Cause Localization Algorithm

Localizing the root cause of network faults is crucial to network operat...

Please sign up or login with your details

Forgot password? Click here to reset