The Sample Complexity of Multi-Distribution Learning for VC Classes

07/22/2023
by   Pranjal Awasthi, et al.
0

Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension d class on k distributions to be O(ϵ^-2ln(k)(d + k) + min{ϵ^-1 dk, ϵ^-4ln(k) d}), the best lower bound is Ω(ϵ^-2(d + k ln(k))). We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2015

The Optimal Sample Complexity of PAC Learning

This work establishes a new upper bound on the number of samples suffici...
research
08/31/2022

Fine-Grained Distribution-Dependent Learning Curves

Learning curves plot the expected error of a learning algorithm as a fun...
research
04/18/2023

Impossibility of Characterizing Distribution Learning – a simple solution to a long-standing problem

We consider the long-standing question of finding a parameter of a class...
research
02/24/2020

On the Sample Complexity of Adversarial Multi-Source PAC Learning

We study the problem of learning from multiple untrusted data sources, a...
research
01/12/2022

On the Statistical Complexity of Sample Amplification

Given n i.i.d. samples drawn from an unknown distribution P, when is it ...
research
04/07/2020

On the Complexity of Learning from Label Proportions

In the problem of learning with label proportions, which we call LLP lea...
research
06/16/2022

Generalization Bounds for Data-Driven Numerical Linear Algebra

Data-driven algorithms can adapt their internal structure or parameters ...

Please sign up or login with your details

Forgot password? Click here to reset