VC Dimension and Distribution-Free Sample-Based Testing

12/07/2020
by   Eric Blais, et al.
0

We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distribution-free sample-based model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closely-related variant that we call "lower VC" (or LVC) dimension to obtain strong lower bounds on this sample complexity. We use this result to obtain strong and in many cases nearly optimal lower bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees. Conversely, we show that two natural classes of functions, juntas and monotone functions, can be tested with a number of samples that is polynomially smaller than the number of samples required for PAC learning. Finally, we also use the connection between VC dimension and property testing to establish new lower bounds for testing radius clusterability and testing feasibility of linear constraint systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2019

Towards Testing Monotonicity of Distributions Over General Posets

In this work, we consider the sample complexity required for testing the...
research
10/22/2020

The Polynomial Method is Universal for Distribution-Free Correlational SQ Learning

We consider the problem of distribution-free learning for Boolean functi...
research
09/06/2023

Testing properties of distributions in the streaming model

We study distribution testing in the standard access model and the condi...
research
11/23/2010

Tight Sample Complexity of Large-Margin Learning

We obtain a tight distribution-specific characterization of the sample c...
research
08/31/2022

Fine-Grained Distribution-Dependent Learning Curves

Learning curves plot the expected error of a learning algorithm as a fun...
research
02/09/2023

Tree Learning: Optimal Algorithms and Sample Complexity

We study the problem of learning a hierarchical tree representation of d...
research
11/09/2018

Two Party Distribution Testing: Communication and Security

We study the problem of discrete distribution testing in the two-party s...

Please sign up or login with your details

Forgot password? Click here to reset