Learning Structured Distributions From Untrusted Batches: Faster and Simpler

02/24/2020
by   Sitan Chen, et al.
0

We revisit the problem of learning from untrusted batches introduced by Qiao and Valiant [QV17]. Recently, Jain and Orlitsky [JO19] gave a simple semidefinite programming approach based on the cut-norm that achieves essentially information-theoretically optimal error in polynomial time. Concurrently, Chen et al. [CLM19] considered a variant of the problem where μ is assumed to be structured, e.g. log-concave, monotone hazard rate, t-modal, etc. In this case, it is possible to achieve the same error with sample complexity sublinear in n, and they exhibited a quasi-polynomial time algorithm for doing so using Haar wavelets. In this paper, we find an appealing way to synthesize the techniques of [JO19] and [CLM19] to give the best of both worlds: an algorithm which runs in polynomial time and can exploit structure in the underlying distribution to achieve sublinear sample complexity. Along the way, we simplify the approach of [JO19] by avoiding the need for SDP rounding and giving a more direct interpretation of it through the lens of soft filtering, a powerful recent technique in high-dimensional robust estimation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2021

Sample-Optimal PAC Learning of Halfspaces with Malicious Noise

We study efficient PAC learning of homogeneous halfspaces in ℝ^d in the ...
research
09/18/2020

Longest Common Subsequence in Sublinear Space

We present the first o(n)-space polynomial-time algorithm for computing ...
research
05/21/2018

Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time

MAP perturbation models have emerged as a powerful framework for inferen...
research
07/24/2023

A faster and simpler algorithm for learning shallow networks

We revisit the well-studied problem of learning a linear combination of ...
research
06/08/2020

Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Connections to Evolvability

In this paper we revisit some classic problems on classification under m...
research
11/05/2019

Efficiently Learning Structured Distributions from Untrusted Batches

We study the problem, introduced by Qiao and Valiant, of learning from u...
research
07/12/2021

Forster Decomposition and Learning Halfspaces with Noise

A Forster transform is an operation that turns a distribution into one w...

Please sign up or login with your details

Forgot password? Click here to reset