Faster PAC Learning and Smaller Coresets via Smoothed Analysis

06/09/2020
by   Alaa Maalouf, et al.
0

PAC-learning usually aims to compute a small subset (ε-sample/net) from n items, that provably approximates a given loss function for every query (model, classifier, hypothesis) from a given set of queries, up to an additive error ε∈(0,1). Coresets generalize this idea to support multiplicative error 1±ε. Inspired by smoothed analysis, we suggest a natural generalization: approximate the average (instead of the worst-case) error over the queries, in the hope of getting smaller subsets. The dependency between errors of different queries implies that we may no longer apply the Chernoff-Hoeffding inequality for a fixed query, and then use the VC-dimension or union bound. This paper provides deterministic and randomized algorithms for computing such coresets and ε-samples of size independent of n, for any finite set of queries and loss function. Example applications include new and improved coreset constructions for e.g. streaming vector summarization [ICML'17] and k-PCA [NIPS'16]. Experimental results with open source code are provided.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

11/04/2021

A Unified Approach to Coreset Learning

Coreset of a given dataset and loss function is usually a small weighed ...
06/09/2020

Coresets for Near-Convex Functions

Coreset is usually a small weighted subset of n input points in R^d, tha...
07/31/2019

Privately Answering Classification Queries in the Agnostic PAC Model

We revisit the problem of differentially private release of classificati...
10/19/2019

Introduction to Coresets: Accurate Coresets

A coreset (or core-set) of an input set is its small summation, such tha...
03/06/2022

Coresets for Data Discretization and Sine Wave Fitting

In the monitoring problem, the input is an unbounded stream P=p_1,p_2⋯ o...
05/14/2019

Dimensionality Reduction for Tukey Regression

We give the first dimensionality reduction methods for the overconstrain...
06/11/2019

Fast and Accurate Least-Mean-Squares Solvers

Least-mean squares (LMS) solvers such as Linear / Ridge / Lasso-Regressi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.