Compression, Generalization and Learning

01/30/2023
by   Marco C. Campi, et al.
0

A compression function is a map that slims down an observational set into a subset of reduced size, while preserving its informational content. In multiple applications, the condition that one new observation makes the compressed set change is interpreted that this observation brings in extra information and, in learning theory, this corresponds to misclassification, or misprediction. In this paper, we lay the foundations of a new theory that allows one to keep control on the probability of change of compression (called the "risk"). We identify conditions under which the cardinality of the compressed set is a consistent estimator for the risk (without any upper limit on the size of the compressed set) and prove unprecedentedly tight bounds to evaluate the risk under a generally applicable condition of preference. All results are usable in a fully agnostic setup, without requiring any a priori knowledge on the probability distribution of the observations. Not only these results offer a valid support to develop trust in observation-driven methodologies, they also play a fundamental role in learning techniques as a tool for hyper-parameter tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2017

Agnostic Distribution Learning via Compression

We study sample-efficient distribution learning, where a learner is give...
research
05/21/2018

A New Lower Bound for Agnostic Learning with Sample Compression Schemes

We establish a tight characterization of the worst-case rates for the ex...
research
09/25/2019

Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network

One of biggest issues in deep learning theory is its generalization abil...
research
01/09/2020

Gaussian Approximation of Quantization Error for Estimation from Compressed Data

We consider the distributional connection between the lossy compressed r...
research
04/06/2023

Compression of enumerations and gain

We study the compressibility of enumerations, and its role in the relati...
research
01/27/2019

Information-Theoretic Understanding of Population Risk Improvement with Model Compression

We show that model compression can improve the population risk of a pre-...

Please sign up or login with your details

Forgot password? Click here to reset