Replicability and stability in learning

04/07/2023
by   Zachary Chase, et al.
0

Replicability is essential in science as it allows us to validate and verify research findings. Impagliazzo, Lei, Pitassi and Sorrell (`22) recently initiated the study of replicability in machine learning. A learning algorithm is replicable if it typically produces the same output when applied on two i.i.d. inputs using the same internal randomness. We study a variant of replicability that does not involve fixing the randomness. An algorithm satisfies this form of replicability if it typically produces the same output when applied on two i.i.d. inputs (without fixing the internal randomness). This variant is called global stability and was introduced by Bun, Livni and Moran ('20) in the context of differential privacy. Impagliazzo et al. showed how to boost any replicable algorithm so that it produces the same output with probability arbitrarily close to 1. In contrast, we demonstrate that for numerous learning tasks, global stability can only be accomplished weakly, where the same output is produced only with probability bounded away from 1. To overcome this limitation, we introduce the concept of list replicability, which is equivalent to global stability. Moreover, we prove that list replicability can be boosted so that it is achieved with probability arbitrarily close to 1. We also describe basic relations between standard learning-theoretic complexity measures and list replicable numbers. Our results, in addition, imply that besides trivial cases, replicable algorithms (in the sense of Impagliazzo et al.) must be randomized. The proof of the impossibility result is based on a topological fixed-point theorem. For every algorithm, we are able to locate a "hard input distribution" by applying the Poincaré-Miranda theorem in a related topological setting. The equivalence between global stability and list replicability is algorithmic.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2023

Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization

The notion of replicable algorithms was introduced in Impagliazzo et al....
research
03/01/2020

An Equivalence Between Private Classification and Online Prediction

We prove that every concept class with finite Littlestone dimension can ...
research
11/02/2019

Adaptive Statistical Learning with Bayesian Differential Privacy

In statistical learning, a dataset is often partitioned into two parts: ...
research
05/24/2020

Successive Refinement of Privacy

This work examines a novel question: how much randomness is needed to ac...
research
06/08/2015

NP-hardness of sortedness constraints

In Constraint Programming, global constraints allow to model and solve m...
research
05/31/2021

Approximate polymorphisms

For a function g{0,1}^m→{0,1}, a function f{0,1}^n→{0,1} is called a g-p...
research
11/12/2020

Comparing computational entropies below majority (or: When is the dense model theorem false?)

Computational pseudorandomness studies the extent to which a random vari...

Please sign up or login with your details

Forgot password? Click here to reset