Detecting Correlations with Little Memory and Communication

03/04/2018
by   Yuval Dagan, et al.
0

We study the problem of identifying correlations in multivariate data, under information constraints: Either on the amount of memory that can be used by the algorithm, or the amount of communication when the data is distributed across several machines. We prove a tight trade-off between the memory/communication complexity and the sample complexity, implying (for example) that to detect pairwise correlations with optimal sample complexity, the number of required memory/communication bits is at least quadratic in the dimension. Our results substantially improve those of Shamir [2014], which studied a similar question in a much more restricted setting. To the best of our knowledge, these are the first provable sample/memory/communication trade-offs for a practical estimation problem, using standard distributions, and in the natural regime where the memory/communication budget is larger than the size of a single data point. To derive our theorems, we prove a new information-theoretic result, which may be relevant for studying other information-constrained learning problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2019

Communication and Memory Efficient Testing of Discrete Distributions

We study distribution testing with communication and memory constraints ...
research
12/01/2017

Fundamental Limits on Data Acquisition: Trade-offs between Sample Complexity and Query Difficulty

In this paper, we consider query-based data acquisition and the correspo...
research
11/14/2013

Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation

Many machine learning approaches are characterized by information constr...
research
07/20/2019

Domain Compression and its Application to Randomness-Optimal Distributed Goodness-of-Fit

We study goodness-of-fit of discrete distributions in the distributed se...
research
06/16/2023

Memory-Constrained Algorithms for Convex Optimization via Recursive Cutting-Planes

We propose a family of recursive cutting-plane algorithms to solve feasi...
research
09/12/2019

Learning Graphs from Linear Measurements: Fundamental Trade-offs and Applications

We consider a specific graph learning task: reconstructing a symmetric m...
research
11/07/2016

Optimal Binary Autoencoding with Pairwise Correlations

We formulate learning of a binary autoencoder as a biconvex optimization...

Please sign up or login with your details

Forgot password? Click here to reset