Sketching for Large-Scale Learning of Mixture Models

06/09/2016
by   Nicolas Keriven, et al.
0

Learning parameters from voluminous data can be prohibitive in terms of memory and computational requirements. We propose a "compressive learning" framework where we estimate model parameters from a sketch of the training data. This sketch is a collection of generalized moments of the underlying probability distribution of the data. It can be computed in a single pass on the training set, and is easily computable on streams or distributed datasets. The proposed framework shares similarities with compressive sensing, which aims at drastically reducing the dimension of high-dimensional signals while preserving the ability to reconstruct them. To perform the estimation task, we derive an iterative algorithm analogous to sparse reconstruction algorithms in the context of linear inverse problems. We exemplify our framework with the compressive estimation of a Gaussian Mixture Model (GMM), providing heuristics on the choice of the sketching procedure and theoretical guarantees of reconstruction. We experimentally show on synthetic data that the proposed algorithm yields results comparable to the classical Expectation-Maximization (EM) technique while requiring significantly less memory and fewer computations when the number of database elements is large. We further demonstrate the potential of the approach on real large-scale data (over 10 8 training samples) for the task of model-based speaker verification. Finally, we draw some connections between the proposed framework and approximate Hilbert space embedding of probability distributions using random features. We show that the proposed sketching operator can be seen as an innovative method to design translation-invariant kernels adapted to the analysis of GMMs. We also use this theoretical framework to derive information preservation guarantees, in the spirit of infinite-dimensional compressive sensing.

READ FULL TEXT
research
06/22/2017

Compressive Statistical Learning with Random Feature Moments

We describe a general framework --compressive statistical learning-- for...
research
04/17/2020

Statistical Learning Guarantees for Compressive Clustering and Compressive Mixture Modeling

We provide statistical learning guarantees for two unsupervised learning...
research
01/25/2012

Task-Driven Adaptive Statistical Compressive Sensing of Gaussian Mixture Models

A framework for adaptive and non-adaptive statistical compressive sensin...
research
03/28/2016

Estimating Mixture Models via Mixtures of Polynomials

Mixture modeling is a general technique for making any simple model more...
research
08/04/2020

Sketching Datasets for Large-Scale Learning (long version)

This article considers "sketched learning," or "compressive learning," a...
research
02/12/2020

Compressive Learning of Generative Networks

Generative networks implicitly approximate complex densities from their ...
research
09/14/2020

When compressive learning fails: blame the decoder or the sketch?

In compressive learning, a mixture model (a set of centroids or a Gaussi...

Please sign up or login with your details

Forgot password? Click here to reset