Autoencoders Learn Generative Linear Models

by   Thanh V. Nguyen, et al.

Recent progress in learning theory has led to the emergence of provable algorithms for training certain families of neural networks. Under the assumption that the training data is sampled from a suitable generative model, the weights of the trained networks obtained by these algorithms recover (either exactly or approximately) the generative model parameters. However, the large majority of these results are only applicable to supervised learning architectures. In this paper, we complement this line of work by providing a series of results for unsupervised learning with neural networks. Specifically, we study the familiar setting of shallow autoencoder architectures with shared weights. We focus on three generative models for the data: (i) the mixture-of-gaussians model, (ii) the sparse coding model, and (iii) the non-negative sparsity model. All three models are widely studied in the machine learning literature. For each of these models, we rigorously prove that under suitable choices of hyperparameters, architectures, and initialization, the autoencoder weights learned by gradient descent successfully recover the parameters of the corresponding model. To our knowledge, this is the first result that rigorously studies the dynamics of gradient descent for weight-sharing autoencoders. Our analysis can be viewed as theoretical evidence that shallow autoencoder modules indeed can be used as unsupervised feature training mechanisms for a wide range of datasets, and may shed insight on how to train larger stacked architectures with autoencoders as basic building blocks.



There are no comments yet.


page 1

page 2

page 3

page 4


On the convergence of group-sparse autoencoders

Recent approaches in the theoretical analysis of model-based deep learni...

The dynamics of representation learning in shallow, non-linear autoencoders

Autoencoders are the simplest neural network for unsupervised learning, ...

Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis

A remarkable recent discovery in machine learning has been that deep neu...

Downsampling leads to Image Memorization in Convolutional Autoencoders

Memorization of data in deep neural networks has become a subject of sig...

The emergence of a concept in shallow neural networks

We consider restricted Boltzmann machine (RBMs) trained over an unstruct...

On the Regularization of Autoencoders

While much work has been devoted to understanding the implicit (and expl...

Rapid Feature Learning with Stacked Linear Denoisers

We investigate unsupervised pre-training of deep architectures as featur...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.