Log In Sign Up

Learning and Generalization in Overparameterized Normalizing Flows

by   Kulin Shah, et al.

In supervised learning, it is known that overparameterized neural networks with one hidden layer provably and efficiently learn and generalize, when trained using stochastic gradient descent with sufficiently small learning rate and suitable initialization. In contrast, the benefit of overparameterization in unsupervised learning is not well understood. Normalizing flows (NFs) constitute an important class of models in unsupervised learning for sampling and density estimation. In this paper, we theoretically and empirically analyze these models when the underlying neural network is one-hidden-layer overparameterized network. Our main contributions are two-fold: (1) On the one hand, we provide theoretical and empirical evidence that for a class of NFs containing most of the existing NF models, overparametrization hurts training. (2) On the other hand, we prove that unconstrained NFs, a recently introduced model, can efficiently learn any reasonable data distribution under minimal assumptions when the underlying network is overparametrized.


page 1

page 2

page 3

page 4


Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data

Neural networks have many successful applications, while much less theor...

Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise

We consider a one-hidden-layer leaky ReLU network of arbitrary width tra...

Learning One Convolutional Layer with Overlapping Patches

We give the first provably efficient algorithm for learning a one hidden...

Towards Understanding the Generalization Bias of Two Layer Convolutional Linear Classifiers with Gradient Descent

A major challenge in understanding the generalization of deep learning i...

A Perturbation Resilient Framework for Unsupervised Learning

Designing learning algorithms that are resistant to perturbations of the...

Multi-Model Least Squares-Based Recomputation Framework for Large Data Analysis

Most multilayer least squares (LS)-based neural networks are structured ...

Neural Empirical Bayes

We formulate a novel framework that unifies kernel density estimation an...