Learning Distributions Generated by One-Layer ReLU Networks

09/04/2019
by   Shanshan Wu, et al.
22

We consider the problem of estimating the parameters of a d-dimensional rectified Gaussian distribution from i.i.d. samples. A rectified Gaussian distribution is defined by passing a standard Gaussian distribution through a one-layer ReLU neural network. We give a simple algorithm to estimate the parameters (i.e., the weight matrix and bias vector of the ReLU neural network) up to an error ϵ||W||_F using Õ(1/ϵ^2) samples and Õ(d^2/ϵ^2) time (log factors are ignored for simplicity). This implies that we can estimate the distribution up to ϵ in total variation distance using Õ(κ^2d^2/ϵ^2) samples, where κ is the condition number of the covariance matrix. Our only assumption is that the bias vector is non-negative. Without this non-negativity assumption, we show that estimating the bias vector within an error ϵ requires the number of samples at least exponential in 1/ϵ^2. Our algorithm is based on the key observation that vector norms and pairwise angles can be estimated separately. We use a recent result on learning from truncated samples. We also prove two sample complexity lower bounds: Ω(1/ϵ^2) samples are required to estimate the parameters up to error ϵ, while Ω(d/ϵ^2) samples are necessary to estimate the distribution up to ϵ in total variation distance. The first lower bound implies that our algorithm is optimal for parameter estimation. Finally, we show an interesting connection between learning a two-layer generative model and non-negative matrix factorization. Experimental results are provided to support our analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2021

Lower Bounds on the Total Variation Distance Between Mixtures of Two Gaussians

Mixtures of high dimensional Gaussian distributions have been studied ex...
research
07/05/2020

Efficient Parameter Estimation of Truncated Boolean Product Distributions

We study the problem of estimating the parameters of a Boolean product d...
research
07/18/2023

Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction

The paper uses a frame-theoretic setting to study the injectivity of a R...
research
12/15/2022

Privately Estimating a Gaussian: Efficient, Robust and Optimal

In this work, we give efficient algorithms for privately estimating a Ga...
research
02/02/2019

Complexity, Statistical Risk, and Metric Entropy of Deep Nets Using Total Path Variation

For any ReLU network there is a representation in which the sum of the a...
research
02/20/2021

Efficient Learning of Non-Interacting Fermion Distributions

We give an efficient classical algorithm that recovers the distribution ...
research
04/05/2019

Parameter estimation for integer-valued Gibbs distributions

We consider the family of Gibbs distributions, which are probability dis...

Please sign up or login with your details

Forgot password? Click here to reset