Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks

11/12/2019
by   Yuan Cao, et al.
8

We study the sample complexity of learning one-hidden-layer convolutional neural networks (CNNs) with non-overlapping filters. We propose a novel algorithm called approximate gradient descent for training CNNs, and show that, with high probability, the proposed algorithm with random initialization grants a linear convergence to the ground-truth parameters up to statistical precision. Compared with existing work, our result applies to general non-trivial, monotonic and Lipschitz continuous activation functions including ReLU, Leaky ReLU, Sigmod and Softplus etc. Moreover, our sample complexity beats existing results in the dependency of the number of hidden nodes and filter size. In fact, our result matches the information-theoretic lower bound for learning one-hidden-layer CNNs with linear activation functions, suggesting that our sample complexity is tight. Our theoretical analysis is backed up by numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2017

Recovery Guarantees for One-hidden-layer Neural Networks

In this paper, we consider regression problems with one-hidden-layer neu...
research
05/21/2018

How Many Samples are Needed to Learn a Convolutional Neural Network?

A widespread folklore for explaining the success of convolutional neural...
research
07/14/2017

On the Complexity of Learning Neural Networks

The stunning empirical successes of neural networks currently lack rigor...
research
11/08/2017

Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels

In this paper, we consider parameter recovery for non-overlapping convol...
research
11/15/2018

Mathematical Analysis of Adversarial Attacks

In this paper, we analyze efficacy of the fast gradient sign method (FGS...
research
12/09/2021

A New Measure of Model Redundancy for Compressed Convolutional Neural Networks

While recently many designs have been proposed to improve the model effi...
research
11/17/2022

On the Sample Complexity of Two-Layer Networks: Lipschitz vs. Element-Wise Lipschitz Activation

We investigate the sample complexity of bounded two-layer neural network...

Please sign up or login with your details

Forgot password? Click here to reset