Memorizing Gaussians with no over-parameterizaion via gradient decent on neural networks

03/28/2020
by   Amit Daniely, et al.
0

We prove that a single step of gradient decent over depth two network, with q hidden neurons, starting from orthogonal initialization, can memorize Ω(dq/log^4(d)) independent and randomly labeled Gaussians in R^d. The result is valid for a large class of activation functions, which includes the absolute value.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2022

Most Activation Functions Can Win the Lottery Without Excessive Depth

The strong lottery ticket hypothesis has highlighted the potential for t...
research
08/15/2019

Improving Randomized Learning of Feedforward Neural Networks by Appropriate Generation of Random Parameters

In this work, a method of random parameters generation for randomized le...
research
02/13/2022

Learning from Randomly Initialized Neural Network Features

We present the surprising result that randomly initialized neural networ...
research
07/14/2017

On the Complexity of Learning Neural Networks

The stunning empirical successes of neural networks currently lack rigor...
research
03/28/2023

Function Approximation with Randomly Initialized Neural Networks for Approximate Model Reference Adaptive Control

Classical results in neural network approximation theory show how arbitr...
research
12/30/2021

A Unified and Constructive Framework for the Universality of Neural Networks

One of the reasons why many neural networks are capable of replicating c...
research
07/02/2020

Persistent Neurons

Most algorithms used in neural networks(NN)-based leaning tasks are stro...

Please sign up or login with your details

Forgot password? Click here to reset