Chat Image Generator Video Music Voice Chat Photo Editor

Memorizing Gaussians with no over-parameterizaion via gradient decent on neural networks

03/28/2020

∙

We prove that a single step of gradient decent over depth two network, with q hidden neurons, starting from orthogonal initialization, can memorize Ω(dq/log^4(d)) independent and randomly labeled Gaussians in R^d. The result is valid for a large class of activation functions, which includes the absolute value.

READ FULL TEXT

Success!

An error occurred

Memorizing Gaussians with no over-parameterizaion via gradient decent on neural networks

Sign in with Google

Consider DeepAI Pro