A Study of Neural Training with Non-Gradient and Noise Assisted Gradient Methods

05/08/2020
by   Anirbit Mukherjee, et al.
0

In this work we demonstrate provable guarantees on the training of depth-2 neural networks in new regimes than previously explored. (1) We start with a simple stochastic algorithm that can train a ReLU gate in the realizable setting with significantly milder conditions on the data distribution than previous results. Leveraging some additional distributional assumptions we also show near-optimal guarantees of training a ReLU gate when an adversary is allowed to corrupt the true labels. (2) Next we analyze the behaviour of noise assisted gradient descent on a ReLU gate in the realizable setting. While making no further distributional assumptions, we locate a ball centered at the origin such that all the iterates remain inside it with high probability. (3) Lastly we demonstrate a non-gradient iterative algorithm for which we give near optimal guarantees for training a class of depth-2 neural networks in the presence of an adversary who is additively corrupting the true labels. This analysis brings to light the advantage of having a large width for the network while defending against an adversary. We demonstrate that faced with data poisoning attacks of the kind we instantiate, for our chosen class of nets, the accuracy achieved by the algorithm in recovering the ground truth parameters, scales inversely with the width.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2019

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

A recent line of research on deep learning focuses on the extremely over...
research
08/14/2018

Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization

Neural networks with ReLU activations have achieved great empirical succ...
research
09/26/2019

Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks

Recent work has revealed that overparameterized networks trained by grad...
research
01/04/2021

Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise

We consider a one-hidden-layer leaky ReLU network of arbitrary width tra...
research
05/04/2020

Guarantees on learning depth-2 neural networks under a data-poisoning attack

In recent times many state-of-the-art machine learning models have been ...
research
09/10/2021

ReLU Regression with Massart Noise

We study the fundamental problem of ReLU regression, where the goal is t...
research
05/24/2023

From Tempered to Benign Overfitting in ReLU Neural Networks

Overparameterized neural networks (NNs) are observed to generalize well ...

Please sign up or login with your details

Forgot password? Click here to reset