Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time

12/14/2021
by   Zhao Song, et al.
0

We consider the problem of training a multi-layer over-parametrized neural networks to minimize the empirical risk induced by a loss function. In the typical setting of over-parametrization, the network width m is much larger than the data dimension d and number of training samples n (m=poly(n,d)), which induces a prohibitive large weight matrix W∈ℝ^m× m per layer. Naively, one has to pay O(m^2) time to read the weight matrix and evaluate the neural network function in both forward and backward computation. In this work, we show how to reduce the training cost per iteration, specifically, we propose a framework that uses m^2 cost only in the initialization phase and achieves a truly subquadratic cost per iteration in terms of m, i.e., m^2-Ω(1) per iteration. To obtain this result, we make use of various techniques, including a shifted ReLU-based sparsifier, a lazy low rank maintenance data structure, fast rectangular matrix multiplication, tensor-based sketching techniques and preconditioning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2022

A Sublinear Adversarial Training Algorithm

Adversarial training is a widely used strategy for making neural network...
research
07/21/2018

On the Analysis of Trajectories of Gradient Descent in the Optimization of Deep Neural Networks

Theoretical analysis of the error landscape of deep neural networks has ...
research
11/25/2022

Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing

Over the last decade, deep neural networks have transformed our society,...
research
01/26/2020

Inference in Multi-Layer Networks with Matrix-Valued Unknowns

We consider the problem of inferring the input and hidden variables of a...
research
07/12/2022

Look-ups are not (yet) all you need for deep learning inference

Fast approximations to matrix multiplication have the potential to drama...
research
05/26/2019

On Learning Over-parameterized Neural Networks: A Functional Approximation Prospective

We consider training over-parameterized two-layer neural networks with R...

Please sign up or login with your details

Forgot password? Click here to reset