The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies

06/02/2019
by   Ronen Basri, et al.
0

We study the relationship between the speed at which a neural network learns a function and the frequency of the function. We build on recent results that show that the dynamics of overparameterized neural networks trained with gradient descent can be well approximated by a linear system. When normalized training data is uniformly distributed on a hypersphere, the eigenfunctions of this linear system are spherical harmonic functions. We derive the corresponding eigenvalues for each frequency after introducing a bias term in the model. This bias term had been omitted from the linear network model without significantly affecting previous theoretical results. However, we show theoretically and experimentally that a shallow neural network without bias cannot learn simple, low frequency functions with odd frequencies, in the limit of large amounts of data. Our results enable us to make specific predictions of the time it will take a network with bias to learn functions of varying frequency. These predictions match the behavior of real shallow and deep networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2020

Frequency Bias in Neural Networks for Input of Non-Uniform Density

Recent works have partly attributed the generalization ability of over-p...
research
10/06/2021

Spectral Bias in Practice: The Role of Function Frequency in Generalization

Despite their ability to represent highly expressive functions, deep lea...
research
05/12/2021

Convergence Analysis of Over-parameterized Deep Linear Networks, and the Principal Components Bias

Convolutional Neural networks of different architectures seem to learn t...
research
01/12/2022

Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks

We study the dynamics of a neural network in function space when optimiz...
research
12/30/2017

Logarithmic Frequency Scaling and Consistent Frequency Coverage for the Selection of Auditory Filterbank Center Frequencies

This paper provides new insights into the problem of selecting filter ce...
research
05/16/2023

A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree Spectral Bias of Neural Networks

Despite the capacity of neural nets to learn arbitrary functions, models...
research
03/03/2023

Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies

Our theoretical understanding of the inner workings of general convoluti...

Please sign up or login with your details

Forgot password? Click here to reset