Linear Neural Network Layers Promote Learning Single- and Multiple-Index Models

05/24/2023
by   Suzanna Parkinson, et al.
0

This paper explores the implicit bias of overparameterized neural networks of depth greater than two layers. Our framework considers a family of networks of varying depths that all have the same capacity but different implicitly defined representation costs. The representation cost of a function induced by a neural network architecture is the minimum sum of squared weights needed for the network to represent the function; it reflects the function space bias associated with the architecture. Our results show that adding linear layers to a ReLU network yields a representation cost that favors functions that can be approximated by a low-rank linear operator composed with a function with low representation cost using a two-layer network. Specifically, using a neural network to fit training data with minimum representation cost yields an interpolating function that is nearly constant in directions orthogonal to a low-dimensional subspace. This means that the learned network will approximately be a single- or multiple-index model. Our experiments show that when this active subspace structure exists in the data, adding linear layers can improve generalization and result in a network that is well-aligned with the true active subspace.

READ FULL TEXT
research
02/02/2022

The Role of Linear Layers in Nonlinear Interpolating Networks

This paper explores the implicit bias of overparameterized neural networ...
research
03/01/2022

Side-effects of Learning from Low Dimensional Data Embedded in an Euclidean Space

The low dimensional manifold hypothesis posits that the data found in ma...
research
03/01/2023

Adversarial Examples Exist in Two-Layer ReLU Networks for Low Dimensional Data Manifolds

Despite a great deal of research, it is still not well-understood why tr...
research
07/31/2020

The Kolmogorov-Arnold representation theorem revisited

There is a longstanding debate whether the Kolmogorov-Arnold representat...
research
11/30/2020

Derivative-Informed Projected Neural Networks for High-Dimensional Parametric Maps Governed by PDEs

Many-query problems, arising from uncertainty quantification, Bayesian i...
research
09/29/2022

Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions

We show that the representation cost of fully connected neural networks ...
research
10/07/2021

Multi-Head ReLU Implicit Neural Representation Networks

In this paper, a novel multi-head multi-layer perceptron (MLP) structure...

Please sign up or login with your details

Forgot password? Click here to reset