Neural Teleportation

12/02/2020
by   Marco Armenta, et al.
10

In this paper, we explore a process called neural teleportation, a mathematical consequence of applying quiver representation theory to neural networks. Neural teleportation "teleports" a network to a new position in the weight space, while leaving its function unchanged. This concept generalizes the notion of positive scale invariance of ReLU networks to any network with any activation functions and any architecture. In this paper, we shed light on surprising and counter-intuitive consequences neural teleportation has on the loss landscape. In particular, we show that teleportation can be used to explore loss level curves, that it changes the loss landscape, sharpens global minima and boosts back-propagated gradients. From these observations, we demonstrate that teleportation accelerates training when used during initialization regardless of the model, its activation function, the loss function, and the training data. Our results can be reproduced with the code available here: https://github.com/vitalab/neuralteleportation.

READ FULL TEXT

page 4

page 5

page 6

page 9

page 12

page 13

page 15

research
01/22/2018

E-swish: Adjusting Activations to Different Network Depths

Activation functions have a notorious impact on neural networks on both ...
research
08/31/2020

Extreme Memorization via Scale of Initialization

We construct an experimental setup in which changing the scale of initia...
research
02/14/2017

Exploring loss function topology with cyclical learning rates

We present observations and discussion of previously unreported phenomen...
research
10/09/2019

Loss Landscape Sightseeing with Multi-Point Optimization

We present multi-point optimization: an optimization technique that allo...
research
06/11/2019

Large Scale Structure of Neural Network Loss Landscapes

There are many surprising and perhaps counter-intuitive properties of op...
research
03/03/2018

On the Power of Over-parametrization in Neural Networks with Quadratic Activation

We provide new theoretical insights on why over-parametrization is effec...
research
11/19/2021

Probabilistic Regression with Huber Distributions

In this paper we describe a probabilistic method for estimating the posi...

Please sign up or login with your details

Forgot password? Click here to reset