A deep learning theory for neural networks grounded in physics

by   Benjamin Scellier, et al.

In the last decade, deep learning has become a major component of artificial intelligence, leading to a series of breakthroughs across a wide variety of domains. The workhorse of deep learning is the optimization of loss functions by stochastic gradient descent (SGD). Traditionally in deep learning, neural networks are differentiable mathematical functions, and the loss gradients required for SGD are computed with the backpropagation algorithm. However, the computer architectures on which these neural networks are implemented and trained suffer from speed and energy inefficiency issues, due to the separation of memory and processing in these architectures. To solve these problems, the field of neuromorphic computing aims at implementing neural networks on hardware architectures that merge memory and processing, just like brains do. In this thesis, we argue that building large, fast and efficient neural networks on neuromorphic architectures requires rethinking the algorithms to implement and train them. To this purpose, we present an alternative mathematical framework, also compatible with SGD, which offers the possibility to design neural networks in substrates that directly exploit the laws of physics. Our framework applies to a very broad class of models, namely systems whose state or dynamics are described by variational equations. The procedure to compute the loss gradients in such systems – which in many practical situations requires solely locally available information for each trainable parameter – is called equilibrium propagation (EqProp). Since many systems in physics and engineering can be described by variational principles, our framework has the potential to be applied to a broad variety of physical systems, whose applications extend to various fields of engineering, beyond neuromorphic computing.


Event-based Backpropagation for Analog Neuromorphic Hardware

Neuromorphic computing aims to incorporate lessons from studying biologi...

Deep physical neural networks enabled by a backpropagation algorithm for arbitrary physical systems

Deep neural networks have become a pervasive tool in science and enginee...

Holomorphic Equilibrium Propagation Computes Exact Gradients Through Finite Size Oscillations

Equilibrium propagation (EP) is an alternative to backpropagation (BP) t...

Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks

Stochastic gradient descent (SGD) is widely believed to perform implicit...

Toward a theory of optimization for over-parameterized systems of non-linear equations: the lessons of deep learning

The success of deep learning is due, to a great extent, to the remarkabl...

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

The last decade has witnessed an experimental revolution in data science...

Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

Over the last decade, a single algorithm has changed many facets of our ...

Please sign up or login with your details

Forgot password? Click here to reset