Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning

09/23/2019
by   Jérôme Bolte, et al.
0

The Clarke subdifferential is not suited to tackle nonsmooth deep learning issues: backpropagation, mini-batches and steady states are not properly modelled. As a remedy, we introduce set valued conservative fields as surrogates to standard subdifferential mappings. We study their properties and provide elements of a calculus. Functions having a conservative field are called path differentiable. Convex/concave, semi-algebraic, or Clarke regular Lipschitz continuous functions are path differentiable as their corresponding subdifferentials are conservative. Another concrete and considerable class of examples of conservative fields, which are not subdifferential mappings, is given by the automatic differentiation oracle, as for instance the "subgradients" provided by the backpropagation algorithm in deep learning. Our differential model is eventually used to ensure subsequential convergence for nonsmooth stochastic gradient methods in the tame Lipschitz continuous setting offering the possibility of using mini-batches, the actual backpropagation oracle and o(1/ k) stepsizes.

READ FULL TEXT
research
01/03/2021

The structure of conservative gradient fields

The classical Clarke subdifferential alone is inadequate for understandi...
research
01/11/2022

Path differentiability of ODE flows

We consider flows of ordinary differential equations (ODEs) driven by pa...
research
09/27/2019

Backpropagation in the Simply Typed Lambda-calculus with Linear Negation

Backpropagation is a classic automatic differentiation algorithm computi...
research
05/31/2022

Automatic differentiation of nonsmooth iterative algorithms

Differentiation along algorithms, i.e., piggyback propagation of derivat...
research
06/01/2022

Nonsmooth automatic differentiation: a cheap gradient principle and other complexity results

We provide a simple model to estimate the computational costs of the bac...
research
01/31/2022

Differentiable Neural Radiosity

We introduce Differentiable Neural Radiosity, a novel method of represen...
research
07/22/2020

Examples of pathological dynamics of the subgradient method for Lipschitz path-differentiable functions

We show that the vanishing stepsize subgradient method – widely adopted ...

Please sign up or login with your details

Forgot password? Click here to reset