In appropriate frameworks, automatic differentiation is transparent to t...
We leverage path differentiability and a recent result on nonsmooth impl...
We show that the derivatives of the Sinkhorn-Knopp algorithm, or iterati...
We provide a simple model to estimate the computational costs of the bac...
Differentiation along algorithms, i.e., piggyback propagation of derivat...
We consider flows of ordinary differential equations (ODEs) driven by pa...
In theory, the choice of ReLU'(0) in [0, 1] for a neural network has a
n...
In view of training increasingly complex learning architectures, we esta...
In view of a direct and simple improvement of vanilla SGD, this paper
pr...
We prove that the iterates produced by, either the scalar step size vari...
We present a new algorithm to solve min-max or min-min problems out of t...
Minibatch decomposition methods for empirical risk minimization are comm...
Automatic differentiation, as implemented today, does not have a simple
...
We consider the long-term dynamics of the vanishing stepsize subgradient...
The Lipschitz constant of a network plays an important role in many
appl...
We consider the problem of estimating the support of a measure from a fi...
The Clarke subdifferential is not suited to tackle nonsmooth deep learni...
We devise a learning algorithm for possibly nonsmooth deep neural networ...
Spectral features of the empirical moment matrix constitute a resourcefu...
Statistical leverage scores emerged as a fundamental tool for matrix
ske...