Sparse approximation in learning via neural ODEs

02/26/2021
by   Carlos Esteve Yagüe, et al.
0

We consider the continuous-time, neural ordinary differential equation (neural ODE) perspective of deep supervised learning, and study the impact of the final time horizon T in training. We focus on a cost consisting of an integral of the empirical risk over the time interval, and L^1–parameter regularization. Under homogeneity assumptions on the dynamics (typical for ReLU activations), we prove that any global minimizer is sparse, in the sense that there exists a positive stopping time T^* beyond which the optimal parameters vanish. Moreover, under appropriate interpolation assumptions on the neural ODE, we provide quantitative estimates of the stopping time T^∗, and of the training error of the trajectories at the stopping time. The latter stipulates a quantitative approximation property of neural ODE flows with sparse parameters. In practical terms, a shorter time-horizon in the training problem can be interpreted as considering a shallower residual neural network (ResNet), and since the optimal parameters are concentrated over a shorter time horizon, such a consideration may lower the computational cost of training without discarding relevant information.

READ FULL TEXT
research
08/06/2020

Large-time asymptotics in deep learning

It is by now well-known that practical deep supervised learning may roug...
research
08/18/2019

Neural Dynamics on Complex Networks

We introduce a deep learning model to learn continuous-time dynamics on ...
research
10/19/2022

Deep neural network expressivity for optimal stopping problems

This article studies deep neural network expression rates for optimal st...
research
06/22/2019

Semi-tractability of optimal stopping problems via a weighted stochastic mesh algorithm

In this article we propose a Weighted Stochastic Mesh (WSM) Algorithm fo...
research
02/27/2019

ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs

Residual neural networks can be viewed as the forward Euler discretizati...
research
02/10/2018

Graph Planning with Expected Finite Horizon

Graph planning gives rise to fundamental algorithmic questions such as s...
research
03/01/2021

Statistically Significant Stopping of Neural Network Training

The general approach taken when training deep learning classifiers is to...

Please sign up or login with your details

Forgot password? Click here to reset