Distribution learning via neural differential equations: a nonparametric statistical perspective

09/03/2023
by   Youssef Marzouk, et al.
0

Ordinary differential equations (ODEs), via their induced flow maps, provide a powerful framework to parameterize invertible transformations for the purpose of representing complex probability distributions. While such models have achieved enormous success in machine learning, particularly for generative modeling and density estimation, little is known about their statistical properties. This work establishes the first general nonparametric statistical convergence analysis for distribution learning via ODE models trained through likelihood maximization. We first prove a convergence theorem applicable to arbitrary velocity field classes ℱ satisfying certain simple boundary constraints. This general result captures the trade-off between approximation error (`bias') and the complexity of the ODE model (`variance'). We show that the latter can be quantified via the C^1-metric entropy of the class ℱ. We then apply this general framework to the setting of C^k-smooth target densities, and establish nearly minimax-optimal convergence rates for two relevant velocity field classes ℱ: C^k functions and neural networks. The latter is the practically important case of neural ODEs. Our proof techniques require a careful synthesis of (i) analytical stability results for ODEs, (ii) classical theory for sieved M-estimators, and (iii) recent results on approximation rates and metric entropies of neural network classes. The results also provide theoretical insight on how the choice of velocity field class, and the dependence of this choice on sample size n (e.g., the scaling of width, depth, and sparsity of neural network classes), impacts statistical performance.

READ FULL TEXT
research
07/20/2022

On minimax density estimation via measure transport

We study the convergence properties, in Hellinger and related distances,...
research
04/15/2022

Universal approximation property of invertible neural networks

Invertible neural networks (INNs) are neural network architectures with ...
research
09/21/2021

Minimax Rates for Conditional Density Estimation via Empirical Entropy

We consider the task of estimating a conditional density using i.i.d. sa...
research
08/09/2019

Convergence Rates of Variational Inference in Sparse Deep Learning

Variational inference is becoming more and more popular for approximatin...
research
09/30/2022

Building Normalizing Flows with Stochastic Interpolants

A simple generative model based on a continuous-time normalizing flow be...
research
07/09/2019

Convergence Rates for Gaussian Mixtures of Experts

We provide a theoretical treatment of over-specified Gaussian mixtures o...
research
10/31/2020

Mixing it up: A general framework for Markovian statistics beyond reversibility and the minimax paradigm

Up to now, the nonparametric analysis of multidimensional continuous-tim...

Please sign up or login with your details

Forgot password? Click here to reset