Ordinary differential equations (ODE): metric entropy and nonasymptotic theory for noisy function fitting

by   Ying Zhu, et al.

This paper establishes novel results on the metric entropy of ODE solution classes. In addition, we establish a nonasymptotic theory concerning noisy function fitting for nonparametric least squares and least squares based on Picard iterations. Our results on the metric entropy provide answers to "how do the degree of smoothness and the "size" of a class of ODEs affect the "size" of the associated class of solutions?" We establish a general upper bound on the covering number of solution classes associated with the higher order Picard type ODEs, y^(m)(x)=f(x, y(x), y^'(x), ...,y^(m-1)(x)). This result implies, the covering number of the underlying solution class is (basically) bounded from above by the covering number of the class ℱ that f ranges over. This general bound (basically) yields a sharp scaling when f is parameterized by a K-dimensional vector of coefficients belonging to a ball and the noisy recovery is essentially no more difficult than estimating a K-dimensional element in the ball. For m=1, when ℱ is an infinitely dimensional smooth class, the solution class ends up with derivatives whose magnitude grows factorially fast – "a curse of smoothness". We introduce a new notion called the "critical smoothness parameter" to derive an upper bound on the covering number of the solution class. When the sample size is large relative to the degree of smoothness, the rate of convergence associated with the noisy recovery problem obtained by applying this "critical smoothness parameter" based approach improves the rate obtained by applying the general upper bound on the covering number (and vice versa).


page 1

page 2

page 3

page 4


Bless and curse of smoothness and phase transitions in nonparametric regressions: a nonasymptotic perspective

When the regression function belongs to the standard smooth classes cons...

Approximate Covering with Lower and Upper Bounds via LP Rounding

In this paper, we study the lower- and upper-bounded covering (LUC) prob...

Almost optimum ℓ-covering of ℤ_n

A subset B of ring ℤ_n is called a ℓ-covering set if { ab n | 0≤ a ≤ℓ, ...

Network-size independent covering number bounds for deep networks

We give a covering number bound for deep learning networks that is indep...

Low-degree learning and the metric entropy of polynomials

Let ℱ_n,d be the class of all functions f:{-1,1}^n→[-1,1] on the n-dimen...

Lyndon Words, the Three Squares Lemma, and Primitive Squares

We revisit the so-called "Three Squares Lemma" by Crochemore and Rytter ...

Recovery of regular ridge functions on the ball

We consider the problem of the uniform (in L_∞) recovery of ridge functi...