Bless and curse of smoothness and phase transitions in nonparametric regressions: a nonasymptotic perspective

by   Ying Zhu, et al.

When the regression function belongs to the standard smooth classes consisting of univariate functions with derivatives up to the (γ+1)th order bounded by a common constant everywhere or a.e., it is well known that the minimax optimal rate of convergence in mean squared error (MSE) is (σ^2/n)^2γ+2/2γ+3 when γ is finite and the sample size n→∞. From a nonasymptotic viewpoint that considers finite n, this paper shows that: for the standard Hölder and Sobolev classes, the minimax optimal rate is σ^2(γ∨1)/n when n/σ^2≾(γ∨1)^2γ+3 and (σ^2/n)^2γ+2/2γ+3 when n/σ^2≿(γ∨1)^2γ+3. To establish these results, we derive upper and lower bounds on the covering and packing numbers for the generalized Hölder class where the kth (k=0,...,γ) derivative is bounded from above by a parameter R_k and the γth derivative is R_γ+1-Lipschitz (and also for the generalized ellipsoid class of smooth functions). Our bounds sharpen the classical metric entropy results for the standard classes, and give the general dependence on γ and R_k. By deriving the minimax optimal MSE rates under R_k=1, R_k≤(k-1)! and R_k=k! (with the latter two cases motivated in our introduction) with the help of our new entropy bounds, we show a couple of interesting results that cannot be shown with the existing entropy bounds in the literature. For the Hölder class of d-variate functions, our result suggests that the classical asymptotic rate (σ^2/n)^2γ+2/2γ+2+d could be an underestimate of the MSE in finite samples.



page 1

page 2

page 3

page 4


Optimal rates of entropy estimation over Lipschitz balls

We consider the problem of minimax estimation of the entropy of a densit...

Ordinary differential equations (ODE): metric entropy and nonasymptotic theory for noisy function fitting

This paper establishes novel results on the metric entropy of ODE soluti...

Single Trajectory Nonparametric Learning of Nonlinear Dynamics

Given a single trajectory of a dynamical system, we analyze the performa...

Minimax MSE Bounds and Nonlinear VAR Prewhitening for Long-Run Variance Estimation Under Nonstationarity

We establish new mean-squared error (MSE) bounds for long-run variance (...

Minimax Lower Bounds for Ridge Combinations Including Neural Nets

Estimation of functions of d variables is considered using ridge combi...

Scalable Hash-Based Estimation of Divergence Measures

We propose a scalable divergence estimation method based on hashing. Con...

Estimation and Inference with Trees and Forests in High Dimensions

We analyze the finite sample mean squared error (MSE) performance of reg...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.