DeepAI

# Asymptotic properties of Bernstein estimators on the simplex. Part 2: the boundary case

In this paper, we study the asymptotic properties (bias, variance, mean squared error) of Bernstein estimators for cumulative distribution functions and density functions near and on the boundary of the d-dimensional simplex. The simplex is an important case as it is the natural domain of compositional data and has been neglected in the literature. Our results generalize those found in Leblanc (2012), who treated the case d=1, and complement the results from Ouimet (2020) in the interior of the simplex. Different parts of the boundary having different dimensions makes the analysis more complex.

02/18/2020

### Asymptotic properties of Bernstein estimators on the simplex

In this paper, we study various asymptotic properties (bias, variance, m...
11/30/2020

### A study of seven asymmetric kernels for the estimation of cumulative distribution functions

In Mombeni et al. (2019), Birnbaum-Saunders and Weibull kernel estimator...
01/05/2021

### A unifying approach on bias and variance analysis for classification

Standard bias and variance (B V) terminologies were originally defined...
10/08/2018

### Visually Communicating and Teaching Intuition for Influence Functions

Estimators based on influence functions (IFs) have been shown effective ...
04/21/2022

### Boundary Adaptive Local Polynomial Conditional Density Estimators

We begin by introducing a class of conditional density estimators based ...
04/23/2018

### Positive data kernel density estimation via the logKDE package for R

Kernel density estimators (KDEs) are ubiquitous tools for nonparametric ...
09/30/2020

### Local Regression Distribution Estimators

This paper investigates the large sample properties of local regression ...

## 1 Introduction

The -dimensional simplex and its interior are defined by

 (1.1)

where . For any cumulative distribution function on , define the Bernstein polynomial of order for by

 F⋆m(x)\vcentcolon=∑k∈Nd0∩mSF(k/m)Pk,m(x),x∈S, m∈N, (1.2)

where the weights are the following probabilities from the

distribution :

 Pk,m(x)\vcentcolon=m!(m−∥k∥1)!∏di=1ki!⋅(1−∥x∥1)m−∥k∥1d∏i=1xkii,k∈Nd0∩mS. (1.3)

The Bernstein estimator of , denoted by , is the Bernstein polynomial of order for the empirical cumulative distribution function

, where the random variables

are independent and distributed. Precisely,

 F⋆n,m(x)\vcentcolon=∑k∈Nd0∩mSFn(k/m)Pk,m(x),x∈S, m,n∈N. (1.4)

Similarly, if has a density function , we define the Bernstein density estimator of (also called smoothed histogram) by

 ^fn,m(x)\vcentcolon=∑k∈Nd0∩(m−1)Smdnn∑i=11(km,k+1m](Xi)Pk,m−1(x),x∈S, m,n∈N, (1.5)

where is just a scaling factor, namely the inverse of the volume of the hypercube .

## 2 Results for the density estimator ^fn,m

For every result stated in this section, we will make the following assumption.

 ∙f is two times % differentiable and its second order partialderivatives are (uniformly) continuous on S. (2.1)

In the first lemma, we obtain a general expression for the bias of the density estimator.

###### Lemma 2.1 (blueBias of ^fn,m(x) on S).

Under assumption (2.1), we have, uniformly for ,

 Bias[^fn,m(x)] =m−1Δ1(x)+m−2Δ2(x) (2.2) +1m2d∑i=1o(1+√(m−1)xi(1−xi)+x2i+(m−1)xi(1−xi)+x2i),

as , where

 Δ1(x) \vcentcolon=d∑i=1(12−xi)∂∂xif(x)+12d∑i,j=1(xi1{i=j}−xixj)∂2∂xi∂xjf(x), (2.3) Δ2(x) \vcentcolon=d∑i,j=1(161{i=j}+181{i≠j}−12xi1{i=j}−12xj+xixj)∂2∂xi∂xjf(x).

By considering points that are close to the boundary in some components (see the subset of indices below), we get the bias of the density estimator near the boundary.

###### Theorem 2.2 (blueBias of ^fn,m(x) near the boundary of S).

Assume (2.1). For any such that for all and is independent of for all , we have

 Bias(^fn,m(x)) +oλJ(m−2+1{J≠[d]}m−1), (2.4)

as , where are fixed, , and .

Next, we obtain a general expression for the variance of the density estimator.

###### Lemma 2.3 (blueVariance of ^fn,m(x) on S).

Assume (2.1). For any such that for all and is independent of for all , we have

 Var(^fn,m(x)) =mdnf(x)∑k∈Nd0∩(m−1)SP2k,m−1(x) (2.5)

as .

By combining Lemma 2.3 and the technical estimate in Lemma A.1, we get the asymptotics of the variance of the density estimator near the boundary.

###### Theorem 2.4 (blueVariance of ^fn,m(x) near the boundary of S).

Assume (2.1). For any such that for all and is independent of for all , we have

 (2.6)

as .

By combining Theorem 2.2 and Theorem 2.4, we get the asymptotics of the mean squared error of the density estimator near the boundary. In particular, the optimal smoothing parameter will depend on the number of components of that are close to the boundary.

###### Corollary 2.5 (blueMean squared error of ^fn,m(x) near the boundary of S).

Assume (2.1). For any such that for all and is independent of for all , we have

 MSE(^fn,m(x)) =n−1m(d+|J|)/2f(x)∣∣xJ=0ψ[d]∖J(x)∏i∈Je−2λiI0(2λi) (2.7) +n−1m(d+|J|)/2OλJ,x[d]∖J(m−1+1{J≠[d]}m−1/2) +OλJ(m−3)+oλJ(1{J≠[d]}m−2),

as . If the quantity inside the big bracket is non-zero in (2.7), the asymptotically optimal choice of , with respect to MSE, is

 (2.8)

in which case

 MSE(^fn,mopt(x)) (2.9)

as .

By adding conditions on the partial derivatives of , we can remove terms from the bias in Theorem 2.2 and obtain another expression for the mean squared error of the density estimator near the boundary, and the corresponding optimal smoothing parameter when .

###### Corollary 2.6 (blueMean squared error of ^fn,m(x) near the boundary of S).

Assume (2.1) and also

 (2.10)

(in particular, the first bracket in (2.2) is zero). Then, for any such that for all and independent of for all , we have

 MSE(^fn,m(x)) (2.11)

as . Note that the last error term is bigger than the main term except when . Therefore, if and we assume that the quantity inside the big bracket is non-zero in (2.11), the asymptotically optimal choice of , with respect to MSE, is

 (2.12)

in which case

 MSE(^fn,mopt(x)) (2.13)

as .

###### Remark 2.7.

In order to optimize when in Corollary 2.6, we would need an even more precise expression for the bias in Theorem 2.2 by assuming more regularity conditions on than we did in (2.1). We have not tried to do so because the number of terms to manage in the proof of Theorem 2.2 is already barely trackable.

## 3 Results for the c.d.f. estimator F⋆n,m

For every result stated in this section, we will make the following assumption.

 ∙F is three times % differentiable and its third order partialderivatives are (uniformly) continuous on S. (3.1)

Below, we obtain a general expression for the bias of the c.d.f. estimator on the simplex, and then near the boundary.

###### Lemma 3.1 (blueBias of F⋆n,m(x) on S).

Under assumption (3.1), we have, uniformly for ,

 E[F⋆n,m(x)] =F(x)+12md∑i,j=1(xi1{i=j}−xixj)∂2∂xi∂xjF(x) (3.2)

as .

###### Theorem 3.2 (blueBias of F⋆n,m(x) near the boundary of S).

Assume (3.1). For any such that for all and is independent of for all , we have

 Bias(F⋆n,m(x)) =1m∑i,j∈[d]∖J12(xi1{i=j}−xixj)∂2∂xi∂xjF(x)∣∣xJ=0 +oλJ(m−2+1{J≠[d]}m−3/2), (3.3)

as .

Next, we obtain a general expression for the variance of the c.d.f. estimator on the simplex.

###### Lemma 3.3 (blueVariance of F⋆n,m(x) on S).

Under assumption (3.1), we have, uniformly for ,

 Var(F⋆n,m(x)) =n−1F(x)(1−F(x)) (3.4) −n−1d∑i=1∂∂xiF(x)∑k,ℓ∈Nd0∩mS((ki∧ℓi)/m−xi)Pk,m(x)Pℓ,m(x) +n−1d∑i=1O(E[|ξi/m−xi|2]),

as .

By combining Lemma A.2 and Lemma 3.3, we get the asymptotics of the variance of the c.d.f. estimator near the boundary.

###### Theorem 3.4 (blueVariance of F⋆n,m(x) near the boundary of S).

Assume (3.1). For any such that for all and is independent of for all , we have

 Var(F⋆n,m(x)) (3.5) =n−1F(x)(1−F(x))1{J=∅} −n−1m−1d∑i=1∂∂xiF(x)∣∣xJ=0{−(λi−e−2λi(I0(2λi)+λiI1(2λi)))⋅1{i∈J}m1/2√π−1xi(1−xi)⋅1{i∈[d]∖J}}

as .

By combining Theorem 2.2 and Theorem 2.4, we get the asymptotics of the mean squared error of the c.d.f. estimator near the boundary.

###### Corollary 3.5 (blueMean squared error of F⋆n,m(x) near the boundary of S).

Assume (3.1). For any such that for all and is independent of for all , we have

 MSE(F⋆n,m(x)) (3.6) =n−1F(x)(1−F(x))1{J=∅} −n−1m−1d∑i=1∂∂xiF(x)∣∣xJ=0{−(λi−e−2λi(I0(2λi)+λiI1(2λi)))⋅1{i∈J}m1/2√π−1xi(1−xi)⋅1{i∈[d]∖J}}

as . As pointed out in (Leblanc, 2012b, p.2772) for , there is no optimal with respect to the MSE when . This is also true here. The remaining case (when is far from the boundary in every component) was already treated in Corollary 2.4 of Ouimet (2020a).

## 4 Proof of the results for the density estimator ^fn,m

### 4.1 Proof of Lemma 2.1

Using Taylor expansions for any such that , we obtain

 md∫(km,k+1m]f(y)dy−f(x) =f(k/m)−f(x)+12md∑i=1∂∂xif(k/m) +1m2d∑i,j=1(161{i=j}+181{i≠j})∂2∂xi∂xjf(k/m)⋅(1+o(1)) =1md∑i=1(ki−mxi)∂∂xif(x) +12m2d∑i,j=1(ki−mxi)(kj−mxj)∂2∂xi∂xjf(x)⋅(1+o(1)) +12md∑i=1∂∂xif(x)+12m2d∑i,j=1(kj−mxj)∂2∂xi∂xjf(x)⋅(1+o(1)) +1m2d∑i,j=1(161{i=j}+181{i≠j})∂2∂xi∂xjf(x)⋅(1+o(1)) =1md∑i=1(ki−(m−1)xi)∂∂xif(x)+1md∑i=1(12−xi)∂∂xif(x) +12m2d∑i,j=1(ki−(m−1)xi)(kj−(m−1)xj)∂2∂xi∂xjf(x) −12m2d∑i,j=1xi(kj−(m−1)xj)∂2∂xi∂xjf(x) −12m2d∑i,j=1xj(ki−(m−1)xi)∂2∂xi∂xjf(x) +12m2d∑i,j=1xixj∂2∂xi∂xjf(x)+12m2d∑i,j=1(kj−(m−1)xj)∂2∂xi∂xjf(x) +1m2d∑i,j=1(161{i=j}+181{i≠j}−12xj)∂2∂xi∂xjf(x) +1m2d∑i=1o(1+|ki−mxi|+|ki−mxi|2).

If we multiply the last expression by and sum over

, then the joint moments from Lemma

B.1, the notation for and in (2.3), Jensen’s inequality and

 Vi\vcentcolon=∑k∈Nd0∩(m−1)S|ki−mxi|2Pk,m−1(x), (4.1)

yield

 E[^fn,m(x)]−f(x) (4.2) =1md∑i=1(12−xi)∂∂xif(x)+m−12m2d∑i,j=1(xi1{i=j}−xixj)∂2∂xi∂xjf(x) =m−1Δ1(x)+m−2Δ2(x) +1m2d∑i=1o(1+√(m−1)xi(1−xi)+x2i+(m−1)xi(1−xi)+x2i).

This ends the proof.

### 4.2 Proof of Theorem 2.2

Take as in the statement of the theorem. Using the notation from (2.3), we have

 Δ1(x) =∑i∈J(12−λim)∂∂xif(x)+∑i∈Jλi2m⋅∂2∂x2if(x) (4.3) +∑i∈[d]∖J(12−xi)∂∂xif(x)−∑i∈Jj∈[d]∖Jλixjm⋅∂2∂xi∂xjf(x) +∑i,j∈[d]∖J12(xi1{i=j}−xixj)∂2∂xi∂xjf(x)+OλJ(m−2),

and

 Δ2(x) =∑(i,j)∈[d]2∖([d]∖J)2(161{i=j}+181{i≠j})∂2∂xi∂xjf(x) (4.4)

Using the fact that, for all , note that

 ∂∂xif(x)=∂∂xif(x)∣∣xJ=0+∑j∈Jλjm⋅∂2∂xi∂xjf(x)∣∣xJ=0⋅(1+oλJ(1)),∂2∂xi∂xjf(x)=∂2∂xi∂xjf(x)∣∣xJ=0⋅(1+oλJ(1)), (4.5)

Then, from (4.3), (4.4) (4.5) and Lemma 2.1, we can deduce the conclusion.

### 4.3 Proof of Lemma 2.3

By the independence of the observations , we have

 (4.6)

From Lemma 2.1, we already know that , uniformly for . We can also expand the integral using a Taylor expansion :

 md∑k∈Nd0∩(m−1)S∫(km,k+1m]f(y)dyP2k,m−1(x) (4.7) =f(x)∑k∈Nd0∩(m−1)SP2k,m−1(x) +1md∑i=1O(∑k∈Nd0∩(m−1)S|ki−mxi|P2k,m−1(x))+O(m−1).

Since , the Cauchy-Schwarz inequality yields

 ∑k∈Nd0∩(m−1)S|ki−mxi|P2k,m−1(x) (4.8) ≤√∑k∈Nd0∩(m−1)S|ki−mxi|2Pk,m−1(x)√∑k∈Nd0∩(m−1)SP2k,m−1(x).

Putting (4.6), (4.7) and (4.8) together, we get the conclusion.

### 4.4 Proof of Theorem 2.4

By Lemma 2.3, and Lemma A.1 and (B.2) in Lemma B.1, we have

 Var(^fn,m(x)) (4.9)

Using the Taylor expansion , we get

 Var(^fn,m(x)) =m(d+|J|)/2n(f(x)∣∣xJ=0+OλJ(m−1)) (4.10)

The conclusion follows.

## 5 Proof of the results for the c.d.f. estimator F⋆n,m

### 5.1 Proof of Lemma 3.1

By a Taylor expansion,

 F(k/m) =F(x)+d∑i=1(ki/m−xi)∂∂xiF(x) (5.1) +12d∑i,j=1(ki/m−xi)(kj/m−xj)∂2∂xi∂xjF(x) +16d∑i,j,ℓ=1(ki/m−xi)(kj/m−xj)(kℓ/m−xℓ)∂3∂xi∂xj∂ℓF(x) +o(∥k/m−x∥31).

If we multiply by , sum over , and then take the expectation on both sides, we get

 E[F⋆n,m(x)] =∑k∈Nd0∩mSF(k/m)Pk,m(x) (5.2) =F(x)+d∑i=1E[ξi/m−xi]∂∂xiF(x) +12d∑i,j=1E[(ξi/m−xi)(ξj/m−xj)]∂2∂xi∂xjF(x) +16d∑i,j,ℓ=1E[(ξi/m−xi)(ξj/m−xj)(ξℓ/m−xℓ)]∂3∂xi∂xj∂ℓF(x) +d∑i,j,ℓ=1o(E[|ξi/m−xi||ξj/m−xj||ξℓ/m−xℓ|]).

From the multinomial joint central moments in Lemma B.1, we get

 E[F⋆n,m(x)] =F(x)+12md∑i,j=1(xi1{i=j}−xixj)∂2∂xi∂xjF(x) +d∑i,j,ℓ=1o(E[|ξi/m−xi||ξj/m−xj||ξℓ/m−xℓ|]). (5.3)

We apply the Cauchy-Schwarz inequality on the error term to get the conclusion.

### 5.2 Proof of Theorem 3.2

Take as in the statement of the theorem. For all , note that

 ∂2∂xi∂xjF(x)=∂2∂xi∂xjF(x)∣∣xJ=0+∑ℓ∈Jλℓm⋅∂3∂xi∂xj∂xℓF(x)∣∣xJ=0⋅(1+oλJ(1)), (5.4) ∂3∂xi∂xj∂xℓF(x)=∂3∂xi∂xj∂xℓF(x)∣