# Functional delta residuals and applications to functional effect sizes

Given a functional central limit (fCLT) and a parameter transformation, we use the functional delta method to construct random processes, called functional delta residuals, which asymptotically have the same covariance structure as the transformed limit process. Moreover, we prove a multiplier bootstrap fCLT theorem for these transformed residuals and show how this can be used to construct simultaneous confidence bands for transformed functional parameters. As motivation for this methodology, we provide the formal application of these residuals to a functional version of the effect size parameter Cohen's d, a problem appearing in current brain imaging applications. The performance and necessity of such residuals is illustrated in a simulation experiment for the covering rate of simultaneous confidence bands for the functional Cohen's d parameter.

## Authors

• 3 publications
• 1 publication
• 5 publications
• ### Simultaneous Confidence Bands for Functional Data Using the Gaussian Kinematic Formula

01/18/2019 ∙ by Fabian J. E. Telschow, et al. ∙ 0

• ### Fast and Fair Simultaneous Confidence Bands for Functional Parameters

Quantifying uncertainty using confidence regions is a central goal of st...
09/30/2019 ∙ by Dominik Liebl, et al. ∙ 0

• ### Heat diffusion distance processes: a statistically founded method to analyze graph data sets

We propose two multiscale comparisons of graphs using heat diffusion, al...
09/27/2021 ∙ by Etienne Lasalle, et al. ∙ 0

• ### Confidence surfaces for the mean of locally stationary functional time series

The problem of constructing a simultaneous confidence band for the mean ...
09/08/2021 ∙ by Holger Dette, et al. ∙ 0

• ### Simultaneous predictive bands for functional time series using minimum entropy sets

Functional Time Series are sequences of dependent random elements taking...
05/28/2021 ∙ by Nicolas Hernandez, et al. ∙ 0

• ### A Conformal Prediction Approach to Explore Functional Data

This paper applies conformal prediction techniques to compute simultaneo...
02/26/2013 ∙ by Jing Lei, et al. ∙ 0

• ### Wasserstein F-tests and Confidence Bands for the Frèchet Regression of Density Response Curves

Data consisting of samples of probability density functions are increasi...
10/29/2019 ∙ by Alexander Petersen, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The motivation for our work comes from the following problem in spatial functional data analysis. Sommerfeld et al. (2018), in the context of climate data, and Bowring et al. (2019)

, in the context of functional magnetic resonance imaging, study confidence statements for estimators of the mean function

from a sample of a signal plus noise model , where

is a stochastic error process with variance function

, where

is a spatial index. This requires estimation of the quantiles of the maximum of a limiting Gaussian processes. The quantiles are estimated using standardized residuals from the estimated mean function either through a multiplier bootstrap

(Chang and Ogden, 2009; Chang et al., 2017) or the Gaussian kinematic formula (Worsley et al., 2004; Adler and Taylor, 2009). These methods successfully approximate the quantiles, since the standardized residuals asymptotically have the same covariance structure as the limiting Gaussian process.

However, this approach no longer works when the object of interest is a nonlinear transformation of the parameters. In order to guarantee comparability between different scanners, Bowring et al. (2020) extends the work of Bowring et al. (2019) to the population Cohen’s , i.e., , rather than the mean function . This causes a new conceptional problem. While the standard residuals capture the covariance structure for the limiting Gaussian process in estimation of the mean, this no longer holds true for Cohen’s as we show in Corollary 1. We visualize this effect in Figures 1 and 2. In particular, Figure 1 shows samples of Cohen’s residuals approximating the correct covariance structure.

In this paper, we use the functional delta method to construct random processes, called functional delta residuals, which can be used for obtaining distributional properties of the limiting process whenever the object of inference is a non linear transformation of the functional parameters. The proposed delta residuals are necessary because the nonlinearity not only affects the variance of the limiting transformed process but also its covariance function. As an application, we here use delta residuals and the quantiles of the maximum of the limiting process for construction of simultaneous confidence bands, a problem commonly found in functional data analysis

(Degras, 2011; Cao et al., 2012; Cao and others, 2014; Chang et al., 2017; Wang et al., 2019), for the Cohen’s

parameter. Its extension to an application to spatial inference using coverage probability excursion sets for the Cohen’s

parameter can be found in Bowring et al. (2020).

Given a functional central limit theorem (fCLT) and a parameter transformation, the construction of the delta residuals is obtained by linearisation in relation to the functional delta method. Our main result, Theorem

1, shows that delta residuals have asymptotically the covariance structure of the limiting process of the transformed parameters. In Section 3 we apply the general theory to the functional Cohen’s statistic, prove the necessary fCLT in Theorem 2, derive the corresponding delta residuals in Section 3.2 and prove, in Theorem 3, a multiplier functional limit theorem for the delta residuals based on Chang and Ogden (2009). We use these results to construct simultaneous confidence bands and study the accuracy of their covering rate and the effect of using the wrong residuals in a simulation study in Section 4.

The methods for simultaneous confidence bands for Cohen’s are implemented in the R-packge SCBfda available under https://github.com/ftelschow/SCBfda and code reproducing the presented simulation results are available under https://github.com/ftelschow/DeltaResiduals.

## 2 Functional Delta Residuals

In this section we introduce the construction of functional delta residuals. We develop the idea in the framework of the Banach space of continuous functions with values in over a compact domain , however the concept can also be generalized to other Banach spaces. The norm on is the maximum norm , where denotes the standard norm on . For ease of readability will be denoted by .

Since a purely formal treatment hides the basic idea of delta residuals, we motivate them with a special case. Let be a functional population parameter and let be estimators of . Further, assume that their average satisfies a fCLT, i.e.,

 √N(^θN−θ)=1√N(N∑n=1^θ(n)−θ)⇝G, (1)

where is a tight zero mean Gaussian process in with covariance function and ”” denotes weak convergence in . We call the processes with values in standard residuals since, by the fCLT, their empirical covariance function converges to the covariance of , i.e.,

 limN→∞1NN∑n=1RN,n(s)RTN,n(t) =limN→∞1NN∑n=1(^θ(n)(s)−θ(s))(^θ(n)(t)−θ(t))T (2) =\rm Cov[G(s),G(t)]=c(s,t).

Here the convergence is in probability and

denotes the transpose of a vector

. Almost sure convergence would require addtional regularity conditions on the standard residuals . We discuss sufficient conditions in the case of Cohen’s in Section 3.

Let and suppose we are interested in inferring on , where is applied pointwise via . Let denote the derivative of at . Then equations (1) and (2) suggest that the transformed processes

 ~RN,n(s)=dH^θN(s)RN,n(s), (3)

which we call functional delta residuals, can be used to approximate the covariance structure of , which is the limiting process from the delta method, in the sense that

 limN→∞N−1N∑n=1~RN,n(s)~RTN,n(s′)=dHθ(s)c(s,s′)dHTθ(s′),

with convergence again being in probability.

For illustrative purposes, consider the following more concrete example. Let be a triangular array of random processes in independent and identically distributed as with and . Let so that , and suppose that weakly in for with being a tight, zero mean Gaussian process with covariance . Then are standard residuals satisfying (2). For continuously differentiable, the delta residuals are given by , which can be used to approximate the covariance function . To be even more concrete, let . Thus, we can define . Say we are interested in the asymptotic behavior of the product of the sample means of the two components of the process. Then and the delta residuals are given by . These delta residuals can be used to approximate the covariance , where and .

The next result is immediate, yet generalizes the previous concept of functional delta residuals to estimators , which are not averages.

###### Theorem 1.

Let and be an estimator such that as

 √N(^θN−θ)⇝G, (4)

weakly in , where denotes a tight mean zero Gaussian process on with mean zero and covariance function . Let be a continuously differentiable function. Moreover, let be a triangular array of random processes satisfying uniformly in probability that

 limN→∞N−1N∑n=1RN,n(s)RTN,n(s′)=c(s,s′). (5)

Then the functional delta methods yields

 √N(H(^θN)−H(θ))⇝dHθG=~G, N→∞ (6)

with being a zero mean Gaussian process with covariance . Furthermore, the functional delta residuals , , satisfy

 limN→∞N−1N∑n=1~RN,n(s)~RTN,n(s′)=~c(s,s′)

uniformly in probability.

###### Proof.

By a simple Taylor expansion argument considered as a function of is Hadamard differentiable tangential to and therefore (Kosorok, 2008, Theorem 2.8) implies that the functional delta method is applicable.

To prove the second statement we obtain by linearity of the differential that

 N−1N∑k=1~RN,n(s)~RTN,n(s′)=dH^θ(s)(N−1N∑k=1RN,n(s)RTN,n(s′))dHT^θ(s′). (7)

The fCLT (4) yields uniformly in probability. Hence the claim follows from (7) and uniformly in probability as by the continuous mapping theorem. ∎

###### Remark 1.

Two observations are noteworthy. First, the factors in equation (4) and in equation (5) can be replaced by general factors tending to infinity and zero respectively. We only keep these simple factors for notational simplicity. Secondly, if for all , then the delta residuals can be identically equal to zero. Here an assumption of higher differentiability of can be used to establish a similar result using a second order delta method.

## 3 Functional delta residuals for Cohen’s d

In this section we show how to apply Theorem 1 to the pointwise Cohen’s statistic for processes with -Hölder continuous paths, see Definition 1 below. This special continuity condition is needed to ensure that the sample mean and the sample variance satisfy a fCLT, which is necessesary to obtain the functional delta residuals of Cohen’s . As a second step, we establish a multiplier bootstrap result for the delta residuals. This result implies that the quantiles of the maximum of the limiting process of the functional delta method can be estimated consistently.

The purpose of these considerations is to provide a theoretical basis for the approach taken in Bowring et al. (2020). Neuroimaging data is typically smoothed with a Gaussian kernel and therefore the assumption on the sample paths is satisfied for the smoothed process provided that the observed data at the voxels has finite

th moment. To circumvent technicalities from the application in

Bowring et al. (2020) we will demonstrate the usefulness of the delta residuals for the task of constructing simultaneous confidence bands for the functional Cohen’s parameter.

Hereafter, we assume that is an i.i.d. sample in . The pointwise population Cohen’s is the function defined by

 d(s)=E[X(s)]√\rm Var[X(s)]=μ(s)σ(s)=H(μ(s),σ2(s)) (8)

with . The Cohen’s parameter is estimated using its corresponding sample counterpart

 ^dN(s)=H(¯XN(s),^σ2N(s))=N−1∑Nn=1Xn(s)√N−1∑Nn=1(Xn(s)−N−1∑Nn=1Xn(s))2. (9)

The biased variance estimator is used in the denominator, since the delta residuals will be simpler.

###### Remark 2.

Two observations are noteworthy here.

a.) The estimator (9) is not unbiased as an estimate of . In the case that is Gaussian, unbiasedness can be achieved by introducing a bias correcting factor depending on (Laubscher, 1960, p. 1106).

b.) It will be obvious from the proofs that the theorems on delta residuals for Cohen’s hold true not only for but any .

### 3.1 A Functional Central Limit Theorem

We want to apply Theorem 1 to the function . As such we need to establish a fCLT for the process , which takes values in . The following sample path property will be our main assumption on the process to prove the fCLT.

###### Definition 1.

Let be a process in . Given , we say that has -Hölder continuous paths, if

 |Z(s)−Z(s′)|≤L|s−s′|α (10)

for a positive random variable

with and .

###### Remark 3.

-Hölder continuous paths ensure that satisfies a fCLT, i.e., for iid processes in , the sum converges weakly to a tight mean zero Gaussian process which has the same covariance structure as , see Jain and Marcus (1975, Theorem 1).

The following Lemma states useful properties of processes with -Hölder continuous paths.

###### Lemma 1.

Let and be iid processes in having -Hölder continuous paths with over a compact set , and assume that there exists such that is finite. Then for all and uniformly almost surely. If , then also uniformly almost surely.

###### Proof.

First claim: Using the convexity of and we have

 E∥Y∥p∞ ≤2p−1(E∥Y−Y(s′)∥p∞+E|Y(s′)|p) ≤2p−1(EApmaxs∈S|s−s′|α+E|Y(s′)|p),

where is the random variable from the -Hölder property. This yields for all .

We now apply the generic uniform convergence result in Davidson (1994, Theorem 21.8)

. Since pointwise convergence is obvious by the strong law of large numbers, we only need to establish strong stochastical equicontinuity of the random function

. This is established using Davidson (1994, Theorem 21.10 (ii)), since

 ∣∣¯YN(s)−¯YN(s′)−E[Y(s)−Y(s′)]∣∣≤(N∑n=1AnN+E[A])|s−s′|α=CN|s−s′|α

for all . Here iid denote the random variables from the -Hölder paths of the ’s and . Hence the random variable converges almost surely to the constant by the strong law of large numbers.

Second claim: With the same strategy and assuming w.l.o.g. , we compute

 ∣∣ ∣∣1NN∑n=1Yn(s)Zn(s)−Yn(s′)Zn(s′)∣∣ ∣∣ ≤(1NN∑n=1∥Yn∥∞Bn+∥Zn∥∞An)|s−s′|α ≤⎛⎜⎝ ⎷N∑n=1∥Yn∥2∞NN∑n=1B2nN+ ⎷N∑n=1∥Zn∥2∞NN∑n=1A2nN⎞⎟⎠|s−s′|α,

where iid denote the random variables from the -Hölder property of the ’s and . Again by the strong law of large numbers the random Hölder constant converges almost surely and is finite. ∎

Since we are dealing with vector-valued random processes, we need the next lemma in our proof of Theorem 2. It states simple conditions for obtaining weak convergence of a vector-valued process from its components.

###### Lemma 2.

Let be -valued random variables on the probability space such that and . If the finite dimensional distributions of converge to those of , we have in .

###### Proof.

Since and in and is complete and separable, the sequences are tight and so for each , there exist compact such that for all , and . This implies

 PXN,YN(A×B)=PXN,YN(C(S)×B∩A×C(S))≥1−2ϵ.

The latter is true, since in general , if and , and since

 PXN,YN(C(S)×B)=P(XN∈C(S),YN∈B)=P(YN∈B)≥1−ϵ

and similarly . This holds for all and so the sequence is tight. Tightness implies relative compactness by Prohorov’s theorem.

Moreover, the finite dimensional distributions converge and form a separating class in (the proof is along the lines of Example 1.3 in (Billingsley, 1999, p. 12)

) so in particular the joint distribution converges (arguing as in Example 5.1 in

(Billingsley, 1999, p. 57)). ∎

With these preparatory results we are now able to prove the main theorem of this section.

###### Theorem 2.

Let be a compact space and be an i.i.d. sample in satisfying and having -Hölder continuous paths. Then

 √N((¯XN,^σ2N)−(μ,σ2))⇝G (11)

where is a 2D mean zero Gaussian process with covariance matrix function given by

 c(s,s′)=(c11(s,s′)c12(s,s′)c12(s,s′)c22(s,s′))

with , and .

###### Proof.

Since for iid random variables with mean we have that

 1√NN∑n=1(Zn−1NN∑n=1Zn)2=1√NN∑n=1(Zn−μ)2+eN,

where as , we can w.l.o.g replace by in the definition of from equation (9) and further, for simplicity, assume for all .

For and any , applying the multivariate CLT to the sequence of vectors

 1√NN∑n=1(Xn(s1),X2n(s1)−σ2(s1),…,Xn(sd),X2n(sd)−σ2(sd))T

yields convergence to the finite dimensional distributions of from the statement of the theorem. Hence the finite dimensional distributions of converge to those of . Since the process is -Hölder continuous, we have -Hölder bounds on and as shown in the proof of Theorem in Telschow and Schwartzman (2019). Thus, by Jain and Marcus (1975) both satisfy the CLT in . In particular, by Lemma 2 we obtain the fCLT for (, ). ∎

The functional delta method yields the following corollary.

###### Corollary 1.

Under the assumptions of Theorem 2 we have that with covariance structure given by

 ~c(s,s′)=(σ(s)−1,−μ(s)σ(s)−32)c(s,s′)(σ(s′)−1,−μ(s′)σ(s′)−32)T (12)

Moreover, if is a Gaussian process, then simplifies to

 ~c(s,s′)=c11(s,s′)σ(s)σ(s′)+c11(s,s′)2μ(s)μ(s′)2σ32(s)σ32(s′) (13)
###### Proof.

By Theorem 2 and the fact that is continuously differentiable, we can apply Theorem 1. Note that the denominator in the fCLT will be nonzero with probability for all , if , by Adler and Taylor (2009, Lemma 11.2.10).

For the Gaussian situation the asymptotic covariance structure simplifies significantly. To show this, we define

and use the fact from the moments of multivariate normal distributions, better known as Isserlis’ theorem, cf. Theorem 1 in

Vignat (2012),

 E[ε2(s)ε2(s′)]=σ2(s)σ2(s′)+2c11(s,s′)2,

for all to compute

 c22(s,s′) =E[(ε(s)2−σ2(s))(ε(s′)2−σ2(s′))] =E[ε2(s)ε2(s′)]−E[ε2(s′)]σ(s)2−E[ε2(s)]σ2(s′)+σ2(s)σ2(s′) =2c11(s,s′)2.

Finally, we note that

 c12(s,s′)=E[ε(s)(ε2(s′)−σ2(s′))]=E[ε(s)ε2(s′)]=0=c21(s,s′),

yielding the simplified version of the limiting covariance structure. ∎

The above corollary shows why linear residuals fail to capture the asymptotic correlation structure of Cohen’s and therefore are unsuitable for building inferential tools. For simplicity, suppose that is Gaussian. Then the standardized linear residuals yield

 \rm Cov[eN,n(s),eN,n(s′)]=(1−1/N)c11(s,s′)σ(s)σ(s′)N→∞−−−−→c11(s,s′)σ(s)σ(s′),

which is not equal to (13).

### 3.2 Functional Delta Residuals

In the previous section we established that the estimator of given in (9) satisfies a fCLT. Therefore, we now derive the corresponding functional delta residuals. Theorem 2 states that

 √N(N−1N∑n=1(Xn(s),(Xn(s)−¯XN(s))2)−(μ,σ2))

converges to a tight mean zero Gaussian process. This is an average estimator as discussed in Section 2. Hence we can identify with standard residuals

 RN,n=^θ(n)−^θN=(Xn−¯XN,(Xn−¯XN)2−^σ2N). (14)

The functional delta residuals for therefore are

 ~RN,n=(^σ−1N,−¯XN^σ−3N/2)TRN,n=Xn−¯XN^σN−^d2((Xn−¯XN^σN)2−1). (15)

We call these residuals Cohen’s residuals. It is easy to show that . To prove that these residuals satisfy Theorem 1, the following result is necessary.

###### Lemma 3.

Suppose has -Hölder continuous paths and for some , then the standard residuals (14) are componentwise -Hölder continuous.

###### Proof.

For the component it is clear that the process has -Hölder continuous paths. Therefore we only prove the claim for . Note that for all

 (Xn(s)−¯XN(s))2−(Xn(t)−¯XN(t))2=X2n(s)−X2n(t)+2(Xn(s)¯XN(s)−Xn(t)¯XN(t))+¯X2N(s)−¯X2N(t).

Here each term can be bounded in a similar manner. As such we only provide the bound for the middle term, which is

 |Xn(s)¯XN(s)−Xn(t)¯XN(t)|≤|Xn(s)¯XN(s)−Xn(s)¯XN(t)|+|Xn(s)¯XN(t)−Xn(t)¯XN(t)|≤∥Xn∥∞|¯XN(s)−¯XN(t)|+∥¯XN∥∞|Xn(s)−Xn(t)|≤(∥Xn∥∞NN∑n=1Ln+LnNN∑n=1∥Xn∥∞)|s−t|α=A|s−t|α

Applying the inequality , using that by Lemma 1 and shows that . ∎

Lemma 3 together with Lemma 1 implies almost sure uniform convergence of

 N−1N∑n=1RN,n(s)RTN,n(s′)→c(s,s′), (16)

where is given in Theorem 2. Hence Theorem 1 holds true for the Cohen’s residuals.

### 3.3 A Multiplier Bootstrap Functional Limit Theorem

Our main application of delta residuals is to approximate statistics that depend on the limiting process in (6) or quantiles thereof such as quantiles of the maximum of the process. The latter are used in Bowring et al. (2020) to construct coverage probability excursion sets for Cohen’s . In order to justify their construction we establish weak conditional convergence for the multiplier process based on the delta residuals. For , the multiplier bootstrap process is defined by

 ~GrN=1√NN∑n=1rN,n~RN,n, (17)

where is an iid triangular array of multipliers satisfying and . Moreover, the multipliers are assumed to be independent of the ’s and thereby independent of the delta residuals defined in equation (15).

The following theorem is based on the proofs from Chang and Ogden (2009) and implies that the multiplier bootstrap process conditioned on the delta residuals asymptotically has similar sample path properties as the limiting process from the delta method. As such it can, for example, be used to estimate quantiles of the maximum, see Remark 4.

###### Theorem 3.

Under the assumptions of Theorem 2 the following statements hold

Here convergence in is in outer probability, is the expectation over conditional on and is the set of all such that and for all .

###### Theorem 3.

It suffices to prove the result for the multiplier bootstrap process defined by the standard residuals with weak convergence towards . This can be seen as follows: converges to uniformly almost surely by Theorem 2 and the continuous mapping theorem. Here and . Thus, applying Theorem 18.10(v) from Van der Vaart (2000), we obtain the weak convergence

 (dH^θN,1√NN∑n=1rN,nRN,n)⇝(dHθ,G),

where is given in Theorem 2. The continuous mapping Theorem (Van der Vaart, 2000, Theorem 18.11), then yields

 dH^θN(1√NN∑n=1rN,nRN,n)⇝dHθG=~G.

Let us define the unobservable iid samples

 RN,n=(Xn−μ,(Xn−μ)2−σ2),

where and . By definition these samples satisfy . Since has -Hölder continuous paths and , both components of divided by satisfy (A), (B), (C) and (D) from Chang and Ogden (2009) meaning their Theorem 1 and 2 are applicable. In particular, applying Lemma 2, this means and converge weakly to . Simple algebra shows that is a random process converging uniformly to zero as tends to infinity. Thus, converge weakly to in the space of bounded functions over . Since and all are assumed to be continuous processes, the convergence is also in by Van Der Vaart and Wellner (1996, Theorem 1.3.10). This finishes the proof of part .

Let and