# Strong-consistent autoregressive predictors in abstract Banach spaces

This work derives new results on the strong-consistency of a componentwise estimator of the autocorrelation operator, and its associated plug-in predictor, in the context of autoregressive processes of order one, in a real separable Banach space B (ARB(1) processes). For the estimator of the autocorrelation operator, strong-consistency is proved, in the norm of the space L(B) of bounded linear operators on B. The strong-consistency of the associated plug-in predictor then follows in the norm of B. The methodology applied is based on assuming suitable continuous embeddings between the Banach, Hilbert and Reproducing Kernel Hilbert spaces, involved in the construction proposed in Kuelbs Kuelbs70. This paper extends the results in Bosq00 and Labbas02.

## Authors

• 7 publications
• 3 publications
08/05/2018

### Strongly consistent autoregressive predictors in abstract Banach spaces

This work derives new results on strong consistent estimation and predic...
12/19/2019

### Parseval Proximal Neural Networks

The aim of this paper is twofold. First, we show that a certain concaten...
08/15/2018

### A note on strong-consistency of componentwise ARH(1) predictors

New results on strong-consistency, in the Hilbert-Schmidt and trace oper...
06/09/2020

### Hierarchical regularization networks for sparsification based learning on noisy datasets

We propose a hierarchical learning strategy aimed at generating sparse r...
12/15/2015

### Increasing the Action Gap: New Operators for Reinforcement Learning

This paper introduces new optimality-preserving operators on Q-functions...
12/23/2017

### Cointegration and representation of integrated autoregressive processes in function spaces

We provide a suitable generalization of cointegration for time series ta...
05/11/2022

### On Dependent Dirichlet Processes for General Polish Spaces

We study Dirichlet process-based models for sets of predictor-dependent ...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In the last few decades, there exists a growing interest on the statistical analysis of high–dimensional data, from the Functional Data Analysis (FDA) perspective. The book by

Ramsay and Silverman [2005] provides an overview on FDA techniques, extended from the multivariate data context, or specifically formulated for the FDA framework. The monograph by Hsing and Eubank [2015] introduces functional analytical tools usually applied in the estimation of random elements in function spaces. The book by Horváth and Kokoszka [2012] is mainly concerned with inference based on second order statistics. A central topic in this book is the analysis of functional data, displaying dependent structures in time and space. The methodological survey paper by Cuevas [2014], on the state of the art in FDA, discusses central topics in FDA. Recent advances in the statistical analysis of high–dimensional data, from the parametric, semiparametric and nonparametric FDA frameworks, are collected in the Special Issue by Goia and Vieu [2016].

Linear time series models traditionally arise for processing temporal linear correlated data. In the FDA context, the monograph by Bosq [2000]

introduces linear functional time series theory. The RKHS, generated by the autocovariance operator, plays a crucial role in the estimation approach presented in this monograph. In particular, the eigenvectors of the autocovariance operator are considered for projection (see also

Álvarez-Liébana [2017]). Its empirical version is computed, when they are unknown. The resulting plug–in predictor is obtained as a linear functional of the observations, based on the empirical approximation of the autocorrelation operator. This approach exploits the Hilbert space structure, and its extension to the metric space context, and, in particular, to the Banach space context, requires to deriving a relationship (continuous embeddings) between the Banach space norm, and the RKHS norm, induced by the autocovariance operator, in contrast with the nonparametric regression approach for functional prediction (see, for instance, Ferraty et al. [2012], where asymptotic normality is derived). Specifically, in the nonparametric approach, a linear combination of the observed response values is usually considered. That is the case of the nonparametric local–weighting–based approach, involving weights defined from an isotropic kernel, depending on the metric or semi–metric of the space, where the regressors take their values (see, for example, Ferraty and Vieu [2006]; see also Ferraty et al. [2002], in the functional time series framework). The nonparametric approach is then more flexible regarding the structure of the space where the functional values of the regressors lie (usually a semi–metric space is considered). However, some computational drawbacks are present in its implementation, requiring the resolution of several selection problems. For instance, a choice of the smoothing parameter, and the kernel involved, in the definition of the weights, should be performed. Real–valued covariates were incorporated in the novel semiparametric kernel–based proposal by Aneiros-Pérez and Vieu [2008], involving an extension to the functional partial linear time series framework (see also Aneiros-Pérez and Vieu [2006]). Goia and Vieu [2015] also adopt a semi–parametric approach in their formulation of a two–terms Partitioned Functional Single Index Model. Geenens [2011] exploits the alternative provided by semi–metrics to avoid the curse of infinite dimensionality of some functional estimators.

On the other hand, in a parametric linear framework, Mas and Pumo [2010] introduced functional time series models in Banach spaces. In particular, strong mixing conditions and the absolute regularity of Banach–valued autoregressive processes have been studied in Allam and Mourid [2001]. Empirical estimators for Banach–valued autoregressive processes are studied in Bosq [2002], where, under some regularity conditions, and for the case of orthogonal innovations, the empirical mean is proved to be asymptotically optimal, with respect to almost surely (a.s.) convergence, and convergence of order two. The empirical autocovariance operator was also interpreted as a sample mean of an autoregressive process in a suitable space of linear operators. The extension of these results to the case of weakly dependent innovations is obtained in Dehling and Sharipov [2005]. A strongly–consistent sieve estimator of the autocorrelation operator of a Banach–valued autoregressive process is considered in Rachedi and Mourid [2003]. Limit theorems for a seasonality estimator, in the case of Banach autoregressive perturbations, are formulated in Mourid [2002]. Confidence regions for the periodic seasonality function, in the Banach space of continuous functions, is obtained as well. An approximation of Parzen’s optimal predictor, in the RKHS framework, is applied in Mokhtari and Mourid [2003]

, for prediction of temporal stochastic process in Banach spaces. The existence and uniqueness of an almost surely strictly periodically correlated solution, to the first order autoregressive model in Banach spaces, is derived in

Parvardeh et al. [2017]. Under some regularity conditions, limit results are obtained for AR(1) processes in Hajj [2011], where denotes the Skorokhod space of right–continuous functions on having limit to the left at each Conditions for the existence of strictly stationary solutions of ARMA equations in Banach spaces, with independent and identically distributed noise innovations, are derived in Spangenberg [2013].

In the derivation of strong–consistency results for ARB(1) componentwise estimators and predictors, Bosq [2000] restricts his attention to the case of the Banach space of continuous functions on with the supremum norm. Labbas and Mourid [2002] considers an ARB(1) context, for being an arbitrary real separable Banach space, under the construction of a Hilbert space where is continuously embedded, as given in the Kuelbs’s Lemma in [Kuelbs, 1970, Lemma 2.1]. Under the existence of a continuous extension to of the autocorrelation operator Labbas and Mourid [2002] obtain the strong-consistency of the formulated componentwise estimator of and of its associated plug–in predictor, in the norms of and respectively.

functional data in nuclear spaces, arising, for example, in the observation of the solution to stochastic fractional and multifractional linear pseudodifferential equations (see, for example, Anh et al. [2016a, b]). The scales of Banach spaces constituted by fractional Sobolev and Besov spaces play a central role in the context of nuclear spaces. Continuous (nuclear) embeddings usually connect the elements of these scales (see, for example, Triebel [1983]). In this paper, a Rigged–Hilbert–Space structure is defined, involving the separable Hilbert space appearing in the construction of the Kuelbs’s Lemma in [Kuelbs, 1970, Lemma 2.1]. A key assumption, here, is the existence of a continuous (Hilbert–Schmidt) embedding introducing the RKHS, associated with the autocovariance operator of the ARB(1) process, into the Hilbert space generating the Gelfand triple, equipped with a finer topology than the –topology. Under this scenario, strong–consistency results are derived, in the space of bounded linear operators on considering an abstract separable Banach space framework.

The outline of this paper is as follows. Notation and preliminaries are fixed in Appendix 2. Fundamental assumptions and some key lemmas are formulated in Appendix 3, and proved in Appendix 4. The main result of this paper on strong–consistency is derived in Appendix 5. Appendix 6 provides some examples. Final comments on our approach can be found in Appendix 7. The Supplementary Material provides in Appendix 8 illustrates numerically the results derived in Appendix 5, under the scenario described in Appendix 6, in a simulation study.

## 2 Preliminaries

Let be a real separable Banach space, with the norm and let the space of zero-mean

–valued random variables

such that

 √∫B∥X∥2BdP<∞.

Consider to be a zero–mean

–valued stochastic process on the basic probability space

satisfying (see Bosq [2000]):

 Xn=ρ(Xn−1)+εn,n∈Z,ρ∈L(B), (1)

where denotes the autocorrelation operator of In equation (1), the –valued innovation process on

is assumed to be strong white noise, uncorrelated with the random initial condition. Thus,

is a zero–mean Banach–valued stationary process, with independent and identically distributed components, and with for each Assume that there exists an integer such that

 ∥∥ρj0∥∥L(B)<1. (2)

Then, equation (1) admits an unique strictly stationary solution with ; i.e., belonging to given by for each (see Bosq [2000]). Under (2), the autocovariance operator of an ARB(1) process is defined from the autocovariance operator of as

 C(x∗)=E{x∗(X0)X0},x∗∈B∗.

The cross–covariance operator is given by

 D(x∗)=E{x∗(X0)X1},x∗∈B∗.

Since is assumed to be a nuclear operator, there exists a sequence such that, for every (see [Bosq, 2000, Eq. (6.24), p. 156]):

 C(x∗)=∞∑j=1x∗(xj)xj,∞∑j=1∥∥xj∥∥2B<∞.

is also assumed to be a nuclear operator. Then, there exist sequences and such that, for every

 D(x∗)=∞∑j=1x∗∗j(x∗)yj,∞∑j=1∥∥x∗∗j∥∥B∗∗∥∥yj∥∥<∞,

(see [Bosq, 2000, Eq. (6.23), p. 156]). Empirical estimators of and are respectively given by (see [Bosq, 2000, Eqs. (6.45) and (6.58), pp. 164–168]), for

 Cn(x∗)=1nn−1∑i=0x∗(Xi)(Xi),Dn(x∗)=1n−1n−2∑i=0x∗(Xi)(Xi+1),x∗∈B∗.

[Kuelbs, 1970, Lemma 2.1], now formulated, plays a key role in our approach.

###### Lemma 2.1

If is a real separable Banach space with norm then, there exists an inner product on such that the norm generated by is weaker than The completion of under the norm defines the Hilbert space where is continuously embedded.

Denote by a dense sequence in and by a sequence of bounded linear functionals on satisfying

 Fn(xn)=∥xn∥B,∥Fn∥=1, (3)

such that

 ∥x∥B=supn∈N|Fn(x)|,x∈B. (4)

The inner product and its associated norm, in Lemma 2.1, is defined by

 ⟨x,y⟩˜H = ∞∑n=1tnFn(x)Fn(y),x,y∈˜H, ∥x∥2˜H = ∞∑n=1tn{Fn(x)}2≤∥x∥2B,x∈B,

where is a sequence of positive numbers such that

## 3 Main assumptions and preliminary results

In view of Lemma 2.1, for every satisfies a.s.

 Xn=˜H∞∑j=1⟨Xn,vj⟩˜Hvj,n∈Z,

for any orthonormal basis of The trace autocovariance operator

 C=E{(∞∑j=1⟨Xn,vj⟩˜Hvj)⊗(∞∑j=1⟨Xn,vj⟩˜Hvj)}

of the extended ARB(1) process is a trace operator in

admitting a diagonal spectral representation, in terms of its eigenvalues

and eigenvectors that provide an orthonormal system in Summarizing, in the subsequent developments, the following identities in will be considered, for the extended version of ARB(1) process . For each

 C(f)=˜H ∞∑j=1Cj⟨f,ϕj⟩˜Hϕj (6) D(h)=˜H ∞∑j=1∞∑k=1⟨D(ϕj),ϕk⟩˜H⟨h,ϕj⟩˜Hϕk Cn(f)=˜H a.s. (7) Cn,j=a.s. 1nn−1∑i=0X2i,n,j, Xi,n,j=⟨Xi,ϕn,j⟩˜H, Cn(ϕn,j)=˜H a.s.Cn,jϕn,j Dn(h)=˜H a.s. (8)

where, for is a complete orthonormal system in and

 Cn,1≥Cn,2≥⋯≥Cn,n≥0=Cn,n+1=Cn,n+2=….

The following assumption plays a crucial role in the derivation of the main results in this paper.

Assumption A1.

is a.s. bounded, and the eigenspace

associated with in (6) is one-dimensional for every

Under Assumption A1, we can define the following quantities:

 a1=2√21C1−C2,aj=2√2max(1Cj−1−Cj,1Cj−Cj+1),j≥2. (9)
###### Remark 3.1

This assumption can be relaxed to considering multidimensional eigenspaces by redefining the quantities for each as the quantities for each given in [Bosq, 2000, Lemma 4.4].

Assumption A2. Let such that

 Cn,kn>0,(a.s.)kn→∞,knn→0, n→∞.

###### Remark 3.2

Consider

 Λkn=sup1≤j≤kn(Cj−Cj+1)−1. (10)

For sufficiently large,

 kn

Assumption A3. The following limit holds:

 supx∈B; ∥x∥B≤1∥∥ ∥∥ρ(x)−k∑j=1⟨ρ(x),ϕj⟩˜Hϕj∥∥ ∥∥B→0,k→∞. (11)

Assumption A4. are such that the inclusion of into is continuous; i.e.,

 H(X)↪˜H∗,

where denotes, as usual, the continuous embedding, the dual space of and the Reproducing Kernel Hilbert Space associated with .

Let us consider the closed subspace of with the norm induced by the inner product defined as follows:

 H = {x∈B; ∞∑n=1{Fn(x)}2<∞},⟨f,g⟩H=∞∑n=1Fn(f)Fn(g),f,g∈H. (12)

Then, is continuously embedded into and the following remark provides the isometric isomorphism established by the Riesz Representation Theorem between the spaces and its dual

###### Remark 3.3

Let and such that, for every consider and for certain Then, the following identities hold:

 ⟨f∗,g∗⟩˜H∗ = ∞∑n=11tnFn(f∗)Fn(g∗)=∞∑n=11tn√tn√tnFn(˜f)Fn(˜g)=⟨˜f,˜g⟩H =
###### Lemma 3.1

Under Assumption A4, the following continuous embeddings hold:

 H(X)↪˜H∗↪B∗↪H↪B↪˜H↪[H(X)]∗, (13)

where

 ˜H = {x∈B; ∞∑n=1tn{Fn(x)}2<∞},⟨f,g⟩˜H=∞∑n=1tnFn(f)Fn(g), f,g∈˜H H = {x∈B; ∞∑n=1{Fn(x)}2<∞},⟨f,g⟩H=∞∑n=1Fn(f)Fn(g), f,g∈H ˜H∗ = {x∈B; ∞∑n=11tn{Fn(x)}2<∞},⟨f,g⟩˜H∗=∞∑n=11tnFn(f)Fn(g), f,g∈˜H∗ H(X) = {x∈˜H; ⟨C−1(x),x⟩˜H<∞}, ⟨f,g⟩H(X) = ⟨C−1(f),g⟩˜H, f,g∈C1/2(˜H) [H(X)]∗ = {x∈˜H; ⟨C(x),x⟩˜H<∞} ⟨f,g⟩[H(X)]∗ = ⟨C(f),g⟩˜H f,g∈C−1/2(˜H).

Proof. Let us consider the following inequalitites, for each ,:

 ∥x∥˜H =  ⎷∞∑j=1tn{Fn(x)}2≤∥x∥B=supn≥1|Fn(x)|, ∥x∥B = supn≥1|Fn(x)|≤ ⎷∞∑n=1{Fn(x)}2=∥x∥H≤∞∑n=1|Fn(x)|=∥x∥B∗, ∥x∥B∗ = ∞∑n=1|Fn(x)|≤ ⎷∞∑n=11tn{Fn(x)}2=∥x∥˜H∗. (14)

 ∥f∥H(X)=√⟨C−1(f),f⟩˜H≥∥f∥˜H∗= ⎷∞∑n=11tn{Fn(x)}2. (15)

From equations (14)–(15), the inclusions in (13) are continuous.

It is well–known that is also an orthogonal system in Futhermore, under Assumption A4, from Lemma 3.1,

 {ϕj, j≥1}⊂H(X)↪˜H∗↪B∗↪H.

Therefore, from equation (12), for every

 ∥ϕj∥2H=∞∑m=1{Fm(ϕj)}2<∞. (16)

The following assumption is now considered on the norm (16):

Assumption A5. The continuous embedding belongs to the trace class. That is,

 ∞∑j=1∥ϕj∥2H<∞.

Let be defined as in Lemma 2.1. Assumption A5 leads to

 (17)

where, in particular, from equation (17),

 Nm = ∞∑j=1{Fm(ϕj)}2<∞,supm≥1Nm=N<∞ (18) V = supj≥1∥∥ϕj∥∥B≤∞∑j=1∞∑m=1{Fm(ϕj)}2<∞. (19)

The following preliminary results are considered from [Bosq, 2000, Theorem 4.1, pp. 98–99; Corollary 4.1, pp. 100–101; Theorem 4.8, pp. 116–117]).

###### Lemma 3.2

Under Assumption A1, the following identities hold, for any standard AR(1) process (e.g., the extension to of ARB(1) process satisfying equation (1)),

 ∥Cn−C∥S(˜H)=O⎛⎝(ln(n)n)1/2⎞⎠ a.s.,∥Dn−D∥S(˜H)=O⎛⎝(ln(n)n)1/2⎞⎠ a.s.,

where is the norm in the Hilbert space of Hilbert–Schmidt operators on ; i.e., the subspace of compact operators such that

 ∞∑j=1⟨A∗A(φj),φj⟩˜H<∞,

for any orthonormal basis of

###### Lemma 3.3

Under Assumption A1, let and in (6)– (7), respectively. Then,

 (nln(n))1/2supj≥1∣∣Cn,j−Cj∣∣⟶0 a.s.,n→∞.

###### Lemma 3.4

(See details in [Bosq, 2000, Corollary 4.3, p. 107]) Under Assumption A1, consider in equation (10) satisfying

 Λkn=o⎛⎝(nln(n))1/2⎞⎠,n→∞.

Then,

 sup1≤j≤kn∥ϕ′n,j−ϕn,j∥˜H⟶0 a.s.,n→∞,

where, for and

 ϕ′n,j=sgn⟨ϕn,j,ϕj⟩˜Hϕj,sgn⟨ϕn,j,ϕj⟩˜H=1⟨ϕn,j,ϕj⟩˜H≥0−1⟨ϕn,j,ϕj⟩˜H<0,

with being the indicator function.

An upper bound for is now obtained.

###### Lemma 3.5

Under Assumption A5, the following inequality holds:

 ∥c∥B×B=supn,m≥1|C(Fn)(Fm)|≤N∥C∥L(˜H),

where has been introduced in equation (18), denotes the space of bounded linear operators on and the usual uniform norm on such a space.

Let us consider the following notation.

 c=˜H⊗˜H ∞∑j=1Cjϕ′n,j⊗ϕ′n,j=˜H⊗˜H∞∑j=1Cjϕj⊗ϕj,cn=˜H⊗˜H∞∑j=1Cn,jϕn,j⊗ϕn,j. c−cn=˜H⊗˜H ∞∑j=1Cjϕ′n,j⊗ϕ′n,j−∞∑j=1Cn,jϕn,j⊗ϕn,j (21)

###### Remark 3.4

From Lemma 3.2, for sufficiently large, there exist positive constants and such that

In particular, for every considering sufficiently large,

 1K1⟨C−1(x),x⟩˜H≥⟨C−1n(x),x⟩˜H≥1K2⟨C−1(x),x⟩˜H ⇔ 1K1∥x∥2H(X)≥⟨C−1n(x),x⟩˜H≥1K2∥x∥2H(X). (22)

Equation (22) means that, for sufficiently large, the norm of the RKHS of is equivalent to the norm of the RKHS generated by with spectral kernel given in (21).

###### Lemma 3.6

Under Assumptions A1 and A4–A5, let us consider in (10) satisfying

 √knΛkn=o⎛⎝√nln(n)⎞⎠,n→∞, (23)

where has been introduced in Assumption A2. The following a.s. inequality then holds:

 ∥c−cn∥B×B≤max(N,√N)[∥C−Cn∥L(˜H) +2max(√∥C∥L(˜H),√∥Cn∥L(˜H))[supl≥1supm≥1∣∣Fl(ϕ′n,m)∣∣] × ⎷kn8Λ2kn∥Cn−C∥2L(˜H)+∞∑m=kn+1∥ϕn,m−ϕ′n,m∥2˜H.

Therefore, as

###### Lemma 3.7

For a standard ARB(1) process satisfying equation (1), under Assumptions A1 and A3–A5, for sufficiently large,

 sup1≤j≤kn∥∥ϕn,j−ϕ′n,j∥∥B ≤2Ckn[max(N,√N)[∥C−Cn∥L(˜H) +2max(√∥C∥L(˜H),√∥Cn∥L(˜H))(supl≥1supm≥1∣∣Fl(ϕ′n,m)∣∣) × ⎷kn8Λ2kn∥Cn−C∥2L(˜H)+∞∑m=kn+1∥ϕn,m−ϕ′n,m∥2˜H⎤⎦ +sup1≤j≤kn∥ϕn,j−ϕ′n,j∥˜HN∥C∥S(˜H)+V∥C−Cn∥S(˜H)]a.s. (24)

Under (23),

 sup1≤j≤kn∥∥ϕn,j−ϕ′n,j∥∥B⟶0 a.s.,n→∞.

###### Lemma 3.8

Under Assumption A3, if

 kn∑j=1∥ϕn,j−ϕ′n,j∥B→a.s.0,,n→∞,

then

 supx∈B; ∥x∥B≤1∥∥ ∥∥ρ(x)−kn∑j=1⟨ρ(x),ϕn,j⟩˜Hϕn,j∥∥ ∥∥B⟶0 a.s.,n→∞. (25)

###### Remark 3.5

Under the conditions of Lemma 3.7, if

then, equation (25) holds.

Let us know consider the projection operators

 ˜Πkn(x) = kn∑j=1⟨x,ϕn,j⟩˜Hϕn,j,Πkn(x)=kn∑j=1⟨x,ϕ′n,j⟩˜Hϕ′n,j,x∈B⊂˜H. (26)

###### Remark 3.6

Under the conditions of Remark 3.5, let

 ˜Πknρ˜Πkn=kn∑j=1kn∑p=1⟨ρ(ϕn,j),ϕn,p⟩˜Hϕn,j⊗ϕn,p,

then

 supx∈B; ∥x∥B≤1∥∥ ∥∥ρ(x)−kn∑j=1kn∑p=1⟨x,ϕn,j⟩˜H⟨ρ(ϕn,j),ϕn,p⟩˜Hϕn,p∥∥ ∥∥B⟶0 a.s., n→∞.

## 4 Proofs of Lemmas

### Proof of Lemma 3.5

Proof. Applying the Cauchy–Schwarz’s inequality, for every

 |C(Fk,Fl)| = ∣∣ ∣∣∞∑j=1CjFk(ϕj)Fl(ϕj)∣∣ ∣∣≤ ⎷∞∑j=1Cj[Fk(ϕj)]2∞∑p=1Cp[Fl(ϕp)]2 ≤ supj≥1|Cj| ⎷∞∑j=1[Fk(ϕj)]2∞∑p=1[Fl(ϕp)]2=supj≥1|Cj|√NkNl,

where have been introduced in equation (3), and satisfy (4)–(LABEL:A7:ineq_norm). Under Assumption A5, from equation (18),

 ∥c∥B×B=supk,l≥1|C(Fk,Fl)|≤supk,l≥1supj≥1|Cj|√NkNl=Nsupj≥1|Cj|=N∥C∥L(˜H).

### Proof of Lemma 3.6

Proof. Let us first consider the following identities and inequalities:

 |C−Cn(Fk)(Fl)| = ∣∣ ∣∣∞∑j=1CjFk(ϕ′n,j)Fl(ϕ′n,j)−Cn,jF<