# Nearly Unstable Integer-Valued ARCH Process and Unit Root Testing

This paper introduces a Nearly Unstable INteger-valued AutoRegressive Conditional Heteroskedasticity (NU-INARCH) process for dealing with count time series data. It is proved that a proper normalization of the NU-INARCH process endowed with a Skorohod topology weakly converges to a Cox-Ingersoll-Ross diffusion. The asymptotic distribution of the conditional least squares estimator of the correlation parameter is established as a functional of certain stochastic integrals. Numerical experiments based on Monte Carlo simulations are provided to verify the behavior of the asymptotic distribution under finite samples. These simulations reveal that the nearly unstable approach provides satisfactory and better results than those based on the stationarity assumption even when the true process is not that close to non-stationarity. A unit root test is proposed and its Type-I error and power are examined via Monte Carlo simulations. As an illustration, the proposed methodology is applied to the daily number of deaths due to COVID-19 in the United Kingdom.

## Authors

• 12 publications
• 2 publications
11/29/2017

### Extended Poisson INAR(1) processes with equidispersion, underdispersion and overdispersion

Real count data time series often show the phenomenon of the underdisper...
06/05/2019

### A copula-based bivariate integer-valued autoregressive process with application

A bivariate integer-valued autoregressive process of order 1 (BINAR(1)) ...
09/13/2019

### Generalized Records for Functional Time Series with Application to Unit Root Tests

A generalization of the definition of records to functional data is prop...
06/22/2020

### Density power divergence for general integer-valued time series with multivariate exogenous covariate

In this article, we study a robust estimation method for a general class...
08/29/2019

### A robust approach for testing parameter change in Poisson autoregressive models

Parameter change test has been an important issue in time series analysi...
02/04/2022

### First-order integer-valued autoregressive processes with Generalized Katz innovations

A new integer-valued autoregressive process (INAR) with Generalised Lagr...
04/05/2020

### Bayesian semiparametric time varying model for count data to study the spread of the COVID-19 cases

Recent outbreak of the novel corona virus COVID-19 has affected all of o...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

First-order nearly unstable continuous autoregressive processes have been well explored in the literature, see for example Chan and Wei (1987), Phillips (1987), Chan, Ing and Zhang (2019), and the references therein. In these works, it is assumed that the model approaches the non-stationarity region as the sample size increases. More specifically, a nearly unstable continuous process is defined by

 Y(n)t=ρnY(n)t−1+ηt,t∈N,

where

is a white noise and

, for .

In the past few years, nearly unstable discrete processes have emerged based on the INteger-valued AutoRegressive (INAR) approach (McKenzie, 1985; Al-Osh and Alzaid, 1987). The first attempt on this subject was due to Ispány, Pap and Van Zuijlen (2003). More specifically, a nearly unstable INAR(1) process is defined by

 X(n)t=αn∘X(n)t−1+ϵ(n)t,t∈N,

where is the thinning operator proposed by Steutel and van Harn (1979), given by with , for , and

is a sequence of independent and identically distributed (iid) random variables with

being independent of the counting series for all , for . These authors assumed that approaches 1 (non-stationarity) when as given in Chan and Wei (1987) in the continuous context. By assuming is known, the conditional least squares (CLS) estimator of was explored by Ispány, Pap and Van Zuijlen (2003)

. They showed that, under nearly non-stationarity and assuming finite second moment for

, the CLS estimator weakly converges to a normal distribution at the rate

. Other related works dealing with nearly unstable INAR (Galton-Watson/branching) processes are due to Wei and Winnicki (1990), Winnicki (1991), Ispány, Pap and Van Zuijlen (2005), Rahimov (2007), Rahimov (2008), Drost, Van Den Akker and Werker (2009), Rahimov (2009), Barczy, Ispány and Pap (2011), Ispány, Körmendi and Pap (2014), Barczy, Ispány and Pap (2014), Guo and Zhang (2014), and Barczy, Körmendi and Pap (2016). Practical situations demonstrating evidence of a nearly unstable INAR model are discussed for instance by Hellström (2001).

Another popular way for dealing with count time series data is the INteger-valued Genenalized AutoRegressive Conditional Heterokedastic (INGARCH) models by Ferland, Latour and Oraichi (2006), Fokianos, Rahbek and Tjøstheim (2009), Fokianos and Fried (2010), Zhu (2011), Fokianos and Tjøstheim (2011), Zhu (2012), Christou and Fokianos (2015), Gonçalves et al. (2015), Davis and Liu (2016), Silva and Barreto-Souza (2019), Weiß et al. (2020), which constitute in some sense an integer-valued counterpart of the classical GARCH models by Bollerslev (1986). The INGARCH methodology is the focus of this paper. Like the existing literature on nearly unstable continuous and INAR processes that assumes first-order autoregressive dependence, in this paper we consider the first-order autoregressive version of the INGARCH approach, which is known as INARCH(1) (INteger-valued AutoRegressive Conditional Heteroskedasticity).

Our chief goal in this paper is to introduce a Nearly Unstable INARCH (denoted by NU-INARCH) process for dealing with count time series data. To the best of our knowledge, this is the first time that a nearly unstable count time series model is being proposed based on the INARCH approach; all existing nearly unstable discrete processes in the literature consider the INAR approach. We establish the weak convergence of the NU-INARCH process (when properly normalized) endowed with a Skorohod topology. With this result at hand, we derive the asymptotic distribution of the conditional least squares estimator of the correlation parameter as a functional of certain stochastic integrals. An equally important contribution of this paper is to develop a unit root test (URT) for the INARCH model, where the asymptotic distribution of the statistics under the null hypothesis is provided. Note that although URTs are well explored in the continuous case, only sporadic results are available for the discrete case. A few works dealing with this relevant problem, based on the INAR approach, are due to

Hellström (2001) and Drost, Van Den Akker and Werker (2009).

The paper is organized as follows. In Section 2, the NU-INARCH model is introduced and a fluctuation theorem is established, which involves the Cox-Ingersoll-Ross diffusion process. The asymptotic distribution of the CLS estimator for the correlation parameter is derived in Section 3 under the nearly unstable and stationarity assumptions. Section 4

provides simulated results about the asymptotic distribution of the CLS estimator under both nearly unstable and stationary approaches and also compares them in terms of confidence interval coverages. A unit root test for the INARCH process is proposed in Section

5 and its performance is evaluated via Monte Carlo simulations. An empirical application about the daily number of deaths due to COVID-19 in the United Kingdom, which exhibits a nearly unstable/non-stationary behavior, is provided in Section 6. Concluding remarks and future research are addressed in Section 7.

## 2 Model and the Fluctuation Theorem

In this section, we define the nearly unstable INARCH process and obtain its weak convergence (under a proper normalization) in the space of the non-negative càdlàg functions endowed with the Skorokhod topology.

###### Definition 2.1.

We say that a sequence is a first-order nearly unstable integer-valued ARCH process (in short NU-INARCH) if

 X(n)t|F(n)t−1 ∼ Poisson(λ(n)t), (1) λ(n)t≡E(X(n)t|F(n)t−1) = β+αnX(n)t−1,t≥1, (2)

for , where , , and , with , and (constant starting value).

###### Remark 2.1.

For the nearly unstable INARCH model defined above, we have that , for . The parameterization of in (2) was first proposed by Chan and Wei (1987) and subsequently used in Ispány, Pap and Van Zuijlen (2003).

In the next proposition, we provide the mean, variance, and autocorrelation function of the NU-INARCH process. These results will be important to establish the proper normalization in order to obtain a non-trivial limit for the counting process.

###### Proposition 2.2.

Let be a nearly unstable INARCH process. Then, its marginal mean and variance, and autocorrelation function are given respectively by

 E(X(n)t)=β1−αtn1−αn, Var(X(n)t)=β1−αn{1−α2tn1−α2n−αtn1−αtn1−αn}, cov(X(n)t+k,X(n)t)=αknVar(X(n)t),t,k∈N0≡{0,1,2,…}.
###### Proof.

We have that . By using recursion times, we obtain the result for the marginal mean. For the variance, it follows that

 Var(X(n)t)=E(Var(X(n)t|F(n)t−1))+Var(E(X(n)t|F(n)t−1))=β+αnE(X(n)t−1)+α2nVar(X(n)t−1)= β1−αtn1−αn+α2n% Var(X(n)t−1)=β1−αn{t−1∑k=0α2kn−αtnt−1∑k=0αkn}=β1−αn{1−α2tn1−α2n−αtn1−αtn1−αn}.

Finally, for , the autocorrelation function becomes

 cov(X(n)t+k,X(n)t) = E(cov(X(n)t+k,X(n)t)|F(n)t)+cov(E(X(n)t+k|F(n)t),E(X(n)t|F(n)t)) = cov(E(X(n)t+k|F(n)t),X(n)t))=αncov(E(X(n)t+k−1|F(n)t),X(n)t)) = αncov(X(n)t+k−1,X(n)t)=αknVar(X(n)t),

where we have used in the third equality the fact that since for . ∎

From Proposition 2.2, we have that and . We then define the normalized process and obtain that , for . In the following theorem, we establish the weak convergence of the process as . We introduce some notation before presenting such a result. Denote by the space of the non-negative càdlàg (right continuous with left limits) functions on and the space of infinitely differentiable functions on having compact supports.

###### Theorem 2.3.

The stochastic process weakly converges in endowed with the Skorokhod topology to a diffusion process given by the solution of the stochastic differential equation

 dX(t)=(β−γX(t))dt+√X(t)dB(t),t>0, (3)

and , as , where is a standard Brownian motion.

###### Remark 2.4.

The process appearing in Theorem 2.3, Equation (3), is known in the literature as the Cox-Ingersoll-Ross (CIR) process (Cox, Ingersoll and Ross, 1985).

###### Proof.

We have that , with ; we here denote and . In particular, almost surely. Note that

is a Markov chain assuming values in

. For , define . From Theorem 6.5 in Chapter 1 and Corollary 8.9 in Chapter 4 of Ethier and Kurtz (1986), to obtain the desired result, it is enough to show that

 limn→∞supx∈En|ϵn(x)|=0,h∈C∞c[0,∞), (4)

with , where and denote the first and second derivatives of , respectively.

For , we have that

 ∫10h′′(x+v(˜Z(n)x−x))dv=h′(˜Z(n)x)−h′(x)˜Z(n)x−x (5)

and

 ∫10vh′′(x+v(˜Z(n)x−x))dv=h′(˜Z(n)x)˜Z(n)x−x−h(˜Z(n)x)−h(x)(˜Z(n)x−x)2. (6)

By combining (5) and (6), we obtain that

 n(h(˜Z(n)x)−h(x))=∫10n(˜Z(n)x−x)2(1−v)h′′(x+v(˜Z(n)x−x))dv+n(˜Z(n)x−x)h′(x). (7)

Note that Equation (7) also holds for . Further, we can write

 −12E(n(˜Z(n)x−x)2)h′′(x)=E(−∫10n(˜Z(n)x−x)2(1−v)h′′(x)dv). (8)

We now use the Equations (7) and (8) to express as follows:

 ϵn(x) = E(∫10n(˜Z(n)x−x)2(1−v)(h′′(x+v(˜Z(n)x−x))−h′′(x))dv)+ (9) h′(x){E(n˜Z(n)x)−(β−γx)}+12h′′(x){E(n(˜Z(n)x−x)2)−x} \coloneqq ϵ(1)n(x)+ϵ(2)n(x)+ϵ(3)n(x).

We will show that , for . This result, Equation (9), and the triangular inequality imply that (4) holds and therefore conclude the proof of the theorem.

To show the case , we argue as in the proof of Theorem 3.1 in Chapter 9 of Ethier and Kurtz (1986). Then, the result follows by showing that for any convergent sequence , where is allowed. Without loss of generality, suppose that the support of is contained in the interval , for constant . For and , it folllows that and therefore the integral involved in equals 0 under the region ( for ), that is . Define for , , for , and . Hence, it follows that

 |ϵ(1)n(xn)| = ∣∣∣E(∫1ω∗(x)n(˜Z(n)x−x)2(1−v)(h′′(x+v(˜Z(n)x−x))−h′′(x))dv)∣∣∣ (10) ≤ E(∫1ω∗(x)n(˜Z(n)x−x)2(1−v)2∥h′′∥dv)=nE((˜Z(n)x−x)2)∥h′′∥ω∗(x)2.

Further, we have that . Consider , then and . These results give us that the right-hand side of (10) goes to 0 as . We obtain the same conclusion when since and , and hence . Suppose now that . We can establish the weak convergence of

via its characteristic function as follows:

 E(exp{it√n(˜Z(n)x−xn)}) = exp{−it√nxn+(β+αnnxn)(eit/√n−1)} = exp{itxnγn√n−αnxnt22+O(n−3/2)} ⟶ exp{−xt2/2},t∈R,

as . Therefore, . Hence, the integrand in

is bounded above by an integrable random variable. Further, this integrand converges in probability to 0 since

. We then apply the Dominated Convergence Theorem to conclude that .

For the case , it follows that

 supx∈En|ϵ(2)n(x)| = supx∈Enx|h′(x)||γn−γ|≤supx∈Enx|h′(x)|I{0≤x≤c}|γn−γ| ≤ c∥h′∥|γn−γ|→0,

as . In a similar fashion, for , it can be shown that , which concludes the proof. ∎

## 3 Conditional Least Squares

In this section, we provide the asymptotic distribution of the conditional least squares estimator of for the nearly unstable INARCH process. The parameter is assumed to be known. This can be seen as a nuisance parameter since our main interest relies on the parameter that controls the dependence in the model. In the empirical illustration, we discuss how to deal with the unknown case.

The CLS estimator of is obtained by minimizing the -function given by . Hence, we obtain explicitly the CLS estimator of , say , which is given by

 ˆαn=n∑t=2Xt−1(Xt−β)n∑t=2X2t−1. (11)

We begin by deriving the asymptotic distribution of under the stationary assumption, where we denote the count time series by (no need for the superscript ). This case will be contrasted to the nearly unstable INARCH process through simulation in the following section.

###### Theorem 3.1.

Assume that is a trajectory from a stationary Poisson INARCH(1) model, that is . Then, the CLS estimator given in (11) satisfies

 √n(ˆαn−α)d⟶N(0,˜σ2),

as , where

 ˜σ2=(1−α)(1−α2)(1+β(1+α))2{1+β(1−α)+α(2+β−1)1−α−α2(1−α)β−11−α3+1+β(1+α)1−α}.
###### Proof.

From Fokianos, Rahbek and Tjøstheim (2009), we have that is strictly stationary and ergodic since . Hence, we can use Theorem 3.2 from Tjøstheim (1986) to establish the asymptotic normality of the CLS estimator . The other conditions necessary to obtain this weak convergence can be straightforwardly checked in our case and therefore are omitted. Applying this theorem, we get that the asymptotic variance, say , assumes the form , with and . Explicit expression for the marginal moments of a Poisson INARCH(1) model are given in Weiß (2010). Using these results and the notation considered there with , for , we obtain and . Direct algebric manipulations conclude the proof. ∎

From now on assume that is a nearly unstable INARCH process as given in Definition 2.1. Define , , and , for and , where denotes the integer-part of . Like in the nearly unstable INAR process by Ispány, Pap and Van Zuijlen (2003), we can express as

 ˆαn−αn=n∑t=2X(n)t−1W(n)tn∑t=2(X(n)t−1)2=∫10X(n)(s)dW(n)(s)n∫10(X(n)(s))2ds. (12)

In the following lemma, we provide the asymptotic behavior of the autocovariance function of the process ; note that . This will be important to identify the proper normalization of in (12) yielding a non-trivial weak limit.

###### Lemma 3.2.

For , we have that , where for , , and denoting that for real sequences and .

###### Proof.

It is straightforward that and for . Further, , where the last equality follows from the expression of the covariance given in Proposition 2.2. After using the expression of the variance given in that lemma, we obtain that .

From the above results and Proposition 2.2, we obtain that

 cov(W(n)(s),W(n)(v)) = ⌊ns⌋∧⌊nv⌋∑k=1Var(W(n)k)=β1−αn{⌊ns⌋∧⌊nv⌋−αn1−α⌊ns⌋∧⌊nv⌋n1−αn} ≈ n2βγ−2(γu+e−γu−1)=n2CW(s∧v).

Lemma 3.2 and Theorem 2.3 give us that . We now are able to establish the asymptotic distribution of the CLS estimator under the nearly unstable INARCH process as follows.

###### Theorem 3.3.

Let be the diffusion process given in (3). Then, the CLS estimator satisfy the following weak convergence

 n(ˆαn−αn)d⟶∫10X(t)dW(t)∫10X(t)2dt=∫10X(t)3/2dB(t)∫10X(t)2dt, (13)

as , where , for , with .

###### Proof.

Define , for . We have that

 n(ˆαn−αn)=∫10X(n)(s)dW(n)(s)∫10(X(n)(s))2ds=∫10X(n)(s)dW(n)(s)∫10(X(n)(s))2ds,

where both numerator and denominator have the same order of magnitude .

For , it follows that

and then can be expressed by

Define the functions () and mapping into as and . Hence, it follows that . Using the fact that the CIR process has almost sure continuous trajectories and similar arguments given in the proof of Proposition 4.1 of Ispány, Pap and Van Zuijlen (2003), we obtain that weakly converges to as .

In particular, we have that weakly converges to . From the definition of , we have that and, therefore, . In other words, . The above results and the continuous mapping theorem give us that .

The above arguments are straightforwardly extended to establish the joint weak convergence

 (X(n)(s),W(n)(s),∫10(X(n)(u))2du)⇒(X(s),W(s),∫10(X(u))2du)

in as . Then, the desired result given in (13) is obtained by applying the continuous mapping theorem. ∎

## 4 Simulated Experiments

In this section, we present simulated results illustrating the behavior of the asymptotic distributions of the normalized CLS estimator under the nearly unstable and stable cases. All the numerical results of this paper were obtained by using the statistical software R (R Development Core Team, 2021). We conduct Monte Carlo simulations with 10000 replications, where we generate Poisson INARCH(1) trajectories with , , and initially a sample size of . Note that the chosen values for here indicate nearly unstable count processes. For each replication, we compute the CLS estimate of using (11) and then its standardized estimate as and according to the nearly unstable (Theorem 3.3) and stable/stationary (Theorem 3.1) cases, respectively.

A generator from the asymptotic distribution given on the right-hand side of (13

) was implemented, where the stochastic integrals are approximately evaluated via type-Riemann integrals. Hence, for instance, we can obtain its quantiles and also plot the associated density function by generating samples and then applying a non-parametric density estimator (here the Gaussian kernel is considered), which are important for what follows. We present the histograms and qq-plots of the standardized CLS estimates along with their associated asymptotic density/quantiles under the stable and nearly unstable cases in Figures

2 and 2, respectively. From Figure 2, it is evident that the normal approximation is not adequate and it is worsening when gets closer to 1, which is expected since these results are based on stationarity. On the other hand, the histograms and qq-plots regarding the nearly unstable approximation given in Figure 2 show an excellent agreement between the empirical standardized estimates and the theoretical asymptotic distribution for all scenarios.

A natural question is what happens when is not close to 1. To address this point, we run additional simulations with