1 Introduction
Let be a strictly stationary univariate time series. We say that the time series is regularly varying if all its finite dimensional distributions are regularly varying, i.e. for each , there exists a nonzero boundedly finite measure on infinity, such that
(1.1) 
on , as , where means vague convergence. Following [Kal17], we say that a measure defined on a complete separable metric space (endowed with its Borel field) is boundedly finite if for all Borel bounded sets and a sequence of boundedly finite measures is said to converge vaguely to a measure if for all continuous functions with bounded support. See also [HL06] who use the terminology of convergence. Here the metric space considered is endowed with the metric
where is an arbitrary norm on . This metric induces the usual topology and makes a complete separable space and bounded sets are sets separated from zero. Moreover, is still locally compact so this definition essentially yields the same notion as the classical vague convergence without the need for compactification at infinity.
This assumption implies that there exists such that the measure is homogeneous of degree and the marginal distribution of is regularly varying and satisfies the balanced tail condition: there exists such that
Without loss of generality, we assume that .
If , there exist two fundamentally different cases: either the exponent measure is concentrated on the axes or it is not. The former case is referred to as extremal independence and the latter as extremal dependence. In other words, extremal independence means that no two components can be extremely large at the same time, and extremal dependence means that some pairs components can be simultaneously extremely large.
In a time series context, we may want to assess the influence of an extreme event at time zero on future observations. If the finite dimensional distributions of the time series model under consideration are extremally independent or more generally if the vector
is extremally independent for some , then, for any Borel set which is bounded away from zero in and ,(1.2) 
Thus in case of extremal independence the exponent measure provides no information on (most) extreme events occurring after an extreme event at time 0.
In order to obtain a non degenerate limit in (1.2) and a finer analysis of the sequence of extreme values, it is necessary to change the normalization in (1.1), and possibly the space on which we will assume that vague convergence holds. One idea is to find a sequence of normalizations , such that for each , the conditional distribution of given has a non degenerate limit. Pursuing in the direction opened by [HR07] and [DR11], [LRR14] and [KS15] we will consider vague convergence on the set endowed with the metric on defined by
The bounded sets for this metric are those sets such that implies for some . Note that under the present definition of vague convergence, we avoid the pitfalls described in [DJe17].
Assumption 1.1.
There exist scaling functions , and nonzero measures , , boundedly finite on , , such that
(1.3) 
on and for every , the measures and on
are not concentrated on a hyperplane.
This assumption does not exclude regularly varying time series with extremal dependence for which for all . But our interest will be in extremally independent time series for which for all
. This assumption is fulfilled by many time series, like stochastic volatility models with heavy tailed noise or heavy tailed volatility, exponential moving averages and certain Markov chains with regularly varying initial distribution and appropriate conditions on the transition kernel. See
[KS15], [MR13] and [JD16].An important consequence of Assumption 1.1 is that the functions , are regularly varying (see [HR07, Proposition 1] and [KS15].) To put emphasis on the regular variation of the functions , we recall the following definition of [KS15].
Definition 1.2 (Conditional scaling exponent).
Under Assumption 1.1, for , we call the index of regular variation of the functions the (lag ) conditional scaling exponent.
The exponents , reflect the influence of an extreme event at time zero on future lags. Even though we expect this influence to decrease with the lag in the case of extremal independence, these exponents are not necessarily monotone decreasing. The measures also have some important homogeneity properties: For all Borel sets , ,
(1.4) 
Equivalently, for all bounded measurable functions ,
(1.5) 
Cf. [HR07, Proposition 1] and [KS15, Lemma 2.1]
. Define the probability measure
on byfor and a Borel subset of . Let be an valued random vector with distribution . Then, for every Borel subsets , we have
(1.6) 
See [KS15, Section 2.4]. Let
be a Pareto random variable with tail index
, independent of . Then, as ,(1.7) 
In particular, we define for the distribution function on :
(1.8) 
for all since the distribution of is continuous at all points except possibly 0.
The goal of this paper is to complement the investigation of this assumption started in [KS15] by providing valid statistical procedures to estimate the conditional scaling functions , the conditional limiting distributions and scaling exponents .
2 Statistical inference
Let be a distribution of . All our results we be proved under the following mixing assumptions.
Assumption 2.1.

[(A1)]

The sequence is mixing with rate .

There exist a non decreasing sequence , non decreasing sequences of integers and such that
(2.1) (2.2) (2.3)
2.1 Non parametric estimation of the limiting conditional distribution
In order to define an estimator of , we must first consider the infeasible statistic
(2.4) 
Then, Assumption 1.1 and the homogeneity property (1.5) imply that for all and ,
We consider weak convergence of the processes and defined on by
Assumption 2.2.
For all ,
(2.5) 
Assumption 2.3.
There exists such that
(2.6) 
Remark 2.4.
An assumptions similar to (2.5
) is unavoidable. Its purpose is to prove the convergence of the intrablock variance in the blocking method and tightness. The present one is taken from
[KSW15]. Similar ones have been considered in [Roo09], [DR10] and [DSW15]. Some of these conditions have been checked directly for extremally dependent time series like GARCH(1,1) or ARMA models (see e.g. [Dre02]), or for Markov chains that satisfy a drift condition (cf. [KSW15]). This assumption will be checked in Section 3 for some specific models. Assumption 2.3 is unavoidable if one wants to remove bias. This will not be discussed in the paper. The condition holds for some sequences .Let be a the Gaussian process on with covariance , , . We note that
is a standard Brownian motion on . The following theorem establishes weak convergence of the tail empirical process and forms the basis for statistical inference on . Its proof is given in Section 6.2.
Theorem 2.5.
Let be a strictly stationary regularly varying sequence such that Assumption 1.1 with extremal independence at all lags. Assume moreover that Assumptions 2.2 and 2.1 hold and that the function is continuous on . Then the process converges weakly in to . If moreover Assumption 2.3 holds, then converges weakly in to .
We now need proxies to replace and which are unknown in order to obtain a feasible statistical procedure. As usual, will be replaced by an order statistic. To estimate the scaling functions we will exploit their representations in terms of conditional mean. Therefore, we need additional conditions.
Assumption 2.6.
There exists and such that
(2.7) 
(2.8)  
(2.9)  
(2.10) 
Condition (2.8) requires and implies that the sequence is uniformly integrable conditionally on and therefore,
(2.11) 
Since the function and the limiting distribution are defined up to a scaling constant, we can and will assume without loss of generality that
Condition (2.9) is again unavoidable and must be checked for specific models. Condition (2.10) is a bias condition which will not be further discussed.
Set and let be the order statistics of . Define an estimator of by
(2.12) 
Corollary 2.7.
Let the assumptions of Theorem 2.5 and Assumption 2.6 hold with extremal independence at all lags. Then
where is a standard Brownian motion and is a standard Brownian bridge on .
Remark 2.8.
The moment conditions in
Assumption 2.6 may seem to be too restrictive. In fact, we can consider a family of estimators , where in (2.12) is replaced with with some . However, we do not pursue it in this paper.Define now the following estimator of :
(2.13) 
The theory for this estimator is easily obtained by applying Corollary 2.7 and the method.
Corollary 2.9.
Under the assumptions of Corollary 2.7 and if the function is differentiable, in , where the process is defined by
(2.14) 
where is the standard Brownian bridge.
Remark 2.10.
The additional term in the limiting distribution is due to the method of estimation of the conditional scaling function. Note that the limiting distribution depends only on and therefore can be used for a KolmogorovSmirnov type goodness of fit test of the conditional distribution.
2.2 Estimation of the conditional scaling exponent
We now consider the estimation of the scaling exponent . We will use the following result.
Lemma 2.11.
This is [KS15, Proposition 2], where the finiteness of is assumed, but it is easily seen that this is actually a consequence of (2.15). At this moment this is all we need to state our results but we will need to prove in Section 6.1 a generalized version of Lemma 2.11; see Lemma 6.4. It must be noted that Condition (2.15) does not hold for an i.i.d. sequence. See also Section 3.1.
If (2.15) holds, then the product has tail index . Hence, we can suggest the following estimation procedure of the scaling exponent .

Let , where is the tail index of the sequence . Estimate using the Hill estimator based on an intermediate sequence , i.e.

Let be estimated by , the Hill estimator of the tail index of , based on the sequence , (assuming without loss of generality that we have observations) and on the same intermediate sequence:

Estimate by
(2.17)
Asymptotic normality of the Hill estimator for betamixing sequences is well known. See e.g. [Dre00, Dre02]. The asymptotic normality of will follow from the delta method. To state the result, we need additional anticlustering and secondorder conditions.
Assumption 2.12.
For all ,
(2.18) 
Furthermore,
(2.19)  
(2.20) 
Theorem 2.13.
Let be a strictly stationary regularly varying sequence such that Assumption 1.1 holds with independence at all lags. Assume moreover that Assumptions 2.3, 2.12, 2.2 and 2.1 and the bound (2.15) hold and that is chosen in such a way that
(2.21) 
for some . Then
3 Examples
3.1 Stochastic volatility process
Consider the sequence , , where is a Gaussian process independent of the i.i.d. sequence , regularly varying with index . For simplicity we assume that the random variables are nonnegative. We list the properties of (see [DM01], [KS11], [KS15]).

[(i),wide=0pt]

The sequence is regularly varying with extremal independence. It satisfies Assumption 1.1 with for all .

By Breiman’s lemma, as .

By [Bra05, Theorem 5.2a),c)], if the spectral density of the Gaussian sequence is bounded away from zero and if with then ;

Conditioning on the sequence , the equivalence between the tails of and and Potter’s bounds yield for ,
as if (2.3) holds.
In summary, the results in Section 2.1 are applicable to the stochastic volatility model.
On the other hand, condition (2.15) does not hold and hence the method of estimating the conditional scaling exponent is not applicable here (note however that the exponent itself is zero).
3.2 Markov chains
As in [KSW15], assume that is a function of a stationary Markov chain , defined on a probability space , with values in a measurable space . That is, there exists a measurable real valued function such that . Assume moreover that:
Assumption 3.1.

[(i),wide=0pt]

The Markov chain is strictly stationary under .

The sequence is regularly varying with tail index .

The sequence satisfies Assumption 1.1.

There exist a measurable function , , and such that for all ,
(3.1) 
There exist an integer and for all , there exists a probability measure on and such that, for all and all measurable sets ,

There exist and a constant such that

For every ,
(3.2) where .
In [KSW15] we showed that the above assumptions (without (iii) and with in (3.2)) imply that is mixing with geometric rates and the conditions (2.2), (2.5) and (2.8)(2.9) are satisfied. Following the calculations in [KSW15] we can argue that (2.8)(2.9) hold with . Therefore, we conclude the following result.
Corollary 3.2.
Assume that Assumption 3.1 holds. Assume moreover that the conditions (2.1), (2.3), (2.6) are satisfied. Then the conclusion of Theorem 2.5 holds. If also (2.10) is satisfied, then the of Corollary 2.7 holds. If moreover is differentiable, then the conclusion of Corollary 2.9 holds.
4 Simulations
We simulated from Exponential AR(1) model , , where , and
are i.i.d. with exponential distribution and the parameter
. Hence, , , .On Figure 1 we plot estimates of the tail index of
using the Hill estimator along with the confidence intervals:
where is the reciprocal of the Hill estimator based on order statistics. On the same graph we plot the estimates of the tail index for products, along with the confidence intervals (left panel). On the right panels we display estimates of the scaling exponent along with the confidence interval:
where indicates that the estimator of the scaling exponent is based on order statistics. The factor
Comments
There are no comments yet.