# Hypothesis Test of a Truncated Sample Mean for the Extremely Heavy-Tailed Distributions

This article deals with the hypothesis test for the extremely heavy-tailed distributions with infinite mean or variance by using a truncated sample mean. We obtain three necessary and sufficient conditions under which the asymptotic distribution of the truncated test statistics converges to normal, neither normal nor stable or converges to -∞ or the combination of stable distributions, respectively. The numerical simulation illustrates an application of the theoretical results above in the hypothesis testing.

There are no comments yet.

## Authors

• 1 publication
• 3 publications
08/18/2020

### A Note on the Sum of Non-Identically Distributed Doubly Truncated Normal Distributions

It is proved that the sum of n independent but non-identically distribut...
07/29/2020

### Moments of the doubly truncated selection elliptical distributions with emphasis on the unified multivariate skew-t distribution

In this paper, we compute doubly truncated moments for the selection ell...
05/04/2021

### The Lévy combination test

A novel class of methods for combining p-values to perform aggregate hyp...
08/04/2019

### Effect of Interim Adaptations in Group Sequential Designs

This manuscript investigates unconditional and conditional-on-stopping m...
03/18/2019

### Robust Inference via Multiplier Bootstrap

This paper investigates the theoretical underpinnings of two fundamental...
01/26/2021

### Ethereum ECCPoW

The error-correction code based proof-of-work (ECCPoW) algorithm is base...
12/15/2019

### Testing Homogeneity for Normal Mixture Models: Variational Bayes Approach

The test of homogeneity for normal mixtures has been conducted in divers...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

There are many random systems with the heavy-tailed distribution such as the change in cotton prices, financial market returns, magnitudes of earthquakes and floods, the internet data and so on ([4]). A random variable with the extremely heavy tails means that its mean or variance is infinite. Three random variables with continuous distributions: Cauchy, Levy and Pareto with the infinite mean, are often used to model the extremely heavy tails in the stock returns and in the transmission line restoration times (see [8], [9], [16]).

When the mean is finite, there are many methods for estimating the mean and using the sample mean

to do the hypothesis testing (see [7], [11], [13-17]), where the sequence is i.i.d. with the common distribution function . Lugosi and Mendelson [12] gave an overview on mean estimation and regression under heavy-tailed distributions.

When the variance is infinite, for example, for some , Bubeck et al. [2] proposed a truncated empirical mean to estimate the mean and gave the convergence rate of , where , , and is the indicator function. Lee et al. [10] further present a novel robust estimator with the same convergence rate. Avella-medina et al.[1] considered Huber’s M-estimator of which is defined as the solution to , where is the Huber function. They also obtained the same convergence rate of .

However, when the mean or variance is infinite, there is not much research work on hypothesis testing. Let for , where the index satisfies and is a positive slowly vary function. It is well-known (see [3]) that converges in distribution to the stable random variable with the index and the shaper parameter 1, where , for and for . But unfortunately, it is difficult to do the hypothesis testing by using the stable distribution since the stable random variable has no analytic expression of density function when (see [5]).

It is natural to ask a question: how to construct a test statistic such that it can be easily used to do the hypothesis testing when the mean or variance is infinite ? The main purpose of this paper is to solve the problem.

In Section 2, we present two test statistics with the truncated sample mean. The three necessary and sufficient conditions under which the asymptotic distribution of the truncated sample statistics converges to normal, neither normal nor stable or converges to or the combination of stable distributions are given in Section 3. A relationship between the limiting distribution of the truncated sample statistics and the stable distribution is shown in Section 4. Section 5 illustrates simulation results about the hypothesis testing for three rejected regions. Section 6 provides some concluding remarks. Proofs of the theorems are given in the Appendix.

## 2 Two test statistics with the truncated sample mean.

Note that a random variable can be written as the summation of positive and negative parts, that is, . Without loss of generality, we only consider the nonnegative random variables in the following. Let be a sequence of mutually independent nonnegative random variables with extremely heavy-tailed distribution functions . Similar to the paper [1], we present a truncated sample mean in the following

 ^μ(bn)=n−1n∑k=1XkI(Xk≤bn), (1)

where is a positive sequence satisfying as which can be called as the truncated sequence. Note that when is replaced with , the resulting truncated sample mean has the same statistical properties except for a constant factor.

According to the the sample variance and the variance , we may present two test statistics

 Tn:=n[^μ(bn)−μ0(bn)]^B(bn),Ton:=n[^μ(bn)−μ0(bn)]√Var(bn) (2)

to do the following hypothesis testing:

 Original hypothesisH0:μn=μ0(bn), alternative hypothesisH1:μn≠μ0(bn),

where , , for and .

This leads to another problem: how to choose the truncated sequence such that the distributions of the test statistics and can converge to some limiting distribution? We are particularly concerned that how to choose the truncated sequence such that and

can converge to normal distribution.

To this end, we first present two conditions. Let be mutually independent non-negative random variables with the extremely heavy-tailed distribution functions satisfying

 (I)max1≤k≤n{1−Fk(bn)}→0(n→∞)

and there is a series of positive numbers satisfying such that

 (II)E(Xrk(bn))=(1+o(1))dk(r)brn[1−Fk(bn)]→∞(n→∞)

for and .

There are many probability density functions, for example, the following five density functions, satisfying the two conditions (I) and (II).

 fk(x) = ck1cos(x−αk)x1+αk(sin(x−αk)+1),gk(x)=ck2x−(αk+1)cos(x−1), hk(x) = ck3x−αksin(1/x),pk(x)=ck4sin(x−(αk+1)),qk(x)=ck5x−(αk+1)

for and , where are the normalized positive numbers, and . In fact, for with density functions we have

 1−Fk(bn)=ck1αklog(sin(b−αkn)+1)=(1+o(1))ck1αkb−αkn→0

uniformly for as and

 E(Xrk(bn))=(1+o(1))∫bnMxrfk(x)dx=(1+o(1))ck1br−αknr−αk=(1+o(1))αkr−αkbrn[1−Fk(bn)]

for and large , where . That is, the conditions (I) and (II) hold for

. Similarly, we can check that the other four kinds of probability densities

and also satisfy the two conditions.

Next we consider the case that . For this case, we may have and . In addition to the condition (I), the condition (II) for this case will be replaced by the following condition (II)

 (II)′E(Xrk(bn))=(1+o(1))dk(r)brn[1−Fk(bn)]→∞(n→∞)

for and , where for satisfying .

For the five kinds of probability densities and above, with , we can similarly check that they all satisfy the conditions (I) and (II).

Remark 1. Note that the probability density of the stable random variable can be written as (see [5]) for large , where . We can also check that with and with , satisfy the conditions (I), (II) or (II), respectively.

The following lemma shows that the regularly varying distribution functions (see [2]) also satisfies the conditions (I), (II) or (II).

Let the regularly varying distribution functions satisfy

 1−Fk(x)=x−αkL(x)0<αk<1 or1≤αk<2 (3)

for , where is bounded, , ( or ) and is a positive monotone continuous slowly vary function. Here, the slowly vary function means that

 L(tx)/L(t)→1 (4)

for any positive number as . For example, , and their reciprocals, are all the slowly vary functions, where are two positive constants.

Lemma 1. The non-negative random variables with the regularly varying distribution functions satisfying (3) and (4) also satisfy the conditions (I), (II) or (II).

## 3 Asymptotic distributions of the two test statistics

### 3.1 Asymptotic distributions of Tn

Let

 ξn: = ∑nk=1(Xk(bn)−μk(bn))bn,ηn:=∑nk=1(Xk(bn)−μ0(bn))2b2n, (5) hn: = n∑k=1(1−Fk(bn)),hn(m):=n∑k=1dk(m)(1−Fk(bn)) (6)

for and . Let and denote the standard normal distribution and the convergence in distribution, respectively.

The following theorem gives the three sufficient conditions, , and for the asymptotic convergence of .

Theorem 1. Assume that the original hypothesis holds and are mutually independent non-negative random variables with the extremely heavy-tailed distribution functions satisfying the conditions (I) and (II) or (II).
(i) If , then ;
(ii) If for some positive number , then, there are two random variables such that and , where both and

have the following characteristic functions

 Cξ(t) = exp{hα|t|α(isgn(t)∫|t|0sinx−xx1+αdx−∫|t|01−cosxx1+αdx)} (7) Cη(t) = exp{hα2|t|α2(isgn(t)∫|t|0sinxx1+α2dx−∫|t|01−cosxx1+α2dx)} (8)

respectively for .
(iii) If and , then in probability;
(iv) If and , then, and , where both and are stable random variables with the shaper parameters and , respectively.

Remark 2. It can be seen from the proof of (i) of Theorem 1 that the result in (i) still holds if we use the following weak conditions: there is a positive sequence with

 0

satisfies the conditions (II) or (II) for or , respectively. And the conditions that ( or ) satisfying the conditions (II) or (II), can let us to obtain the display expressions of the four characteristic functions of random variables , and .

Remark 3. The distribution functions of and are not symmetric, so do and . In other words, the distributions of both and are not the standard normal, . Moreover, have continuous probability density functions since the absolute value functions of the characteristic functions of are integrable.

It is clear that or mean that the truncated sequence approaches to infinity slowly or goes to infinity quickly, respectively.

As an application of (i) of Theorem 1, we can get the confidence interval for

in the following

 P(μ0(bn)∈[^μ(bn)−x^B(bn)n,x^B(bn)n+^μ(bn)])≈2Φ(x)−1(x>0) (9)

for large with the probability , where is the standard normal distribution. It can be seen that if with and , then the mean is closer the real mean than the mean .

By Theorem 1 we can further get the following theorem which gives the three necessary and sufficient conditions of asymptotic convergence of .

Theorem 2. Let the conditions of the theorem 1 hold.
(i) if and only if ;
(ii) if and only if .
(iii) Let . in probability if and only if .
(iv) Let . if and only if .

Next, we discuss on the set of the truncated sequence . Let and define three regions or sets and of the truncated sequence in the following

 Rs={{bn}:Dn(bn)→∞},Rc={{bn}:h1≤Dn(bn)≤h2} and Rb={{bn}:Dn(bn)→0},

where are two positive numbers. Let the truncated sequence . We may call and as a critical truncated sequence and the critical truncated region respectively, since when the truncated sequence , that is, is smaller than the critical truncated sequence , then , when the truncated sequence , that is, is bigger than the critical truncated sequence , then converges to in probability or , and for the critical truncated sequence, there is a subseries such that .

Now we consider an application of Theorem 1 to the distribution functions satisfying (3) and (4). Let is an increasing positive sequence satisfying as . Assume that

 (αk−α)logn→0 (10)

for as . Take the truncated sequence satisfying for , it follows that

 n∑k=1[1−Fk(cn)] = hnn∑k=1(hnL(cn))αk−αα = (1+o(1))n−l(n)nhn−l(n)n∑k=l(n)e(αk−α)α(logh−logn−logL(cn)) = (1+o(1))h

for large . This means that is the critical truncated sequence. It is clear that for and for , where is a positive monotone continuous slowly vary function satisfying as .

By Lemma 1 and Theorem 1 we have the following corollary.

Corollary 1. Let be mutually independent non-negative random variables with the distribution functions satisfying the conditions (3), (4) and (10).
(i) If the truncated sequence satisfies , then .
(ii) For the critical truncated sequence , .
(ii) Let . If satisfies , then converges to in probability.
(iii) Let . If satisfies , then .

### 3.2 Asymptotic distributions of Ton

The following theorem gives three necessary and sufficient conditions of asymptotic convergence of .

Theorem 3. Under the conditions of Theorem 1, we have
(i) if and only if ;
(ii) if and only if , where the characteristic function of random variable has the following form

 Cα,h(t)=exp{hασ−α|t|α(isgn(t)∫|t|/√hσ0sinx−xx1+αdx−∫|t|/√hσ01−cosxx1+αdx)} (11)

for and ;
(iii) converges to in probability if and only if .

It can be seen that the distribution of is neither normal nor stable. So we may call the distribution of NNS-distribution.

Remark 4. Similar to (iv) of Theorem 1 we can check that as for . In other words, though the random variable is not the stable random variable, its some limiting is.

Remark 5. From (iii)-(iv) of Theorem 1 and (iii) of Theorem 3 it follows that the asymptotic statistical properties of and are completely different. The reason is that in probability when . In fact, for small and , by the condition (II) we have

 P(TonTn≥ε) ≤ E(^B(bn))ε√Var(bn)≤∑nk=1E|Xk(bn)−^μ(bn)|ε√Var(bn) ≤ 2∑nk=1E(Xk(bn))/bnε√Var(bn)/b2n≤O(√hn)→0

as , where . Note that . For , it follows from (iv) of Theorem 1 that

 P(TonTn≥ε) = P(ηn−ξ2n/nVar(bn)/b2n≥ε2) = P(h2/αn[ηnh−2/αn−(ξnh−1/αn)2/n]O(hn)≥ε2) = P((1+o(1))h2−ααn[ηnh−2/αn−(ξnh−1/αn)2/n]≥ε2)→0

as since .

Next we consider the general truncated sequence satisfying for every as . Similar to we can define the test statistics for in the following

 Ton(bnk)=n√Var(bnk)(^μ(bnk)−μ0(bnk)), (12)

where , , for and .

Let be continuous with and for Hence, and , where . Similar to Theorem 3, we can get the following theorem 4 which shows that is a critical truncated sequence.

Theorem 4. Let be mutually independent non-negative random variables with the continuous distribution functions satisfying (3), (4) and (10). Let the variance be known and be a positive slowly vary function satisfying as . Under the original hypothesis , we have that
(i) if , then, ;
(ii) if , then, converges to in probability;
(iii) if , then, .

By (iii) of Theorem 4 we see that for for any large number , there is a subsequence such that as , where .

## 4 The relationship between Tα,1 and Sα,1

By the remark 4, we know that there is a relationship between and for . In this section, we will show another relationship between and .

Let be i.i.d. positive random variables with the heavy-tailed continuous distribution function satisfying . We have or and therefore, , where . It is well-known that (see [3])

 Sn,α,1:=c−1nn∑k=1(Xk−μn)⇒Sα,1

as , where for and for .

By the critical truncated sequence , we may decompose into two parts, , where and are defined in the following

 Un=c−1nn∑k=1[XkI(Xk≤cn)−μn],Vn=c−1nn∑k=1XkI(Xk>cn) (13)

for . In fact, both and can be considered as the sum of the lower parts of and the sum of the upper parts of , respectively.

The following theorem shows that the limiting distribution of is the distribution of and the limiting distribution of is the distribution of linear combination of and .

Theorem 5. Suppose are i.i.d. positive random variables with the heavy-tailed continuous distribution function satisfying . Then
(i)

 Un⇒√α2−αTα,1+α1−α,Vn⇒Sα,1−√α2−αTα,1+α1−α

for ;
(ii)

 Un⇒√α2−αTα,1,Vn⇒Sα,1−√α2−αTα,1

for .

## 5 Numerical simulations

In this section, as an application of (i) of the theorems 1 and 3, the hypothesis testing will be studied by the numerical simulations based on the asymptotic distribution of and