# On discrimination between classes of distribution tails

We propose the test for distinguishing between two classes of distribution tails using only the largest order statistics of the sample and state its consistency. We do not assume belonging the corresponding distribution functions to any maximum domain of attraction.

## Authors

• 2 publications
05/26/2021

### Total, Equitable Total and Neighborhood sum distinguishing Total Colorings of Some Classes of Circulant Graphs

In this paper, we have obtained the total chromatic as well as equitable...
07/23/2020

### The r-largest four parameter kappa distribution

The generalized extreme value distribution (GEVD) has been widely used t...
01/16/2021

### Free Lunch for Few-shot Learning: Distribution Calibration

Learning from a limited number of samples is challenging since the learn...
05/10/2022

### Entropic CLT for Order Statistics

It is well known that central order statistics exhibit a central limit b...
10/21/2020

### The entropy based goodness of fit tests for generalized von Mises-Fisher distributions and beyond

We introduce some new classes of unimodal rotational invariant direction...
07/29/2021

### Personalized Trajectory Prediction via Distribution Discrimination

Trajectory prediction is confronted with the dilemma to capture the mult...
07/19/2012

### Clustering of Local Optima in Combinatorial Fitness Landscapes

Using the recently proposed model of combinatorial landscapes: local opt...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Let

be independent identically distributed (i.i.d.) random variables with the continuous distribution function (d.f.)

. We set . Let . We say that the tail of the d.f. is lighter than the tail of the d.f. with , if the following condition holds

 1−H(x)1−G(x)→0 x→+∞.

This paper is concerned with the problem of distinguishing between two arbitrary classes of distribution tails and

, where the tails of the distributions lying in one class are lighter than the tails of the distributions lying in another one. The test of discrimination between classes of distribution tails proposed in this paper is asymptotic, we also state its consistency. For the aim of developing the discrimination test we consider the auxiliary problem of distinguishing between the simple hypothesis about the distribution tail and two composite alternatives that include almost all distributions with the tails lighter or heavier than the distribution tail of the null hypothesis. We emphasize that unlike the overwhelming number of works concerned with the problems of testing hypotheses about the distribution tails we do not assume that the distribution of the sample should satisfy the conditions of the Fisher-Tippett-Gnedenko limit theorem, i.e. belong to any maximum domain of attraction (see the definitions below).

In statistics, one often encounters the problem of discrimination between close distributions from truncated or censored data – in particular, in fields related to insurance, reliability, telecommunications, computer science and earth sciences. The problem when only the observations over some threshold are known is well studied (see the works [1, 2, 3] and references therein and the book [4]). On the other hand, according to the statistics of extremes (see the book [5]), only higher order statistics can be used for the problem of discrimination of close tails of distributions, whereas moderate sample values can be simulated using standard statistical tools.

Gnedenko’s limit theorem (or the extreme value theorem) (see [6]), which is the central result in the stochastic extreme value theory, states that if there exist sequences of constants and such that the d.f. of normalized maximum tends to some non-degenerate d.f. , i.e.,

 limn→∞P(Mn≤anx+bn)=G(x), (1)

then there exist constants and such that , where

 Gγ(x)=exp(−(1+γx)−1/γ),1+γx>0,γ∈R,

and for the right-hand side should be understood as . The parameter is called the extreme value index [5]. The d.f. of a sample is said to belong to the Fréchet (Weibull, Gumbel, respectively) maximum domain of attraction if (1) holds for (, , respectively). The distribution functions belonging to the Fréchet and Gumbel maximum domain of attraction are called the heavy-tailed and light-tailed distributions respectively. The distributions with the tails heavier than the tails of the distributions belonging to the Fréchet maximum domain of attraction are called the distributions with super-heavy tails; these distributions do not belong to any maximum domain of attraction (see for details [7]). For investigation the rates of convergence in the Gnedenko’s limit theorem and solving some other problems of extreme value theory the second extreme value index is considered (see [8]), detailed investigation of which is beyond the scope of our paper.

The estimators of the extreme value indices

and (see for details [5]) can be used in the problem of discrimination between close distribution tails. In this connection we refer to [9, 10, 11, 12, 13, 14, 15], among many others. Another approach is to estimate the distribution tails directly using higher order statistics (see [16, 17, 18]).

It is clear that the asymptotically normal estimators of the extreme value index can be used in the problem of discrimination between the tails of the distributions belonging to the Fréchet and Weibull domains of attraction respectively. However, the above approach does not work for a huge class of distribution, for instance, belonging to the Gumbel maximum domain of attraction (because in this case ) or super-heavy-tailed distributions (because is not determined). In this connection we mention the work [19], the authors of which propose the test distinguishing between the heavy-tailed and super-heavy-tailed distributions.

The Weibull and log-Weibull classes of distributions form an important class of distributions from the Gumbel maximum domain of attraction. We say that the d.f. is a Weibull-type d.f., if there exist such that for all

 limx→∞ln(1−F(λx))ln(1−F(x))=λθ.

The parameter

is also called the Weibull tail index. The class of Weibull distributions contains, in particular, the normal, exponential, gamma distributions and other ones of great value in statistics. If for some d.f.

the distribution function belongs to the Weibull class with , then one says that belongs to the log-Weibull class of distributions. A method capable of discriminating between close tails of Weibull and log-Weibull distributions was proposed in [20] and based on the well-known Hill estimator (see [21], and Section 2 of this work), the estimators of the Weibull tail index (see [22, 23]) can also be applied for discriminating between tails of Weibull-type distributions. The likelihood ratio method applied to higher order statistics of the sample was used in [24, 25] to develop the tests discriminating between the tails of the distributions from the Gumbel maximum domain of attraction.

This paper is actually the first attempt to develop the general test in the problem of discrimination between the distribution tails that are not assumed to belong to any maximum domain of attraction. The problem of optimality of this test is natural and will be considered in our next works. The problem of the optimal choice of the number of retained higher order statistics is considered in Section 3.

The paper is organized as follows. In Section 2 we formulate the problem and propose the test of discrimination. In Section 3 we illustrate the numerical performances of the test and compare it with some other tests. The results and the proofs are given in Sections 4, 5.

## 2 The model and the test of discrimination

Let as above be i.i.d. random variables with a common continuous d.f. and . Let be the order statistics pertaining to . For the purpose of developing the discrimination test consider the Hill-type statistic

 Rk,n=ln(1−F0(X(n−k)))−1kn∑i=n−k+1ln(1−F0(X(i))),

where is a continuous d.f. Note that if is the Pareto d.f., then

 Rk,nd=γH/γ

as , where is the Hill’s estimator of the extreme value index (which in this case is equal to the parameter of the Pareto distribution, [5]),

 γH=1kn∑i=n−k+1lnX(i)−lnX(n−k),

the estimator is consistent for positive values of .

###### Definition 1

We say, that distribution functions and satisfy the condition (written B-condition), if for some and

 (1−H(x))1−ε1−G(x)  is nonincreasing as  x>x0. (2)

It is easy to see, that under this condition the tail of the d.f.  is lighter than the tail of the d.f. , i.e.

 1−H(x)1−G(x)→0as x→∞.

For example, if , , and ,

, are two exponential distribution functions with

, then the concerned condition is satisfied for all and

. The B-condition is also satisfied for the normal distribution functions with different variances. In addition note that the tail of the log-Weibull type distributions are lighter then the Pareto tails but heavier then the tails of the Weibull-type distributions (see

[20]), and the B-condition holds for all these classes of distributions.

Let us formulate the main problem of our paper. Consider two distribution classes and such that the tails of the distributions lying in are lighter than the distribution tails lying in . Assume, that there exist the “separating” d.f. such that both of the conditions and are satisfied with some (may be, different) and for all and . Consider in details the examples of fulfilment of these conditions.

###### Example 1

Let the common distribution of the sample belong to some one-parametric distribution family, , and for all , , the distribution functions and satisfy the condition . Then, taking , we can distinguish between the classes and .

###### Example 2

We can select as the separating function in the problem of discrimination between distributions with the Weibull and log-Weibull tails. Indeed, we can represent the Weibull-type d.f. in the form

 FW(x)=1−exp{−exp{θlnx(1+o(1))}}as x→∞,

whereas the log-Weibull-type d.f. can be represented as

 FLW(x)=1−exp{−exp{θlnlnx(1+o(1))}}% as x→∞,

hence the fulfilment of the B-conditions is required.

###### Example 3

In the problem of discrimination between heavy-tailed distributions (i.e. belonging to Fréchet maximum domain of attraction) and super-heavy-tailed distributions (i.e. with the tail heavier than the tail of any heavy-tailed distribution), the separating function cannot be correctly selected, but the problem can be solved if we consider a more particular problem. According to [5], the tail of the arbitrary d.f. belonging to the Fréchet maximum domain of attraction can be represented as , where is a slowly varying function, i.e. such that , and the parameter coincides with the extreme value index of this distribution. So if one consider the problem of discrimination between super-heavy-tailed distributions and distributions with heavy tails, the extreme value index of which is less than some parameter , then the d.f. , can be selected as a separating function.

Thus, suppose that two classes of distributions are separable by the function . Consider the null hypothesis and the alternative . Propose the following procedure testing the hypothesis :

 Rk,n>1+u1−α√k, ~H0 . (3)

According to the corollary of Theorem 2 (see below), the proposed test is consistent against the alternative and has the asymptotic significance level , here

is the quantile of the corresponding level of a standard normal distribution.

If the tails of distributions lying in are heavier than the tails of distributions from , and there exist the d.f. such that both the conditions and are satisfied for some (may be different) , and arbitrary distribution functions and , then the test discriminating between the hypothesis and the alternative with the same asymptotic properties is the following:

 Rk,n<1+uα√k, ~H0 %. (4)

## 3 Simulation study

The aim of the present simulation study is to illustrate the use of the proposed test and to demonstrate its asymptotic properties. Firstly consider the problem of discrimination between the Weibull and log-Weibull classes of distribution tails applying the proposed test (3), and we select the following separating function . As it follows from Theorem  2, the statistics

converges in probability to

on distributions belonging to the Weibull class and to on distributions belonging to the log-Weibull class, that is confirmed by simulations (see Fig. 1).

Empirical type I error probabilities and empirical power of the test (

3) discriminating between the Weibull and log-Weibull classes for different distributions with the separating function are given in Table 1 (the nominal level ). We include the simulation results for the log-Weibull distribution with the d.f.

, the log-normal distribution with parameters

and the standard exponential distribution. For the standard normal distribution and the distributions with lighter tails the empirical type I error probabilities are significantly less than for all considered values of and

, so we do not include these probabilities in the Table 1. In addition, as an example we include the empirical power of the test for Pareto distribution with the parameter 2 and standard Cauchy distribution.

Now consider the problem of discrimination between the hypothesis and the alternative , where is the Weibull-tail index. Let us compare the test (3) (we select as a separating function the standard normal d.f.), and the test proposed in work [20] (see in addition [26]

). If we select the sample from the standard normal distribution, then the limit distributions of both the test statistic (

3) and the test statistic proposed in [20] is standard normal (see the second plot in Fig. 2).

Consider the problem of distinguishing between distributions with heavy and super-heavy tails. As it is mentioned before, the proposed test cannot be applied in this problem because of impossibility of selecting a separating function . But for more particular problem of discriminating between distributions with heavy tails and logarithmic distributions with the d.f. , , as , the proposed test is applicable. Let be the class of logarithmic distributions mentioned above and be the class of heavy-tailed distributions. Consider the problem of discrimination between the hypotheses and . We select as a separating function for the test (4) the following function

 F0(x)=1−exp(−√lnx).

In Tables 2 and 3 one can find the empirical type I error probabilities and empirical power of the test (4) and test proposed in [19] respectively, for the log-Pareto distribution identified by the d.f. with the parameters and , the log- distribution with the density function , the standard Cauchy distribution, the Pareto distribution with the parameter 2 and standard log-normal distribution. Note the high values of the type I error probabilities of the test proposed in [19] for the log- distribution.

The problem of selecting the number of retained order statistics for constructing the estimators and tests is significant, but one of the most complicated in statistics of extremes (see the book [27] and references therein). In the context of the problem of distinguishing between heavy-tailed and super-heavy-tailed distributions consider the behavior of the empirical type I error probability and empirical power of the test (4) with the separating function as increases (see Fig. 3). We see, that with the level 0.05 the optimal value of lies between 50 and 150.

## 4 Main results

This section discusses theoretical properties of the test proposed in Section 2. As before, are i.i.d. random variables with the continuous d.f. . Let be the class of the continuous distribution functions satisfying either or . Consider the simple hypothesis (actually we check the hypothesis that and have the same tail) and the alternative . Note that if , satisfy either or for some , then the same holds for all , . So set

 ε(F0,F1)=max{ε:F0,F1   B(F0,F1), B(F1,F0)  ε}.

Let be the class of the continuous distribution functions satisfying either or with and consider another alternative hypothesis .

Let us show, that the conditional distribution of the statistic differs depending on whether or holds, that make it possible to propose the discrimination test of these hypotheses. The following results discusses the asymptotic behavior of the statistic as , , if holds.

###### Theorem 1

If holds, then

 √k(Rk,n−1)d→ξas k,n→∞,

where is standard normal, .

The above theorem allows us to propose the test distinguishing hypotheses and for the tail of the d.f. :

 if Rk,n∉(1+uα/2√k,1+u1−α/2√k),then  H0 is rejected, (5)

where and are quantiles of the corresponding levels of the standard normal distribution. It is easy to see, that the test is asymptotical with the significance level .

Next, the following result shows the consistency of the proposed test. Suppose that (how to distinguish distributions with , see [28, 5]). Suppose in addition, that does not hold and the tail of the d.f. coincides with the tail of some d.f. , but not with the tail of . The consistency of the test (5) is shown in

###### Theorem 2
1. [(i)]

2. If holds, then

 √k|Rk,n−1|d→+∞

as , .

3. If holds, then under the same conditions

 infF1∈Θε(F0)√k|Rk,n−1|d→+∞.

Theorems 1, 2 justify the correctness of the statistical procedure proposed in (3). Consider two classes and of distribution tails. Consider the null hypothesis and the alternative . Suppose that there exist such d.f. , that the conditions and are satisfied with some (may be different) and for arbitrary distribution functions and .

###### Corollary 1

. The procedure (3) testing the null hypothesis has the asymptotical significance level . Moreover, the test is consistent if holds.

Let us return to discussing the test (5). The proposed test allows us to distinguish the tails of two normal distributions with different variances, but we should weaken the B-condition (2) to be able to distinguish, for instance, the tails of two normal distributions with the same variances an different means. But weakening the B-condition implies imposing some conditions on the sequence .

###### Definition 2

We say that distribution functions and satisfy the condition , if for some and

 (1−F(x))(−ln(1−F(x)))ε1−G(x) is non-increasing as  x>x0. (6)

Let be the class of the continuous distribution functions satisfying either or and the following condition: for some

 1−F1(x)≤(1−F0(x))δ,  x>x0. (7)

It is easy to check that if , satisfy either or with some , then the same holds for all  , . Set

 ε′(F0,F1)=max{ε:F0,F1   C(F0,F1), C(F1,F0) ε}.

Let be the class of the continuous distribution functions satisfying (7) and either or with . As before, consider the simple hypothesis and two alternatives , , suppose in addition that is continuous.

###### Theorem 3
1. [(i)]

2. If holds, then,

 √k|Rk,n−1|d→+∞

as , for some .

3. If holds, then under the same conditions

 infF1∈Θ′ε(F0)√k|Rk,n−1|d→+∞.

## 5 Auxiliary results and proofs

### 5.1 Auxiliary results

Since the statistic depends only on higher order statistics of the sample, we cannot use directly the independence of the random variables . So we consider the conditional distribution of the statistics given with the help of the following result.

###### Lemma [5,  3.4.1]

Let be i.i.d. random variables with a common distribution function and let be the

th order statistics. The joint distribution of the set of statistics

given for some agrees with the joint distribution of the set of order statistics of i.i.d. random variables with the common d.f.

 Fq(x)=P(X≤x|X>q)=F(x)−F(q)1−F(q),x>q.

Let us call , , the tail distribution function associated with the d.f. . Consider two d.f. , and the random variable with the d.f. , . We set

 ηq=ln(1−F(q)1−F(ξq)).

It is easy to check, that for all . The crucial point in the proof of Theorem 2 is an investigation of the behavior of the random variable .

###### Proposition 1

Let be the tail distribution functions associated with and respectively. Then

1. If for some , , and all , then is standard exponential.

2. for all if and only if is stochastically smaller than the standard exponential random variable.
for all if and only if is stochastically greater than the standard exponential random variable.

3. for all and some if and only if is the nonincreasing function as .

### 5.2 The proof of the proposition.

(i) Suppose that for all , then for the d.f. of the random variable we have

 P(ηq≤y) =P(ln(1−F(q)1−F(ξq))≤y)=P(1−F(q)1−F(ξq)≤ey) (8) =P(F(ξq)≤1−(1−F(q))e−y)=P(ξq≤F←(1−1−F(q)ey)),

with . Besides, for the same value of

 P(ξq≤F←(1−1−F(q)ey))=F(F←(1−1−F(q)ey))−F(q)1−F(q)=1−e−y.

(ii) Suppose that for all and some the relation holds. From (8) and the relation with it follows that

 P(ηq≤y) (9) ≥F(F←(1−1−F(q)ey))−F(q)1−F(q)=1−e−y,

the result required. Let us prove the assertion in the opposite direction. Suppose that is stochastically smaller than the standard exponential random variable, i.e. with . Using (8), we have

 ⟺1−G(F←(1−1−F(q)ey))1−G(q)≤e−y ⟺G(F←(1−1−F(q)ey))≤1−1−G(q)ey ⟺F←(1−1−F(q)ey)≤G←(1−1−G(q)ey).

Denote and . Since and , we have

 e−y=1−G(zG)1−G(q)=1−F(zF)1−F(q).

Next, since ,

 1−F(zF)1−F(q)=1−G(zG)1−G(q)≤1−G(zF)1−G(q).

This observation ends the proof, since . The proof of the second assertion of the item (ii) is similar.

(iii) We have

 G(x)−G(q)1−G(q)≥F(x)−F(q)1−F(q)∀x>q≥x0 ⟺1−G(x)1−G(q)≤1−F(x)1−F(q)∀x>q≥x0 ⟺1−G(x)1−F(x)≤1−G(q)1−F(q)∀x>q≥x0 ⟺1−G(x)1−F(x) is nonincreasing for all x>x0.

### 5.3 The proof of Theorem 1

The proof of Theorem 1 is evident, the same steps are used in extreme value theory to prove the consistency of the Hill’s estimator of extreme value index (see [5,  lemma 3.2.3] and its proof). Though, let us give the proof of Theorem 1. In assumptions of Theorem 1

has a uniform distribution on the interval

, , hence is standard exponential. From Rényi’s representation (see [5]) it follows that

 {−ln(1−F0(X(n−i)))+ln(1−F0(X(n−k)))}k−1i=0d={k∑j=i+1En−j+1j}k−1i=0,

where are independent standard exponential random variables. Therefore, the distribution of the left part does not depend on and

 {−ln(1−F0(X(n−i)))+ln(1−F0(X(n−k)))}k−1i=0d={E(k−i)}k−1i=0,

where are the th order statistics of .

and the result follows from the fact that the characteristic function of the right-hand side tends to the characteristic function of the standard normal distribution as

### 5.4 The proof of Theorem 2

Let us prove the item (i) at first. The scheme of the proof includes the steps that are similar to ones using in works [24, 25]. Consider the asymptotic behavour of the statistic as . Denote

 Yi=ln(1−F0(q))−ln(1−F0(X∗i)),

where are i.i.d. random variables introduced in Lemma with the d.f.

 Fq(x)=F1(x)−F1(q)1−F1(q),q

Assuming and , we have , . It follows from Lemma, that the joint conditional distribution of the th order statistics of the sample given equals the joint distribution of the th order statistics of the sample , where

 Zj=ln(1−F0(X(n−k)))−ln(1−F0(X(n−j+1))),j=1,…,k.

Next, we see that

 Rk,n=1kk∑i=1Zi.

Therefore, the conditional distribution of given agrees with the distribution of the statistic . Next, the distribution functions and satisfy either the condition or the condition under the assumptions of Theorem 2. Firstly suppose that is satisfied for some and . Since , then a.s., so we can only consider the case . The item (iii) of Proposition implies, that

 1−F1(x)1−F1(x0)≥(1−F0(x))1−ε(1−F0(x0))1−ε,x>x0.

Using (9), we have

 P(Y1≤x) =1−1−F1(F←0(1−1−F0(q)ex))1−F1(q) ≤1−(1−F0(F←0(1−1−F0(q)ex)))1−ε(1−F0(q))1−ε=1−e−(1−ε)x,

therefore is stochastically greater than a random variable , written . Next, let be i.i.d. random variables with the common d.f. , then

 √k(1kk∑i=1Yi−1)≫√k(1kk∑i=1Ei−1). (10)

Since (10) holds for all and a.s. as , then under the conditions of Theorem 2 we have

 √k(Rk,n−1)≫√k(1kk∑i=1Ei−1). (11)

From the Lindeberg–Feller central limit theorem,

 (1−ε)√k(1kk∑i=1Ei−11−ε)d→ξ∼N(0,1),n→∞,

hence

 √k(1kk∑i=1Ei−1)P→+∞,n→∞. (12)

Finally, using (11), we have

 √k(