Divergence measures estimation and its asymptotic normality theory : Discrete case

12/12/2018 ∙ by Ba Amadou Diadie, et al. ∙ 0

In this paper we provide the asymptotic theory of the general phi-divergence measures. We use the empirical probability distribution.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

1.1. Motivations

In this paper, we study the convergence of empirical discrete probability disributions supported on a finite set.
Let throughout the following be a finite countable space. The distributions probability on

are finite dimensional vectors

p in

A divergence measure on  is a function

(1.1)

such that for any p such that in the domain of application of .

The function is not necessarily an application. And if it is, it is not always symmetrical and it does neither have to be a metric. In lack of symmetry, the following more general notation is more appropriate :

(1.2)
,

where and are two families of distributions probability on , not necessarily the same. To better explain our concern, let us introduce some of the most celebrated divergence measures.


We may present the following divergence measure : let with , and two probabilities distribution on .

(1) The -divergence measure :

(1.3)

(2) The family of Renyi’s divergence measures indexed by , , known under the name of Renyi- :

(1.4)

(3) The family of Tsallis divergence measures indexed by , , also known under the name of Tsallis- :

(1.5)

(4) The Kulback-Leibler divergence measure

(1.6)

The latter, the Kullback-Leibler measure, may be interpreted as a limit case of both the Renyi’s family and the Tsallis’ one by letting . As well, for near 1, the Tsallis family may be seen as derived from based on the first order expansion of the logarithm function in the neighborhood of the unity.

From this small sample of divergence measures, we may give the following remarks.

(a) The -divergence measure is both an application and a metric on , where is the class of probability measures on such that

(b) For both the Renyi and the Tsallis families, we may have computation problems and lack of symmetry. Let give examples. It is clear from the very form of these divergence measures that we do not have symmetry, unless for the special case where . Both families are build on the following functional

1.2. Previous work and main contributions

Our main contribution may be summurized as follows, for data sampled from one or two unknown random variables, we derive almost sure convergency and central limit theorems for empirical

divergences

1.3. Overview of the paper

2. Distribution limit for empirical divergence

2.1. Notation and definitions

Before we state the main results we need a few definitions. Define the empirical probability distribution generated by i.i.d. random variables from the distribution probability p as

(2.1)

and is defined in the same way by that is

(2.2)
Definition 1.

The -divergence between the two probability distributions p and q is given by

(2.3)

where is a measurable function on which we will make the appropriate conditions.

The results on the functional will lead to those on the particular cases of the Renyi, Tsallis, and Kullback-Leibler measures.

2.2. Main resuls

Since for a fixed

has a binomial distribution with parameters

and success probability , therefore

(2.4)

Furthermore, by the strong law of large numbers we have that

converges almost surely (and hence in probability) to for every fixed .
By the theorem central limit

(2.5)

where we use the symbol to denote convergence in distribution.

Also for a fixed we have

(2.6)

More generally since is a sample of size from a multinomial distribution with probabilities p, therefore (see Lo et al. (2016))

where is the multinomial covariance matrix given by

3. Asymptotic theory for -divergence measure

3.1. Boundness assumption and notations

Define

Let

where is a mesurable function having continuous second order partial derivatives defined as follows :

and

Set


Based on (2.1) and (2.2), we will use the following empirical -divergences.

  and

Set

(3.1)

3.2. Statements of the main results

The first concerns the almost sure efficiency of the estimators.

Theorem 1.

Let a finite countable space and and and be generated by i.i.d. samples and . Then the following asymptotic results hold for the empirical -divergences

  • One sample

    (3.2)
    (3.3)
  • Two samples

    (3.4)

where , and are as in (3.1).


The second concerns the asymptotic normality of the estimators.

Theorem 2.

Let

Under the same assumptions as in theorem 1, the following central limit theorems hold for empirical -divergences

  • One sample : as ,

    (3.5)
    (3.6)
  • Two samples : as and ,

    (3.7)

II - Direct extensions.

Quite a few number of divergence measures are not symmetrical. Among these non-symmetrical measures are some of the most interesting ones. For such measures, estimators of the form , and are not equal to , and respectively.

In one-sided tests, we have to decide whether the hypothesis , for q known and fixed, is true based on data from p. In such a case, we may use the statistics one of the statistics and to perform the tests. We may have information that allows us to prefer one of them. If not, it is better to use both of them, upon the finiteness of both and , in a symmetrized form as

(3.8)

The same situation applies when we face double-side tests, i.e., testing from data generated by p et q.

Asymptotic a.e. efficiency.

Theorem 3.

Under the same assumptions as in theorem 1, the following hold

  • One sample :

    (3.9)
    (3.10)
  • Two samples :

    (3.11)

Asymptotic Normality.

Denote

We have

Theorem 4.

Under the same assumptions as in theorem 1, the following hold

  • One sample : as

    (3.12)
    (3.13)
  • Two samples : as

    (3.14)

Remark The proof of these extensions will not be given here, since they are straight consequences of the main results. As well, such considerations will not be made again for particular measures for the same reason.

4. Particular Cases

4.1. Renyi and Tsallis families

These two families are expressed through the functional

(4.1)

which is of the form of the divergence measure with

A- (a)- The asymptotic behavior of the Tsallis divergence measure.

Denote

We have

Corollary 1.

Under the same assumptions as in theorem 1, and for any , the following hold

  • One sample :

  • Two samples :

Denote

We have

Corollary 2.

Under the same assumptions as in theorem 1, and for any , the following hold

and as

As to the symmetrized form

we need the supplementary notations:

and

We have

Corollary 3.

Let Assumptions LABEL:C1 and LABEL:C2 hold and let (BD) be satisfied. Then for any , ,

and

Denote

We also have

Corollary 4.

Let Assumptions LABEL:C1 and LABEL:C2 hold and let (BD) be satisfied. Then for any , , we have

and as

A-(b)- The asymptotic behavior of the Renyi- divergence measure.

The treatment of the asymptotic behavior of the Renyi-, , is obtained from Part (A) (a) by expansions and by the application of the delta method.

We first remark that

Corollary 5.

Under the same assumptions as in theorem 1, and for any , the following hold

and

Denote

We have

Corollary 6.

Let Assumptions LABEL:C1 and LABEL:C2 hold and let (BD) be satisfied. Then for any ,

and as

As to the symetrized form

we need the supplementary notations:

Corollary 7.

Let Assumptions LABEL:C1 and LABEL:C2 hold and let (BD) be satisfied. Then for any ,

and

Denote