DeepAI

Axiomatic characterization of the χ^2 dissimilarity measure

We axiomatically characterize the χ^2 dissimilarity measure. To this end, we solve a new generalization of a functional equation discussed in Aczel (Lectures on functional equations and their applications, Academic Press, 1966).

• 1 publication
• 1 publication
• 2 publications
06/22/2019

Mumford-Shah functionals on graphs and their asymptotics

We consider adaptations of the Mumford-Shah functional to graphs. These ...
12/24/2004

Global minimization of a quadratic functional: neural network approach

The problem of finding out the global minimum of a multiextremal functio...
04/13/2021

Numerical viscosity solutions to Hamilton-Jacobi equations via a Carleman estimate and the convexification method

We propose a globally convergent numerical method, called the convexific...
06/20/2022

The Carleman convexification method for Hamilton-Jacobi equations on the whole space

We propose a new globally convergent numerical method to solve Hamilton-...
12/22/2010

Local Minima of a Quadratic Binary Functional with a Quasi-Hebbian Connection Matrix

The local minima of a quadratic functional depending on binary variables...
03/20/2022

On a characterization of exponential and double exponential distributions

Recently, G. Yanev obtained a characterization of the exponential family...
06/05/2021

Can Subnetwork Structure be the Key to Out-of-Distribution Generalization?

Can models with particular structure avoid being biased towards spurious...

1 Introduction

Let be a set of categories (with

). The vector

represents the respective numbers of observations in each category and the total number of observations is denoted by . We want to measure the dissimilarity between the observed distribution and a reference distribution , with and for all , where is the set of positive rational numbers. We exclude reference distributions with null components because the dissimilarity measure is not defined when a component is zero. The set of all observed distributions is , i.e. the set of all mappings from to , where is the set of non-negative integers. The set of all reference distributions is defined by .

A dissimilarity measure is a mapping from to (the set of non-negative real numbers) satisfying iff . It measures how far the observed distribution is from the reference. In this paper, we axiomatically characterize the dissimilarity measure defined by

 χ21(x,π)=∑i∈N(s(x)πi−xi)2s(x)πi

and frequently used in statistics as a measure of goodness of fit.

The dissimilarity measure defined by has been characterized in [7] and we will also provide a new characterization thereof. It is popular in ecology [6], sociology [8], economics [9], and so on.

While we consider in our paper that the number of categories is given and fixed, [7] considers that can vary. Depending on the context, one or the other assumption can be more relevant. For instance, when we use Pearson’s test, we have a sample distributed over categories and the

-value is computed conditional on a theoretical probability distribution with the same number

of categories. If we repeat the experiment and draw other samples, we obtain other

-values always based on the same theoretical probability distribution with the same number

of categories. It therefore makes sense to consider as given.

A common feature of [7] and our paper is that we use a framework in which can vary and such that comparisons of the dissimilarity measure across different reference distributions are relevant. Yet, unlike [7], we also consider the case in which the reference distribution is fixed (as in our Pearson’s example).

For characterizations of other dissimilarity measures, in the context of political sciences, see [3]. See [2] for a characterization of a wide class of dissimilarity measures. While we consider dissimilarity measures in this paper, it is also interesting to consider dissimilarity rankings as in [4].

Section 2 presents our main conditions and results. Section 3 shows the independence of the conditions used in our results. Section 4 concludes the discussion. All the proofs are gathered in Section 5.

2 Axioms and results

The dissimilarity measures and are homogeneous of degree 0 and 1, respectively, where homogeneity is defined as follows.

A 1.

Homogeneity of degree . For all positive integers and , .

In statistics, it seems unanimously accepted that a dissimilarity measure (used as a goodness-of-fit statistic) should be homogeneous of degree 1, but in ecology, many researchers seem to favour homogeneity of degree 0. Indeed, when they measure the dissimilarity between the species distribution in an ecosystem and a reference distribution, they want the dissimilarity to be independent of the size of the ecosystem. It is easy to see that Homogeneity of degree 0 (resp. 1) is satisfied by (resp. ). Indeed, we have

 χ20(λx,π)=∑i∈N(πi−λxi/s(λx))2πi=∑i∈N(πi−xi/s(x))2πi=χ20(x,π)

and

 χ21(λx,π)=∑i∈N(s(λx)πi−λxi)2s(λx)πi=λ∑i∈N(s(x)πi−xi)2s(x)πi=λχ21(x,π).

Suppose the dissimilarity between a distribution and is zero. This implies for some positive integer . The next condition states that, when we modify by moving a single individual from category to

, then the dissimilarity measure is inversely proportional to the harmonic mean of

and . Let be a vector such that and for all .

A 2.

Inverse Effects. If , then, for all , with and ,

 f(kπ+1j−1l,π)f(kπ′+1r−1s,π′)=1πj+1πl1π′r+1π′s.

In our first result, we will use a restricted variant of Inverse Effects in which . This weaker condition is named Restricted Inverse Effects and is trivially satisfied when . We now prove that Inverse Effects is satisfied by :

 χ20(kπ+1j−1l,π)=(−1/k)2πj+(1/k)2πl=1k2(1πj+1πl).

The proof for is similar.

Let and be two observed distributions of size . The deviation between and is . The corresponding deviation for is . If we add these two vectors of deviations, we obtain and the corresponding observed distribution is (provided all components are non-negative). Hence, represents the dissimilarity corresponding to the additive combination of two deviations: between (resp. ) and . Similarly, corresponds to the subtractive combination of the same two deviations. Finally, corresponds in some sense to four deviations (two - and two -deviations) combined once additively and once subtractively. Our next condition states that this must be equal to , which is another way to combine the same four deviations.

A 3.

Deviations Balancedness. For all with , if and , then

 f(x+y−kπ,π)+f(x−y+kπ,π)=2(f(x,π)+f(y,π)).

This condition is inspired by [5], in which they characterize the Euclidean distance in . Let us prove that satisfies Deviations Balancedness. We have

 χ21(x+y−kπ,π) =∑i∈N(s(x+y−kπ)πi−(xi+yi−kπi))2s(x+y−kπ)πi =∑i∈N(2kπi−xi−yi)2kπi

and

 χ21(x−y+kπ,π) =∑i∈N(s(x−y+kπ)πi−(xi−yi+kπi))2s(x−y+kπ)πi =∑i∈N(xi−yi)2kπi.

Hence, is equal to

 ∑i∈N(2kπi−xi−yi)2kπi+∑i∈N(xi−yi)2kπi = ∑i∈N2(k2π2i+x2i−2kπixi)+2(k2π2i+y2i−2kπiyi)kπi = 2χ21(x,π)+2χ21(y,π).

We are now ready to state our first result in which we consider that is given and does not vary.

Theorem 2.1.

Assume is given. For , a dissimilarity measure satisfies Homogeneity of degree , Deviations Balancedness and Restricted Inverse Effects iff , for some positive . Restricted Inverse Effects is not required when .

Notice that Theorem 2.1 does not hold when is not fixed. Indeed, for any with not constant, the dissimilarity measure

 fϕ(x,π)=ϕ(π)∑i∈N(πi−xi/s(x))2πi

satisfies Homogeneity of degree 1, Deviations Balancedness and Restricted Inverse Effects but is not of the form or . In order to characterize the dissimilarity measure when varies, we need the full power of Inverse Effects.

Theorem 2.2.

For , a dissimilarity measure satisfies Homogeneity of degree , Deviations Balancedness and Inverse Effects iff , for some positive .

3 Independence of the axioms

In order to prove the independence of the conditions characterizing with variable , we provide three examples of dissimilarity measures violating only one of the three conditions in Theorem 2.2.

The dissimilarity measure violates Homogeneity of degree 0 but satisfies Deviations Balancedness and Inverse Effects. The dissimilarity measure

 f(x,π)=∑i∈N|πi−xi/s(x)|πi

violates Deviations Balancedness but satisfies Homogeneity of degree 0 and Inverse Effects. The dissimilarity measure

 f(x,π)=∑i∈N(πi−xi/s(x))2

violates Inverse Effects but satisfies Homogeneity of degree 0 and Deviations Balancedness.

Our examples are easily adapted to prove the independence of the conditions characterizing with variable . Finally, our examples can also be used for Theorem 2.1 since it involves the same conditions as Theorem 2.2 except for Restricted Inverse Effects which is weaker than Inverse Effects.

4 Discussion

Theorems 2.1 and 2.2 characterize the dissimilarity measures and up to a multiplication by a positive real number . We could easily add a condition characterizing exactly or . For instance, the extra condition is enough to force in both characterizations. Yet, unlike [7], we consider that such a normalization is not really interesting. Indeed and (with ) convey exactly the same information, just like a distance measurement in meters or yards. In particular, if we want to perform a Pearson’s test, we are free to use Pearson’s statistic (i.e. ) and to compute the -value using the density or to use (with an arbitrary ) and to compute the -value using the corresponding density. The resulting -value will of course be identical. The same holds for and .

5 Proofs

We need a few lemmas before proving Theorem 2.1.

Lemma 1.

Let . Then satisfies Homogeneity of degree 1 iff satisfies Homogeneity of degree 0. And satisifies Deviations Balancedness (resp. Inverse Effects) iff satisfies Deviations Balancedness (resp. Inverse Effects).

Proof.

Since satisfies Homogeneity of degree 1, we have for all positive integers . We thus have . Hence and is homogeneous of degree 0. The proof of the reverse implication is similar. The rest of the proof is left to the reader. ∎

Lemma 2.

Suppose is fixed. If a dissimilarity measure satisfies Homogeneity of degree 0, then , for some mapping .

Proof.

Since is fixed, we can define a mapping such that . Define now the mapping as follows. For any , if there is such that . The mapping is defined everywhere because has rational components and, hence, there is always such that . The mapping is well defined. Indeed, suppose now there are such that and . By Homogeneity of degree 0, . Therefore, . ∎

We say that a set in is rational convex if whenever , then for all rational .

Lemma 3.

Let be a rational convex subset of such that is full-dimensional. Let be a mapping such that the graph of is a parabola on any line segment . Then for some real .

Proof.

Since is full-dimensional, the interior of is not empty and we can suppose without loss of generality that . Let us consider the line defined by for some and all . The intersection of this line with defines a line segment passing by the origin. The graph of on is a parabola. We can express this by means of the following polynomial of degree 2 in  :

 g(αt,(1−α)t)=kαt2+lαt+mα, (5.1)

where and are real numbers.

Let us now consider the line defined by for some and all . The intersection of this line with defines a line segment . We can express that the graph of on is a parabola by means of a polynomial of degree 2 in  :

 g(αt,(1−α)t)=α2βt+αγt+δt, (5.2)

where and are real numbers. Setting in (5.2) yields, . Since this must be true for all , we must have .

Equating (5.1) and (5.2) yields

 kαt2+lαt+mα=α2βt+αγt+δt. (5.3)

Setting , and in (5.3) yields

 mα = δ0 kα+lα+mα = α2β1+αγ1+δ1 4kα+2lα+mα = α2β2+αγ2+δ2.

The solution of this system is

 mα = δ0 kα = α2 β2−2β12+αγ2−2γ12+δ2−2δ1+δ02 lα = α2 4β1−β22+α 4γ1−γ22+4δ1−δ2−3δ02.

Let us rewrite (5.1) :

 g(αt,(1−α)t) = (α2 β2−2β12+α γ2−2γ12+δ2−2δ1+δ02)t2 +(α2 4β1−β22+α 4γ1−γ22+4δ1−δ2−3δ02)t+δ0.

Letting and , we find that is equal to

 u2 β2−2β12+u(u+v) γ2−2γ12+(u+v)2 δ2−2δ1+δ02 +u2u+v 4β1−β22+u 4γ1−γ22+(u+v) 4δ1−δ2−3δ02+δ0. (5.4)

The graph of must be a parabola on the line segment corresponding to . That is,

 u2 β2−2β12+u(2u+1) γ2−2γ12+(2u+1)2 δ2−2δ1+δ02+u22u+1 4β1−β22+u 4γ1−γ22+(2u+1) 4δ1−δ2−3δ02+δ0

must be a parabola in . This is possible only if . We have therefore reached the conclusion that (5) can be written as in the statement of the lemma. ∎

Let and .

Lemma 4.

Let be a rational convex subset of such that is full-dimensional. Let be a mapping such that the graph of is a parabola on any line segment . Suppose the restriction of

to the hyperplane defined by

( for all such that the hyperplane intersects ) has the form for some real .

Then for some real .

Proof.

Since is full-dimensional, there is and we can suppose without loss of generality that . Let us consider the line defined by for some and all . The intersection of this line with defines a line segment passing by the origin. The graph of on is a parabola. We can express this by means of the following polynomial of degree 2 in  :

 g(α1t,α2t,…,αk−1t,(1−∑i∈K∗αi)t)=kαt2+lαt+mα, (5.5)

where and are real numbers.

Let us now consider the hyperplane defined by for some and . We assumed in the statement of the lemma,

 g(α1t,α2t,…,αk−1t,(1−∑i∈K∗αi)t)=∑i∈K∗σtiiα2i+∑i,j∈K∗,i

Setting in (5.6) yields, . Since this must be true for all , we must have , for all .

Equating (5.5) and (5.6) yields

 kαt2+lαt+mα=∑i∈K∗σtiiα2i+∑i,j∈K∗,i

Setting , and in (5.7) yields

 mα = σ0 kα+lα+mα = ∑i∈K∗σ1iiα2i+∑i,j∈K∗,i

The solution of this system is

 mα = σ0 kα = ∑i∈K∗α2i σ2ii−2σ1ii2+∑i,j∈K∗,i

Let us rewrite (5.5) :

 (∑i∈K∗α2i σ2ii−2σ1ii2 +∑i,j∈K∗,i

Letting , we have and , and the previous equation becomes,

 ∑i∈K∗u2i σ2ii−2σ1ii2+∑i,j∈K∗,i

For any , the graph of must be a parabola on the line segment corresponding to . That is,

 u2j σ2jj−2σ1jj2+uj(2uj+1) σ2j−2σ1j2+(2uj+1)2 σ20−2σ10+σ002+u2j2uj+1 4σ1jj−σ2jj2+uj 4σ1j−σ2j2+(2uj+1) 4σ10−σ20−3σ002+σ0

must be a parabola in . This is possible only if for all .

Similarly, for any with , the graph of must be a parabola on the line segment corresponding to , . That is,

 u2i σ2ii−2σ1ii2+u2i σ2jj−2σ1jj2+u2i σ2ij−2σ1ij2+ui(3ui+1) σ2i−2σ1i2+ui(3ui+1) σ2j−2σ1j2+(3ui+1)2 σ20−2σ10+σ002+u2i3ui+1 4σ1ij−σ2ij2+ui 4σ1i−σ2i2+ui 4σ1j−σ2j2+(3ui+1) 4σ10−σ20−3σ002+σ