The Pareto Record Frontier

01/17/2019
by   James Allen Fill, et al.
Johns Hopkins University
0

For iid d-dimensional observations X^(1), X^(2), ... with independent Exponential(1) coordinates, consider the boundary (relative to the closed positive orthant), or "frontier", F_n of the closed Pareto record-setting (RS) region RS_n := {0 ≤ x ∈ R^d: x ≺ X^(i) for all 1 ≤ i ≤ n} at time n, where 0 ≤ x means that 0 ≤ x_j for 1 ≤ j ≤ d and x ≺ y means that x_j < y_j for 1 ≤ j ≤ d. With x_+ := ∑_j = 1^d x_j, let F_n^- := {x_+: x ∈ F_n} and F_n^+ := {x_+: x ∈ F_n}, and define the width of F_n as W_n := F_n^+ - F_n^-. We describe typical and almost sure behavior of the processes F^+, F^-, and W. In particular, we show that F^+_n ∼ n ∼ F^-_n almost surely and that W_n / n converges in probability to d - 1; and for d ≥ 2 we show that, almost surely, the set of limit points of the sequence W_n / n is the interval [d - 1, d]. We also obtain modifications of our results that are important in connection with efficient simulation of Pareto records. Let T_m denote the time that the mth record is set. We show that F̃^+_m ∼ (d! m)^1/d∼F̃^-_m almost surely and that W_T_m / m converges in probability to 1 - d^-1; and for d ≥ 2 we show that, almost surely, the sequence W_T_m / m has equal to 1 - d^-1 and equal to 1.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/17/2019

Generating Pareto records

We present, (partially) analyze, and apply an efficient algorithm for th...
01/24/2019

Breaking Bivariate Records

We establish a fundamental property of bivariate Pareto records for inde...
06/19/2018

Records from partial comparisons and discrete approximations

In this paper we study records obtained from partial comparisons within ...
12/23/2020

Strong laws of large numbers for Fréchet means

For 1 ≤ p < ∞, the Fréchet p-mean of a probability distribution μ on a m...
10/22/2019

Ranking, and other Properties, of Elite Swimmers using Extreme Value Theory

The International Swimming Federation (FINA) uses a very simple points s...
06/09/2020

Exact and asymptotic properties of δ-records in the linear drift model

The study of records in the Linear Drift Model (LDM) has attracted much ...
01/19/2017

Moving to VideoKifu: the last steps toward a fully automatic record-keeping of a Go game

In a previous paper [ arXiv:1508.03269 ] we described the techniques we ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction, background, and main results

The study of univariate records is very well developed ([1] being a classical reference), but that of multivariate records less well so, in part because there are many ways one can formulate the latter concept. See [6], and the references therein, and [1, Chap. 8] for background.

This paper is mainly about the stochastic process , where is the boundary, or “frontier”, for Pareto records (otherwise known as nondominated records or weak records; consult Definitions 1.11.2) in general dimension  when the observed sequence of points are assumed (as they are throughout the paper) to be i.i.d. (independent and identically distributed) copies of a

-dimensional random vector 

with independent Exponential coordinates .

Theoretical investigation leading to the results in this paper were spurred by empirical observations whose generation is discussed briefly in Section 5 (see especially Figure 3) and in detail in [5] and began with the simple result of Theorem 1.4.

Notation: Throughout this paper we abbreviate the th iterate of natural logarithm by and by , and we write for the sum of coordinates of the -dimensional vector .

Unless otherwise specifically noted, all the results of this paper hold for any dimension .

1.1. Pareto records and the record-setting region

We begin with some definitions. Write (respectively, ) to mean that (resp., ) for . (We caution that, with this convention, is weaker than , the latter meaning “ or ”; indeed, but we have neither nor . This distinction will matter little in this paper, since the probability that any coordinate of an observation is repeated or vanishes is , but the distinction is important in [5].) The notation means , and means .

Definition 1.1.

(a) We say that is a (Pareto) record (or that it sets a record at time ) if for all .

(b) If , we say that is a current record (or remaining record, or maximum) at time  if for all .

(c) If , we say that is a broken record at time  if it is a record but not a current record, that is, if for all but for some ; in that case, the observation corresponding to the smallest such  is said to break or kill the record .

For (or , with the obvious conventions) let denote the number of records with , let denote the number of remaining records at time , and let denote the number of broken records. Note that and are nondecreasing in , but the same is not true for . For dimension , by standard consideration of concomitants [that is, by considering the -dimensional sequence sorted from largest to smallest value of (say) last coordinate] we see that (that is, for dimension , with similar notation used here for ) has, for each , the same (univariate) distribution as ; note, however, the same equality in distribution does not hold for the stochastic processes and .

Definition 1.2.

(a) The record-setting region at time  is the (random) closed set of points

(b) We call the (topological) boundary of (relative to the closed positive orthant determined by the origin) its frontier and denote it by .

Figure 1. Record frontier based on observations resulting in 10 current records (shown as solid points). The values and

determine two hyperplanes

and A new observation sets a record if and only if it falls in the region to the upper right of
Remark 1.3.

The terminology in Definition 1.2(a) is natural since the next observation sets a record if and only if it falls in the record-setting region. Note that

and that the current records at time  all belong to  but lie on its frontier. Observe also that is a closed subset of . Because this paper makes heavy use of the classical probabilistic notion of boundary-crossing probabilities, to avoid confusion we have chosen to use the term “frontier” for , rather than “boundary”, in Definition 1.2(b).

1.2. The record-setting frontier

Our first result shows that deviations of the sum of coordinates for a generic current record at time  from are typically of constant order. Observe that the conditional distribution of given that is a current record at time  doesn’t depend on ; in particular, it’s the conditional distribution of given that sets a record. Let

be a random variable with that distribution. Let 

denote a random variable with the standard Gumbel distribution (i.e., distribution function , ), and write for convergence in law (i.e., in distribution)

Theorem 1.4.

We have

Proof.

This is quite elementary. Let denote the probability that sets a record. Fix

for the moment. For

we have

and so the conditional density depends on  only through . It follows that the density of satisfies

Using the well-known asymptotic equivalence as [see (4.5) below], it is easy to check that, for each fixed , the density of at  converges to the standard Gumbel density as . The claimed result thus follows from Scheffé’s theorem (e.g., [4, Thm. 16.12]), which shows that there is in fact convergence in total variation. ∎

This paper primarily concerns the stochastic process , and specifically its “width” as defined next (see Figure 1).

Definition 1.5.

Recall that denotes the frontier of , and let

(1.1)

We define the width of as

(1.2)

Very roughly put, what we will see in this paper is that, unlike of Theorem 1.4, deviations of from are exactly of order ; on the other hand, we will see that deviations of from are of smaller order than . It will follow that the width of the frontier is exactly of order .

We next make some simple observations about the quantities appearing in Definition 1.5 that will prove fundamentally useful to our development.

Lemma 1.6 (characterization of ).

We have

which is nondecreasing in .

Proof.

The current records at time  all belong to , and broken records and non-records all have coordinate-sums (strictly) smaller than some current record. Thus . Conversely, if , then for some ; it follows that . ∎

Lemma 1.7 (two upper bounds on ).

(a) Define

Then

(b) Let . Define

Then, over the event that there are at least  remaining records at time , we have

(c) The processes , , and (for any ) all have nondecreasing sample paths.

Proof.

(a) For , let denote the almost surely unique index such that

Let denote the th coordinate vector. We claim that the points with all belong to (in fact, to ), and then the inequality is immediate. To prove the claim, note that all of the points belong to [because and hence ] but also to [because ].

(b) Over the event , is certainly at most the th-largest sum of coordinates of remaining records, which is in turn at most .

(c) The asserted monotonicity is clear for the bounding processes. The asserted monotonicity of follows easily from the observation that . ∎

It seems difficult to study the processes and bivariately, so we draw all our conclusions about the width process  by studying and univariately (that is, separately) and using . The behavior of is well known from classical extreme value theory and is reviewed in Section 2. Conclusions about will be drawn from (i) the upper-bounding processes in Lemma 1.7(a)–(b) together with classical extreme value theory for those bounding processes and (ii) a rather nontrivial lower bound developed in Section 3.

1.3. Main results

We next present the main results of our paper. What the results show, in various precise senses, is that and both concentrate near , with deviations that are , from which it follows of course that . But for we show more, namely, that is the exact scale for , that is, that . We can even narrow things down further:  in probability for each , with an almost sure equal to and an almost sure equal to .

Here are our main results for arbitrary but fixed dimension . We consider both convergence in probability (typical behavior) and almost sure largest and smallest deviations from (top and bottom boundary-behavior, respectively) for large .

Theorem 1.8 (Kiefer [7]).

Consider the process defined at (1.1).

(a) Typical behavior of :

(b) Top boundaries for :

(c) Bottom boundaries for :

Theorem 1.8 gives rise immediately to the following succinct corollary.

Corollary 1.9 (Kiefer [7]).

Consider the process defined at (1.1).

(a) Typical behavior of :

(b) Almost sure behavior for :

Remark 1.10.

In fact, one can show rather simply from Corollary 1.9(b) and the fact that has nondecreasing sample paths that the set (call it ) of limit points of the sequence is almost surely the closed interval . Here is a sketch of the proof. The set  is closed, so we need only show that  is dense in , which clearly follows if we can show that

(1.3)

the roughly stated idea being that then (a.s.) the sequence “can’t leap downward over any interval i.o.” in its infinitely many downward moves from its to its . To prove (1.3), we first bound from below by , then express the resulting difference with a common denominator, and finally use the consequence of Corollary 1.9(b) to find

as .

Remark 1.11.

Our Theorem 1.8 formalizes and improves upon related computations in Bai et al. [3, Secs. 1 and 3.2]

who, for the limited purpose of proving a central limit theorem reviewed in Theorem 

4.1(a) below, “observe that nearly all maxima occur in a thin strip sandwiched between [the] two parallel hyper-planes”

Our results for show that the deviations of from are almost surely negligible on a scale of .

Theorem 1.12.

Consider the process defined at (1.1).

(a) Typical behavior of :

and

(b) Top outer boundaries for : If , then

(c1) A bottom outer boundary for on the scale of :

(c2) A bottom inner boundary for on the scale of :

Theorem 1.12 gives rise immediately to the following succinct corollary.

Corollary 1.13.

Consider the process defined at (1.1).

(a) Typical behavior of :

(b) Almost sure behavior for If , then

We come now to our main focus, the process . The results in Theorem 1.14 follow directly from Corollaries 1.9 and 1.13.

Theorem 1.14.

Consider the process defined at (1.2).

(a) Typical behavior of :

(b) Almost sure behavior for If , then

and, in particular,

Remark 1.15.

(a) When , at each time there is exactly one current record, is the value of that record, is the closed interval , and .

(b) Using Remark 1.10, Theorem 1.14(b) can be strengthened to the conclusion that the set of limit points of the sequence is almost surely the closed interval .

(c) Theorem 1.14(b) has the following immediate corollary. If, for some positive integer , processes corresponding to dimension , , are defined on a common probability space (regardless of any dependence among the processes), then

(1.4)

That is, roughly speaking, for time  large relative to large dimension , the width almost surely concentrates near .

(d) We could have used  in the denominators of (1.4), but we chose because of Theorem 1.14(a). A remark of a somewhat similar flavor as (b) for convergence in probability is the following. If, for some integer , processes corresponding to dimension , , are defined on a common probability space (regardless of any dependence among the processes), then

We have not investigated whether this result might extend to dimension  growing with .

1.4. Outline of paper

The stochastic process is studied in Section 2, where we prove Theorem 1.8. We treat the process in Section 3, where we prove Theorem 1.12. In Section 4 we assess asymptotic behavior of the record counts , , and introduced following Definition 1.1 as preparation for Section 5, where we produce versions of our main results concerning the record-setting frontier process  when time is measured in the number of records (rather than observations ) generated.

2. The process

This section is devoted to the proof of Theorem 1.8 concerning the process defined at (1.1). In light of the characterization provided by Lemma 1.6, Theorem 1.8 follows from results of [7]. Kiefer is concerned with behavior of the law of the iterated logarithm type for the empirical distribution function and sample

-quantiles for a sequence of independent uniform

random variables, with and , but notes that his results “may easily be translated into results for general laws.” Since we are concerned here with a sequence from the Gamma distribution and with (only) the upper quantile, for completeness and the reader’s convenience we distill Kiefer’s proof(s) for our special case.

Proof of Theorem 1.8.

(a) This is elementary. We have

where .

(b) Kiefer describes two proofs. The first proof observes, for any sequence which is ultimately monotone nondecreasing, that

and applies the Borel–Cantelli lemmas to the sequence of independent events with . The second proof exploits the nondecreasingness of the sample paths of the process noted in Lemma 1.7 and proceeds as follows. If is ultimately monotone nondecreasing and is any strictly increasing sequence of positive integers, then

where we note that the random variables

(2.1)

are independent. Now choose and and apply the Borel–Cantelli lemmas.

(c) For the case of outer-class bottom boundaries, we start with the observation that if is ultimately monotone nondecreasing and is any strictly increasing sequence of positive integers, then

We then choose with and and apply the first Borel–Cantelli lemma.

For the case of inner-class bottom boundaries, we start with the observation that if is ultimately monotone nondecreasing and is any strictly increasing sequence of positive integers, then, recalling the definition (2.1),

We then choose with and with and apply the first Borel–Cantelli lemma to the events and the second Borel–Cantelli lemma to the independent events . ∎

3. The process

3.1. Towards a stochastic lower bound on

To prove Theorem 1.12 we need a stochastic lower bound on to complement the upper bound of Lemma 1.7. For this we use the definitions of the frontier and the closed record-setting region to argue as follows. For , let

denote the open positive orthant determined by . For any set , let denote the number of observations with that fall in . Then

(3.1)

The difficulty with upper-bounding the probability of this event is of course that the last union is uncountable. In the next subsection we produce a geometric lemma whose application effectively bounds the uncountable union by a finite union.

Figure 2. Geometric lemma illustrated for . Given with , the orthant determined by  must contain a point  with integer coordinates on the hyperplane .

3.2. A geometric lemma

Consider the (uncountable) union of positive orthants whose vertices lie on the hyperplane in where is an integer. We can also form a finite union of positive orthants whose vertices lie on the hyperplane situated a bit further from the origin. Our key geometric lemma guarantees that the uncountable union contains the finite union (see Figure 2).

Lemma 3.1.

Given a positive integer , and with

(3.2)

there exists with

(3.3)

such that

(3.4)
Proof.

We need to prove the existence of satisfying (3.3) and (3.4) (i.e., ). The frugal choice defined by

satisfies (3.4) but not necessarily (3.3). However, using (3.2) we observe that is at least the integer

and strictly less than the integer , i.e., is at most . Thus we need only (arbitrarily) “sweeten” (i.e., add  to) precisely of the entries to obtain  with the desired properties. ∎

3.3. A stochastic lower bound on

Let . Returning to (3.1), we now see from Lemma 3.1 with and

together with homogeneity [ for and ], that

and so by finite subadditivity

But

Since the cardinality of equals

we conclude that

where the last inequality holds assuming that as .

We summarize and simplify the bound we have derived in the next proposition, where we assume further that . The bound is the key to the proof of the first assertion in Theorem 1.12(a) and of Theorem 1.12(c1).

Proposition 3.2 (Stochastic lower bound on ).

Let with and . Then

3.4. Proof of Theorem 1.12

In this subsection we prove Theorem 1.12, part by part in the order (a), (c1), (c2), (b).

Proof of Theorem 1.12(a).

The second assertion in Theorem 1.12(a) follows from the case of Theorem 1.8(a) since, according to Lemma 1.7(a), we have

(3.5)

where we recall the definition

The first assertion follows from part (c1), proved next. ∎

Proof of Theorem 1.12(c1).

As noted in Lemma 1.7, the process has nondecreasing sample paths. From this it follows that if is (ultimately) monotone nondecreasing and is any strictly increasing sequence of positive integers, then

To complete the proof, we choose and , bound using Proposition 3.2, and apply the first Borel–Cantelli lemma.

Here are the details. Since and

the hypotheses of Proposition 3.2 are met and

which is summable. ∎

Remark 3.3.

We chose the constant  as the coefficient of in parts (a) and (c1) of Theorem 1.12 for convenience. As the proof shows, we could have used any constant larger than .

Proof of Theorem