1 Introduction
Let be a set of dimensional points. The largest points in are its maximal points of and are a wellstudied object. More formally^{1}^{1}1We restrict our definition to because that is what this paper addresses; the concept of maxima generalize naturally to for and have been wellstudied there as well. We discuss this in more detail in the Conclusions and Extensions section.
Definition 1
For let () denote the () coordinate of For , is dominated by if , and . If then
are the maximal points of
The problems of finding and estimating the number of maximal points of a set in
, appear very often in many fields under different denominations,maximal vectors
, skylines, Pareto frontier/points and others, see, e.g., [18, 5, 17, 12, 14], and for a more exhaustive history of the problems and further references, Sections 1 and 2 in [7].Recall that the metric for points in the dimensional space is defined by
Let denote a set of points chosen Independently Identically Distributed (IID) from some 2D distribution and
be the random variable counting the number of maximal points in
. Because maxima are so ubiquitous, understanding the expected number of maxima has been important in many areas and many properties of have been studied.More specifically, if is the uniform distribution drawn from an ball with then, it is well known [12, 2, 13, 6] that

If , then
The same result holds if the points are drawn from some distribution where and are ANY two 1dimensional distributions that are independent of each other. 
If , then
where is a constant dependent only upon .

Similar results to the above, i.e., that , derived using similar techniques, are known if is a uniform distribution from ANY convex region [11].
It is also known [15, 16] that if the points are chosen IID from a
D Gaussian distribution then
There are also generalizations of these results (both the ones and the Gaussian one) to higher dimensions. See [13] for a a table containing most known results.Surprisingly, given the importance of the problem, not much else is known. The motivation for this work is to extend the family of distributions for which can be derived.
Consider a point that is originally generated from some uniform distribution over a unit ball but, has some error in the metric when measured or reported. The actual reported point can be equivalently considered as being chosen from a new distribution which we denote by (the next section provides formal definitions). Note that the support of this distribution is the Minkowksi sum of the two balls.
As an example, Figure 2 shows the support of . In the diagram, the shaded inner square is the unit ball. A point chosen from that square is then perturbed by the addition of another point , drawn uniformly from the ball with radius . The support of this convoluted distribution is the interior of the dotted region in the figure.
Note that the distribution is NOT uniform in this support. Towards the centre the density is uniform but it decreases approaching the boundary of the support where it becomes zero. Note too that the rate of decrease differs in different parts of the support. It is this nonuniformity that will cause complications in calculating
Although the problem described above was for small it is well defined for all which is what we analyze in this paper. More specifically, the motivation for the present work is twofold:

Explain how changes when the distribution is perturbed and

Increase the families of distributions for which is understood.
The idea of analyzing how quantities change under perturbations is smoothed analysis [20, 21]. In the classic setting, smoothed analysis of the number of maxima would mean analyzing how, given a fixed set would change under small perturbations (as a function of the original set ). This was the approach in [9, 8] (see also similar work for convex hulls in [10]). This paper differs in that it is the Distribution that is being smoothed (or convoluted) and not the point set. This paper also differs from recent work [22, 1] on the mostlikely
skyline and convex hull problems in that those papers assume that each point has a given probability distribution and they are attempting to find the subset of points that has the highest probability of being the skyline (or convex hull).
2 Definitions and Results
Definition 2
or will denote that is a real number will denote that OR
Definition 3
Let be a distribution over .

If , the distribution is generated by choosing a point using and then returning the point

Let be two distributions over . is the convolutionof It is generated by choosing a point from and a point from and returning

A set of is Chosen from if the are IID with each being generated using distribution .
Definition 4
Let be a set and
Then
.
The Minkowski sum of sets and is
If , let will denote the set
Definition 5
Let , and

The ball of radius around is

The ball of radius around is
Let and denote the respective unit balls and , denote their respective areas.

For all will denote the uniform distribution that selects a point uniformly from . This distribution has support with uniform density within

will be the convolution of distributions and . This distribution’s support is the Minkowski sum . Note that the density of is NOT uniform in .
The main result of this paper is
Theorem 1
Fix so that either or and Let be points chosen from the distribution and . Let be a function of . Then behaves as below.
Observations: In

When , has exactly the same distribution as if were chosen from so this is an uninteresting case.

When is small enough , behaves almost as if were chosen from and when is large enough it behaves almost as if were chosen from

Later Lemma 8 will show that has the same distribution for chosen from both and Thus row (iv) gives the behavior for for any and row (v) the behavior for

When the behavior starts at , smoothly decreases until reaching and then increases again until reaching . The behavior in the middle is different for and In both cases there is symmetry between and (from Lemma 8).

When there is no symmetry. Behavior starts at , decreases to at and then increases again at a different rate to .

When , the behavior is asymptotically equivalent for all not just The only difference is in the value of the constant hidden by the The behavior starts at , stays there for a short while and then smoothly increases to
3 Basic Lemmas
The following collection of Lemmas comprise the basic toolkit used to derive Theorem 1. They are only stated here, with complete proofs being provided in Section 5.
Definition 6
Let be a distribution over , and a measurable region.

will denote the density function of

will denote the measure of
If is understood we often simply write and
Definition 7
Let , and

.


is dominant in or a dominant region in if
Note that, by definition, is a dominant region in
Lemma 1
Let and be chosen from and Then
The following observation will be used to prove most of our lower bounds.
Lemma 2 (Lower Bound)
Let be chosen from . Further let be a collection of pairwise disjoint dominant regions in with for all . Then
Definition 8
Let For define
the preimage of point in
Lemma 3
Fix . Let and let be a point chosen from Let . Then
(1)  
(2) 
Lemma 4
Fix . Let and be any constant.
The constants implicit in the in (a) and (c) are only dependent upon while the constants implicit in the in (b) and (d) are only dependent upon
Lemma 5 (Mirror)
Let be any distribution with a continuous density function and a set of points chosen from . Let be two disjoint regions in the support that are parameterized by and satisfy:

.

(Monotonicity in ) , and .

(Asymptotic dominance in measure)
Define the random variables
Then
Lemma 6 (Sweep)
Let be any distribution with a continuous density function and a set of points chosen from . Let be two disjoint regions in the support that are parameterized by , satisfy conditions 13 of Lemma 5 and, in addition, satisfy
Then
Corollary 7
Fix and choose from Let be the upperright quadrant of the plane and the first octant , i.e.,
Then
(3)  
(4) 
Proof: Set
For set
Conditions (1) and (2) of Lemma 5 trivially hold. Condition (3) holds because, by symmetry around the axis Finally the additional condition of Lemma 6 holds because every point in is below and to the left of every point in . Thus the expected number of maximal points in below the axis is . Note that this is independent of . Similarly, the expected number of maximal points to the left of the axis is . This proves Eq. 3
To prove Eq. 4 define the second octant to be
By the symmetry between the and coordinates in the distribution,
Futhermore, since and partition ,
Thus
The fact that for , dominates if and only if dominates implies
Lemma 8 (Scaling)
Fix , and Let be points chosen from and points chosen from . Then and have exactly the same distribution. In particular
Lemma 9 (Limiting Behavior)
Let , , and chosen from . Then
Note that if chosen from , and are independent random variables. Thus, for any if is chosen from and are independent random variables. As noted in the introduction, this means that if is chosen from , is exactly the same as if was just chosen from i.e.,
Now note that Lemma 9 combined with Lemma 8 immediately imply the limiting behavior in columns (b) and (e) of the table in Theorem 1. Note too that for rows (ii) and (iii), column (d) follows directly from applying Lemma 8 to column (c).
Thus, proving Theorem 1 reduces to proving cells (ii) c, (iii) c, (iv) c,d and (v) c,d. In the next sections we sketch how to derive these results with full proofs relegated to the appendix.
4 The General Approach
4.1 A Simple Example:
Before sketching our results it is instructive to see how the Lemmas in the previous section can be used to rederive that fact that, if then . This is illustrated in Figure 4.
Even though the behavior of is already well understood we provide this to sketch the generic steps that are needed to derive . These are exactly the same steps that are needed when and permits identifying where the complications can arise in those more general cases.
Set and let be the points defined in the figure with and Also set
Finally, for set and The steps in the derivation are.
4.2 The General Approach
The proof of Theorem 1 will require casebycase analyses of for different pairs . The analysis for each pair will follow exactly the same 5 steps as the analysis of above. We note where the complications arise.
Step 1 of restricting the analysis to quadrant will be the same for every case.
Step 2, of deriving the measure, will often be quite cumbersome. While Lemma 3 provides an integral formula this, in many cases, is unusable. The density varies quite widely near the border of the support which is where most of the maxima are located. A substantial amount of work is involved in finding usable functional representations for the densities/measures in different parts of the support.
Step 3, of deriving the lower bound, is usually a simple application of Lemma 2, given the results of step 2.
Step 4 is the hardest step. It is usually derived using the sweep lemma with the difficulties arising from how to specify the regions to be swept. This strongly depends upon how the measure is represented .
5 Proofs of Basic Lemmas
Proof: of Lemma 2.
First note that, from Lemma 1, Thus implies
If region is dominant then points in can only be dominated by other points in so Since each is dominant, this implies
Finally, since the are pairwise disjoint,
Proof: of Lemma 3:
Note that for , and for , .
To see Eq. 2 note that
For Eq. 1 first note that
where (5) comes from the change of variables . Differentiating around yields Eq. 1.
Proof: of Lemma 4:
The proof for (a) follows easily from the fact that, for all
so from Eq. 1, . Furthermore, if then
where is only dependent upon . Thus, again from Eq. 1, , proving (b). The proofs for (c) and (d) follow from plugging these inequalities into Eq. 2.
Proof: of Mirror Lemma (Lemma 5):
Without loss of generality smoothly rescale so that , and thus .
The informal intuition of the Lemma is that since the “first” point in appears when the sweep line is , Since is asymptotically dominated in measure by and thus
Note that by the continuity of the measure we know that . That is, we may assume that
Now assume that is known. Conditioned on known , the remaining points in are chosen from with the associated conditional distribution. More specifically, if is any one of those points.
Thus, conditioning on , and applying Lemma 1 (b)
and therefore
From the definition of and Lemma 1 (c), with exponentially low probability. Therefore, recalling that
Another application of Lemma 1 (c) shows
Thus
Proof: of Sweep Lemma (Lemma 6):
From the setup in Lemma 5, for all all points in are dominated by all points in . By the definition of , contains (exactly) one point. Thus no point in can be maximal, i.e.,
The proof follows from
Proof: of Lemma 8:
Let be chosen from . Recall that the process of choosing point from is to choose from , from and return . Choosing a point from is the same except that it returns . Thus the distribution of choosing from is exactly the same as choosing from
Finally, note that dominance is invariant under multiplication by a scalar, i.e., dominates if and only if dominates . Thus and have the same distribution and
The proof of Lemma 9 will need an observation that will be reused multiple times in the analysis of and is therefore stated first, in its own lemma.
Lemma 10
Recall from Definition 7. Fix and set
Comments
There are no comments yet.