Shape comparison is a fundamental problem in shape recognition, shape classification and shape retrieval (cf., e.g., 
), finding its applications mainly in Computer Vision and Computer Graphics. The shape comparison problem is often dealt with by defining a suitable distance providing a measure of dissimilarity between shapes (see, e.g., for a review of the literature).
Over the last fifteen years, Size Theory has been developed as a geometrical-topological theory for comparing shapes, each shape viewed as a topological space , endowed with a real-valued continuous function . The pair is called a size pair, while is said to be a measuring function. The role of the function is to take into account only the shape properties of the object described by that are relevant to the shape comparison problem at hand, while disregarding the irrelevant ones, as well as to impose the desired invariance properties.
A measure of the dissimilarity between two size pairs is given by the natural pseudo-distance. The main idea in the definition of natural pseudo-distance between size pairs is to minimize the change in the measuring functions due to the application of homeomorphisms between topological spaces, with respect to the -norm: The natural pseudo-distance between and with and homeomorphic is the number
where varies in the set of all the homeomorphisms between and . In other words, the variation of the shapes is modeled by the infinite-dimensional group of homeomorphisms and the cost of warping an object’s shape into another is measured by the change of the measuring functions. An important feature of the natural pseudo-distance is that it does not require the choice of parametrizations for the spaces under study nor the choice of origins of coordinate, which in image applications would be arbitrarily driven.
The main aim of this paper is to provide a new method to estimate the natural pseudo-distance, motivated by the intrinsic difficulty of a direct computation. Since the natural pseudo-distance is defined by a minimization process, it would be natural to look for the optimal transformation that takes one shape into the other, as usual in energy minimization methods. In our case, however, the existence of an optimal homeomorphism attaining the natural pseudo-distance is not guaranteed.
Earlier results about the natural pseudo-distance can be divided in two classes. One class provides constraints on the possible values taken by the natural pseudo-distance between two size pairs. For example, if the considered topological spaces and are smooth closed manifolds and the measuring functions are also smooth, then the natural pseudo-distance is an integer sub-multiple of the Euclidean distance between two suitable critical values of the measuring functions . In particular, this integer can only be either or in the case of curves, while it can be either , or in the case of surfaces . The other class of results furnishes lower bounds for the natural pseudo-distance , . In particular it is possible to estimate the natural pseudo-distance by using the concept of size function . Indeed, size functions can reduce the comparison of shapes to the comparison of certain countable subsets of the real plane (cf. ,  and ). This reduction allows us to study the space of all homeomorphisms between the considered topological spaces, without actually computing them. The research on size functions has led to a formal setting, which has turned out to be useful, not only from a theoretical point of view, but also on the applicative side (see, e.g., , , , , , , ).
This paper investigates into the problem of obtaining lower bounds for the natural pseudo-distance using size functions.
To this aim, we first introduce the concept of reduced size function. Reduced size functions are a slightly modified version of size functions based on the connectedness relation instead of path-connectedness. This new definition is introduced in order to obtain both theoretical and computational advantages (see Rem. 3 and Rem. 9). However, the main properties of size functions are maintained. In particular, reduced size functions can be represented by sets of points of the (extended) real plane, called cornerpoints.
Then we need a preliminary result about reduced size functions (Th. 25). It states that a suitable distance between reduced size functions exists, which is continuous with respect to the measuring functions (in the sense of the -topology). We call this distance matching distance, since the underlying idea is to measure the cost of matching the two sets of cornerpoints describing the reduced size functions. The matching distance reduces to the bottleneck distance used in  for comparing Persistent Homology Groups when the measuring functions are taken in the subset of tame functions. We underline that the continuity of the matching distance implies a property of perturbation robustness for size functions allowing them to be used in real applications.
Having proven this, we are ready to obtain our main results. Indeed, the stability of the matching distance allows us to prove a sharp lower bound for the change of measuring functions under the action of homeomorphisms between topological spaces, i.e. for the natural pseudo-distance (Th. 29). Furthermore, we prove that the lower bound obtained using the matching distance not only improves the previous known lower bound stated in , but is the best possible lower bound for the natural pseudo-distance obtainable using size functions. The proof of these facts is based on Lemma 30. This lemma is a crucial result stating that it is always possible to construct two suitable measuring functions on a topological -sphere with given reduced size functions and a pseudo-distance equaling their matching distance. On the basis of this lemma, in Th. 32 and Th. 34 we prove that the matching distance we are considering is, in two different ways, the best metric to compare reduced size functions.
This paper is organized as follows. In Section 2 we introduce the concept of reduced size function and its main properties. In Section 3 the definition of matching distance between reduced size functions is given. In Section 4 the stability theorem is proved, together with some other useful results. The connection with natural pseudo-distances between size pairs is shown in Section 5, together with the proof of the existence of an optimal matching between reduced size functions. Section 6 contains the proof that it is always possible to construct two size pairs with pre-assigned reduced size functions and a pseudo-distance equaling their matching distance. This result is used in Section 7 to conclude that the matching distance furnishes the finest lower bound for the natural pseudo-distance between size pairs among the lower bounds obtainable through reduced size functions. In Section 8 our results are briefly discussed.
2 Reduced size functions
In this section we introduce reduced size functions, that is, a notion derived from size functions () allowing for a simplified treatment of the theory. The definition of reduced size function differs from that of size function in that it is based on the relation of connectedness rather than on path-connectedness. The motivation for this change, as explained in Remark 3, has to do with the right-continuity of size functions.
In what follows, denotes a non-empty compact connected and locally connected Hausdorff space, representing the object whose shape is under study.
The assumption on the connectedness of can easily be weakened to any finite number of connected components without much affecting the following results. More serious problems would derive from considering an infinite number of connected components.
We shall call any pair , where is a continuous function, a size pair. The function is said to be a measuring function. The role of the function is to take into account only the shape properties of the object described by that are relevant to the shape comparison problem at hand, while disregarding the irrelevant ones, as well as to impose the desired invariance properties.
Assume a size pair is given. For every , let denote the lower level set .
For every real number , we shall say that two points are -connected if and only if a connected subset of exists, containing both and .
In the following, denotes the diagonal ; denotes the open half-plane above the diagonal; denotes the closed half-plane above the diagonal.
(Reduced size function) We shall call reduced size function associated with the size pair the function , defined by setting equal to the number of equivalence classes into which the set is divided by the relation of -connectedness.
In other words, counts the number of connected components in that contain at least one point of . The finiteness of this number is an easily obtainable consequence of the compactness and local-connectedness of .
An example of reduced size function is illustrated in Fig. 1. In this example we consider the size pair , where is the curve represented by a continuous line in Fig. 1 (a), and is the function “distance from the point ”. The reduced size function associated with is shown in Fig. 1 (b). Here, the domain of the reduced size function is divided by solid lines, representing the discontinuity points of the reduced size function. These discontinuity points divide into regions in which the reduced size function is constant. The value displayed in each region is the value taken by the reduced size function in that region.
For instance, for , the set has two connected components which are contained in different connected components of when . Therefore, for and . When and , all the connected components of are contained in the same connected component of . Therefore, for and . When and , all of the three connected components of belong to the same connected component of , implying that in this case .
As for the values taken on the discontinuity lines, they are easily obtained by observing that reduced size functions are right-continuous, both in the variable and in the variable .
The property of right-continuity in the variable can easily be checked and holds for classical size functions as well. The analogous property for the variable is not immediate, and in general does not hold for classical size functions, if not under stronger assumptions, such as, for instance, that is a smooth manifold and the measuring function is Morse (cf. Cor. 2.1 in ). Indeed, the relation of -homotopy, used to define classical size functions, does not pass to the limit. On the contrary, the relation of -connectedness does, that is to say, if, for every it holds that and are -connected, then they are -connected. To see this, observe that connected components are closed sets, and the intersection of a family of compact, connected Hausdorff subspaces of a topological space, with the property that for every , is connected (cf. Th. 28.2 in  p. 203).
Most properties of classical size functions continue to hold for reduced size functions. For the aims of this paper, it is important that for reduced size functions it is possible to define an analog of classical size functions’ cornerpoints and cornerlines, here respectively called proper cornerpoints and cornerpoints at infinity. The main reference here is .
(Proper cornerpoint) For every point , let us define the number as the minimum over all the positive real numbers , with , of
The finite number will be called multiplicity of for . Moreover, we shall call proper cornerpoint for any point such that the number is strictly positive.
(Cornerpoint at infinity) For every vertical line , with equation , let us identify with the pair , and define the number as the minimum, over all the positive real numbers with , of
When this finite number, called multiplicity of for , is strictly positive, we call the line a cornerpoint at infinity for the reduced size function.
The multiplicity of points and of vertical lines is always non negative. This follows from an analog of Lemma 1 in , based on counting the equivalence classes in the set
quotiented by the relation of -connectedness, in order to obtain the number
Under our assumptions on , i.e. its connectedness, can only take the values and , but the definition can easily be extended to spaces with any finite number of connected components, so that can equal any natural number. Moreover, the connectedness assumption also implies that there is exactly one cornerpoint at infinity.
As an example of cornerpoints in reduced size functions, in Fig. 2 we see that the proper cornerpoints are the points , and (with multiplicity , and , respectively). The line is the only cornerpoint at infinity.
The importance of cornerpoints is revealed by the next result, analogous to Prop. 10 of , showing that cornerpoints, with their multiplicities, uniquely determine reduced size functions.
The open (resp. closed) half-plane (resp. ) extended by the points at infinity of the kind will be denoted by (resp. ), i.e.
(Representation Theorem) For every we have
The equality (1) can be checked in the example of Fig. 2. The points where the reduced size function takes value are exactly those for which there is no cornerpoint (either proper or at infinity) lying to the left and above them. Let us take a point in the region of the domain where the reduced size function takes the value . According to the above theorem, the value of the reduced size function at that point must be equal to .
By comparing Th. 8 and the analogous result stated in Prop. 10 of , one can observe that the former is stated more straightforwardly. As a consequence of this simplification, all the statements in this paper that follow from Th. 8 are less cumbersome than they would be if we applied size functions instead of reduced size functions. This is the main motivation for introducing the notion of reduced size function.
In order to make this paper self-contained, in the rest of this section we report all and only those results about size functions that will be needed for proving our statements in the next sections, re-stating them in terms of reduced size functions. Proofs are omitted, since they are completely analogous to those for classical size functions.
The following result, expressing a relation between two reduced size functions corresponding to two spaces, and , that can be matched without changing the measuring functions more that , is analogous to Th. 3.2 in .
Let and be two size pairs. If is a homeomorphism such that , then for every we have
The next proposition, analogous to Prop. 6 in , gives some constraints on the presence of discontinuity points for reduced size functions.
Let be a size pair. For every point , a real number exists such that the open set
is contained in , and does not contain any discontinuity point for .
The following analog of Prop. 8 and Cor. 4 in , stating that cornerpoints create discontinuity points spreading downwards and towards the right to , also holds for reduced size functions.
(Propagation of discontinuities) If is a proper cornerpoint for , then the following statements hold:
i) If , then is a discontinuity point for ;
ii) If , then is a discontinuity point for .
If is the cornerpoint at infinity for , then the following statement holds:
iii) If , then is a discontinuity point for .
The position of cornerpoints in is related to the extrema of the measuring function as the next proposition states, immediately following from the definitions.
(Localization of cornerpoints) If is a proper cornerpoint for , then
If is the cornerpoint at infinity for , then .
(Local finiteness of cornerpoints) For each strictly positive real number , reduced size functions have, at most, a finite number of cornerpoints in .
Therefore, if the set of cornerpoints of a reduced size function has an accumulation point, it necessarily belongs to the diagonal . An example of reduced size function with cornerpoints accumulating onto the diagonal is shown in Fig. 3.
Moreover, this last proposition implies that in the summation of Th. 8 (Representation Theorem), only finitely many terms are different from zero.
3 Matching distance
In this section we define a matching distance between reduced size functions. The idea is to compare reduced size functions by measuring the cost of transporting the cornerpoints of one reduced size function to those of the other one, with the property that the longest of the transportations should be as short as possible. Since, in general, the number of cornerpoints of the two reduced size functions is different, we also enable the cornerpoints to be transported onto the points of (in other words, we can “destroy” them).
When the number of cornerpoints is finite, the matching distance may be related to the bottleneck transportation problem (cf., e.g., , ). In our case, however, the number of cornerpoints may be countably infinite, because of our loose assumption on the measuring function, that is only required to be continuous. Nevertheless, we prove the existence of an optimal matching. Under more tight assumptions on the measuring function, the number of cornerpoints is ensured to be finite and a bottleneck distance can be more straightforwardly defined. For example, in  a bottleneck distance for comparing Persistent Homology Groups is introduced under the assumption that the measuring functions are tame. We recall that a continuous function is tame if it has a finite number of homological critical values and the homology groups of the lower level sets it defines are finite dimensional.
Although working with measuring functions that are continuous rather than tame involves working in an infinite dimensional space, yielding many technical difficulties in the proof of our results (for instance, compare our Matching Stability Theorem 25 with the analogous Bottleneck Stability Theorem for Persistence Diagrams in ), there are strong motivations for doing so. First of all, in real applications noise cannot be assumed to be tame, so that the perturbation of a tame function may happen to be not tame. In second place, when working in the more general setting of measuring functions with values in instead of , it is important that the set of functions is closed under the action of the operator, as shown in , whereas the set of tame functions is not. Last but not least, working with continuous functions allows us to relate the matching distance to the natural pseudo-distance, which is our final goal, without restricting the set of homeomorphism to those preserving the tameness property.
Of course the matching distance is not the only metric between reduced size functions that we could think of. Other metrics for size functions have been considered in the past (, ). However, the matching distance is of particular interest since, as we shall see, it allows for a connection with the natural pseudo-distance between size pairs, furnishing the best possible lower bound. Moreover, it has already been experimentally tested successfully in  and .
In order to introduce the matching distance between reduced size functions we need some new definitions. The following definition of representative sequence is introduced in order to manage the presence in a size function of infinitely many cornerpoints as well as that of their multiplicities. Moreover, it allows us to add to the set of cornerpoints a subset of points of the diagonal.
(Representative sequence) Let be a reduced size function. We shall call representative sequence for any sequence of points , (briefly denoted by ), with the following properties:
is the cornerpoint at infinity for ;
For each , either is a proper cornerpoint for , or belongs to ;
If is a proper cornerpoint for with multiplicity , then the cardinality of the set is equal to ;
The set of indexes for which is in is countably infinite.
We now consider the following pseudo-distance on in order to assign a cost to each deformation of reduced size functions:
with the convention about that for , , , , , .
In other words, the pseudo-distance between two points and compares the cost of moving to and the cost of moving and onto the diagonal and takes the smaller. Costs are computed using the distance induced by the -norm. In particular, the pseudo-distance between two points and on the diagonal is always ; the pseudo-distance between two points and , with above the diagonal and on the diagonal, is equal to the distance, induced by the -norm, between and the diagonal. Points at infinity have a finite distance only to other points at infinity, and their distance depends on their abscissas.
Therefore, can be considered a measure of the minimum of the costs of moving to along two different paths (i.e. the path that takes directly to and the path that passes through ). This observation easily yields that is actually a pseudo-distance.
It is useful to observe what disks induced by the pseudo-distance look like. For , the usual notation will denote the open disk . Thus, if is a proper point with coordinates and (that is, ), then is the open square centered at with sides of length parallel to the axes. Whereas, if has coordinates with (that is, ), then is the union of the open square, centered at , with sides of length parallel to the axes, with the stripe , intersected with (see also Fig. 4). If is a point at infinity then .
In what follows, the notation will refer to the open square centered at the proper point , with sides of length parallel to the axes (that is, the open disk centered at with radius , induced by the -norm). Also, when , will refer to the closure of in the usual Euclidean topology, while .
(Matching distance) Let and be two reduced size functions. If and are two representative sequences for and respectively, then the matching distance between and is the number
where varies in and varies among all the bijections from to .
In order to illustrate this definition, let us consider Fig. 5. Given two curves, their reduced size functions with respect to the measuring function distance from the center of the image are calculated. One sees that the top reduced size function has many cornerpoints close to the diagonal in addition to the cornerpoints , , , , , . Analogously, the bottom reduced size function has many cornerpoints close to the diagonal in addition to the cornerpoints , , , . Cornerpoints close to the diagonal are generated by noise and discretization. The superimposition of the two reduced size functions shows that an optimal matching is given by , , , , , , and all the other cornerpoints sent to . Sending cornerpoints to points of corresponds to the annihilation of cornerpoints. Since the matching is the one that achieves the maximum cost in the -norm, the matching distance is equal to the distance between and (with respect to the -norm).
is a distance between reduced size functions.
[Proof.] It is easy to see that this definition is independent from the choice of the representative sequences of points for and . In fact, if and are representative sequences for the same reduced size function , a bijection exists such that for every index .
Furthermore, we have that , for any two size pairs and . Indeed, for any bijection such that , it holds , because of Prop. 13 (Localization of cornerpoints).
Finally, by recalling that reduced size functions are uniquely determined by their cornerpoints with multiplicities (Representation Theorem) and by using Prop. 14 (Local finiteness of cornerpoints), one can easily see that verifies all the properties of a distance. ∎
We will show in Th. 28 that the and the in the definition of matching distance are actually attained, that is . In other words, an optimal matching always exists.
4 Stability of the matching distance
In this section we shall prove that if and are two measuring functions on whose difference on the points of is controlled by (namely ), then the matching distance between and is also controlled by (namely ).
For the sake of clarity, we will now give a sketch of the proof that will lead to this result, stated in Th. 25. We begin by proving that each cornerpoint of with multiplicity admits a small neighborhood, where we find exactly cornerpoints (counted with multiplicities) for , provided that on the functions and take close enough values (Prop. 20). Next, this local result is extended to a global result by considering the convex combination of and . Following the paths traced by the cornerpoints of as varies in , in Prop. 21 we show that, along these paths, the displacement of the cornerpoints is not greater than (displacements are measured using the distance , and cornerpoints are counted with their multiplicities). Thus we are able to construct an injection , from the set of the cornerpoints of to the set of the cornerpoints of (extended to a countable subset of the diagonal), that moves points less than (Prop. 23). Repeating the same argument backwards, we construct an injection from the set of the cornerpoints of to the set of the cornerpoints of (extended to a countable subset of the diagonal) that moves points less than . By using the Cantor-Bernstein theorem, we prove that there exists a bijection from the set of the cornerpoints of to the set of the cornerpoints of (both the sets extended to countable subsets of the diagonal) that moves points less than . This will be sufficient to conclude the proof. Once again, we recall that in the proof we have just outlined, cornerpoints are always counted with their multiplicities.
We first prove that the number of proper cornerpoints contained in a sufficiently small square can be computed in terms of jumps of reduced size functions.
Let be a size pair. Let and let be such that . Also let , , , . Then
is equal to the number of (proper) cornerpoints for , counted with their multiplicities, contained in the semi-open square , with vertices at , given by
[Proof.] It easily follows from the Representation Theorem (Th. 8). ∎
We now show that, locally, small changes in the measuring functions produce small displacements of the existing proper cornerpoints and create no new cornerpoints.
(Local constancy of multiplicity) Let be a size pair and let be a point in , with multiplicity for (possibly ). Then there is a real number such that, for any real number with , and for any measuring function with , the reduced size function has exactly (proper) cornerpoints (counted with their multiplicities) in the closed square , centered at with side .
[Proof.] By Prop. 11, a sufficiently small real number exists such that the set
is contained in (i.e. ), and does not contain any discontinuity point for . Prop. 12 (Propagation of discontinuities) implies that is the only cornerpoint in .
Let be any real number such that . For each real number with , let us take a sufficiently small positive real number with and , so that .
We define , , ,
If is a measuring function such that , then by applying Prop. 10 twice,
Since is constant in each connected component of , we have that
implying . Analogously, , , . Hence, , i.e. the multiplicity of for , equals
By Prop. 19, we obtain that is equal to the number of cornerpoints for contained in the semi-open square with vertices , , , given by
This is true for any sufficiently small . Therefore, is equal to the number of cornerpoints for contained in the intersection . It follows that amounts to the number of cornerpoints for contained in the closed square . ∎
The following result states that if two measuring functions and differ less than in the -norm, then it is possible to match some finite sets of proper cornerpoints of to proper cornerpoints of , with a motion smaller than .
Let be a real number and let and be two size pairs such that . Then, for any finite set of proper cornerpoints for with , there exist and representative sequences for and respectively, such that for each with .
[Proof.] The claim is trivial for , so let us assume .
Let with . Then, for every , we have .
Let , let be the multiplicity of , for , and . Then we can easily construct a representative sequence of points for , such that
Now we will consider the set defined as
In other words, if we think of the variation of as the flow of time, is the set of instants for which the cornerpoints in move less than itself, when the homotopy is applied to the measuring function .
is non-empty, since . Let us set and show that . Indeed, let be a sequence of numbers of converging to . Since , for each there is a representative sequence for with , for each such that . Since , for any and any . Thus, recalling that , for each such that , it holds that for any . Hence, for each with , possibly by extracting a convergent subsequence, we can define . We have . Moreover, by Prop. 20 (Local constancy of multiplicity), is a cornerpoint for . Also, if indexes exist, such that and , then the multiplicity of for is not smaller than . Indeed, since , for each arbitrarily small and for any sufficiently great , the square contains at least cornerpoints for , counted with their multiplicities. But Prop. 20 implies that, for each sufficiently small , contains exactly as many cornerpoints for as the multiplicity of with respect to , if . Therefore, the multiplicity of for is greater than, or equal to, .
The previous reasoning allows us to claim that if a cornerpoint occurs times in the sequence , then the multiplicity of for is at least .
In order to conclude that , it is now sufficient to observe that is easily extensible to a representative sequence for , simply by setting equal to the cornerpoint at infinity of , and by continuing the sequence with the remaining proper cornerpoints of and with a countable collection of points of . So we have proved that is attained in .
We end the proof by showing that . In fact, if , by using Prop. 20 once again, it is not difficult to show that there exists , with , and a representative sequence for , such that for . Hence, by the triangular inequality, for , implying that . This would contradict the fact that . Therefore, , and so . ∎
We now give a result stating that if two measuring functions and differ less than in the -norm, then the cornerpoints at infinity have a distance smaller than .
Let be a real number and let and be two size pairs such that . Then, for each and representative sequences for and , respectively, it holds that .
[Proof.] By Prop. 13 (Localization of cornerpoints), . Let and , with . Since , then
By contradiction, let us assume that , that is, . So either or . In the first case,
in the latter case,
Hence we would conclude that either is not a minimum point for or is not a minimum point for . In both cases we get a contradiction. ∎
Now we prove that it is possible to injectively match all the cornerpoints of to those of with a maximum motion not greater than the -distance between and .
Let be a real number and let and be two size pairs such that