1 Introduction
Distances reveal the geometry of their underlying space. Even distance comparisons carry valuable information. Consider a set of points in space . In nonmetric embedding problems, we have measurements of the form
where is the distance between and , and is an unknown nonlinear and monotonically increasing (or decreasing) function. In such problems, only the order of these measurements is useful. We can interpret them as distance comparisons, since
In this paper, we address the following question:
What can we say about a space from distance comparisons alone?
Euclidean distance geometry problems (DGP) have a rich history in the literature from robotics [1, 2] and wireless sensor networks [3] to molecular conformations [4] and dimensionality reduction [5]. Typically, we want to find a representation for a set of measured distances in a Euclidean space [6]. Beyond Euclidean DGPs, there has recently been a surge in applications of hyperbolic geometry in data analysis, most notably as a natural space to work with hierarchical data. Social networks [7], gene ontologies [8], Hearst graph of hypernyms [9] and olfactory data [10] are all examples of hierarchical data structures. Similarly, spherical embedding aims to embed a set of objects on a (hyper)sphere given their dissimilarities [11]. Spherical embedding problems have various applications in astronomy [12], distance problems on Earth [13], and texture mapping [14]. To compute an embedding, we have to know the geometry of embedding space.
Euclidean, spherical and hyperbolic geometry are categorical examples of constant curvature spaces, known as space forms. A space form is characterized by its curvature and dimension. For nonmetric embedding problems posed in space forms, we want to characterize these two properties from the measured distance comparisons. However, it is impossible to infer the magnitude of a space form’s curvature only based on distance comparisons. In other words, if a set of distance comparisons is realizable in a space form with curvature (or ), then we can find an equivalent embedding in a space form with curvature ( or ) for any positive .
In the literature, a related problem is to detect intrinsic structure in neural activity, invariant under nonlinear monotone transformations of measurements. Giusti et al. [15]
propose a method based on clique topology of the graph of correlations between pairs of neurons. Clique topology of a weighted graph describes the behavior of cycles in its order complex
^{2}^{2}2Order complex of a complete, weighted graph is a sequence of graphs where is the graph having vertices and no edges, has a single edge corresponding to the highest edge weight of , and each subsequent graph has an additional edge for the nexthighest edge weight [15]. as a function of edge densities, also known as Betti curves. The statistical behavior of Betti curves can help distinguish random and geometric structures of size in Euclidean space. Zhou et al. [10] generalize this statistical approach to hyperbolic space.Main contributions
In this paper, we propose a distributionfree approach to determine a lower bound on the embedding dimension of space forms with only distance comparisons. We show that ordering of distances inferred from comparisons contains information about the dimension of space forms. We introduce the ordinal capacity of a metric space, defined as follows: Ordinal capacity of a metric space is the maximum number of points such that
We prove that ordinal capacity characterizes the admissible patterns of ordinal measurements. Intuitively, in a Euclidean space with fixed dimension , we claim only a specific pattern of distance comparisons
is realizable. We show that the ordinal capacity of a space form is only related to its dimension and curvature sign. Then, we define the ordinal spread for a point set to describe the appearance pattern of vertex pairs in the sorted distance list
We prove that point ordinal spread of space forms — the maximum ordinal spread of its point sets — is related to their ordinal capacity. The theoretical bounds on the point ordinal spread of space forms give us a practical test to find a minimum Euclidean (and spherical) embedding dimension.
Notation
For any two numbers , we let and
be their maximum and minimum. We use small letters for vectors,
, and capital letters for matrices, . We denote the th standard basis vector in by , and let be short for the set . For vectors , their dot product is denoted by , and their Lorentzian inner product is . Finally, and are allzero and allone vectors of appropriate dimensions. Let be a subset of a metric space , and ; We defineThe cardinality of a discrete set is denoted by . The graphtheoretic notations simplifies the main results of this paper. For a graph , we denote its edge set as . Let be a complete partite graph with part sizes . The Turán graph [16] is a complete partite graph with vertices, and part sizes ^{3}^{3}3From , we have , .
Then, . ^{4}^{4}4This is simplified from . For , we assume the graph is complete and .
2 Nonmetric distance problems in space forms
A space form is a complete, connected Riemannian manifold of dimension and constant sectional curvature. The hyperbolic, Euclidean and (hyper)spherical spaces are famous examples of space forms with constant negative, zero and positive curvatures. Space forms are equivalent to spherical, Euclidean, or hyperbolic spaces up to an isomorphism [17], see Table 1.
Hyperbolic (’Loid Model)  Euclidean  Hyperspherical  

In general, distance geometry problems aim to find an embedding for a set of distancerelated measurements in a metric space. They can be metric [18], nonmetric [19], or unlabeled [20] depending on the data modality and application domain. In this paper, we focus on nonmetric distance problems in space forms.
Problem 1.
Let be a space form with distance function . A nonmetric space form distance geometry problem aims to find , given a subset of ordinal distances measurements such that
(1) 
where .
For noisefree measurements, we can fully encode distance comparisons in a sorted list, namely
(2) 
This list is not necessarily unique. A determinstic or a randomized binary sort algorithm needs at least pairwise comparisons to uniquely sort the distance list [21]. In this paper, we assume that such a list always exists, and is unique.
Consider a set of points that form a centered, regular pentagon shown in Figure 1 . The sorted distance list could be . We can summarize the labels appearing in the sorted distance list in the label matrix ,
(3) 
where th column represent the labels appearing at the th position in the sorted distance list. The label matrix summarizes the appearance pattern of individual points in the distance list (2). Alternatively, we can represent this sequence in a binary label matrix . If appears in the th position of the distance list (2), then . Otherwise, . In Figure 2, we show the binary label matrix (in green) associated with the ordered distance list (3). In the next section, we use the binary label matrix of measured data in creftypecap 1 to extract useful information about the geometry of underlying space.
3 Ordinal Spread
We consider identifying the embedding space in creftypecap 1. Specifically, we want to characterize the dimension of space form given a set of binary distance comparisons of the form (1). We focus on inferring geometrical information through binary label matrix associated with (1). Any such inference must be invariant with respect to arbitrary permutations of point labels. It will be useful to devise a canonical procedure to relabel point sets. We assign to each point a unique number in that corresponds to its first appearance in the sorted distance list. For the point set shown in Figure 1 , we have
according to (3). The following sequence shows the appearance order of points in the sorted distance list,
The canonical ordering of the labels assigns points and to the largest distance, to the second largest distance, etc. This procedure is illustrated as follows,
We can summarize this procedure as permuting the columns of binary label matrix by a permutation operator , see Figure 2.
This relabeling procedure can give us intuitions to extract geometrical information from a distance list. For instance, we show that the appearance pattern of new labels in the sorted distance list bears geometrical implications. Let us formalize this intuition by introducing th ordinal spread for a point set. The th ordinal spread of the point set is defined as
for the ordered distance list . Simply, we write , where no confusion can arise.
The th ordinal spread of a point set is if the first appearance of the th label in the ordered distance happens in position . In other words, we have
For the point set shown in Figure 1 with label matrix (3), we have
[] For any metric space , and points . We have

, ,

Let us devise an experiment to show how the th ordinal spread can distinguish space forms. We randomly generate i.i.d. points from absolutely continuous distributions with full support in hyperbolic (’Loid ), Euclidean () and spherical () spaces.^{5}^{5}5
We use normal and uniform distributions for Euclidean and spherical spaces. For hyperbolic space, we project a normally distributed
onto the hyperboloid sheet, i.e. . For trials, we plot the th ordinal spread for each realization , see Figure 3. We find the empirical maximum of to be a sensitive indicator for geometry of underlying space. While the emerging pattern of ’s is dependent on the distribution of point sets, the behavior of empirical maximum of th ordinal spread is robust to the choice of point set distributions, as it converges to its supremum almost surely. Therefore, we introduce point ordinal spread for a metric space – a novel concept to categorize space forms based on their ability to realize extremal ordinal patterns, in the sense of the following definition.Let be a metric space. The point ordinal spread of is defined as
By definition, the ordinal spread number of a space form depends on extremal configurations of point sets. In Figure 1 , we show point sets with maximum (th) ordinal spread of , see Section 3. In the next section, we introduce ordinally dense subsets, and show how they determine the point ordinal spread of space forms.
4 Ordinal Capacity
Let be a set of distinct points in metric space . If
then we say that is an ordinally dense subset of , or in short . This definition formalizes the point configurations with maximum ordinal spread. A set of points is ordinally dense in if and only if it has a subset of points whose pairwise distances are all larger than (or equal to) their distances to the th point. In other words, we have
The existence of an ordinally dense subset of size depends on the curvature sign and dimension of space forms. Hence, we want to find the maximum number of ordinally dense points in space forms. The ordinal capacity for a metric space is defined as
The ordinal capacity is an indicator of the capability of a metric space to accommodate different patterns of point labels. For space forms, this concept is intimately related to the famous spherical cap packing problem [22], as the proof of the following result shows (see Section 7.2). [] The ordinal capacity for a space form is given by
where . The ordinal capacity of a hyperbolic space is infinite. This implies that there exists an ordinally dense point set for any . In Poincaré model, a centered gon with an extra point in the center is an ordinally dense set, see Figure 1 . In comparison, Euclidean and spherical spaces have a finite ordinal capacity, increasing exponentially^{6}^{6}6Their ordinal capacities have a lower bound of the form [23], see Section 7.2. with their dimension as given in Table 2. In Figure 1 , we show a regular hexagon with an extra point in the center. All pairwise distances in hexagon are larger or equal to their distances to the center. This point set configuration in fact achieves .
The ordinal capacity can not be used to distinguish between and . However, it is possible to refine the ordinal capacity of spherical space if we only consider points set with ; See Section 7.2.3. [] The point ordinal spread of a space form is given by
This theorem gives a universal upper bound on ordinal spread of point sets. We can use it to find a bound on minimum dimension for embedding in a space form. In practice, give a set of nonmetric measurements associated with point set , we calculate the empirical point ordinal spread as
(4) 
where . Then, we can find a lower bound for Euclidean (or spherical) embedding dimension by computing
(5) 
The ordinal capacity of hyperbolic spaces is infinite, regardless of their dimension. Hence, this test can not be used to give a lower bound on the dimension of hyperbolic space, as it always gives .
5 Numerical Results
In this section, we numerically illustrate a geometrical intuition for ordinal capacity number of Euclidean and hyperbolic spaces. Then, we experiment with popular realworld datasets, namely olfactory data [24] and Bitcoin Trust Network [25].
5.1 Stylized Experiments
We generate i.i.d. point sets from a normal distribution in dimensional hyperbolic and Euclidean spaces. ^{7}^{7}7In ’Loid model, we generate random point where is normally distributed. For trials, we plot the ordinal spread of each realization and varying sizes of point sets
. The maximum ordinal spread of the generated point sets gives an estimate for the
point ordinal spread of Euclidean and hyperbolic spaces, see Section 3. We repeat this experiment by fixing a point in the center of the coordinate system, and projecting the remaining points to their circumscribed circle, i.e. point sets wherewhere . The random points yield a more accurate estimate for the and , see Figure 4. We also show the individual points in sets with maximum ordinal spread accumulate on nonoverlapping spherical caps of the circle, see Section 7.3. In the proof for Section 4, we show that there are strictly nonoverlapping spherical caps for dimensional Euclidean space, whereas this number is infinite for hyperbolic spaces. Therefore, ordinal capacity of a space is equal to the total number such caps plus the center point. The estimated point ordinal spread of Euclidean space is close to the theoretical bound, e.g., we have , whereas the theoretical bound is . Finally, the estimated point ordinal spread of a hyperbolic space matches its theoretical bound of .
5.2 Geometry of Similarity Graphs
Generally, in nonmetric embedding problems, the measurements are in form of similarities (or dissimilarities) between a set of entities. In this section, we want to experiment with olfactory [24] and Bitcoin Trust Network [25] datasets. The olfactory dataset contains monomolecular odor concentrations of blueberries. There are odors across the total of fruit samples. The crosscorrelations between monoodor concentrations across samples represent the similarity measurements. The embedding goal is to find a representation for odors in a space form, such that
We summarize these distance comparisons in a nonincreasing list of distances,
We randomly select up to different subcliques of size . In Figure 5 , we show the ordinal spread of each subclique. The maximum ordinal spread of these subcliques, , serves as a test for the point ordinal spread of underlying space, (4). We compare with the theoretical values of , see (5). In this experiment, we show that the minimum dimension of Euclidean (and spherical) space must be at least .
Keeping a record of Bitcoin users’ reputation prevents transactions with fraudulent users. The Bitcoin OTC trust network is a weighted whotrustswhom graph of people [25]. There are members in the network. The member rates another member an integer between (total distrust) to (total trust). This is normalized to a nonnegative number in interval,
, and interpreted as the probability that user
trusts user . For a network with nodes, there could be up to of such trust probabilities. ^{8}^{8}8We assume each member trusts itself with probability of , and in general. If is unavailable for a pair , we replace it with the average trust probability of the network. To embed such probabilities, we relate the distance between two users to a function of their probability of mutual trust, i.e.where .
Similarly, we randomly choose up to different subcliques size of . In Figure 5 , we show the ordinal spread of each subclique, along with their maximum value. The theoretical values for again suggests that the Euclidean embedding dimension must be at least . This estimate could be improved by sampling more subcliques since the total number of subclique grows rapidly with their size.
6 Conclusion
In this paper, we focus on inferring the geometry of space forms only from distance comparisons between a set of entities. We introduce novel notions such as ordinal capacity and spread for a metric space, as well as ordinally dense discrete sets. We provide a theoretical lower bound for the embedding dimension of Euclidean and spherical spaces. Our geometrical approach for studying embedding spaces in nonmetric problems brings new perspective to design similar algorithms. Future works include finding a useful upper bound for embedding dimensions, and generalizing the results to hyperbolic spaces.
Broader Impact
This work provides a theoretical framework to identify the underlying geometry of space forms from distance comparisons. The authors believe that this study does not have any future societal impacts.
7 Appendices
7.1 Proof of Proposition 3
From Section 3, the values for and are trivial. The lower bound for simply follows from the uniqueness of pairwise distances. To put formally, we have
For the upper bound, is maximum when all smallest pairwise distances are incident to a unique point; For example, see Figure 1 . The total length of the distance list is . Therefore, we have
7.2 Proof of Theorem 4
Let us separately consider hyperbolic, Euclidean, and spherical spaces.
7.2.1 Hyperbolic space
Let , and be a set of parameterized points in ’Loid model of dimensional hyperbolic space, such that
where , and . To see an example, see Figure 6. Therefore,
Therefore, for any , there exists a such that . Hence,
7.2.2 Euclidean space
There is a set of points in such that
where and .
Proof.
Let be a set of points in such that
or . Without loss of generality, we assume and . Let and . We want to show that . Following the definition of ordinal spread, we have
where holds with equality if appears last in the sorted distance list, is due to . To prove inequality , let for distinct . Then,
where follows from , , , and follows from the symmetry in the argument. Therefore, we have
Hence, is an ordinally dense subset of . ∎
From Section 7.2.2, we want find an ordinally dense set of points in such that
and . From the definition of ordinal spread, we have
There, we can find a maximum number of ordinally dense points by solving a spherical cap packing problem, see Figure 7.
Let be the dimensional unit sphere in . We define the spherical cap as
for any .
The maximum number of nonoverlapping is defined as
Therefore, we have