    # Nonlinear Mapping and Distance Geometry

Distance Geometry Problem (DGP) and Nonlinear Mapping (NLM) are two well established questions: Distance Geometry Problem is about finding a Euclidean realization of an incomplete set of distances in a Euclidean space, whereas Nonlinear Mapping is a weighted Least Square Scaling (LSS) method. We show how all these methods (LSS, NLM, DGP) can be assembled in a common framework, being each identified as an instance of an optimization problem with a choice of a weight matrix. We study the continuity between the solutions (which are point clouds) when the weight matrix varies, and the compactness of the set of solutions (after centering). We finally study a numerical example, showing that solving the optimization problem is far from being simple and that the numerical solution for a given procedure may be trapped in a local minimum.

08/31/2021

### Riemannian Optimization for Distance Geometric Inverse Kinematics

Solving the inverse kinematics problem is a fundamental challenge in mot...
04/29/2018

### Embedding with a Rigid Substructure

This paper presents a new distance geometry algorithm for calculating at...
06/29/2022

### Minimum Weight Euclidean (1+ε)-Spanners

Given a set S of n points in the plane and a parameter ε>0, a Euclidean ...
06/20/2020

### Cycle-based formulations in Distance Geometry

The distance geometry problem asks to find a realization of a given simp...
02/04/2021

### The Analysis from Nonlinear Distance Metric to Kernel-based Drug Prescription Prediction System

Distance metrics and their nonlinear variant play a crucial role in mach...
09/21/2019

### Efficient estimation of a Gromov--Hausdorff distance between unweighted graphs

Gromov--Hausdorff distances measure shape difference between the objects...
06/02/2021

### Matrix factorisation and the interpretation of geodesic distance

Given a graph or similarity matrix, we consider the problem of recoverin...

## 1 Introduction

Let us have a set of objects, and distances between every pair of them. The distance between object and is denoted , or . This defines a metric space with . Among metric spaces, Euclidean spaces play a special role, because they establish a link with geometry. Moreover, the geometry of Euclidean spaces is well understood. Hence, even if many other metric spaces exist, like Riemanian manifolds with distances along a geodesic or graphs with shortest distance between vertices, many efforts have been devoted to specify those metric spaces for which an isometry exists with a Euclidean space, and if such an isometry exists, to build it. In such a case, any metric problem in

can be translated into a problem in Euclidean geometry, and, when lucky, solved. For example, supervised learning by discriminant analysis in a discrete metric space can be translated into the same problem in a Euclidean space and solved by Support Vector Machine approaches (see

).

The conditions for existence of an isometry between a discrete metric space and a subset of a Euclidean space are known, and are given by classical multidimensional scaling, proposed in  (see [12, 1] for classical presentation of MDS, and  for a recent presentation). If there is an isometry between and a subset of points in a Euclidean space , then the Gram matrix of vectors is definite positive. It is known that the Gram matrix can be computed from pairwise distances only. A set of points such that

 ∀i,j∈V,∥xi−xj∥=dij (1)

can be computed from the Singular Value Decomposition of the Gram matrix as a second step. If the Gram matrix has non positive eigenvalues

111In such a case, strictly speaking, the matrix built from the pairwise distances is not a Gram matrix., there is no isometry on any subset of , regardless of the dimension. In such a case, for a given dimension , one defines the cost of a map by

 ϕ=∑i

If there is an isometry between and a subset of , then there is a map for which . If not, Least Square Scaling (LSS) finds a map with minimal cost for a given dimension by solving

 Given a discrete metric space (V,d) a dimension k find a map i∈V⟼xi∈Rk such that ϕ is minimal

Least square scaling has been pioneered in . See as well  for a presentation and a comparison with classical MDS222It is unfortunate that least-square scaling has been proposed with the same name MDS than classical MDS. However, classical texts like [12, 1, 5] are clear on this matter and agree on setting the vocabulary..

In some situations, one is interested in relative error/distorsion between the distances and . Therefore, Sammon has developed in  what he called Non Linear Mapping (NLM) in which each term in the cost function is weighted by the inverse of the distance:

 ϕ=∑i

(Sammon introduces a normalizing constant , that we do not mention here). It is natural to extend this towards

 ϕ=∑i

where is a weight function, which can be as in Sammon’s seminal paper, or other classical maps such as or . Nonlinear mapping is minimizing the cost function over all possible point sets. It is clear that a solution is not unique, and is given up to an isometry in a Euclidean space, i.e. the composition of a translation, a rotation and a reflection.

Let us now consider a slightly different situation, where there exists an unknown point set of points in a Euclidean space and where distances are known for a subset only of pairs of points. Here, it is known as part of the problem that the distances are taken between points living in a Euclidean space, whereas in NLM, such an hypothesis is not required. Let us consider the graph where is the set of pairs for which is known. The aim is to find a mapping

 x:V−−−−→Rki−−−−→xi (5)

such that

 ∀(i,j)∈E,∥xi−xj∥=dij (6)

This is known as Distance Geometry Problem (DGP, see , equation ). A recent and thorough survey of this problem with historical background on how it grew over decades and in different guises is . See also . Here, the dimension is given, and the distances are known accurately. In real world problems, such as determination of protein structures, the distances are known up to a given precision only. DGP has been studied as well as embeddability problem for graphs. It has been proved in  that the embeddability problem is NP-complete and the embeddability problem is NP-hard for .

A link between both problems has been established in , where the DGP problem is recast as finding a map

 x:V−−−−→Rki−−−−→xi

such that

 ϕ(X)=∑i,j∈Sωij(∥xi−xj∥2−d2ij)2 (7)

is minimal, where is the matrix with in row and are weights. Let us note different choices for the exponents of the quantities to be compared: in equation (7) and in equation (4). Clearly, if DGP has a solution, the minimum of is zero, and a set of rows of where is a solution of the DGP problem. A known difficulty is that has in general many local minima. The technique used in  is to progressively smooth by a convolution (see [11, section 3.2.2] for a general presentation of smoothing-based methods and of DGSOL which implements it).

If all weights are equal to 1 in (4), one recovers least-square scaling (see ). If one has

 {(i,j)∈E⇒ωij=1(i,j)∉E⇒ωij=0

one recovers DGP. There is a sort of continuity between least-square scaling, nonlinear mapping and DGP when the weights vary smoothly from 1 to 0 on pairs of items outside .

Here, we study whether this continuity can be given a sound basis in a relevant topology, and whether it can be translated into a continuity of numerical solutions between NLM and DGP when one or a set of parameters vary.

## 2 Continuity between LSS, NLM and DGP

Let be a discrete metric space, with . We denote with . Let be a weight function on distances. being known, this yields a weight matrix of general term with enabling to run a nonlinear mapping of in a Euclidean space , where is the image of . The set of points is denoted as a matrix , with being row . The set of real matrices is denoted . We define

 ϕ(X,Ω)=n∑i,j=1ωij(∥xi−xj∥2−d2ij)2 (8)

and consider the NLM problem

 (9)

conditioned by . This problem encompasses the problems mentioned above. For example, one can see that

• if , i.e. the matrix with ones only, or , problem (9) is least-square scaling (see )

• if , i.e. the matrix with zeros only, or , any point cloud is a solution

• if is a symmetric boolean matrix, i.e. , problem (9) is DGP.

Let be the graph such that if . Then, in DGP vocabulary, is a realization of . (see [11, sec. 1.1.4]). Let us mention that here and in the following is undetermined, and one can select or . This can be summarized as follows:

minimum of minimum of
Least square scaling Isometry with a Euclidean space
= Classical MDS
Nonlinear mapping
if Distance Geometry Problem

This raises the question of the continuity of the solutions of (9) depending on . The definition of continuity is not straightforward as there is a set of matrices which are solutions of (9). We begin by an intuitive notion of continuity, and make it rigorous in section 3. A solution is said continuous at if, being a solution at , whatever the neighborhood of , there exists a neighborhood of such that for each in it there is a solution in , or333Rigorously, one should replace by: there exists an in the set of solutions for such that .

 ∀ϵ>0,∃η>0:|Ω′−Ω|<η⟹|X′−X|<ϵ

One can observe that the solution is not continuous at . Indeed, any point cloud is a solution for . Let us take an which has a "nice" solution, i.e. the set of solutions is an orbit of the group of isometries acting on , and denote a solution. Whatever ,

 ϕ(X,ηΩ)=ηϕ(X,Ω)

and for any , the set of solutions for still is the set of solutions for . Let us take a point cloud distant from any in this set of solutions, i.e. for some whatever the point cloud in the set of solutions for . is a solution for . Whatever the neighborhood of , there is a such that is in this neighborhood. And, for this , there is no point cloud in the neighborhood of that is a solution for . Then, the solution of (9) is not continuous at . Let us note that and are incongruent in the sense of [11, sect. 1.1.4.3], as a congruency class is an orbit of the action of the group of isometries in .

Our objective is to study the continuity of the solution of (9), when varies within the space of matrices with non negative elements. The motivation for this is that each problem has a specific approach to build a solution:

• DGP builds it explicitly (often with an optimization scheme)

• LSS or NLM uses an optimization scheme

We study here whether a continuity between solutions of NLM, when a parameter varies in a family and the limit corresponds to a DGP problem (some distances have zero weight), can lead to a numerical solution of DGP. On the other hand, efficient optimization schemes have been derived for solving DGP which are close to some used for NLM (DC programming, see  and references 6 & 7 therein).

## 3 A topology on the set of solutions

Let us define

 ψ(Ω)={X∈M(n,k):ϕ(X,Ω)is minimal}

For a given , if , we have

 ψ(Ω)=ϕ−1(m)

We will define a topology in two spaces, in which: an element is a point cloud and an element is a set of point clouds. The latter will enable to define the neighborhood of a set for a given and study continuity of . If the topology is associated to a distance, this will read

 ∀ϵ>0,∃η>0:|Ω′−Ω|<η⟹d(ψ(Ω′),d(Ω))<ϵ

For this, we must define a distance . A distance can be defined between compact subsets of a metric set: the Hausdorff distance (see e.g. [6, p. 34]). Let us briefly recall its definition. Let be two closed sets in a metric space where the distance between points is denoted . One defines

 δ(A,B)=maxx∈A{miny∈Bd(x,y)}

It is easy to show that . But it is not symmetric. So one defines

 d\textsch(A,B)=max{δ(A,B),δ(B,A}

This is defined if and are compact subsets, and the triangular identity is fulfilled. We denote the Hausdorff distance between and . We first define a distance between two points clouds and as the Hausdorff distance between them. For this to hold, and must be compact. As they are obviously closed, they must be bounded. Therefore, we show a first lemma which gives the condition under which is bounded. Let be the graph associated to and defined by

 (i,j)∈E⟺ωij>0

Then

###### Lemma 1.

The set is bounded if, and only if, is connected.

###### Proof.

connected is bounded. We show first that is bounded for any pair if . If , we have

 ϕ(X,Ω)=∑i∼jωij(∥xi−xj∥2−d2ij)2=m

Hence,

 ∀i∼j,ωij(∥xi−xj∥2−d2ij)2≤m

and

 ∥xi−xj∥2≤d2ij+mωij

Let and . Then,

 ∀i∼j,∥xi−xj∥2≤δ2+mω

Now, let with . As is connected, there is a path linking with . By triangular inequality, (with and ). As is finite, it has a diameter , and . If there is a such that for any pair and , then is bounded.

not connected is not bounded. If is not connected, there are at least two subsets of vertices without connection between and . We have

 ϕ(X,Ω)=∑i,j∈Aωij(∥xi−xj∥2−d2ij)2+∑k,m∈Bωkm(∥xk−xm∥2−d2km)2

Let us now set

 x′i=xi+h|A|,x′j=xj−h|B|

We still have and, , . If , is unbounded

As a consequence, is bounded if, and only if, is connected. ∎

Then, the set of point clouds solution of (9) for a weight matrix with a connected associated graph is a metric space, with Hausdorff distance. We then consider the subsets , running over the matrices fulfilling connectedness condition. We call a solution of (9), and the set of solutions. is closed as it is the pre-image of , the minimum of . But it is not bounded. Indeed, is invariant by any isometry in . A translation is an isometry, and the distance between two solutions such that with is . Then, is unbounded. Therefore, we impose on each solution that it is centered, and consider

 ψ\textscc(Ω)={X∈ψ(Ω):∑ixi=0}

Then is bounded. Being a closed and bounded subset of a metric space, it is a compact set. We can use the Hausdorff distance between and , and continuity of the solution of (9) is defined as

 ∀ϵ>0,∃η>0:|Ω′−Ω|<η⟹d\textsch(ψ(Ω′),d(Ω))<ϵ (10)

It is reasonable to impose that for any pair of points . The set of such weight matrices is homeomorphic to the hypercube with . The weight matrices associated to DGP are boolean. They correspond to vertices of the hypercube.

## 4 Continuity and rigidity

Let us denote by the set of weight matrices such that the associated graph is connected, and by the set of all centered point clouds with points. Then, is a compact subset of .

Let be the weight matrix of a DGP, i.e. there is a connected graph such that if and if . We show here that the continuity of at is linked with the rigidity of the framework associated to the graph . A framework is a set of points in solution of (6). A framework is rigid if it is defined up to an isometry (i.e. the set of known distances is sufficient to derive all other distances in a unique way). Otherwise, it is said flexible (see [11, sect. 1.1.4] for the definitions, and section 4.2 of same reference for a thorough discussion of those notions). We show here

###### Lemma 2.

The set of weight matrices for which any realization in is a rigid framework is not closed.

###### Proof.

For this, it suffices to exhibit a sequence for which for any DGP solution is a rigid framework which converges to a weight matrix for which the solution is flexible. Let us consider and (i.e. 3 points in ) and

 D=⎛⎜⎝01110√21√20⎞⎟⎠andΩη=⎛⎜⎝01110η1η1⎞⎟⎠,0≤η≤1

where is the pairwise distance matrix between points. A realization is

 X′=⎛⎜⎝x=00y=01z=10⎞⎟⎠

If , the realization is defined up to an isometry: the lengths of the edges of the triangle made by are known, and this fixes the triangle up to an isometry. Whereas if , is the set of points such that , which is isomorphic to where is the group of rotations in ( and are on the circle of center and radius 1). At , is flexible. ∎

As a consequence, is not continuous at . Indeed,

 X=⎛⎜⎝0010−10⎞⎟⎠∈ψ(Ω)

and there is a constant (the calculation is easy, and omitted here) such that

 ∀η>0,∀X′∈ψ(Ωη),d(X,X′)>C

whereas . and are incongruent.

## 5 Convergence to a Heaviside function

We first succinctly give a motivation for the next short case-study of a given family of weight matrices. Molecular based taxonomy consists in assigning specimens to species based on similarities between some relevant DNA sequences, called markers . Distances between sequences are computed with classical algorithms (see ). This dictionary between morphological based and molecular based taxonomy works up to a given threshold of distance (beyond this threshold, computed genetic distances are highly likely to be blurred). Thus, as a simplified setting, distances up to a given threshold only are taken into account. We have a set of pairwise distances, and we wish to build an Euclidean image of it that is as accurate as possible, ignoring the distances beyond a given threshold. Provided a dimension is given, this can be formalized as a DGP problem, where in distances below a given threshold only are given, i.e. is defined by

 E={(i,j):d(i,j)≤θ}

The weights are given by a Heaviside function . Such a function has a discontinuity at . We can consider the weight function

 ωa,θ(d)=1−tanh(a(d−θ))2 (11)

as well because distances are blurred progressively. We have

 lima→∞ωa,θ(d)=H(θ−d)

with the topology of uniform convergence

 ∀ϵ>0,∃α∈R+:∀d>0,∀a>α,|ωa,θ(d)−H(θ−d)|<ϵ

Let us assume that is fixed. We are given a converging family of weight matrices satisfying

 Ω∞=lima→∞Ωa

Then, if is continuous at , one can define

 ψ(Ω∞)=lima→∞ψ(Ωa)

As is the weight matrix of a DGP, i.e. there is a graph such that for and for , this means that the solution of DGP can be found as the limit of a family of solutions of NLM, when .

Let us make a few remarks on possible frameworks to address such a problem. We call Euclidean image a solution of problem (9).

• the dimension is not fixed by the problem itself. Hence, this problem can be rephrased as EMDCP.

• It is however interesting to have a Euclidean image in a low dimensional space where shapes of point clouds can be better studied (e.g. )

• In order to build such a solution, we can select a large

, and project it in an optimal way in a low dimensional space (by Principal Component Analysis)

• The choice to ignore distances beyond a given threshold is similar to the Isomap procedure, which is a manifold learning technique using graph based distances to approximate geodesic distances on a manifold [17, 9]. A link between Isomap and the solution of a DGP has recently been made in .

Here, we fix a dimension (in our example, ) and study the convergence of NLM solutions to a solution of DGP. We focus on the numerical stability of such an approach. Indeed, for a given , the manifold in defined by

 M={(X,z)∈Rn×k×Rs.% t.z=ϕ(X,Ω)}

is far from having a unique minimum on . The set when is the minimum of is a minimal submanifold (all its points have a minimum "elevation" ). It is likely that many other local minima exist, as well as "flat valleys". We study here whether the ill behavior of may hamper to use the continuity of the solution between NLM and DGP as a way to obtain a numerical solution to DGP. We mention that there exist several efficient methods to solve DGP by numerical optimization, like Difference on Convex functions programming , Distance Continuation 

or Isomap-based heuristics

. We are interested here in the continuity between solutions of NLM and DGP on continuous families of weight matrices .

## 6 Numerical optimization schemes

We have implemented a numerical optimization scheme for solving problem (9) with function (11) for , with two methods known to be efficient for finding global minima: BFGS and basin hopping  in package scipy.optimize. As distances are unreliable beyond a given threshold, we have adopted the framework of isomap-based heuristics . We have built a data set in silico, that we wish to recover (it should be the solution of DGP and NLM problems). The objective is to test the continuity of the numerical solution. The dataset consists of points in randomly (uniform law) distributed within the ring delimited by circles and centered at origin (see FIGURE 1 top left). We have selected the weight function defined in equation (11) with different values of stiffness coefficient and threshold . The numerical results for various values of and are presented in TABLE 1 for both BFGS and Basin Hoping. Each cell contains the value of the cost function for the result of the procedure for one value of , one of and one method. The starting point for NLM optimization phase could be the result of Multidimensional Scaling. But this requires the knowledge of all distances. In our case, we wish to avoid the use of distances larger than a given threshold (even if they have been measured). Therefore, the starting point is the point cloud built on distances as computed by Isomap on the partial distance matrix of distances . This yields the following observations:

• The cost function for Isomap + BFGS is always equal to or lower than the cost function for Isomap + basin hoping. For this type of problem, Isomap + BFGS is recommended (we have tested other methods, results not shown, like simulated annealing, for which results were worse).

• The cost function decreases when increases for a given , and is less sensitive to . If

is too small, there is a significative probability that the graph of partial distanes is not connected. When

and are low, the optimization step may not converge.

A picture of the datasets obtained by each method (Isomap, Isomap + BFGS, Isomap + basin hopping) is displayed in FIGURE 1. The eye can recognize a deformed ring in the bottom right graph, namely the best reconstruction with basin hopping. The optimization scheme has been trapped on this pattern. One is tempted to twist the outer small loop to recover a shape close to a ring. The fact that this twist (ouwards like here, or inwards in some other simulations) of a fraction of the ring is a trap can be heuristically understood: the main discrepancy between exact distances and reconstructed distances is for those points which have been twisted outward. They are much closer to the points on the opposite on the ring on the reconstruction than in the initial data set (top left). However, the stiffness of the decrease of the weights for lets the cost function be nearly insensitive to those discrepancies. The role of stiffness is then more important than the role of threshold as the weight of pairs of points separated by a large distance (like ) is annihilated by a very low weight ( for and ). Careful observation of many simulations lead to the observation that similar low cost function values may correspond to very different geometric settings for the solution.

## 7 Conclusion

We have set a framework to study continuity of the solution of (9) when the weight matrix varies. The main point to address is that a solution to is never a single point cloud, but always a set of point clouds, a union of orbits of the action of the group of isometries on . We have exhibited an example where the solution is not continuous at a weight matrix where the realization of the solution is not rigid. We expect that it is continuous when realizations are rigid, but this has not been shown here. This is deferred to further work on study of the topological structure of the solutions in relation with the weight matrices. We have linked DGP and NLM in a common framework using this continuity. We have studied whether the continuity of the solution can serve as a basis for ensuring the continuity of numerical solutions when a parameter varies in the weight matrix. Therefore, we have built a simple in silico dataset, and derived a procedure with the output of Isomap as initial point for optimization step in NLM, with two optimization schemes (BFGS and Basin Hopping). We have produced good hints to show convergence of NLM solution to DGP solution (a situation when ). We have shown as well that different geometric settings of the solution may correspond to similar very low values of the cost function. It is likely that this is due to the complicated shape of the manifold of cost function as function of coordinates of a point cloud, and motivates further studies. Figure 1: Results of numerical simulations for reconstructing the point cloud knowing pairwise distances lower than a given threshold only. Parameters: θ=0.5 and a=10 (see text). Cost function for BFGS is ϕ=1.018 and for Basin Hopping ϕ=2.211. Top left: Simulated dataset. Top right: dataset reconstructed with standard Isomap procedure. Bottom: dataset reconstructed with BFGS optimization scheme (left) or Basin Hopping (right) with top right dataset as initial values.

## References

•  Cox, T., Cox, M.A.A.: Multidimensional Scaling - Second edition, Monographs on Statistics and Applied Probability, vol. 88. Chapman & al. (2001)
•  Cristianini, N., Shawe-Taylor, J.: An introduction to Support Vector Machines and other Kernel-based learning methods. Cambridge University Press (2000)
•  Gusfield, D.: Algorithms on strings, trees, and sequences. Cambridge University Press, Cambridge, UK (1997)
•  Hillis, D.M., Moritz, C., Mable, B.: Molecular Systematics. Sinauer, Sunderland, Mass. (1996)
•  Izenman, A.J.: Modern Multivariate Statistical Techniques. Springer, NY (2008)
•  Krantz, S.G., Parks, H.R.: Geometric integration theory. Birkaüser (2008)
•  Kruskal, J.B.: Non Metric Multidimensional Scaling: a numerical approach. Psychometrika 29(2), 115–129 (1964)
•  Le Thi, H.A., Pham Dinh, T.: DC programming Approaches for Distance Geometry Problems, chap. 13 in Distance Geometry, Theory, Methods and Applications, Mucherino & al. (Ed.), pp. 225–290. Springer (2013)
•  Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer, NY (2007)
•  Liberti, L., D’Ambrosio, C.: The Isomap Algorithm in Distance Geometry. In: Iliopoulos, C.S., Pissis, S.P., Puglisi, S.J., Raman, R. (eds.) 16th International Symposium on Experimental Algorithms (SEA 2017). pp. 5:1–5:13. Leibniz International Proceedings in Informatics (2017)
•  Liberti, L., Lavor, C., Maculan, N., Mucherino, A.: Euclidean distance geometry and applications. SIAM review 56(1), 3–69 (2014)
• 

Mardia, K.V., Kent, J., Bibby, J.M.: Multivariate Analysis. Probability and Mathematical Statistics, Academic Press, (1979)

•  Moré, J.J., Wu, Z.: Global continuation for Distance geometry Problems. SIAM J. Optim. 7(3), 814–836 (1997)
•  Mucherino, A., Lavor, C., Liberti, L., Maculan, N. (eds.): Distance Geometry. Springer (2013)
•  Sammon, J.W.: A nonlinear mapping algorithm for data structure analysis. IEEE Transactions on Computers 18(5), 401–409 (1969)
•  Saxe, J.B.: Embeddability of weighted graphs in k-space is strongly NP-hard. In: 17th Allerton Conference in Communication Control and Computing. pp. 480–489 (1979)
•  Tennenbaum, J.B., de Silva B., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
•  Torgerson, W.S.: Multidimensional Scaling: I. Theory and Method. Psychometrika 17(4), 401–419 (1952)
•  Wales, D.J., Doye, J.P.K.: Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms. Journal of Physical Chemistry A 101(5111-5116) (1997)