1 Introduction
We consider the following problem [28, 19], which arises in the determination of protein structure from distance data [27, 22], as well as in the study of rigid graphs constructed by “Henneberg type 1 moves” [13, 34]:
Discretizable Distance Geometry Problem (DDGP). Given an integer , a simple undirected graph with an edge weight function , and a vertex order on such that:
(the subgraph of induced by ) is a clique of size , where
,
determine if there is an embedding such that:
(1)
The DDGP is a subclass of the more general Distance Geometry Problem (DGP) [30, 19]: given as above, determine if there is a realization satisfying Eq. (1). An embedding satisfying Eq. (1) is called a realization. With a slight abuse of notation we shall also refer to an “invalid realization” to denote an embedding which does not satisfy Eq. (1), as well as, pleonastically, to a “valid realization”. We also note that the vertex order and the sequence of sets need not be unique.
Note that a realization in of a graph on vertices can be represented as an matrix. The symmetric zerodiagonal matrix having as its th entry is a squared Euclidean Distance Matrix (EDM). It turns out that (where is the centering matrix and
is the allone vector) is the
Gram matrix of the realization , i.e. [33, 9]. Moreover, , which implies that [10]; since , we also have .Given a realization , we can compute the corresponding EDM by evaluating all Euclidean distances between . Given an EDM for , we can compute a valid realization by obtaining the Gram matrix in function of as explained above, and then factoring using the spectral decomposition , where
is a matrix of eigenvectors and
a diagonal matrix of corresponding eigenvalues. Then
is a valid realization of (not necessarily equal to ).As mentioned above, the DDGP is a subclass of the DGP including all instances having a vertex order that ensures that the first vertices form a clique in , and for each remaining vertex there is a set , of vertices, each of which is adjacent to and precedes in the given order. This structure allows the application of a certain geometric operation called trilateration [19] (see Sect. 2.1 below). Trilateration determines, almost surely, at most two positions for vertex using the distances for each . We remark that, when generalized to arbitrary , trilateration is sometimes called lateration. Moreover, it takes polynomial time in [2]. Since is usually fixed in applications, it takes constant time.
Trilateration is a construction also known as “Henneberg type 1” [13, 34], which entails that all DDGP graphs are rigid. In particular, they have a finite number of incongruent realizations. This follows by definition of rigidity: every isometric continuous motion of a subset of vertices must involve all vertices, and hence be a congruence.
The DDGP is also a superclass of the Discretizable Molecular Distance Geometry Problem (DMDGP), which requires each to consist of the immediate predecessors of . The DMDGP [35, 16], the DDGP [28] and the DGP [32] are all complete. Given a DGP instance, recognizing whether it is DDGP is known as the Trilateration Ordering Problem (TOP); recognizing whether it is DMDGP is known as the Contiguous TOP (CTOP). It turns out that TOP is complete, but it is in for every fixed , whereas CTOP is complete even for fixed [6]. For many protein graphs, however, it is possible to construct a contiguous trilateration order efficiently from the protein backbone [17, 7, 15], which makes the DMDGP a practically interesting class [18].
Repeated trilateration applied to DDGP and DMDGP instances yields an exact algorithm (in the real RAM model [3]), as follows. The realization of the initial clique of size can be carried out in constant time (assuming fixed) by trilateration; then for each subsequent we construct two alternative positions (again in constant time by trilateration), and branch on them. We verify whether neither, one of them, or both satisfy the distances to the predecessors of not in (if any), and prune those which do not. We obtain a tree search, called BranchandPrune (BP) [4, 21], over the set of possible positions for vertices . This tree has width at most and depth at most . If has depth , then no positions could be found for vertex , which means that the instance is NO. Otherwise, the instance is YES; and any sequence of positions found by the BP for all vertices in , where is a sequence of , is a realization of which certifies a YES. We recall that this certificate is only valid in the real RAM model, which describes a computer able to represent real numbers exactly. In practice, we take , perform operations in floating point, and attempt at minimizing numerical errors using a variety of techniques [31, 27, 5, 11, 26].
We remark that the tree is a graph defined over , and is therefore itself naturally embedded in . Limited to the DMDGP only, two invariant groups of the embedding of were described in [25, 24]. Both groups are reflection groups. The discretization group is the invariant group of maximum width trees with leaf nodes, where each vertex is adjacent only to the predecessors in (and possibly some successors); unsurprisingly, it has cardinality power of two. The pruning group, a subgroup of the discretization group, is the invariant group of the more general case where vertices may be adjacent to the predecessors in but also to other predecessors. More surprisingly, the pruning group also has power of two many elements. The simple expressions of the cardinalities of these groups derived in [24] were used to argue that the BP algorithm is FixedParameter Tractable (FPT) [23]. It also allowed the determination of the number of incongruent solutions [20], and of a new “pruning device” for the BP algorithm [29] based on symmetry. On the other hand, it was also shown that random DMDGP instances are unlikely to possess large pruning groups [8], and, in particular, that this likelihood rapidly decreases with size.
The techniques used for the structure determination of the discretization and pruning groups are specific to the DMDGP. No easy extension to the DDGP was found so far using those techniques (see [1] for an attempt). In this paper, we propose a new theoretical analysis of the number of solutions of the DDGP. Specifically, we show that an a priori computation (i.e. before running the BP algorithm on the given instance) of the number of incongruent solutions of a DDGP instance is only possible in those instances for which each set of adjacent predecessors induces a clique of size in the given graph . For those instances, we prove that the number of incongruent realizations is almost surely a power of two, similarly to the DMDGP.
The rest of this paper is organized as follows. In Sect. 2, we analyse the difference between DMDGP and DDGP, we recall the trilateration operation, and give a formal definition of “combinatorial counting”. In Sect. 3 we prove our impossibility result, and present a simple subclass of the DDGP where combinatorial counting is possible.
2 Preliminary notions and definitions
We recall that most of the properties discussed above only hold almost surely: this occurs because trilateration may fail to work as expected with probability zero, notably when the points realizing vertices in
are not in general position [12, p. 20]: if is a realization of in general position and then, for each with , spans an affine subspace of dimension .It is always possible to construct infinite families of instances where the edge weight function is carefully chosen so that there may be more than two possible positions for vertex using trilateration [24]. But these families all have measure zero in the set of all DDGP (and DMDGP) instances. The same holds for all of the results in this paper. For brevity, we shall refer to “probability zero instances” (those over which trilateration fails) and “probability one instances” (the rest) — also see Sect. 2.2 below.
The only difference between DMDGP and DDGP is that the sets of adjacent predecessors must also be immediate in the former case, namely . This directly implies that each must be a clique of size in , which is the property which made it possible to study the symmetries and number of solutions of the DMDGP using the techniques sketched above. DDGP instances may not have this property, however.
For each we let , and . Moreover, let:

be the neighbourhood of ;

.
We partition the edge set into the discretization edges and pruning edges .
2.1 The trilateration operation
Given points and their distances to an unknown point , can be determined by solving the quadratic system of equations in unknowns
(2) 
The trilateration operation is as follows:

Rewrite Eq. (2) as .

Arbitrarily choose one of these equations, e.g. the th one, and form the system of equations in unknowns given by the difference of the th equation with the th one; this removes the term from all equations, leaving the following (after some rearrangements):
(3) which is a linear underdetermined system in .

We replace in and obtain a quadratic equation in the single unknown . We solve this equation and obtain two solutions with probability 1, yielding two positions for by using Eq. (4).

Finally, we check that satisfy the original equations Eq. (2). If they do, the system has two solutions with probability 1. Otherwise, it is infeasible.
We denote by the trilateration operation in order to determine the position of vertex in function of the positions . We remark that either or almost surely.
2.2 What we mean by “counting”
In the real RAM model, the DDGP problem contains an uncountable number of instances, since the edge weight function maps to the real numbers. This allows us to make statements with some probability (usually zero or one). The trilateration operation, for example, determines zero or two positions for a vertex with probability one (Sect. 2.1). If certain special relations between the edge weights hold, it might also determine a single position, or uncountably many [14]. These relations between the edge weights induce relations between the points in the valid realizations of the graph, which turn out to be linear equations such as Eq. (3).
It is intuitive to think that when one or more edge weights continuously change their values within some small enough interval, the positions of the adjacent vertices typically trace continuous trajectories in space.
2.1 Example
Consider a triangle graph over with and , embedded in . If and , then moves continuously on the segment for as move continuously in (see Fig. 1, left).
At (corresponding to ) the three points are aligned, and therefore their affine span has deficient rank equal to : this is a “probability zero” realization. All of the other values in the interval define a nontrivial isosceles triangle having full affine span rank .
A different choice of might have yielded an interval where the affine span rank of the associated realization is always full, e.g. . For more complicated graphs it is possible to have situations where both endpoints of the interval yield realizations of deficient ranks.
Suppose now that we add another vertex (labelled by ) to the triangle graph above. We let be adjacent to with edge weights and . We consider realizations in . When we apply the trilateration operation to the probability zero realization , , , can move in a circle of radius and centered at . In other words, this trilateration operation finds an uncountable number of positions for (see Fig. 1, right).
In light of Example 2.1, we can also define probability zero events over DDGP instances as follows: we construct an uncountable set of DDGP instances where the edge weights are allowed to vary over given intervals, and show that the probability zero event only holds at a finite or countable number of values of the weights in the corresponding intervals.
The goal of this paper is to count realizations of DDGP instances a priori. The counting methods we consider may not take any feature of the solution into account (for otherwise, the counting problem would be solved by finding all of the finitely many incongruent solutions and counting them). Moreover, we want to avoid events leading to failure of the trilateration operation, such as e.g. those shown in Example 2.1. Since these events happen with probability zero, they can be ignored by only considering combinatorial counting methods, i.e. those methods which only consider the graph topology.
3 Can we count DDGP realizations combinatorially?
In this section we claim that we can only (combinatorially) count realizations for a special subclass of DDGP instances, namely when induces a clique of size in for all . Our argument is based on the (easier) case of YES instances with no pruning edges.
For each let be the number of positions, found by the BP algorithm for vertex , which eventually lead to a valid realization of . We assume that the given DDGP instance is YES, and, wlog, that . Moreover, since the only possible choice for is , which are the immediate predecessors of , the DMDGP and DDGP coincide on instances of size , which implies that [16].
We start with the trivial observation that, by trilateration, there are two positions for vertex for each position of vertex :
(5) 
We now look at conditions which might cause to be strictly less than , discounting those which hold with probability zero. More precisely, we assume that the given DDGP instance is a probability one instance, and that all realizations of are in general position.
Given a realization of we let be the EDM of . If we also let be the EDM of . For brevity we also denote simply by , if no ambiguity should arise in .
3.1 Remark
If is a realization of and , then we can write the EDM of in the form below:
(6)  
where is expressed in function of , whereas the last row and column is expressed in function of the known edge weights .
3.2 Lemma
Consider a YES DDGP instance and a valid realization of . If is a valid EDM, then is also a valid EDM.
Proof.
Assume is a valid EDM but is not: then cannot be a valid realization of , against the assumption. ∎
3.3 Proposition
Consider a YES DDGP instance, and let such that . If is a valid EDM for any possible realization of , then
(7) 
Proof.
Let such that . Assume that every realization of yields a matrix which is a valid EDM. By definition, each of the possible positions for vertex gives rise to a valid realization of . By Lemma 3.2, every is a valid EDM. By trilateration, there are two positions for vertex for each , which yields Eq. (7). ∎
3.1 An impossibility result
The counterexample below shows what can go wrong if the condition of Prop. 3.3 is not met.
3.4 Example
Consider the graph on with edges
and edge weights , , , , realized in . We assume that , , . There are two possible positions for vertex , namely , , as shown in Fig. 2. However, cannot form a triangle with segments realizing both having unit length, since , which negates the triangular inequality on . On the other hand, the position is compatible with .
In this case, trilateration would return as the singleton , rather than ensuring as expected. Note that the above instance is not a “probability zero instance”, as all ’s are realized in general position. Generalizations of this counterexample can be obtained for all .
The counterexample in Ex. 3.4 showcases the necessity of the condition that each matrix needs to be a valid EDM. Verifying this condition involves checking that all of the matrices are Gram. By the equivalence of Gram and positive semidefinite (psd) matrices, this is equivalent to verifying that all of the ’s are psd.
3.5 Theorem
The solutions of the DDGP cannot be counted combinatorially.
Proof.
Our definition of combinatorial counting (Sect. 2.2) states that acceptable methods may not consider the edge weights. We now construct an uncountable family of DDGP instances for which trilateration finds 0,1 or 2 positions for a certain vertex, all with positive probability. This shows that the edge weights must necessarily be taken into account by any counting method, and hence that this counting method cannot be combinatorial. We consider the case of Example 3.4: our strategy is to define intervals for and such that: (i) at the lower extrema trilateration on finds two valid positions for , (ii) at the upper extrema trilateration on only finds one valid position (and hence fails) for , and (iii) there are neighbourhoods of these extrema for which the same behaviours hold. This will show that the probability of trilateration failure to find either or positions is nonzero, and depends on the edge weights only. Therefore there can be no general combinatorial counting method for dealing with the totality of DDGP instances.
In the rest of the proof (which simply consists of a long but easy symbolic calculation) we sometimes indicate distance between two vertices by for brevity. We generalize the instance in Example 3.4 to the uncountable family of instances given by and , for some small enough . If we take the lower extrema of both intervals and we obtain and , whence
When is negligible, we have and the same for , which implies that both positions for vertex yield a distance that satisfies the triangular inequality. As grows, decreases, which means that it satisfies the triangular inequality for all values of in the respective intervals (as verified in Ex. 3.4). We want to find the value of at which satisfies the triangular inequality at equality, namely . This happens at , namely , i.e. when . Since we assumed , is the only value for which . Thus, the family of DDGP instances under scrutiny has the property that vertex has two valid positions (almost surely) for , , only one position () for , , and zero positions in the remaining cases where no position for vertex exists.
In other words, assuming uniform probability distributions over the two distance intervals for
, we have shown that this DDGP instance family has (almost surely) solutions (for some ) with probability , solutions with probability , and solutions in the remaining events where is towards the lower extremum while is towards the upper one and vice versa, which have joint probability . Note that , as claimed. ∎We note that it is also hard to imagine the existence of noncombinatorial counting method which does not require the realizations of prior to counting (Sect. 2.2): as mentioned above, we need to check that all of the matrices are psd, which typically requires the knowledge of the entries of , which in turn requires , and hence the realizations of , to be known a priori.
Thm. 3.5 does not prevent the existence of counting techniques for subclasses of the DDGP, or based on a condition involving other parameters than (such as e.g. the smallest eigenvalue over all being nonzero, which would make it easy to prove positive semidefiniteness), or taking into account special structures in the pruning edges.
3.2 A sufficient condition
A combinatorial condition making sure that the are valid EDMs is that should be a clique.
3.6 Corollary
Let such that . If is a clique of size in , then .
Proof.
It suffices to remark that, since all of the ’s are valid realizations of , they must satisfy the given distance constraints. Therefore, is simply the EDM for the clique , which is constant since the distance values are given for all the edges, and does not depend on . Since we are assuming that the DDGP instance is YES, is a valid EDM. For the same reason, is also a valid EDM. ∎
We also remark that Cor. 3.6 cannot be improved in general terms, for example by asking that is a clique without one or a few edges, since Ex. 3.4 portrays a failure when a single edge is missing from the clique on .
This shows that a combinatorial counting of the number of solutions of DDGP instances prior to actually solving the instance is only possible in the special case where all of the ’s induce cliques of size in . We call the class of such DDGP instances the combinatorial DDGP.
3.7 Corollary
For a combinatorial DDGP instance with discretization edges only, the number of incongruent realizations of is almost surely.
Proof.
This follows by , , and Cor. 3.6. ∎
We remark that Cor. 3.7 applies to DMDGP instances. This provides an alternative proof to the result that DMDGP instances with discretization edges only have incongruent solutions almost surely.
4 Conclusion
An important property of DMDGP orders, mainly in applications related to protein conformation, is that given a DMDGP solution (calculated by any algorithm applied to the DMDGP), all the others can be obtained just using the DMDGP symmetries [29]. Whether there are symmetric properties similar to the DMDGP case at least for the combinatorial DDGP remains an open question.
Acknowledgements
CL is grateful to FAPESP and CNPq for support. LL is partly supported by the European Union’s Horizon 2020 research and innovation programme under the Marie SklodowskaCurie grant agreement n. 764759 ETN “MINOA”. AM and LL are grateful to ANR for partly supporting this research under PRCI grant “MultiBioStruct”.
References
 [1] G. Abud, J. Alencar, C. Lavor, L. Liberti, and A. Mucherino. The discretization and incident graphs for discretizable distance geometry. Optimization Letters, 14:469–482, 2020.
 [2] J. Alencar, C. Lavor, and L. Liberti. Realizing euclidean distance matrices by sphere intersection. Discrete Applied Mathematics, 256:5–10, 2019.

[3]
L. Blum, M. Shub, and S. Smale.
On a theory of computation and complexity over the real numbers: NPcompleteness, recursive functions, and universal machines.
Bulletin of the AMS, 21(1):1–46, 1989.  [4] R. Carvalho, C. Lavor, and F. Protti. Extending the geometric buildup algorithm for the molecular distance geometry problem. Information Processing Letters, 108:234–237, 2008.
 [5] A. Cassioli, B. Bordeaux, G. Bouvier, A. Mucherino, R. Alves, L. Liberti, M. Nilges, C. Lavor, and T. Malliavin. An algorithm to enumerate all possible protein conformations verifying a set of distance constraints. BMC Bioinformatics, 16:23–38, 2015.
 [6] A. Cassioli, O. Günlük, C. Lavor, and L. Liberti. Discretization vertex orders for distance geometry. Discrete Applied Mathematics, 197:27–41, 2015.
 [7] V. Costa, A. Mucherino, C. Lavor, A. Cassioli, L. Carvalho, and N. Maculan. Discretization orders for protein side chains. Journal of Global Optimization, 60:333–349, 2014.
 [8] C. D’Ambrosio, Ky Vu, C. Lavor, L. Liberti, and N. Maculan. New error measures and methods for realizing protein graphs from distance data. Discrete and Computational Geometry, 57(2):371–418, 2017.
 [9] J. Dattorro. Convex Optimization and Euclidean Distance Geometry. , Palo Alto, 2015.
 [10] I. Dokmanić, R. Parhizkar, J. Ranieri, and M. Vetterli. Euclidean distance matrices: Essential theory, algorithms and applications. IEEE Signal Processing Magazine, 10535888:12–30, Nov. 2015.
 [11] D. Gonçalves and A. Mucherino. Discretization orders and efficient computation of cartesian coordinates for distance geometry. Optimization Letters, 8:2111–2125, 2014.
 [12] J. Graver, B. Servatius, and H. Servatius. Combinatorial Rigidity. AMS, 1993.
 [13] L. Henneberg. Die Graphische Statik der starren Systeme. Teubner, Leipzig, 1911.
 [14] C. Lavor, J. Lee, A. LeeSt. John, L. Liberti, A. Mucherino, and M. Sviridenko. Discretization orders for distance geometry problems. Optimization Letters, 6:783–796, 2012.
 [15] C. Lavor, L. Liberti, B. Donald, B. Worley, B. Bardiaux, T. Malliavin, and M. Nilges. Minimal NMR distance information for rigidity of protein graphs. Discrete Applied Mathematics, 256:91–104, 2019.
 [16] C. Lavor, L. Liberti, N. Maculan, and A. Mucherino. The discretizable molecular distance geometry problem. Computational Optimization and Applications, 52:115–146, 2012.
 [17] C. Lavor, A. Mucherino, L. Liberti, and N. Maculan. On the computation of protein backbones by using artificial backbones of hydrogens. Journal of Global Optimization, 50:329–344, 2011.
 [18] C. Lavor, M. Souza, L. Mariano, and L. Liberti. On the polinomiality of finding DMDGP reorders. Discrete Applied Mathematics, 267:190–194, 2019.
 [19] L. Liberti and C. Lavor. Euclidean Distance Geometry: An Introduction. Springer, New York, 2017.
 [20] L. Liberti, C. Lavor, J. Alencar, and G. Abud. Counting the number of solutions of DMDGP instances. In F. Nielsen and F. Barbaresco, editors, Geometric Science of Information, volume 8085 of LNCS, pages 224–230, New York, 2013. Springer.
 [21] L. Liberti, C. Lavor, and N. Maculan. A branchandprune algorithm for the molecular distance geometry problem. International Transactions in Operational Research, 15:1–17, 2008.
 [22] L. Liberti, C. Lavor, N. Maculan, and A. Mucherino. Euclidean distance geometry and applications. SIAM Review, 56(1):3–69, 2014.
 [23] L. Liberti, C. Lavor, and A. Mucherino. The discretizable molecular distance geometry problem seems easier on proteins. In A. Mucherino, C. Lavor, L. Liberti, and N. Maculan, editors, Distance Geometry: Theory, Methods, and Applications, pages 47–60. Springer, New York, 2013.
 [24] L. Liberti, B. Masson, C. Lavor, J. Lee, and A. Mucherino. On the number of realizations of certain Henneberg graphs arising in protein conformation. Discrete Applied Mathematics, 165:213–232, 2014.
 [25] L. Liberti, B. Masson, J. Lee, C. Lavor, and A. Mucherino. On the number of solutions of the discretizable molecular distance geometry problem. In Combinatorial Optimization, Constraints and Applications (COCOA11), volume 6831 of LNCS, pages 322–342, New York, 2011. Springer.
 [26] T. Malliavin, A. Mucherino, C. Lavor, and L. Liberti. Systematic exploration of protein conformational space using a distance geometry approach. Journal of Chemical Information and Modeling, 59:4486–4503, 2019.
 [27] T. Malliavin, A. Mucherino, and M. Nilges. Distance geometry in structural biology: new perspectives. In A. Mucherino, C. Lavor, L. Liberti, and N. Maculan, editors, Distance Geometry: Theory, Methods, and Applications, pages 329–350. Springer, New York, 2013.
 [28] A. Mucherino, C. Lavor, and L. Liberti. The discretizable distance geometry problem. Optimization Letters, 6:1671–1686, 2012.
 [29] A. Mucherino, C. Lavor, and L. Liberti. Exploiting symmetry properties of the discretizable molecular distance geometry problem. Journal of Bioinformatics and Computational Biology, 10:1242009(1–15), 2012.
 [30] A. Mucherino, C. Lavor, L. Liberti, and N. Maculan, editors. Distance Geometry: Theory, Methods, and Applications. Springer, New York, 2013.
 [31] A. Mucherino, C. Lavor, T. Malliavin, L. Liberti, M. Nilges, and N. Maculan. Influence of pruning devices on the solution of molecular distance geometry problems. In P. Pardalos and S. Rebennack, editors, Experimental Algorithms, volume 6630 of LNCS, pages 206–217, Berlin, 2011. Springer.
 [32] J. Saxe. Embeddability of weighted graphs in space is strongly NPhard. Proceedings of 17th Allerton Conference in Communications, Control and Computing, pages 480–489, 1979.
 [33] I. Schoenberg. Remarks to Maurice Fréchet’s article “Sur la définition axiomatique d’une classe d’espaces distanciés vectoriellement applicable sur l’espace de Hilbert”. Annals of Mathematics, 36(3):724–732, 1935.
 [34] T.S. Tay and W. Whiteley. Generating isostatic frameworks. Structural Topology, 11:21–69, 1985.
 [35] Y. Yemini. Some theoretical aspects of positionlocation problems. In Proceedings of the 20th Annual Symposium on the Foundations of Computer Science, pages 1–8, Piscataway, 1979. IEEE.
Comments
There are no comments yet.