 # Analysis of Farthest Point Sampling for Approximating Geodesics in a Graph

A standard way to approximate the distance between any two vertices p and q on a mesh is to compute, in the associated graph, a shortest path from p to q that goes through one of k sources, which are well-chosen vertices. Precomputing the distance between each of the k sources to all vertices of the graph yields an efficient computation of approximate distances between any two vertices. One standard method for choosing k sources, which has been used extensively and successfully for isometry-invariant surface processing, is the so-called Farthest Point Sampling (FPS), which starts with a random vertex as the first source, and iteratively selects the farthest vertex from the already selected sources. In this paper, we analyze the stretch factor F_FPS of approximate geodesics computed using FPS, which is the maximum, over all pairs of distinct vertices, of their approximated distance over their geodesic distance in the graph. We show that F_FPS can be bounded in terms of the minimal value F^* of the stretch factor obtained using an optimal placement of k sources as F_FPS≤ 2 r_e^2 F^*+ 2 r_e^2 + 8 r_e + 1, where r_e is the ratio of the lengths of the longest and the shortest edges of the graph. This provides some evidence explaining why farthest point sampling has been used successfully for isometry-invariant shape processing. Furthermore, we show that it is NP-complete to find k sources that minimize the stretch factor.

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In this work, we analyze the stretch factor of approximate geodesics computed on triangle meshes or more generally in graphs. In our context, a triangle mesh represents a discretization of a two-dimensional manifold, possibly with boundary, embedded in for a constant dimension using a set of vertices , edges , and triangular faces . Such a triangle mesh can be viewed as a (hyper)graph such that each edge is adjacent to one or two triangles and the triangles incident to an arbitrary vertex can be ordered cyclically around that vertex. Due to discretization artifacts, the triangle mesh may contain holes, and different parts of the surface may even intersect in the embedding. However, we assume that the graph structure is planar and connected (see Figure 1 for an illustration). For , this definition of a triangle mesh is commonly used in geometry processing when analyzing models obtained by scanning real-world objects.

Given a connected planar triangle graph with vertices, where every edge has a positive length (or weight), we consider the problem of approximating the geodesic distances between pairs of vertices in , where distances are measured in the graph theoretic sense. Specially, given an integer , our goal is to select a set of vertices of that minimize the stretch factor, defined as the value

 max(p,q)∈V, p≠q minsi∈Sd(p,si)+d(si,q)d(p,q),

where the function measures the shortest geodesic distance between two vertices. Throughout this paper, we use for simplicity the notation for .

The problem of approximating geodesic distances on surfaces represented by triangle meshes arises when studying shapes (usually shapes embedded in or in ) that are isometric, which means that they can be mapped to each other in a way that preserves geodesics. Isometric shapes have been studied extensively recently [3, 5, 12, 16, 17] because many shapes, such as human bodies, animals, and cloth, deform in a near-isometric way as a large stretching of the surfaces would cause injury or tearing. Figure 1: The graph G=(V,E,F) is a triangle mesh, while G′ has a non-triangular face, and G′′ is not connected.

In order to analyze near-isometric shapes, it is commonly required to compute, on each shape, geodesics between many pairs of vertices, and to compare the corresponding geodesics in order to compute the amount of stretching of the surface. Consider the problem of computing geodesic distances on a surface represented by a triangle mesh, where distances are measured on the graph induced by the vertices and edges of the triangulation. To solve the single-source shortest-path (SSSP) problem on , Dijkstra’s algorithm  takes time.111Alternatively, we can use the linear time algorithm of  for the SSSP problem, as the underlying graph is planar. To solve the all-pairs shortest-path (APSP) problem, we can run Dijkstra’s algorithm starting from each source point, yielding an time algorithm. While there are more efficient methods, the APSP problem has a trivial lower bound. In practical applications, a typical triangle mesh may contain from to vertices and, for such large meshes, it is impractical to use algorithms that take time.

To allow for a reduced complexity, instead of considering the APSP problem, we consider the problem of pre-computing a data structure that allows to efficiently approximate the distance between any two points. We call this the any-pair approximate shortest path problem in the following.

A commonly used method to solve the any-pair approximate shortest path problem is to select a set of sources, solve the SSSP problem from each of these sources, and use this information to approximate pairwise geodesic distances. Given a pair and of vertices, their shortest distance is approximated as the minimum, over all the sources, of the sum of the distances of and to a source . This method is used to approximate the intrinsic geometry of shapes [3, 5, 13]. A natural problem is thus to compute an optimal placement of sources that minimizes the stretch factor. We refer to this problem as the -center path-dilation problem and show that this problem is NP-complete (see Theorem 6).

A commonly used heuristic for selecting a set of

sources is to use Farthest Point Sampling (FPS) [9, 14], which starts from a random vertex and iteratively adds to a vertex that has the largest geodesic distance to its closest already picked source, until sources are picked. Given sources on a graph , the distance between any two vertices and is approximated as the minimum over all sources, of the distance from to through one of the sources.

FPS has been shown to perform well compared to other heuristics for isometry-invariant shape processing in practice  [19, Chapter 3], which suggests that the stretch factor obtained by a FPS is small. However, to the best of our knowledge, no theoretical results are known on the quality of the stretch factor, , obtained by a FPS of sources, compared to the minimal stretch factor, , obtained by an optimal choice of sources. In this paper, we prove that

 FFPS⩽2r2e(F∗+1)+8re+1,

where is the ratio of the lengths of the longest and the shortest edges of (see Theorem 1). Note that this bound holds for any arbitrary graph.

It should further be observed that if the ratio is large, can be much larger than the optimal stretch factor but, on the other hand, is likely to be large as well. Indeed, if at least edges are arbitrarily small and are not “too close” to each other, can be made arbitrarily large; this can be seen by considering the pairs of vertices defined by those small edges.

After discussing related work in Section 2, we prove our two main results, Theorems 1 and 6 in Sections 3 and 4, respectively.

## 2 Related Work

Computing geodesics on polyhedral surfaces is a well-studied problem for which we refer to the recent survey by Bose et al. . In this paper, we restrict geodesics to be shortest paths along edges of the underlying graph.

The FPS algorithm has been used for a variety of isometry-invariant surface processing tasks. The algorithm was first introduced for graph clustering , and later independently developed for 2D images  and extended to 3D meshes . Ben Azouz et al.  and Giard and Macq  used this sampling strategy to efficiently compute approximate geodesic distances, Elad and Kimmel  and Mémoli and Sapiro  used FPS in the context of shape recognition. Bronstein et al.  and Wuhrer et al.  used FPS to efficiently compute point-to-point correspondences between surfaces. While it has been shown experimentally that FPS is a good heuristic for isometry-invariant surface processing tasks [1, 8, 5, 13, 3, 20], to the best of our knowledge, the worst-case stretch of the geodesics has not been analyzed theoretically.

The problem we study is closely related to the -center problem, which aims at finding centers (or sources) , such that the maximum distance of any point to its closest center is minimized. With the notation defined above, the -center problem aims at finding , such that is minimized. This problem is -hard and FPS gives a -approximation, which means that the centers found using FPS have the property that  .

In the context of isometry-invariant shape processing, we are interested in bounding the stretch induced by the approximation rather than ensuring that every point has a close-by source. A related problem that has been studied in the context of networks by Könemann et al.  is the edge-dilation -center problem, where every point, , is assigned a source, , and the distance between two points and is approximated by the length of the path through and . The aim is then to find a set of sources that minimizes the worst stretch, and Könemann et al. show that this problem is -hard and propose an approximation algorithm to solve the problem.

Könemann et al.  also study a modified version of the above problem, which is similar to our problem. In particular, they present an algorithm for computing sources and claim that it ensures, for our problem, a stretch factor of  [11, Theorem 3]222Theorem 3 in  is stated in a slightly different context but with the notation of that paper, considering for every vertex , the triangle inequality yields the claimed bound.; as before, denotes the minimal stretch factor for the -center path-dilation problem. However, we believe that their proof has gaps. 333For a given stretch factor , their algorithm iteratively includes an endpoint of the shortest edge that cannot yet be approximated with a stretch of at most until no such edges are left. If the solution contains at most sources, a solution with stretch has been found. Their algorithm then essentially does a binary search on the optimal stretch factor . However, this search is done in a continuous interval without stopping criteria. Moreover, since it is a priori possible that for any given , their algorithm returns strictly more than sources, and that may not be exactly reachable by dichotomy, we believe that the stretch factor of is not ensured. It should nonetheless be stressed that our result is independent of whether this bound on holds. Indeed, the relevance of our bound of on is to give some theoretical insight on why FPS has been used successfully in heuristics for isometry-invariant shape processing.

## 3 Approximating Geodesics with Farthest Point Sampling

We start this section with some definitions and notation. We consider a connected graph in which the edges have lengths from a positive and finite interval , and denotes the ratio . We require the graph to be connected so that the distance between any two vertices is finite. In this section, we do not require the graph to satisfy any other criteria, but observe that if it is not planar, the running time of FPS will be , where is the number of edges.

Given vertices (sources) in the graph, let denote the (or a) closest source to a vertex and let denote the shortest path length from to through any source , that is . Let be a choice of sources that minimizes the stretch factor . Furthermore, let be a choice of sources that minimizes . In other words, the set of is an optimal solution to -center path-dilation problem and the set of is an optimal solution to the -center problem.

In this section, we prove the following theorem.

###### Theorem 1.

Let be a set of sources returned by the FPS algorithm on a connected graph with edge lengths of ratio at most . Then

 maxp,qd(p,si,q)d(p,q)⩽2r2emaxp,qd(p,s∗i,q)d(p,q)+2r2e+8re+1.

In order to prove this theorem, we first show a somewhat surprising property that, for any set of sources, the stretch factor is realized when and are adjacent in the graph (Lemma 2). We use this property to bound this stretch factor in terms of (Lemma 3). On the other hand, we bound the stretch factor of any set of sites in terms of the stretch factor of an optimal set of sources for the -center problem (Lemma 5). We then combine these results to prove Theorem 1.

###### Lemma 2.

For any sources and any given vertex in , the maximum ratio is realized for some that is adjacent to in . It follows that the maximum ratio is realized for some and that are adjacent in .

###### Proof.

For the sake of contradiction, let be any fixed vertex and let be a non-adjacent vertex that realizes the maximum and such that among all the vertices that realize this maximum, the shortest path from to has the smallest number of edges.

Let be the immediate neighbor of along the shortest path from to . As before, denotes the shortest path length from to through any source (we use here the notation instead of in order to avoid confusion with ). Let be the length of the edge (see Figure 2). We have . Dividing by we get

 d(~p,sj,q)d(~p,q)⩾d(p,si,q)d(p,q)−ℓ−ℓd(~p,q).

On the other hand, by multiplying by we have

 d(p,si,q)d(p,q)−ℓ = d(p,si,q)d(p,q) + ℓd(p,si,q)d(~p,q)⋅d(p,q),

and therefore

 d(~p,sj,q)d(~p,q)⩾d(p,si,q)d(p,q) + ℓd(~p,q)⋅(d(p,si,q)d(p,q)−1)⩾d(p,si,q)d(p,q),

which contradicts our assumption. Indeed, either the inequality is strict and was not maximum, or the equality holds and the shortest path from to has fewer edges than the shortest path from to . ∎ Figure 2: For the proof of Lemma 2.

The property of the previous lemma that is realized when and are neighbors allows us to bound it as follows.

###### Lemma 3.

For any sources , we have

 2ℓmaxmaxpd(p,sp)−1⩽maxp,qd(p,si,q)d(p,q)⩽2ℓminmaxpd(p,sp)+1.
###### Proof.

For the upper bound, we have . Therefore, This holds for any vertices and and thus for those that realize the maximum of . Furthermore, and . Hence,

 maxp,qd(p,si,q)d(p,q)⩽2ℓminmaxpd(p,sp)+1.

For the the lower bound, we have by the triangle inequality that, for any , . Adding on both sides, we get . By the definition of , for any , thus . This holds for any and thus for the such that is minimum, hence . Dividing by , we get This holds for any and and thus for the vertex that realizes the maximum of ; let denote such vertex. We then have that . This holds for any and in particular for the one that realizes . By Lemma 2, the maximum is realized for a that is adjacent to in , thus, for such a , . It follows that

 maxp,qd(p,si,q)d(p,q)⩾maxqd(¯p,si,q)d(¯p,q)⩾2ℓmaxmaxpd(p,sp)−1.

The following lemma bounds the path length between two vertices and passing through in terms of the shortest path between and through any source.

###### Lemma 4.

For any sources , and vertices we have

 d(u,su)+d(su,v)⩽d(u,si,v)+2d(u,v).
###### Proof.

Denote by the source that realizes the minimum . Since by definition , we only have to show that . Using the triangle inequality twice, we have

 d(v,su)⩽d(v,u)+d(u,su)⩽d(v,u)+d(u,si)⩽d(v,u)+d(u,v)+d(v,si),

which concludes the proof. ∎

These results allow us to bound the stretch factor corresponding to the sources returned by the FPS algorithm with respect to the stretch factor corresponding to an optimal choice of sources for the -center problem.

###### Lemma 5.

Let be a set of sources returned by the FPS algorithm and be an optimal set of sources for the -center problem. Then

 maxp,qd(p,si,q)d(p,q)⩽2remaxu,vd(u,s′i,v)d(u,v)+6re+1.
###### Proof.

Since is a set of sources returned by the FPS algorithm, this choice of sources provides a 2-approximation for the -center problem compared to an optimal solution ; in other words,  .

By definition, is the minimum over all (fixed) sources of . Thus, . Moreover, by the triangle inequality, thus . One the other hand, , which is less than or equal to by the 2-approximation property. For clarity, denote by the vertex that realizes the maximum . We then have .

Now, by the triangle inequality, for any vertex . Thus which implies, by Lemma 4, that . Thus, and

 d(p,si,q)d(p,q)⩽2d(u,v)d(p,q)d(u,s′i,v)d(u,v)+6d(u,v)d(p,q)+1.

This inequality holds for any distinct and , and any distinct from (recall that is fixed). Thus it holds for the vertices and that realize and for the that realizes . Such a is a neighbor of by Lemma 2, thus it satisfies . Since for any distinct and , and , we get

 maxp,qd(p,si,q)d(p,q)⩽2ℓmaxℓminmaxu,vd(u,s′i,v)d(u,v)+6ℓmaxℓmin+1.

This finally allows us to prove the main theorem.

###### Proof of Theorem 1.

By Lemma 5 and using the same notation, we have

 maxp,qd(p,si,q)d(p,q)⩽2remaxu,vd(u,s′i,v)d(u,v)+6re+1.

Using the upper bound in Lemma 3 on , we have

 maxp,qd(p,si,q)d(p,q) ⩽ 2re(2ℓminmaxpd(p,s′p)+1)+6re+1.

By definition, is an optimal set of sources for the -center problem, that is and thus .

We now apply the lower bound of Lemma 3 to which gives

 2ℓmaxmaxpd(p,s∗p)−1⩽maxp,qd(p,s∗i,q)d(p,q)

and thus

 maxp,qd(p,si,q)d(p,q)⩽2re(ℓmaxℓmin(maxp,qd(p,s∗i,q)d(p,q)+1)+1)+6re+1⩽2r2emaxp,qd(p,s∗i,q)d(p,q)+2r2e+8re+1.

## 4 The Complexity of k-Center Path-Dilation Problem

In this section we consider the complexity of the -center path-dilation problem on triangle graphs, i.e., computing an optimal set of sources that minimizes the stretch factor. The following theorem shows that the decision version of this problem is NP-complete for triangle graphs. Note that this directly yields the NP-completeness for arbitrary graphs (since proving that the problem is in NP is trivial).

###### Theorem 6.

Given a triangle graph , an integer , and a real value , it is NP-complete to determine whether there exists a set of sources such that the stretch factor is at most .

Note that the problem is in NP since, for any set of sources, the stretch factor can be computed in polynomial time. To show the hardness, we provide a reduction from the decision problem related to finding a minimum cardinality vertex cover on planar graphs of maximum vertex degree three . The first step of the reduction uses the following well-known result on embedding planar graphs in integer grids .

###### Lemma 7.

A planar graph with maximum degree can be embedded in the plane using area in such a way that its nodes have integer coordinates and its edges are drawn as polygonal line segments that lie on the integer grid (i.e, every edge consists of one or more line segments that lie on lines of the form or , where and are integers). Figure 3: (a) A planar graph G and (b) the grid-embedding of Gr. The gadget ϱ replacing the edges of Gr and (d) the resulting graph G′.

Consider a planar graph with maximum degree 3, and let be a planar embedding of according to Lemma 7, to which we have added, on each edge , an even number of auxiliary nodes with half-integer coordinates and such that every resulting edge in has length or . (We consider the half-integer grid so that we can ensure that we add an even number of auxiliary nodes on every edge of so that every resulting edge in has length at most 1.) Please refer to Figure 3(a,b) for an illustration. For an edge , we let denote the path in replacing the edge . The endpoints of the paths (i.e., the nodes that are not auxiliary), are called regular nodes. Finally, let . We have the following lemma.

###### Lemma 8.

has a vertex cover of size if and only if has a vertex cover of size .

###### Proof.

Any vertex cover of with size can be extended to a vertex cover of size in by including every other auxiliary node on , for each edge .

Now let be a vertex cover for of size , and suppose there exists a path, , such that neither nor belongs to . Then at least auxiliary nodes from must belong to in order to cover all the edges of this path. However, by using only auxiliary nodes from and adding or to , we still have a vertex cover of the same size, which now contains one of the endpoints of . Continuing this way, we can construct a vertex cover of size for , which includes at least one endpoint from each , for all . Therefore, is a vertex cover for , when restricted to the nodes of (regular nodes). Since a minimum of auxiliary nodes are needed to cover any path , (even if both and belong to the vertex cover), the number of regular nodes selected is at most . This concludes the proof. ∎

Finally, we replace each edge in with a copy of the gadget illustrated in Figure 3(c), and denote the resulting graph by (see figure 3(d)). (We note that each copy is scaled, while maintaining the proportions, to match with the length of the edge it replaces.)

###### Proof of Theorem 6.

Consider the graph , constructed as above from a planar graph with maximum degree and with a gadget such that . The graph can be seen as a union of triangles, and it thus a triangle mesh. We prove in the following that has a vertex cover of size if and only if has sources such that its stretch factor is at most . Hence, the vertex cover problem can be reduced in polynomial time to the problem at hand, which concludes the proof.

We first show that if has a vertex cover of size , then there is a set of sources in whose stretch factor is . If has a vertex cover of size , then has a vertex cover of size by Lemma 8. Recall that this vertex cover of can be obtained from the vertex cover of by adding every other auxiliary node on each edge of . Let this vertex cover of nodes be the choice of sources in . Consider a pair and of nodes. We consider three cases:

1. and belong to the same gadget (the same copy of ). Let , , and denote the nodes of this gadget, as illustrated in Figure 3(c), and suppose, without loss of generality, that is selected as a source. Then, is equal to 1 if or coincides with , it is by definition equal to if , and it is equal to if and or (see Figure 4(a)). Since by definition of the gadget, the maximum of , over all pairs in a gadget, is . Figure 4: For the proof of Theorem 6.
2. and belong to two adjacent gadgets and . Let denote the nodes of the two gadgets. If the node which belongs to both gadgets is selected as a source then, by symmetry, the analysis of Case yields the same bound of . Otherwise, and are sources and, for any two nodes from two different gadgets, the maximum ratio is (the ratio is 1 if or is one of the sources and the ratio is 2 in the other cases; see Figure 4(b)). Therefore, again the maximum ratio is at most .

3. and belong to neither the same gadget nor to two adjacent gadgets. In this case, at least one of the nodes in a shortest path from to is selected as a source, and hence their approximate shortest path equals the geodesic shortest path and .

We thus proved that the stretch factor of , for the selected sources, is .

Conversely, we show that if has sources such that its stretch factor is at most , then has a vertex cover of size . Every gadget must contain at least a source since, otherwise, the vertices and of a gadget with no source are such that . For every gadget in , if vertex or is a source, we select the corresponding vertex in , and if vertex or is a source, we select any endpoint of the edge of corresponding to the gadget. Then, at least one vertex from each edge of the graph is selected. Hence, has a vertex cover of size , which implies that has a vertex cover of size by Lemma 8. This completes the proof. ∎

## 5 Conclusions

We analyzed the stretch factor of approximate geodesics computed as distances through at least one of a set of sources found using farthest point sampling. We showed that can be bounded by , where is stretch factor obtained using an optimal placement of the sources and is the ratio of the lengths of the longest and the shortest edges in the graph. Furthermore, we showed that it is NP-complete to find such an optimal placement of the sources. Note that in many practical applications , which gives some evidence explaining why farthest point sampling has been used successfully for isometry-invariant surface processing.

## Acknowledgments

This research was initiated at Bellairs Workshop on Geometry and Graphs, March 10–15, 2013. The authors are grateful to Prosenjit Bose, Vida Dujmovic, Stefan Langerman, and Pat Morin for organizing the workshop and to the other workshop participants for providing a stimulating working environment.

## References

•  Z. Ben Azouz, P. Bose, C. Shu, and S. Wuhrer. Approximations of geodesic distances for incomplete triangular manifolds. In Canadian Conference on Computational Geometry, 2007.
•  P. Bose, A. Maheshwari, C. Shu, and S. Wuhrer. A survey of geodesic paths on 3d surfaces. Computational Geometry - Theory and Applications, 44(9):486–498, 2011.
•  A. M. Bronstein, M. M. Bronstein, and R. Kimmel. Generalized multidimensional scaling: a framework for isometry-invariant partial surface matching. Proceedings of the National Academies of Sciences, 103(5):1168–1172, 2006.
•  E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1:269–271, 1959.
•  A. Elad and R. Kimmel. On bending invariant signatures for surfaces. Transactions on Pattern Analysis and Machine Intelligence, 25(10):1285–1295, 2003.
•  Y. Eldar, M. Lindenbaum, M. Porat, and Y. Y. Zeevi. The farthest point strategy for progressive image sampling. Transactions on Image Processing, 6(9):1305–1315, 1997.
•  M. R. Garey and D. S. Johnson. The Rectilinear Steiner Tree Problem is NP-Complete. SIAM Journal on Applied Mathematics, 32(4):pp. 826–834, 1977.
•  J. Giard and B. Macq.

From mesh parameterization to geodesic distance estimation.

In European Workshop on Computational Geometry, 2009.
•  T. Gonzales. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293–306, 1985.
•  M. Rauch Henzinger, P. N. Klein, S. Rao, and S. Subramanian. Faster shortest-path algorithms for planar graphs. Journal of Computer and System Sciences, 55(1):3–23, 1997.
•  J. Könemann, Y. Li, A. Sinha, and O. Parekh. An approximation algorithm for the edge-dilation k-center problem. Operations Research Letters, 32(5):491–495, 2004.
•  Y. Lipman and T. Funkhouser. Möbius voting for surface correspondence. Transactions on Graphics, 28(3):72:1–72:12, 2009. Proceedings of SIGGRAPH.
•  F. Mémoli and G. Sapiro. Comparing point clouds. In Symposium on Geometry Processing, pages 32–40, 2004.
•  C. Moenning and N. A. Dodgson. Fast marching farthest point sampling. In Eurographics Poster Presentation, 2003. Technical Report 562, University of Cambridge Computer Laboratory.
•  M. Ruggeri, G. Patane, M. Spagnuolo, and D. Saupe. Spectral-driven isometry-invariant matching of 3d shapes.

International Journal of Computer Vision

, 89(2,3):248–265, 2010.
•  J. Sun, M. Ovsjanikov, and L. Guibas. A concise and provably informative multi-scale signature based on heat diffusion. Computer Graphics Forum, 28(5):1383–1392, 2009. Proceedings of SGP.
•  A. Tevs, A. Berner, M. Wand, I. Ihrke, and H.-P. Seidel. Intrinsic shape matching by planned landmark sampling. Computer Graphics Forum, 29(2):543–552, 2011. Proceedings of Eurographics.
•  L.G. Valiant. Universality Considerations in VLSI Circuits. IEEE Trans. Computers, 30:135–140, 1981.
•  S. Wuhrer. Matching and Morphing of Isometric Models. Ph.D. thesis, Carleton University, Canada, 2009.
•  S. Wuhrer, C. Shu, Z. Ben Azouz, and P. Bose. Posture invariant correspondence of incomplete triangular manifolds. International Journal of Shape Modeling, 13(2):139–157, 2007.