Fast Fréchet Distance Between Curves With Long Edges

10/28/2017 ∙ by Joachim Gudmundsson, et al. ∙ Tulane University The University of Sydney AUT 0

Computing Fréchet distance between two curves takes roughly quadratic time. In this paper, we show that for curves with long edges the Fréchet distance computations become easier. Let P and Q be two polygonal curves in R^d with n and m vertices, respectively. We prove four main results for the case when all edges of both curves are long compared to the Fréchet distance between them: (1) a linear-time algorithm for deciding the Fréchet distance between two curves, (2) an algorithm that computes the Fréchet distance in O((n+m) (n+m)) time, (3) a linear-time √(d) -approximation algorithm, and (4) a data structure that supports O(m^2 n)-time decision queries, where m is the number of vertices of the query curve and n the number of vertices of the preprocessed curve.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Measuring the similarity between two curves is an important problem that has applications in many areas, e.g., in morphing [11], movement analysis [12], handwriting recognition [18] and protein structure alignment [16]. Fréchet distance is one of the most popular similarity measures which has received considerable attentions in recent years. It is intuitively the minimum length of the leash that connects a man and dog walking across the curves without going backward. The classical algorithm of computing the Fréchet distance between curve with total complexity runs in time [2]. The major goal of this paper is to focus on computing the Fréchet distance for a reasonable special class of curves in significantly faster than quadratic time.

1.1 Related Work

Buchin et al [7] gave an lower bound for computing the Fréchet distance. Then Bringmann [5] showed that, assuming the Strong Exponential Time Hypothesis, the Fréchet distance cannot be computed in strongly subquadratic time, i.e., in time for any . For the discrete Fréchet distance, which considers only distances between the vertices, Agarwal et al. [1] gave an algorithm with a (mildly) subquadratic running time of . Buchin et al. [8] showed that continuous Fréchet distance can be computed in expected time. Bringmann and Mulzer [6] gave a -time algorithm to compute a -approximation of the discrete Fréchet distance for any integer . Therefore, an -approximation for any can be computed in (strongly) subquadratic time.

For the continuous Fréchet distance, there are also a few subquadratic algorithms known for restricted classes of curves such as -bounded, backbone and -packed curves. Alt et al. [3] considered -bounded curves and they gave an time algorithm to -approximate the Fréchet distance. A curve is -bounded if for any two points , the union of the balls with radii centered at and contains the whole where is equal to times the Euclidean distance between and . For any , Aronov et al. [4] provided a near-linear time -approximation algorithm for the discrete Fréchet distance for so-called backbone curves that have essentially constant edge length and require a minimum distance between non-consecutive vertices. For -packed curves a -approximation can be computed in time [10]. A curve is -packed if for any ball , the length of the portion of contained in is at most times the diameter of .

1.2 Our Contribution

In this paper, we study a new class of curves, namely curves with long edges, and we show that for these curves the Fréchet distance can be computed significantly faster than quadratic time. In a particular application, one might be interested in detecting groups of different movement patterns in migratory birds that fly very long distances. As shown in Fig. 1, different flyways are comparatively straight and the trajectory data of individual birds usually consists of only one GPS sample per day in order to conserve battery power. Infrequent sampling and the straight flyways therefore result in curves with long edges, and it is desirable to compare the routes of different animals in order to identify common flyways.

Figure 1: There are four typical flyways across the U.S.. Clustering the trajectories by similarity between them provides us to realize the most common movement pattern [17].

We consider the decision, optimization, approximation and data structure problems for the Fréchet distance between two polygonal curves and in with and vertices, respectively, all for the case where all edges of both curves are long compared to the Fréchet distance between them. In Section 3 we present a greedy linear-time algorithm for deciding whether the Fréchet distance is at most , as long as all edges in are at least long and edges in are at least long. In Section 4 we give an algorithm for computing the Fréchet distance in time and a linear-time algorithm to approximate the Fréchet distance up to a factor of . In Section 5 we give a data structure that decides Fréchet distance in query time using space and preprocessing time, where is the number of vertices of the query curve and the number of vertices of the preprocessed curve.

2 Preliminaries

In this section we provide notations and definitions that will be required in the next sections. Let and be two polygonal curves with vertices and , respectively. We treat a polygonal curve as a continuous map where for an integer , and the -th edge is linearly parametrized as , for integer and . A re-parametrization of is any continuous, non-decreasing function such that and . We denote a re-parametrization of by .

We denote the length of the shortest edge in and the length of the shortest edge in by and , respectively. For two points , let denote the Euclidean distance between the points. For , denotes the subcurve of starting in and ending in . Let be a real number. Consider an edge of length . Let be the ball with radius that is centered at a point . The cylinder is the set of points in within distance from , i.e. . is -monotone if (1) and , (2) , and (3) is monotone with respect to the line supporting . A curve is monotone with respect to a line

if it intersects any hyperplane perpendicular to

at most once.

Figure 2: is a free point on a reachable path in , where white space contains free points and gray space contains blocked points. There are a quadratic number of cells containing free space as well as a quadratic number of cells containing blocked space in and all of them may need to be checked to decide reachability for . Note that curves contain short edges as well as long edges compared to and some consecutive purple vertical intervals intersects which cause many possibilities for searching for the reachable path.

2.1 Fréchet Distance and Free-Space Diagram

To compute the Fréchet distance between and , Alt and Godau [2] introduced the notion of free-space diagram. For any , we denote the free-space diagram between and by . This diagram has the domain of and it consists of cells, where each point in the diagram corresponds to two points and . A point in is called free if and blocked, otherwise. The union of all free points is referred to as the free space. A monotone matching between and is a pair of re-parameterizations corresponding to an -monotone path from to within the free space in . The Fréchet distance between two curves is defined as , where is a monotone matching and is called the width of the matching. A monotone matching realizing is called a Fréchet matching. A point is reachable if there exists a Fréchet matching from to in . A Fréchet matching in from to is also called a reachable path for (see Fig. 2). Alt and Godau [2] compute a reachable path by propagating reachable points across free space cell boundaries in a dynamic programming manner, which requires the exploration of the entire and takes time.

2.2 The Main Idea

We set out to provide faster algorithms for the Fréchet distance using implicit structural properties of the free-space diagram of curves with long edges. These properties will allow us to develop greedy algorithms that construct valid re-parameterizations by repeatedly computing a maximally reachable subcurve on one of the curves. Like the greedy algorithm proposed by Bringmann and Mulzer [6], we compute prefix subcurves that have a valid Fréchet distance. However, while the approximation ratio of their greedy algorithm is exponential, the approximation ratio of the algorithm we present in Section 4.2 is constant, because we can take advantage of the curves having long edges. Our assumption on edge lengths is more general than backbone curves, since we do not require that non-consecutive vertices be far away from each other and we do not require any upper bound on the length of the edges.

The free space diagram for curves with long edges is simpler, and intuitively seems to have fewer reachable paths (see Fig. 3). In the remainder of this paper we show that indeed we can exploit this simpler structure to compute reachable paths in a simple greedy manner which results in runtimes that are significantly faster than quadratic. We show that by setting up a greedy-based approach the reachable path, if it exists, can be found efficiently.

Figure 3: for curves with long edges results in fewer reachable paths for . In the remainder of this paper we show how to find such a path without requiring to check the entire free space diagram. Consider the vertical (free) intervals on the lines that correspond to vertices of . The intervals shown in purple are the ones that intersect the green reachable path. Since , no consecutive purple intervals intersect which is a property we exploit.

3 A Greedy Decision Algorithm

In this section we give a linear time algorithm for deciding whether the Fréchet distance between two polygonal curves and in with relatively long edges is at most . In Section 3.1, we first prove a structural property for the case that each edge in is longer than and is a single segment. Afterwards in Section 3.2, we consider the extension to the case that and are two polygonal curves with edges longer than and , respectively. In Section 3.3, we present our greedy algorithm, which is based on computing longest reachable prefixes in with respect to each segment in . In Section 3.4, we provide a critical example in which our greedy algorithm fails when the assumption on the edge lengths does not hold.

3.1 A Simple Fréchet Matching for a Single Segment

In this section we start by introducing the crucial notion of orthogonal matching between a polygonal curve and a single line segment . An orthogonal matching projects each point from to its closest point on . In particular, it maps vertices of either orthogonally to the segment or directly to the endpoints of .

[Orthogonal Matching] Let , be a polygonal curve, and be a line segment. A Fréchet matching realizing is called an orthogonal matching of width if and only if for all , ; see Fig. 4(a).

Figure 4: (a) In this example is -monotone and the green arrows indicate an orthogonal matching between and . (b) Illustration of the case in the proof of Lemma 4 .

Now we state a key lemma that demonstrates that the orthogonal matching between the curve with long edges and the segment exits if and only if , and this is equivalent to being -monotone.

[Orthogonal Matching and Monotonicity] Let , be a polygonal curve such that , and be a line segment. Then the following three are equivalent:

  1. ,

  2. is -monotone,

  3. and admit an orthogonal matching of width at most .

Proof.

For all , define . Because , we know that . Note that (3) implies (1) by Definition 3.1.

To prove (1) (2), if , then it is obvious to see that , , and . It remains to show that is monotone with respect to the line supporting . Let be a monotone matching realizing . For sake of contradiction assume there exists a hyperplane perpendicular to such that intersects in at least two points and , where . Let be the last vertex along , and recall that and are the two vertices of . First assume that . Then lies on the -side of and lies on the -side of . Therefore, because , we know that . Let be two values such that and , where . From and , we know that , which violates the monotonicity of , see  Fig. 4(b). Now consider the case that . Then lies on one side of , and lies entirely on the other side. If , then we know that . But this is not possible since all edges of are longer than . The same argument holds if .

To prove (2) (3), let be an arbitrary hyperplane perpendicular to which splits at some point . Let be the last vertex along . By construction, and are on opposite sides of . Since is on the -side of , must also lie on the -side of , because otherwise would intersect at least twice. On the other hand, , therefore, . Now, let be a matching of width at most such that for all , . Then let be values such that for all . Note that and , therefore, where . This indicates that is a monotone matching of width , therefore is an orthogonal matching width .

In fact Lemma 4 shows that for a curve with long edges, the Fréchet distance to a line segment is determined by examining whether is -monotone or not.

3.2 A Simple Fréchet Matching for More than One Segment

In this section, we extend the matching between a curve and a single line-segment to a matching between two curves and .

[Longest -Prefix] Let , be a polygonal curve, and be a line segment. Define . We call the longest -prefix of with respect to .

We now use the longest -prefix to define an extension of the matching introduced in Definition 3.1. Definition 3.2 is the basis of our greedy algorithm (Algorithm 1) which is presented in the next section. We show that if there exists a matching between two curves, then one can necessarily cut it into orthogonal matchings between each segment in and the corresponding longest -prefix. Before we reach this property, we need the following technical lemma:

[-Ball] Let and let be a polygonal curve such that . Let where . Assume that is the longest -prefix of with respect to , and be a parameter such that is the first point along that intersects . Then .

Proof.

By assumption , we know that , thus exists. Notice that . Let be the hyperplane intersecting that is perpendicular to and is tangent to . Hence splits into two parts, the part on -side and the part that on - side. Let be the last vertex before . By Definition 3.2, , and , then Lemma 4 is -monotone. Thus must lie on the -side of , and in particular inside the cube enclosing (see Fig. 5). Therefore the maximum possible distance between any point in and is . If , then is a line segment and lies trivially inside . ∎

Figure 5: The farthest point in from must lie inside the cube enclosing .

[(-Ball] Let and let be a polygonal curve. Let where . Assume that is the longest -prefix of with respect to and is the first point along that intersects . Then .

Proof.

Although there exists a similar proof in Lemma 11 of Gudmunsson and Smid [14], we describe a slight modification of the proof that is necessary for our setting. Suppose is a Fréchet matching realizing . Let such that is the farthest point to . We need to show that which concludes . Let be two values such that and . Note that there exists some such that . By the triangle inequality we have:

Note that and we can have , hence:

By applying triangle inequality once more the we have:

Therefore, . ∎

Now we show that if , then the two polygonal curves and admit a piecewise orthogonal matching, which can be obtained by computing longest -prefixes of with respect to each segment of . This lemma is the foundation of our greedy algorithm (Algorithm 1).

Figure 6: Given an arbitrary matching (the concatenation of the light and dark green reachable paths), the orthogonal matching (the brown reachable path) between and exists. We construct a matching realizing as the concatenation of the pink and the dark green reachable paths.

[The Cutting Lemma] Let , and let and be two polygonal curves such that and . If , then as the longest -prefix of with respect to exists, and .

Proof.

Let be any Fréchet matching realizing . This corresponds to a reachable path, which shown as the concatenation of light and dark green in the example in Fig. 6. Let be the largest value such that , hence . By Definition 3.2, exists with , and . See the brown reachable path corresponding to the orthogonal matching realizing in Fig. 6. In the remainder of this proof we construct a matching to prove that (the concatenation of pink and dark green paths).

Let be the largest value such that . By Lemma 3.2, . Now let be the smallest value such that . We have , therefore and thus cannot match to any point in . Therefore, , and correspondingly .

Now we construct a new matching realizing as follows: and for all (the dark green reachable path). On the other hand, since (pink point) and (the dark green point), we know that i.e., the pink vertical segment is free. We set, and for all (the pink reachable path). Therefore, we have , which completes the proof. ∎

Now since by Lemma 6 we have ,  Lemma 4 implies that the matching between and is orthogonal. Note that if the last edge in is shorter than , we can adjust the orthogonal matching by simply mapping all points on the last edge to . In addition, if and have long edges then the free-space diagram is simpler than in the general case, since the entire vertical space (the pink segment in Fig. 6) between two points and has to be free and cannot contain any blocked points.

3.3 The Algorithm

In this section we present a linear time algorithm using the properties provided in Section 3.1. At the heart of our decision algorithm is the greedy algorithm presented in Algorithm 1. The input to this DecisionAlgorithm is two polygonal curves and , and . The algorithm assumes that and have long edges. In each iteration the function LongestEpsilonPrefix returns , where is the longest -prefix of with respect to , if it exists. Here, is the parameter of where is the endpoint of the previous longest -prefix with respect to . At any time in the algorithm, if , this means that the corresponding longest -prefix does not exist and then “No” is returned. Otherwise, the next edge of is processed. This continues iteratively until all edges have been processed, or until no exists.

1 DecisionAlgorithm()
        // Assumes
2       for  to  do
3             LongestEpsilonPrefix() if  then  return “No”
4      if  then return “No” return “Yes”
Algorithm 1 Decide whether

Computing the Longest -Prefix: Consider a segment of in the st iteration of Algorithm 1 and let be the value of computed in the -th iteration of the for loop in line 5 of the algorithm. From Lemma 4 follows that and admit the orthogonal matching if and only if is -monotone, i.e., (1) and , (2) , and (3) is monotone with respect to the line supporting . Now Lemma 4 can be used to implement the LongestEpsilonPrefix procedure as follows: Determine the first edge on which violates three above with respect to . If the violation occurs before reaching , then obviously . Otherwise we intersect the first violating edge along with the boundary of to find . Clearly, the whole process takes linear time. Now we prove the correctness of our decision algorithm:

[Correctness] Let , and let and be two polygonal curves such that and . Then DecisionAlgorithm() returns “Yes” if and only if

Proof.

If the algorithm returns “Yes” then the sequence for all with and describes a monotone matching that realizes .

If , then we use Lemma 6 to prove by induction on that the algorithm returns “Yes”, i.e., all longest -prefixes of with respect to the corresponding segments of exist. For , following Lemma 6, exists and can be found by the algorithm. For any , the algorithm has determined already and by Lemma 6, . Another application of Lemma 6 yields that and .

In the case that it remains to prove that . For the sake of contradiction, assume . Since is the longest -prefix, there is no other such that . Consequently, and therefore . Applying the contrapositive of Lemma 6 to and yields , which is a contradiction. Therefore and the algorithm returns “Yes” as claimed. ∎

Observation (Piecewise Orthogonal Matching).

If , then the sequence computed by Algorithm 1 induces a Fréchet matching that maps to , for all .  Lemma 4 implies that the matching between and is the orthogonal matching.

We summarize this section with the following theorem: [Runtime] Let , and let and be two polygonal curves such that and . Then there exists a greedy decision algorithm, Algorithm 1, that can determine whether in time.

Proof.

The number of vertices in is at most . The algorithm greedily finds the longest -prefix per edge by calling LongestEpsilonPrefix in time. The for-loop iterates over edges, thus the runtime is .

Our algorithm also can be applied to the case that one curve has arbitrary edge lengths and the other curve has edge length greater than .

[Single Curve with Long Edges] Let , and let and be two polygonal curves such that and . Then there exists a greedy decision algorithm, Algorithm 1, that can determine in time.

Proof.

In the proof of Lemma 6, we can replace Lemma 3.2 with Corollary 5, and realize that Lemma 6 also holds for the case and . However, we cannot implement LongestEpsilonPrefix as stated before because Lemma 4 does not hold if has arbitrary edge lengths. However, we can perform a simple reachability propagation in the free space in , and determine as the rightmost reachable point on the top boundary of the free space. This takes linear time per edge. The rest follows from Theorem 1. ∎

3.4 Necessity of the Assumption

As we have seen so far, Algorithm 1 greedily constructs a feasible Fréchet matching by linearly walking on curve to find all longest -prefixes on it with respect to the corresponding edges of . Unfortunately, this property is not always true for curves with short edges. In general, there can be (combinatorially) quadratically many blocked points (regions) in the free space diagram of two curves; see Fig. 7 as an example of two curves in that have edges of length exactly equal except for some edges of lengths in . This example demonstrates that our simple greedy construction of a Fréchet matching is unlikely to work if the edges are shorter than the assumptions we made. It also shows that our greedy construction does not work if both curves have edge lengths of at least .

Figure 7: An example in which the greedy algorithm fails to realize the Fréchet matching highlighted in green. Here, is the longest -prefix in with respect to , as illustrated by the red reachable path. Also is the longest -prefix in with respect to as illustrated by the blue reachable path. Every edge is long, except for the edges and that have lengths and , respectively. The latter values are still in the range .

4 Optimization and Approximation

In this section, we present two algorithms for computing and approximating the Fréchet distance between two curves with long edges, respectively. First we give an exact algorithm which runs in time. Afterwards, we present a linear time algorithm which is similar to the greedy decision algorithm, but it uses the notion of minimum prefix to approximate the Fréchet distance.

4.1 Optimization

The main idea of our algorithm is that we compute critical values of the Fréchet distance between two curves and then perform binary search on these to find the optimal value acquired by the decision algorithm. In general, there are a cubic number of critical values, which are candidate values for the Fréchet distance between two polygonal curves. These critical values are those for which or , or when decreasing slightly a free space interval disappears on the boundary of a free space cell or a monotone path in the free space becomes non-monotone. See Alt and Godau [2] for more details on critical values. In our case we can show that it suffices to consider only a linear number of critical values, because the assumption on the edge lengths of the curves implies that a piecewise orthogonal matching exists, which reduces the number of possible critical values. Our optimization algorithm consists of the following three steps:

  1. Run DecisionAlgorithm() with and store all . Only proceed if DecisionAlgorithm() returns ‘Yes’.

  2. Compute , where is the set of all critical values for and . Here, is the parameter such that is the first point along that intersects .

  3. Sort and perform binary search on using DecisionAlgorithm() to find .

Let , for all . By Observation 3.3, we know that there exists a piecewise orthogonal matching that maps to . Therefore we only need to consider those critical values that occur between each segment in and its corresponding longest -prefix in . However, the value and correspondingly, are not known beforehand. We can show that it suffices to consider instead of when computing critical values with respect to . Here, is defined with respect to . Observe that for all , by definition of and because , see Fig. 8. Therefore, is a subcurve of . Now it can be easily seen that all critical values for and are contained in the set which are the critical values for and .

The orthogonal matching of width between and maps points in such a way that there are three types of point-to-point distances: Some of these distances are obtained by an orthogonal projection of vertices from to , some by mapping vertices from to endpoints in , and some by an orthogonal projection of vertices from to edges in . Let be the hyperplane intersecting that is perpendicular to and tangent to . Similarly, define with respect to , see Fig. 8. For any vertex do the following:

(1) If lies between and , then the orthogonal matching of width maps to its orthogonal projection on . We store the (orthogonal) distance in the set . (2) If lies on the -side of , then an orthogonal matching of width can map either to or to its orthogonal projection on . In this case we therefore store both and in . Similarly, if lies on the -side of then we store and in . Finally, for each edge in , (3) we store .

Now, the call to DecisionAlgorithm() in step (1) ensures that and satisfy the constraint and . Consequently, for any , we know that and because , and therefore DecisionAlgorithm() in step (3) is runnable. We have the following theorem:

Figure 8: The orthogonal matching of width between and causes three types of point-to-point distances. We show cylinders and , where , but is not known beforehand. (a) Vertex falls into case (2), when the orthogonal matching maps either to ( lies inside