Approximating p-Mean Curve of Large Data-Sets

05/14/2020
by   Sepideh Aghamolaei, et al.
Sharif Accelerator
0

Given p, k and a set of polygonal curves P_1,…,P_L, the p-mean curve M of P_1,…,P_L is the curve with at most k vertices that minimizes the L_p norm of the vector of Fréchet distances between each P_i and M. Also, the p-mean curve is the cluster representative (center) of L_p-based clusterings such as k-center, k-medians, and k-means. For p→∞, this problem is known to be NP-hard, with lower bound 2.25-ϵ on its approximation factor for any ϵ>0. By relaxing the number of vertices to O(k), we were able to get constant factor approximation algorithms for p-mean curve with p=O(1) and p→∞, for curves with few changes in their directions.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

11/12/2018

Approximate Curve-Restricted Simplification of Polygonal Curves

In this paper we study curve-restricted min-# simplification of polygona...
05/03/2018

Approximating (k,ℓ)-center clustering for curves

The Euclidean k-center problem is a classical problem that has been exte...
01/28/2020

On the complexity of the middle curve problem

For a set of curves, Ahn et al. introduced the notion of a middle curve ...
11/08/2017

Curve Reconstruction via the Global Statistics of Natural Curves

Reconstructing the missing parts of a curve has been the subject of much...
02/24/2022

Removing Popular Faces in Curve Arrangements

A face in a curve arrangement is called popular if it is bounded by the ...
09/02/2020

Isotopic Arrangement of Simple Curves: an Exact Numerical Approach based on Subdivision

This paper presents the first purely numerical (i.e., non-algebraic) sub...
10/19/2018

Population and Empirical PR Curves for Assessment of Ranking Algorithms

The ROC curve is widely used to assess the quality of prediction/classif...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A polygonal curve is a sequence of points, e.g. GPS data, time series, and discretized borders in a map. Curve simplification, clustering, and median curve computation using various distance measures between curves have already been discussed, however, most of them are inefficient on massive data sets. In curve simplification, the goal is to find a subcurve of approximately the same number of vertices and minimum error from the original curve. However, for a set of curves, the simplification error is aggregated. We focus on the simplification of a set of curves by finding a representative curve that is a cluster representative of a given complexity, assuming that the input curves have a small Fréchet distance.

Intuitively, the Fréchet distance is the minimum length of the leash between a man walking on one curve from the start to the end, and his dog walking on the other curve from the start to the end, given that none of them ever goes back. Deciding the Fréchet distance between two curves takes time for curves with vertices using the free space diagram (FSD) [4]. The Fréchet distance cannot be decided in time [6] or even approximated by a factor better than  [14], for any , unless SETH fails. Prior to this result, this lower bound was  [8]. For input curves with vertices, assuming SETH is true, it is not possible to decide the Fréchet distance of the curves in time, for all  [10]. For the special case of -packed curves, a -approximation decider exists for Fréchet distance [16].

Computing a representative curve is a well studied problem [11, 25, 21, 15, 24, 3, 12, 13]. For similar (close) monotone trajectories with the same start and end vertices, Buchin, et al [11] presented algorithms for computing the median trajectory and the homotopic median trajectory.

A curve simplification where the points of the simplified curve should be a subset of the vertices of the input curve is called a discrete curve simplification. Discrete curve simplification under Fréchet distance is solvable in time [22], and no algorithm with running time exists for all , assuming SETH holds [10].

Given a curve and an integer , the local min- simplification of is a subsequence of at most vertices of , such that the maximum distance between each edge of and the sub-curve of defined by the vertices between the endpoints of is minimized over all such edges. In the global min- simplification, is a subsequence of with at most vertices that minimizes .

The current best exact min- simplification algorithms for local Fréchet distance and global Fréchet distance have cubic complexity [7].

In Definition 1

, unlike the simplification problem, the vertices of the estimated curve are not required to be a subset of the vertices of the original curve. A similar problem is

-segment mean curve [29], where a monotone path is simplified into a possibly discontinuous -piecewise linear function. Also, the problem of min- estimation, where the distance is given and the goal is to minimize the number of vertices has been discussed in [23], however, their algorithm assumes the optimal path goes through a vertex of the FSD grid and therefore shares the same counter-example given in [26] for the simplification algorithm using (continuous) Fréchet distance of [22].

Definition 1 (Min- Estimation)

For a curve in and an integer , the min- estimation of is a curve with at most vertices from that minimizes the distance between and .

Our algorithms require the input to satisfy the property of Definition 2, which can be achieved by slightly perturbing the input points.

Definition 2 (Slope-Based General Position)

For a given set of segments, they are in slope-based general position if the slope of no two segments is equal. For a set of curves, they are in slope-based general position if no segment from one curve has the same slope as a segment from another curve.

Approximation algorithms with near linear time exist for local simplification under Fréchet distance [2, 1]. Global discrete curve simplification using Fréchet distance can be solved in time [7]. If conjecture holds, there is no algorithm for global simplification using Fréchet distance with running time , for any  [7].

The combination of the representative curve and curve simplification problems is the -clustering problem, where the cost of clustering a set of curves into clusters with centers , , is to be minimized using curves with complexity as cluster representatives (centers).

Driemel et al. [17] proposed -approximation algorithms for -center and -median clustering of curves in 1D and a -approximation for any dimensions, assuming the complexity of a center and is constant. Buchin et al. [12] presented an algorithm for computing the -center of a set of curves under Fréchet distance, such that the complexity of the representative curves (centers) is fixed, and prove that it is NP-hard to find a polynomial approximation scheme (PTAS) for this problem. They also presented a -approximation algorithm for this problem in the plane and a -approximation for , and proved the lower bound for the discrete Fréchet distance in 2D if .

In our definition, the discrete -mean curve (Definition 3) chooses an input curve as the representative, however, the continuous -mean curve (Definition 4) can be any arbitrary curve in the plane.

Definition 3 (Continuous -Mean Curve (Continuous -Mc))

Given a set of trajectories in and an integer , the continuous -mean curve of these trajectories is a curve with at most vertices from that minimizes where denotes the Fréchet distance.

Definition 4 (Discrete -Mean Curve (Discrete -Mc))

Given a set of trajectories in and an integer , the discrete -mean of these trajectories is the curve with at most vertices chosen from the vertices of that minimizes , where denotes the Fréchet distance.

-mean trajectories for using -center [12, 17], and using -median [17] exist. Discrete -MC with is the simplification problem with an arbitrary number of vertices.

We call the objective function of -MC the -norm of the Fréchet distance. Since the root function is monotone for , which are the values that appear in the cost of -based clustering problems, it is sufficient to minimize the -th power of Fréchet distance or . Note that while both -norms and the Fréchet distance satisfy the triangle inequality, their combination does not. For example, for , the inequality becomes , which does not always hold.

The ply of a 1D curve is the maximum number of times a polygonal curve passes through a point in the positive direction of the -axis. Computing the Fréchet distance takes time for curves of ply at most in 1D [9].

Table 1 summarizes the results on -mean curves. To the best of our knowledge, we are the first to consider the -MC for most values of and give approximation algorithms for them. We also extend the definition of ply to general dimensions.

-Mean Time Approx. Reference
Continuous -Mean:
Lower bound [12]
, no simpl. Lower bound [13]
Cor. 1, Thms. 5.1 and 5.3
Discrete -Mean:
, weak simpl.  [12]
Lem. 10 and 11
, vertices Thms. 5.2 and 5.1, and Lem. 11
Table 1: Results on -mean curve in . means for curves whose projections on each of the axes have fixed plies.

2 Preliminaries

A polygonal curve is a sequence of points and the segments connecting each point to its next point in the sequence, , for .

The Fréchet distance of two curves is defined as where and are reparameterizations, i.e., continuous, non-decreasing, bijections from [0,1] to [0,1], and is a point metric. In the Fréchet distance of a set of curves, is the diameter of the mapped points from the input curves and can be computed in time [18]. For , there is a point on an input curve with the Fréchet distance to the other curve. The Fréchet distance of curves is the diameter of their minimum enclosing ball or the -center of the curves using Fréchet distance. Using triangle inequality, the Fréchet distance of the curves is at most twice the distance from -center to the farthest curve.

The free-space diagram (FSD) [4] between two polygonal curves for a constant error , is a 2D region in the joint parameter space of those curves where each dimension is an arc-length parameterization of one of the curves, and the free space (FS) is the set of all points that are within distance of each other: and the rest of the points are non-free. Therefore, each point of FSD defines a mapping/correspondence between a point on and a point on . The Fréchet distance between two curves is at most iff there is an -monotone path in the free space diagram from to . In figures, the free space is usually shown in white, and the non-free regions are shown in gray. The orthogonal lines drawn from the vertices of the input curves build a grid (FSD grid), whose cells are called the FSD cells.

The reachable space (RS) [16] is a subset of the free space diagram for the Fréchet distance of two curves that consists of all the points for which a monotone path in the free space of the FSD exists from to .

Definition 5 (Transformed Free-Space Diagram)

A FSD where the edges of the input curves are scaled by arbitrary positive constant factors is a transformed free-space diagram. Since FSD is constructed by a set of ellipses, the scaling only results in scaled ellipses and does not change the mapping of the points from the Euclidean plane into the parameter space.

A special transformed FSD called the deformed FSD was already defined for a variation of the Fréchet distance called the backward Fréchet distance, assuming the edges of input curves have weights [19].

Two families of curves with near-linear time algorithms for computing or approximating the Fréchet distance exist. A curve is -packed [16] if the total arc length of inside any ball of radius is at most . The time complexity of computing a -approximation of the Fréchet distance between -packed curves is  [16]. -Packed curves also have the property that for a given , the complexity of the RS is within a constant factor of the complexity of the RS for , for any .

The time complexity of computing the Fréchet distance at most between two curves of complexity with long edges is , where a curve with minimum edge-length is a curve with long edges [20].

-based clustering problems are clusterings with cost equal to the -norm of the distances between the points and their corresponding centers. For a real number , the cost of a set of points , is defined as is also the cluster center of in an -based clustering. There is a

-approximation algorithm using linear programming for computing the

cost for fixed  [27]. For special cases such as and explicit mathematical formulas for exist which can be used to compute in linear time. Constant factor approximations for -based clustering also exist [5].

Given a curve and a set of curves , the -norm of the Fréchet distances is defined as The -norm of the Fréchet distances may not satisfy the triangle inequality for different curves .

Based on this definition, given the optimal mapping between the points of the curves, it is possible to compute the corresponding curve by repeatedly finding the center of the mapped points by solving the cost optimization.

Finding the minimum-link (min-link) path in a polygonal domain asks for finding a minimum-link -path (a path from to ) such that the number of bends is minimized and the path lies inside the polygon and those not go through a set of polygonal holes. This problem can be solved in time, where is the number of edges in the visibility graph of the polygon [28, 30].

3 Continuous Min- Estimation

We give a global min- estimation (Definition 1) algorithm and define two new concepts, namely the ply of curves and the normalized free-space diagram.

3.1 Extending Ply to General Dimensions

We define the ply of a curve in -dimensions , and compute it using a normalized FSD (Section 3.2).

Definition 6 (Generalized Ply)

For a curve , the directional ply of is defined as the sum of the plies of the projections of this curve on each of the coordinate axes. In other words, the directional ply of is defined as the total number of times the slopes of the edges of changes with respect to the positive direction of each coordinate axes. We define the generalized ply of a curve as the maximum of its directional plies in the directions of its edges and the directions perpendiculars to its edges. We denote the generalized ply of a curve by .

In the rest of the paper, we use the word ply to refer to the generalized ply.

Lemma 1

The number of holes (connected non-free regions) in a FSD of two curves and with distance is at most , assuming .

Proof

Consider the FSD of and . The holes in the FSD are created when two non-consecutive edges of one of the curves, for example , where , have distance at most from one edge of the other curve . This means that at some point between , had distance more than from and then changed its direction and moved back such that its distance from that same edge has become at most again. Since the slope of is fixed, this means that the slope of has changed. Therefore, the number of holes per edge is at most the number of times the other curve changes its direction. Summing these values gives the following upper bound on the number of holes:

3.2 Normalized Free-Space Diagram

FSD can only decide the Fréchet distance. To decide the existence of a curve with vertices, we introduce a transformed FSD called normalized FSD (Definition 7) such that a curve in FSD corresponds to a segment in the original space if the derivatives of any point on the curve with respect to each of the FSD axes (input curves) is the same. We formalize this in Lemma 2. Since we compute the slopes of the edges of -MC using the slopes of and , all slopes must be distinct and the signs of the slopes of the segments from each curve must be the same.

Definition 7 (Normalized Free-Space Diagram)

A normalized free-space diagram is a transformed free-space diagram in which the constant scaling factors are chosen such that any segment in this diagram corresponds to a segment in the Euclidean plane, unless the sign of the slope of a segment from at least one of the curves changes from positive to negative or vice versa. The distance between the mapped points of the input curves must be at most . A valid reparameterization is a monotone path in the free space that goes from the start vertex to the end vertex of the input curves.

The proof of Lemma 2 discusses how to find the constants in Definition 7, such that any segment in the original space is mapped to a segment in the parameter space.

Lemma 2

A segment in the normalized FSD corresponds to a segment in the original space (Euclidean plane), if the slopes of the edges of each input curve have the same sign, i.e. all positive slopes or all negative.

Proof

Assume and are two input curves, and we want to find a condition on a curve in the parameter space of that guarantees it will correspond to a polygonal curve in the Euclidean plane.

Choose an arbitrary segment from each of these curves. We want to change the mapping of the points of and to the axes of the FSD to keep the slope of constant along different segments. Let be the length of the curve from its start vertex to the point where the length of the curve reaches . So, the domain of is , where is the length of curve . Similarly, define and . Figure 1 shows unit vectors in the direction of for a segment of , respectively.

Figure 1: Unit vectors in the Euclidean plane (original space) on the left, and in the FSD (parameter space) on the right.

For each segment of the curves, we define a reparameterization. Let where , be a segment of curve with slope and -intercept . The reformulation of in terms of is given in the following formula: since using the derivatives of length variables:
Similarly, we reparameterize a segment of curve in terms of its length variable . The axes of the normalized FSD are and . So we need to compute the slope of the line segment from curve in terms of and : and the equation for is similar. This means that scaling each segment of the curve by a factor preserves the slope of in terms of in the Euclidean plane.

After scaling, the slope of will not change if the sign of the slopes of the segments of the input curves remains the same and they are in slope-based general position (Definition 2). However, the ellipses of FSD are scaled. ∎

Figure 2 shows the complexity of a monotone curve in FSD is different from its complexity in the Euclidean plane, as much as the total number of times each curve changes its direction with respect to one of the axes ( or ).

Figure 2: A curve with Fréchet distance at most from and and vertices that has vertices in FSD.
Lemma 3

The complexity of a monotone path with vertices in the normalized FSD of curves is at most in the original space.

Proof

The connectivity of regions in the FSD does not change under scaling transformations. So, if there is a path in the FSD, there is also a path in the normalized FSD or any other version of the FSD where each edge of the curves used to build the FSD is scaled by a positive constant factor. Scaling preserves the sign of the slopes, so the monotonicity of the path is also preserved.

In a normalized FSD, according to Lemma 2, the absolute values of the slopes of the edges are preserved. As a result, the only points of the path in the FSD that correspond to a vertex of the output curve are either the vertices of the path or when the sign of the edges of the original curves changed. There are vertices in the path and sign changes, so the overall complexity of the resulting curve is at most . ∎

Lemma 3 shows that if a monotone path exists in the FSD, it also exists in the normalized FSD and vice versa, so do their corresponding curves in the Euclidean plane. We assumed all curves share the same start and end points.

3.3 Continuous -Mean Curve of Two Curves

The main idea for constructing a -MC is to build a normalized FSD, then compute a monotone path inside the FS with the minimum number of vertices. Using Lemma 3, this will give a curve with at most vertices. For curves with ply zero, i.e. monotone curves, according to Lemma 1, .

First, we find a set of candidate values for the Fréchet distance between the -MC and the other curves, then we use these values to build a normalized FSD and find a monotone path with vertices in the FS (Algorithm 2).

Lemma 4

The Fréchet reparameterization of curves and minimizes the distances to the -MC of and with .

Proof

Let and be a pair of points mapped to each other in the Fréchet mapping between and . Let be the point on which lies on the -MC of . The goal is to minimize the cost of -MC for these points: Then, we take the derivative of the above cost expression in terms of :
This is a minimum of the function, since for the derivative is positive and for smaller values it is negative. Substituting this value in the cost expression gives . This means that the minimum of also minimizes the cost expression. The maximum of for all pairs is the maximum distance in the reparameterizations of that realizes the Fréchet distance. ∎

In a simplification algorithm, a shortcut is a segment that replaces a part of the curve starting and ending at vertices of that curve.

Theorem 3.1

The events for the min- local simplification of a curve are a subset of the events used when computing the Fréchet distance between and all the simplifications of with only one shortcut.

Proof

For a curve and its simplification , the Fréchet distance between and is the distance between two points . There are three types of events for computing the Fréchet distance: vertex-vertex events, monotonicity events, and vertex-edge events. The vertices of are a subset of the vertices of , so the vertex-vertex events of are a subset of the vertex-vertex events of . The vertex-edge events where a vertex of and an edge of is chosen are also a subset of the vertex-edge events of and itself. New events are the ones between the vertices of and the edges of . But these events are a subset of the events for the simplifications of with vertices. The monotonicity events happen between two vertices of one curve and a segment of the other curve. Removing a vertex can create a new event between the shortcut and two vertices from the other curve, but these events are a subset of events for curves with one shortcut. ∎

Lemma 5

The time complexity of computing all the events for computing the Fréchet distance between a curve and its simplification is .

Proof

Using Theorem 3.1, other than the events for computing the Fréchet distance between a curve and itself, all the events are from simplification with vertices. The number of simplifications with only a single shortcut is , since we only need to find the endpoints of the shortcut. Let . Computing the events for the Fréchet distance takes time. So, overall, the algorithm takes time and space. ∎

Lemma 6

The events for global simplification and the events for estimation of a curve are the same.

Proof

The proof of Lemma 4 also shows that the mappings for global min- simplification and min- estimation are the same. Since the mappings are the same, the events that give the optimal cost for these problems are also the same. ∎

1:A free-space diagram of curves and
2:A monotone path from to
3:Convert the FS into a polygon by connecting consecutive intersection points of the FS inside each cell with the boundary of the cell, if the segment connecting them falls inside the FS.
4:Connect the vertices of the FS polygon and the holes to each other and extend it until it reaches the boundary of the polygon or the boundary of a hole.
5:Build the intersection graph of the segments from the previous step and add directions to edges in the positive direction of slopes and in the case of zero slopes, in the direction of increasing indices of the horizontal curve.
6:Compute the unweighted shortest path from the start vertex to the end vertex in the graph from the previous step.
Algorithm 1 Monotone Minimum-Link Path in A Free-Space Diagram
1:Two trajectories , a constant
2:A trajectory
3:= The values used in computing the Fréchet distance of (using [4]).
4:repeat binary search on
5:     Build a normalized FSD for using .
6:     Find the min-link shortest path in , using Algorithm 1.
7:until the smallest is found for which a path of length is found.
8:Build a curve by connecting the middle point of the mapped points of the path in the FSD in the Euclidean plane.
Algorithm 2 Continuous -Mean Curve

In Figure 3, the steps of Algorithm 2 are shown.

Figure 3: An example of Algorithm 2 for two curves on the left and their FSD on the right. The polygon inside the FS is shown in green. Only a subset of the edges of the intersection graph are shown. The dashed curve is in the parameter space.
Lemma 7

The curve returned by Algorithm 2 has at most vertices, where is the complexity of the -MC of and with plies and , respectively.

Proof

Assume is the number of vertices of the optimal -MC. Then, the min-link monotone path in the normalized FSD has complexity at most , since it ignores the changes in the sign of the slope of the edges of the path.

The change in the signature of the slopes of the -MC is at most the sum of the changes in the signature of the slopes of the input curves since either the output will change its direction with one of the input curves or it will not. Each of these events corresponds to at most one extra vertex in the computed curve.

When converting the FS to a polygon with polygonal holes, each curve which is replaced by a segment in the algorithm increases the complexity of the path by at most one, since if the optimal path enters this removed part, it can be replaced by the path that shortcuts the removed part.

In a polygon, for any minimum link path of length , there exists a min-link path of length that passes through the vertices of the polygon or its holes, based on the properties of the visibility graph of the polygon. ∎

Lemma 8

In Algorithm 2, for , the cost of equals the -MC.

Proof

Since there are only two curves, the points of the curve that minimizes the norm of the Fréchet distances to and lie on the line that connects the corresponding points of the curves that are mapped to each other in the optimal reparameterization, as proved in Lemma 4.

Curve is a polygonal curve, since it is the result of connecting the middle point of the points of two segments (one edge from each of the curves). ∎

Lemma 9

Algorithm 2 takes time, where .

Proof

Computing the events of the Fréchet distance between takes time. So, the number of values found for is also . The algorithm computes a normalized FSD for each of the values for , so it takes time.

The overall complexity of the diagram is , which is also the number of vertices in the intersection graph. The number of edges is , so computing the shortest path takes . Computing the intersection of a line with a polygon with vertices takes time, so the time complexity of computing the graph is . ∎

4 -Mean Trajectory of A Set of Curves

4.1 The Pairwise Algorithm for Discrete -Mean Curve

The -MCs of a set of curves is not unique. So, dividing the computation of the -MC with at most vertices into computing the Fréchet distance of a set of curves and simplifying that curve does not yield the optimal solution. Algorithm 3 computes an approximate -MC. We use this algorithm in latter sections as a subroutine. For the continuous version, we use the algorithm of [18].

1:A set of trajectories
2:A -mean trajectory
3:
4:
5: min- simplification of (using [7]).
Algorithm 3 Pairwise Algorithm
Lemma 10

Algorithm 3 is a -approximation for discrete -MC.

Proof

Assume is the curve that has the optimal solution as its simplification and let be an optimal min- simplification of . Since is an optimal simplification, then Since -MC is also a simplification for , its distance to is at least as much as the optimal simplification. Using triangle inequality of norms, the approximation factor is proved:

Lemma 11

The time complexity of Algorithm 3 is for discrete simplification.

Proof

Computing the Fréchet distance of two curves takes time. Testing each curve as the center and computing the norm of the Fréchet distance of all curves requires distance computations between each pair of curves. This takes time. Finding the minimum takes time. Computing the simplification with vertices takes time [7]. ∎

Note that in Algorithm 3, while distances in matrix satisfy the triangle inequality, their -th power does not. So, approximation algorithms based on triangle inequality cannot be used to prune away large distances in .

4.2 Continuous -Mean of Curves

Algorithm 4 finds an approximation of the continuous -MC.

1:A set of curves , an integer
2:A path with vertices
3:.
4:while  do
5:     .
6:     Group the curves of into pairs and run Algorithm 2 on each pair and add their -MC to .
7:     
8:end while
9:Let be the only member of after running Algorithm 2 with .
Algorithm 4 Continuous -Mean Curve of curves
Corollary 1

Algorithm 4 is a -approximation, using Lemmas 8 and 13.

Lemma 12

The time complexity of Algorithm 4 is , where .

Proof

The recursion tree of the algorithm has height , and each node runs an instance of Algorithm 2, which according to Lemma 9 takes time. So, there are at most nodes in the tree, and the total running time of the algorithm is .

5 An Algorithm for -Mean of Curves

Algorithm 5 simplifies the input curves with error less than their distances to the optimal -MC, then it computes an approximate -MC.

1:A set of curves , an integer , an -approximate min- simpl./estim. algorithm
2:An approximate -mean curve
3:for  do
4:      an approximate min- simpl./estim. of .
5:end for
6: a -MC of using Algorithm 3 (disc.) or Algorithm 4 (cont.).
Algorithm 5 -Mean Algorithm
Lemma 13

The approximation factor of Algorithm 5 is , if an exact simpl./estim. algorithm is used .

Proof

denotes the -mean of curves computed by the algorithm, and denotes the optimal solution. is a simpl./estim. with error equal to the minimum error of simpl./estim.s of with at most vertices, so it has a distance less than any other curve, including : Since is the -MC with minimum cost for curves , it has a lower cost than . Using triangle inequality of Fréchet distance:

Theorem 5.1

The approximation factor of Algorithm 5 is .

Proof

Replacing approximation factor for computing from in the proof of Lemma 13, gives the approximation factor:

Theorem 5.2

Algorithm 5 takes time for discrete -MC.

Proof

The total time complexity of curve simplification of curves is . Algorithm 3 takes . Summing up the aforementioned times gives . ∎

Theorem 5.3

Algorithm 5 takes time for continuous -MC.

Proof

Based on Lemma 12, Algorithm 4 takes time, which is used times in the first step of the algorithm. Computing the -mean of curves takes . Since , this time is dominated by the time of the first step. So, the running time of the algorithm is . ∎

The Fréchet re-parameterization a curve with long edges can be computed in time via piecewise orthogonal matching [20]. So, the time complexity of Algorithm 5 on such curves reduces to .

Acknowledgements

The authors would like to thank Professor Carola Wenk for reviewing parts of the paper and for the useful discussions.

References

  • [1] Abam, M.A., De Berg, M., Hachenberger, P., Zarei, A.: Streaming algorithms for line simplification. Discrete Comput. Geom. 43(3), 497–515 (2010)
  • [2] Agarwal, P.K., Har-Peled, S., Mustafa, N.H., Wang, Y.: Near-linear time approximation algorithms for curve simplification. Algorithmica 42(3-4), 203–219 (2005)
  • [3] Ahn, H.K., Alt, H., Buchin, M., Oh, E., Scharf, L., Wenk, C.: A middle curve based on discrete Fréchet distance. In: Latin American Symp. Theoret. Informatics. pp. 14–26. Springer (2016)
  • [4] Alt, H., Godau, M.: Computing the Fréchet distance between two polygonal curves. Int. J. of Comput. Geom. Appl. 5(01n02), 75–91 (1995)
  • [5] Bateni, M., Bhaskara, A., Lattanzi, S., Mirrokni, V.: Distributed balanced clustering via mapping coresets. In: Adv. in Neural Info. Process. Syst. pp. 2591–2599 (2014)
  • [6] Bringmann, K.: Why walking the dog takes time: Fréchet distance has no strongly subquadratic algorithms unless seth fails. In: Annu. IEEE Sympos. Found. Comput. Sci. pp. 661–670. IEEE (2014)
  • [7] Bringmann, K., Chaudhury, B.R.: Polyline simplification has cubic complexity. arXiv preprint arXiv:1810.00621 (2018)
  • [8] Bringmann, K., Mulzer, W.: Approximability of the discrete Fréchet distance. Comput. Geom. 7(2), 46–76 (2015)
  • [9] Buchin, K., Chun, J., Markovic, A., Meulemans, W., Löffler, M., Okamoto, Y., Shiitada, T.: Folding free-space diagrams: computing the fréchet distance between 1-dimensional curves. In: Proceedings of the 33rd Annu. ACM Sympos. Comput. Geom. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2017)
  • [10] Buchin, K., Buchin, M., Konzack, M., Mulzer, W., Schulz, A.: Fine-grained analysis of problems on curves. EuroCG, Lugano, Switzerland (2016)
  • [11] Buchin, K., Buchin, M., van Kreveld, M., Löffler, M., Silveira, R.I., Wenk, C., Wiratma, L.: Median trajectories. Algorithmica 66(3), 595–614 (Jul 2013)
  • [12] Buchin, K., Driemel, A., Gudmundsson, J., Horton, M., Kostitsyna, I., Löffler, M., Struijs, M.: Approximating (k,)-center clustering for curves. In: Proceedings of the 30th ACM-SIAM Sympos. Discrete Algorithms. pp. 2922–2938. SIAM (2019)
  • [13] Buchin, K., Driemel, A., Struijs, M.: On the hardness of computing an average curve. arXiv preprint arXiv:1902.08053 (2019)
  • [14] Buchin, K., Ophelders, T., Speckmann, B.: Seth says: Weak fréchet distance is faster, but only if it is continuous and in one dimension. In: Proceedings of the 30th Annu. ACM Sympos. Comput. Geom. pp. 2887–2901. SIAM (2019)
  • [15] Chambers, E., Kostitsyna, I., Löffler, M., Staals, F.: Homotopy measures for representative trajectories. In: Inform. Process. Lett. vol. 57. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016)
  • [16] Driemel, A., Har-Peled, S., Wenk, C.: Approximating the Fréchet distance for realistic curves in near linear time. Discrete Comput. Geom. 48(1), 94–127 (2012)
  • [17] Driemel, A., Krivošija, A., Sohler, C.: Clustering time series under the Fréchet distance. In: Proceedings of the 27th ACM-SIAM Sympos. Discrete Algorithms. pp. 766–785. Society for Industrial and Applied Mathematics (2016)
  • [18] Dumitrescu, A., Rote, G.: On the fréchet distance of a set of curves. In: Canad. Conf. Computat. Geom. pp. 162–165 (2004)
  • [19] Gheibi, A., Maheshwari, A., Sack, J.R.: Weighted minimum backward fréchet distance. Theoret. Comput. Sci. 783, 9–21 (2019)
  • [20] Gudmundsson, J., Mirzanezhad, M., Mohades, A., Wenk, C.: Fast Fréchet distance between curves with long edges. In: Proceedings of the 3rd Internat. Workshop on Interactive and Spatial Computing. pp. 52–58. ACM, New York, NY, USA (2018)
  • [21] Har-Peled, S., Raichel, B.: The Fréchet distance revisited and extended. ACM Trans. Algorithms 10(1),  3 (2014)
  • [22]

    Imai, H., Iri, M.: Polygonal approximations of a curve—formulations and algorithms. In: Machine Intelligence and Pattern Recognition, vol. 6, pp. 71–86. Elsevier (1988)

  • [23] van de Kerkhof, M., Kostitsyna, I., Löffler, M., Mirzanezhad, M., Wenk, C.: Global curve simplification. In: Proceedings of the 27th Annu. European Sympos. Algorithms. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2019)
  • [24] van Kreveld, M., Loffler, M., Staals, F.: Central trajectories. arXiv preprint arXiv:1501.01822 (2015)
  • [25] van Kreveld, M., Wiratma, L.: Median trajectories using well-visited regions and shortest paths. In: Proceedings of the 19th ACM SIGSPATIAL Internat. Conf. Advances Geogr. Inform. Syst. pp. 241–250. ACM (2011)
  • [26] Kreveld, M.v., Löffler, M., Wiratma, L.: On optimal polyline simplification using the hausdorff and fréchet distance. In: Proceedings of the 34th Annu. ACM Sympos. Comput. Geom. vol. 99, pp. 56–1. Leibniz International Proceedings in Informatics (LIPIcs) (2018)
  • [27] Lin, J.H., Vitter, J.S.: Approximation algorithms for geometric median problems (1992)
  • [28] Mitchell, J.S., Rote, G., Woeginger, G.: Minimum-link paths among obstacles in the plane. Algorithmica 8(1-6), 431–459 (1992)
  • [29] Rosman, G., Volkov, M., Feldman, D., Fisher III, J.W., Rus, D.: Coresets for k-segmentation of streaming data. In: Adv. in Neural Info. Process. Syst., pp. 559–567. Curran Associates, Inc. (2014)
  • [30] Toth, C.D., O’Rourke, J., Goodman, J.E.: Handbook of discrete and computational geometry. Chapman and Hall/CRC (2017)