1. Introduction
Image vectorization algorithms date back to early 1990s and are among the core tools in vector processing software including Adobe Illustrator (Live Trace), CorelDRAW (PowerTRACE), and Inkscape. Despite their wide adoption in industry, algorithms for line drawing vectorization remain under active development and still admit major shortfalls [Noris et al., 2013; Favreau et al., 2016]. In several industries where vectorization is heavily needed, including traditional animation and engineering design, this task frequently is done manually, by painstakingly tracing a scanned image with drawing tools. This process is often considered to take less time than editing the automatic result from commercial vectorization tools.
A primary reason for frustration with line drawing vectorization algorithms is incorrect treatment of junctions, resulting in wrong topology, or connectivity (Fig.2(a,b)). Image understanding and perception rely on junctions and drawing topology to disambiguate depth and other cues [Xia et al., 2014]. In industries such as character animation, incorrect topology yields temporal incoherence and makes modern automatic coloring or inbetweening tools unusable [Whited et al., 2010; Orzan et al., 2013]. In engineeringoriented industries, incorrect topology may be considered an incorrect result overall, because it may not correspond to a physicallyrealizable object.
The main challenge when disambiguating junctions in line drawings is noisy or insufficient local information, even for clean images [Noris et al., 2013]. The presence of noise, such as uneven curve edges, complicates matters even further, and the widelyused local approach to resolving junctions based on a onepixel width image skeleton becomes unreliable [Favreau et al., 2016].
A recent method by Favreau et al. [2016] uses global information to resolve ambiguities at junctions. Their method successfully vectorizes sketches with numerous overdrawn strokes, where a heavy simplification of the result is needed. Unfortunately, for inputs requiring fidelity, their approach can lead to oversimplified results significantly deviating from the drawn contours (Fig. 2(b), 16).
While theoretically junctions may have various valences, as noted by previous work [Noris et al., 2013], the vast majority of junctions are X and Tjunctions (Fig. 1). Occlusion contours typically generate Tjunctions, making them crucial for 3D shape perception [Kanizsa, 1979; Bessmeltsev et al., 2015]. Hence, correct resolution of X and Tjunctions is a primary concern during image vectorization.
With these challenges in mind, in this paper we propose a robust image tracing method true to the image in unambiguous regions, with global treatment of T and Xjunctions even when local information is unclear (Fig. 2, right). Our technical innovation is to use frame fields to guide vectorization. Frame fields attach two pairs of vectors
to each point on the plane. They have been recently used to generate anisotropic quadrilateral meshes and to estimate 3D normals from 2D sketches
[Panozzo et al., 2014; Iarussi et al., 2015]. Although frame fields are natural for tracking the orientations of curves meeting at sharp junctions, to our knowledge they never have been applied to image vectorization.Overview.
As illustrated in Figure 1, the general idea of our method is to find a smooth frame field on the image plane, where at least one direction is aligned with nearby contours of the drawing. Around X or Tshaped junctions, the two directions of the field will be aligned with the two intersecting contours. Then, we extract the topology of the drawing by tracing the frame field and grouping traced curves into strokes. Finally, we create a vectorization aligned with the frame field with the extracted topology.
2. Related Work
Our work builds upon achievements in three areas: image vectorization, junction detection, and frame fields. A comprehensive review of these areas is outside the scope of this paper; here, we instead highlight work relevant to our proposed pipeline.
Frame fields.
Our algorithm is built upon the construction of frame fields that assign two directions to every point in a planar region; these directions will guide our placement of strokes. Unlike cross fields, frame fields have no constraint on orthogonality or length of the direction vectors. We refer the reader to the recent survey by Vaxman et al. [2016] for broad discussion.
While cross fields have been extensively used in computer graphics [Kass and Witkin, 1987; Hertzmann and Zorin, 2000; Palacios and Zhang, 2007], representations of frame fields and algorithms for their computation are relatively recent [Panozzo et al., 2014; Diamanti et al., 2015]
. They serve as a natural representation of linear transformations on tangent spaces of a surface. Frame fields originally were proposed for guiding anisotropic quad meshing via inversionfree mesh parameterization
[Panozzo et al., 2014]. Since then, frame fields have found additional applications, such as inferring 3D normals from a 2D sketch [Iarussi et al., 2015] and recovery of damaged historical documents [Pal et al., 2016].Our work is driven by the frame field synthesis and interpolation tool set developed in
[Panozzo et al., 2014; Diamanti et al., 2015]. Namely, we use their definition and representation of a PolyVector field, as described in Section 3.1.The BendFields algorithm proposed by Iarussi et al. [2015] inspired some aspects of our approach. While their algorithm is targeted to 3D surface reconstruction from curvature lines, they initially generate a frame field aligned to directions in a bitmap image. Their goal, however, is to compute a frame field in the space between
the input curves, while we solve for a frame field defined exclusively on dark pixels. This difference gives our method a significant performance boost by reducing the number of degrees of freedom, and it qualitatively affects the results near junctions with sharp angles. Due to differences in application, the formulation and weighting of their alignment term differs from ours (Fig.
3), and our use of PolyVectors has only realvalued variables per pixel rather than requiring a mixedinteger solver.Image vectorization.
Vectorization of bitmapped images has been studied extensively in graphics, vision, and other disciplines. Various input and applicationspecific priors guide many vectorization methods, conforming to requirements of end users in medical imaging, road map reconstruction from GPS traces, processing of astronomical imagery, and other tasks [Türetken et al., 2013; Chai et al., 2013; Bo et al., 2016]. These methods are applicationspecific and cannot be applied directly to vectorization of handdrawn line drawings. Other vectorization methods deal with shaded images, like photographs or cartoon images [Zhang et al., 2009; Lecot and Lévy, 2006; Orzan et al., 2013]; their focus is to capture an image using simple colored primitives, which typically are assumed to be closed.
We focus on reconstruction of line drawings without shading, where lines may or may not be closed. In this area, strong priors about line shape, e.g. that lines only form circles or straight lines, might bring simplicity to vectorization of technical drawings [Hilaire and Tombre, 2006], but do not apply to freeform line drawings.
For vectorization of curvy line drawings, existing methods vary by the amount of noise allowed in the input. Noisy line drawings with multiple overlapping strokes or hatching patterns require deviation from the drawn image in favor of simplicity [Bartolo et al., 2007; Favreau et al., 2016]. Guided by a similar motivation, De Goes et al. [2011] propose a method to extract a simplified curve network for noisy drawings. While their approach is natural for drawings with very fuzzy lines and significant noise, such behavior may not be desired for higherquality drawings that do not contain overlapping strokes, which require more precise vectorization (Fig. 2(a)).
On the other side of the spectrum is an image vectorization method tailored for clean cartoon drawings by Noris et al. [2013]. Their global approach to topology allows them to, for instance, correctly disambiguate nearby parallel strokes. Their treatment of junctions, however, is still local and may result in incorrect or imprecise treatment (Fig. 2(b)). Furthermore, the discrete nature of the algorithm renders it unstable in presence of noise (Fig. 15).
A recent work by Donati et al. [2017] explores accurate vectorization of noisy sketches using Pearson’s correlation coefficient with Gaussian kernels. While their method achieves impressive performance and is able to process sketches with multiple overlapping strokes, it makes no effort to correctly disambiguate junctions, parallel lines, or overall extract drawing topology. Instead they rely on the topology of a 1pixel width skeleton, which is known to be prone to local artifacts [Favreau et al., 2016]. In contrast, we resolve junctions and parallel lines by generating a frame field, and use it to explicitly extract drawing topology.
A line of work close to our method is using tangent fields for image processing and vectorization [Chen et al., 2013; Kang et al., 2007; Chen et al., 2015]. For instance, Chen et al. [2015] propose an image vectorization method with a global variational approach to disambiguation of junctions. The primary issue with these approaches is the use of tangent fields, which cannot capture a collection of directions present at a junction point (Fig 4). As a result, the method by Chen et al. [2015] relies on user interaction and arbitrary thresholds to resolve junctions. Their method also does not consider the topology of the drawing, potentially yielding disconnected lines and/or spurious connections.
Building on this work, our method uses a more natural representation to track junctions in the drawings: a frame field defined at each stroke pixel. The two directions of the frame field efficiently disambiguate directions around T or Xjunctions, and the variational nature of our approach makes it resistant to noise.
Corner and junction detection in images.
Corner detection is a basic step in classical computer vision pipelines
[Szeliski, 2010]; for example, the wellknown Harris corner detector [Harris and Stephens, 1988] is implemented in countless industrystandard vision libraries. The goal of these methods typically is to detect and characterize salient features, while for vectorization it is more important to calculate the exact center of the junction and to estimate the directions of the joining lines robustly. Furthermore, even if it is possible to identify junction points, image gradient directions near corners and junctions often are noisy, making it difficult to estimate the individual directions meeting at a junction point using purely local information.3. Algorithm
Our system takes as input a grayscale bitmap line drawing and produces a set of strokes aligned to the drawing. We first solve an optimization problem to compute a frame field at each pixel in a narrow band around the set of stroke pixels, designed to capture directionality of the input and to superpose multiple directions near junctures. We then extract topology of the drawing by tracing the frame field and grouping curves into strokes. We then compute the final vectorization (Fig. 1).
3.1. Designing Frame Fields
Our vectorization algorithm begins by computing a smooth frame field, such that at every point near a stroke at least one field direction is aligned with a nearby curve tangent. Near T and Xjunctions, our field will align to both tangent directions present nearby in the image; this provides the flexibility needed to resolve image behavior near junctions (Fig. 5). We formulate computation of the field as a variational problem that, after discretization, can be approached using standard algorithms for nonlinear unconstrained optimization.
Initial steps.
We start by thresholding the image to isolate those pixels involved in the line drawing. In particular, our algorithm will operate on a subset of pixels , corresponding to dark pixels with intensity less than a fixed threshold, , where is the maximum image intensity. We additionally estimate a noisy tangent field from the drawing by taking its Sobel gradient with kernel of size 3 and rotating by . Note that, similarly to the discussion in [Zhang et al., 2007], the direction of this rotation (clockwise vs. counterclockwise) will not affect the computation of the frame field, which always couples forward and backward directions.
Representation and variational problem.
Following Diamanti et al. [2015], we represent the unknown frame field as a PolyVector field. Suppose we are given two directions representing curve tangents of the drawing near a given pixel; we can identify the image plane with the complex numbers and take as complex vectors. Consider the following complex polynomial :
(1) 
Here, the constants and determine and up to relabeling and sign. That is, every pair uniquely determines a frame , agnostic to the labeling of vs. as well as their sign. We use to denote the function above parameterized by the two coefficients and .
Recovering the frame directions from is equally straightforward:
(2) 
Of course, the relationship on the right is nonunique.
Optimizing for a pair per pixel induces challenging issues involving labeling and sign; for example, this representation in the BendFields algorithm requires the use of a mixedinteger solver [Iarussi et al., 2015]. To avoid this complexity, we instead optimize for a pair per pixel, which has no sign or ordering ambiguity. That is, the unknown in our optimization technique is a pair of complexvalued functions .
We propose the following variational problem:
(3) 
Here, encodes the direction of a vector , i.e. in complex language we can write .
Details about the individual terms in (3) are below, in the order they appear:

[leftmargin=*]

Alignment: The first optimization term enforces alignment of the frame field with the tangent directions. This term is small when the polynomial has a root near , implicitly implying that one of the field directions is aligned with the tangent direction . Since (1
) has no odddegree terms, this term has no dependence on the sign of
, as desired. 
Smoothness: The second optimization term is a Dirichlet energy measuring the smoothness of the functions and as a function of the location in the image. Smoothlyvarying pairs imply a smooth set of frame directions. We use in all our experiments; while the method is fairly stable to the choice of , larger values may be desirable for particularly noisy inputs.

Regularization: Away from junctions, there is only one prominent direction in the image. To prevent the frame field from collapsing into a line field, the regularization term expresses a slight preference for the field to be aligned with in the absence of other information.
To improve results by attenuating the influence of noisy directions near junctions, we weigh the smoothness term by and alignment term by , where
(4)  
(5) 
Here, denotes a small (onepixel) neighborhood of ; in practice, we approximate this integral by averaging the neighboring values of adjacent to the pixel centered at .
Optimization.
We apply standard finitedifference discretization to evaluate the objective function (3) on a pixel grid. The end result is a quadratic, unconstrained optimization problem for a pair per pixel. We use “natural” (Neumann) boundary conditions, essentially evaluating the gradientbased smoothness term only on pairs of adjacent pixels that are both included as degrees of freedom in the numerical problem.
We use the LBFGS algorithm for optimization [Nocedal and Wright, 2006], with a history of 6 iterates for the quasiNewton Hessian approximation. Our code uses the LBFGS++ implementation described by Qiu et al. [2016]. We start from an initial guess of an axisaligned cross field. The quadratic nature of our optimization problem would allow for more specialized techniques, e.g. preconditioned conjugate gradients, but as frame field computation is not currently the efficiency bottleneck of our algorithm, we leave tuning of this step to future work.
In the end, we only require frame directions on the image pixels corresponding to strokes that we will trace. Hence, to improve optimization efficiency and to improve junction resolution even with acute angles, we take inspiration from narrow band level set methods [Adalsteinsson and Sethian, 1995] and only include variables corresponding to pixels in ; that is, pixels corresponding to white areas in the input image are ignored. This greatly reduces the number of variables, yielding a significant boost in performance.
The optimization yields two scalar fields, . At every dark pixel , we then use (2) to recover the frame field directions .
4. Extracting Drawing Topology
The next step of our algorithm extracts the topology of the drawing from the computed frame field. The key requirement is not only to extract the correct topology, but also to create a structure that allows for subsequent vectorization aligned with the frame field.
Starting from each dark pixel, we trace the frame field (Fig. 7(a)); these curves are grouped locally into curve bundles. Each curve bundle is associated to a vertex in a topological graph , whose adjacency is determined by shared curve segments between different bundles (Fig. 7(b)). After topological simplification (Fig. 7(c)) and disambiguating parallel strokes, this graph has the topology of the line drawing (Fig. 7(d)) and allows for vectorization by following shared curves connecting each pair of curve bundles.
4.1. Initial Graph Construction
Away from junctions, the largest root of the frame field reliably is aligned with the curve tangent. Therefore, at every dark pixel , we choose the frame field root with the maximum magnitude; without loss of generality, we will label it . We then trace the frame field starting from in both directions , using simple Euler’s integration method with a step size , where each pixel is considered to have width 1 (Fig. 7(a)). We stop tracing as soon as a curve leaves the narrow band or comes indistinguishably close (within distance in our implementation) to a curve with the same tangent. The latter condition is designed partly to prevent closed curves in the drawing from being traced over multiple times, and partly for efficiency reasons to avoid overtracing.
The integration step yields more curves than will be present in the final traced image. Hence, once all curves are traced, we split up curves into groups corresponding to strokes in the drawing. Since curves may naturally continue past acute junctions or Yjunctions—valence3 junctions with three distinct directions—we create a topological graph by grouping curves locally (Fig. 6). Our goal is to group corresponding curves along the width of the stroke, perpendicular to the centerline; each group forms a vertex of the graph. We only group curves corresponding to the same direction of the frame field, thus separating intersecting strokes.
Starting from each curve endpoint (seed), at each of the 8 neighboring pixels, we select a matching direction of the frame field using the standard leastangle matching criterion [Diamanti et al., 2015]. We then trace a curve perpendicular to this local direction field, extending the field each time we move to a neighboring pixel by the same procedure: We look at the new 8pixel neighborhood and match the frame field directions. Once the orthogonal test curve is traced, we find its intersection points with the curves with the matching direction. We then form the group of intersection points , the curve bundle associated with a graph vertex , in the following way (Fig. 6): Starting with the seed point, we group adjacent intersection points if they are less than one pixel apart. This strategy effectively groups curves along the width of the stroke without relying on an estimate of the stroke width, based only on the simpler assumption that different parallel strokes are separated by at least one pixel (Fig. 6(a,b)). We then add edges between vertices that have at least one pair of intersection points adjacent along their shared curve (Fig. 6(c,d)). To efficiently test for intersections, in our implementation we cache segments overlapping with each pixel; we then test for intersections only with the segments in the pixels overlapping with the test curve.
Using this graph, one can define an “induced” vectorization of the drawing with the same topology as the graph. Namely, each edge of the graph defines a set of curve segments connecting adjacent curve bundles. By choosing one of those segments per edge and connecting, if necessary, the ends of those segments with straight lines, we obtain a vectorization with the topology of the graph (Fig. 11). This vectorization is not unique, since edges are typically associated to more than one shared curve (Fig. 11(b)); we discuss the vectorization process further in Section 5.
4.2. Topology Simplification
An uneven narrow band boundary can affect the topology of the graph. In particular, it might introduce extraneous loops and branches (inset, Fig. 7(b), red). To account for this, we perform the topology simplification procedure below (inset, Fig. 7(b)(c)).
Since we expect each loop to correspond to a hole in the narrow band, we contract each loop if its induced vectorization contains fewer than white pixels (inset, top). To distinguish true topological branches
—valencetwo paths ending with a leaf node—from extraneous ones, we use the following heuristic. For each vertex with valence greater than two, we repeatedly choose the shortest branch and prune it if its length outside the strokes formed by the other branches is too short (inset, bottom). In our implementation, for a given branch, we use a quarter of its full length, or a pixel, whichever is greater, as such threshold. To perform this test, we define stroke width at each vertex as the maximal distance between the intersection points of its curve bundle. Then, for each vertex of the branch we find the closest vertex not belonging to the branch and test if Euclidean distance between those is within sum of their stroke radii.
4.3. Disambiguating parallel strokes
In our final stage of topology extraction, we separate parallel strokes that are merged due to a connected narrow band segment (inset and Fig. 7(c)(d)). This happens when two different but nearly parallel strokes touch or overlap (inset, top): In the overlap, the traced curves of the upper stroke will be naturally grouped with the traced curves of the lower stroke, forming the orange vertices. To resolve this, we first find paths of valence2 vertices connecting pairs of vertices with valence 3 (inset, orange). Edges along these paths are split into two by duplicating their vertices; this procedure “unzips” the path connecting the degree3 vertices (inset and Fig. 7(d), green vertices). The remaining edges at the degree3 vertices that were not unzipped are assigned to new neighbors based on the connectivity of the underlying curve bundle.
At Xjunctions, two strokes intersect but they do not share a vertex in our graph construction. Hence, vertices with valence 4 or higher are extremely rare. We split these vertices using the same unzipping technique, effectively treating them as a pair of degree3 vertices connected by an edge of length zero.
4.4. Treating Frame Field Singularities
In contrast to frame fields applied to quad meshing [Diamanti et al., 2015], singularities in our frame field do not have meaningful interpretation; they usually are artifacts due to noise. Our insight is since singularities happen in the areas with significant noise, we can eliminate most of the singular points by relaxing the alignment term in those areas. Therefore, after the frame field optimization, we find singular pixels, set their alignment weight to zero, and rerun the optimization (3). We repeat this process, each time updating the alignment weights, until no more singularities can be resolved this way. Since each time we either reduce the number of nonzero alignment weights or stop, and since a frame field with no alignment term has no singularities, the process necessarily terminates. In practice, however, the alignment only needs to be relaxed for a small number of pixels, typically less than 1% of dark pixels.
We address the remaining singularities (typically fewer than five singular pixels per image, Fig. 8) using a simple heuristic. First, we stop tracing curves at singular pixels. This ensures that no vertices in the topological graph with inconsistently matched frame field directions are connected. However, this may introduce a gap in a stroke in the final vectorization. To address this, we mark leaf vertices next to a singularity in the topological graph, and greedily connect each to the closest nonadjacent vertex.
5. Vectorization
We use the extracted topological graph to create the final vectorization. The key idea is to extract a vectorization that follows the traced curves as much as possible while having the topology of the graph. Additionally, we would like the vectorized curves close to the centerline of the stroke.
We initialize our procedure for embedding the topological graph by embedding vertices whose degree does not equal two; these vertices correspond to isolated stroke endpoints as well as points where curve segments join together tangentially. Recall that in the topological graph represents a bundle of intersection points between the traced curves and an orthogonal test curve (red points in Figure 6). Thus, a natural choice of embedding for each vertex is one of the points in . With this construction in mind, we approximate the drawing centerline at as the barycenter of (Fig. 9, green points); becomes the assigned position for in our embedding (Fig. 11(b)).
What remains is to embed the degree2 vertices and the edges of the graph. Our objective in this step is to embed edges as curves that approximately follow the traced curves without diverging too far from the stroke centerline. Each individual traced curve, even if it started at the stroke center, might drift away from the centerline. To account for that, we select a curve on a peredge basis; our vectorization can “hop” from one traced curve to another at the vertices of the topological graph, in which case the two curve segments are connected using a straight line segment. The end result is a tracing that is composed piecewise of traced curves connected with short ligaments that subsequently can be smoothed. The details of this procedure are outlined below.
5.1. Auxiliary Graph
We cast the remaining embedding computation as a shortestpath problem over an auxiliary graph constructed as follows:

[leftmargin=*]

Vertices (): The vertices of the auxiliary graph are defined as the union , corresponding to the set of all intersection points between the traced curves and the test curves (yellow points in Fig. 9).

Edges (): Recall the vertices of are clustered into sets . For two vertices connected by an edge in the topological graph (rather than ), we insert a bipartite graph of edges connecting all vertices in to all vertices in . Symbolically, we can write:
Intuitively, by following an edge in the auxiliary graph, we connect the two intersection points with a segment of some traced curve shared by their curve bundles and, possibly, a straight line segment.

Edge weights (): The edge weight is designed so that the shortest path on the graph produces a curve that is smooth and centered. Thus, the weight of an edge is a weighted sum of two terms:
Roughly, the first term penalizes hopping from one curve to another when they are far away, in some sense favoring smoother connections; and the second term is designed to penalize connecting pairs of vertices that are far from the centerline (Figure 9).
More concretely, the first term, , is computed as a sum of distances to the closest shared curve (for an orange edge on Fig. 9 it is the magenta curve):
where is the traced curve containing the intersection point of some curve bundle. The centering term penalizes the distance from a vertex to the corresponding barycenter, i.e. for an edge from bundles and with stroke widths respectively,
The centering weight affects the exact locations of Yjunctions in ambiguous areas (Fig. 10); we use in all our experiments.
5.2. Extracting the Vectorization
As an initial embedding of the full graph , we simply connect the previouslyembedded vertices (with degree not equal two) using shortest paths in this weighted auxiliary graph. This produces a vectorization with the correct topology, which is centered and follows the traced curves (Fig. 11(a,b)).
After this initial pass computes an embedding, we make a second pass to refine the result. Principally, we improve the locations of the valence3 vertices, which can be suboptimal since they were chosen independently (Fig. 11(b)). Our procedure for moving the valence3 vertices to improved locations is illustrated in Figure 11 and described below.
Valence3 vertices typically correspond to acute junctions or Yjunctions. In this stage, we find optimal locations that provide a smooth transition between the joining curves. Thus, we attempt to further improve the total shortestpath cost over the graph while preserving topology. To do so, we allow each degree3 vertex (inset, yellow) to snap to any barycenter along its outgoing degree2 chains in (inset, green) and find the optimal locations for valence3 vertices minimizing the total shortestpath cost on the auxiliary graph (, gray vertices in the inset below).
If two degree3 vertices are connected by a chain of vertices of valence two, we fix the location of the vertex closest to the middle of the chain that was not split during the procedure described in §4.2; this avoids having to solve a global problem to place all the degree3 vertices simultaneously and is welljustified since traditional 1skeletonbased image vectorization methods [Noris et al., 2013] perform well away from junctions. After this step, every valence3 vertex is connected, via chains of degree2 vertices, to vertices with fixed locations (inset). Denoting the set of all the vertices in the degree2 chains connecting to as , we restrict the set of possible embedding locations for the valence3 vertex to the set of all curve bundle barycenters (inset, green points). In particular, . We iterate over to solve a discrete problem for the embedding of :
(6) 
where is the shortestpath distance on .
A few postprocessing steps conclude our second pass. Since intersecting strokes are separated in the graph construction and thus are traced independently, they may continue past the points where they should meet (see inset). To prune the resulting curve fragments, we add the intersection points into the graph and repeat the branch pruning procedure (§4.2). Finally, we optionally smooth the curves using Adobe Illustrator’s ‘Simplify Path’ feature with the 95% curve precision and zero angle threshold (Fig. 11, (e) and Fig. 13). Alternatively, one may use the Douglas–Peucker algorithm, followed by Laplacian smoothing; this strategy produces similar results.
6. Validation and Results
input  n. of dark  Noris  Favreau  our  

res.  pixels  et al. time  et al. time  time  
Muten  1024^{2}  35868  13s  375s  24s 
Mouse  1024^{2}  50298  17s  341s  64s 
Dracolion  1024^{2}  39402  15s  415s  25s 
Sheriff  1024^{2}  50198  19s  437s  49s 
Puppy  660x624  29908  26s  224s  41s 
Hippo  700x535  25114  24s  120s  43s 
Banana Tree  589x865  18619  15s  244s  23s 
Penguin  500x714  24134  23s  181s  56s 
Kitten  700x554  29023  38s  250s  81s 
Elephant  500x753  34569  33s  270s  55s 
Qualitative Evaluation
We have automatically generated a number of vectorizations for line drawings of different style and level of noise (Figs. 14, 15, 16, 1, and 24 (green curves)). These included noisy, complex drawings (‘Puppy,’ ‘Elephant,’ ‘Banana Tree’), some with varying stroke width (‘Hippo,’ ‘Penguin’), www.easydrawingsandsketches.com, ©Ivan Huska, as well as highresolution clean digital images (‘Sheriff,’ ‘Dracolion,’ ‘Muten,’ ‘Mouse’). For all noisy images from the drawing tutorial website, to simplify line separation from the background, we automatically adjusted contrast in Adobe Photoshop. Alternatively, one may use an implementation of histogram equalization.
The puppy example (Fig. 15) has what Noris et al. [2013] call spikes on the sides of the face (Fig. 17). In the presence of noise, distinguishing those from true junctions is problematic, and the heuristic suggested by Noris et al. [2013] breaks down. Instead of relying on similar heuristics, we allow for a simple user interaction: the user is able to edit the narrow band as a bitmap image. For this example, using a couple of brush strokes within a few seconds, the user adjusted the narrow band to achieve the desired effect (Fig. 17, (b)). All other input images were processed in a fully automatic way.
Comparison to Prior Art
We compare our method to the most relevant recent work on vectorization, described in [Noris et al., 2013] and [Favreau et al., 2016] (Fig. 15,16).
To run the method by Favreau et al. [2016], we try two sets of input parameters: the default parameters in their implementation^{1}^{1}1maxNumOpenCurves = 0, minLengthOpenCurves=30, minRegionSize=7 and ones manually selected to improve results;^{2}^{2}2maxNumOpenCurves = 30, minLengthOpenCurves=5, minRegionSize=3 we keep the ‘fidelitysimplicity’ weight at the default value of 0.5 ( in their formula (2)). To run the method by Noris et al. [2013], we try a set of parameter values, including the default parameters in their implementation,^{3}^{3}3All combinations of: Maximal Interact Distance , Maximal Active Distance , Direction Threshold and choose the best result. We tried both thresholding the initial images using our parameter value of , as well as not thresholding. Optionally, we additionally run a postprocessing step on the results by Noris et al. [2013] and Favreau et al. [2016]. For their methods, we chose the best results out of all those options on a perinput basis. We run our method with default parameters on all inputs.
On the clean digital inputs, our results are comparable to the ones by Noris et al. (Fig. 15, left). On the noisy inputs (Fig. 15), right, our variational method reliably disambiguates junctions, even with missing details and varying stroke width. The method by Noris et al. [2013] fails to disambiguate the complicated regions due to its discrete nature and heavy reliance on image gradients, e.g. puppy’s eyes (red). Our method faithfully captures the principal directions and junctions even in those regions. We provide additional comparison results in the auxiliary materials.
We see the method by Favreau et al. [2016] as complementary to our method: their method works best when significant simplification of the curve network is needed, while our goal is to stay true to the drawing even in the presence of noise (Fig. 16). For sketches with multiple overlapping strokes, our method aims to reconstruct all the single pen strokes, and in some cases it may not be the desired behavior (Fig. 24 (c)). However, for those cases, our method may serve as a better input vectorization for further topological simplification [Favreau et al., 2016; SimoSerra et al., 2016].
Our method is robust to changes in the input bitmap (Fig. 18), due in large part to its variational nature. Note that since drawings with multiple overlapping strokes, such as inputs in [Favreau et al., 2016], were not the focus of our work, we ran [Bartolo et al., 2007] on the left image, while tracing the curves from the dark pixels in the original sketch.
Noise robustness
We have evaluated noise robustness of our algorithm by testing on images polluted by various degrees of Gaussian noise (Fig. 23). In general, our method is robust to Gaussian noise; junction directions are particularly stable.
Benefits of individual steps
For completeness, we demonstrate effects of disabling significant steps of our algorithm in Figure 12.
Parameters and Processing Time
On a 4core Intel i76700 @ 3.4Ghz with 32Gb RAM, our implementation usually takes from twenty seconds for lowerresolution images to a little over a minute on high resolution images. Due to the narrowband optimization, our performance depends not on the image resolution, but rather on the number of dark pixels. Statistics for the images we tested are summarized in Table 1. We use the same parameters for all the images: .
While most parameters in our method have a straightforward and intuitive effect on the result, we have two main nonlinear weights: the frame field smoothness weight , and the regularizer weight . In Figure 20, we demonstrate that our method produces reasonable output under significant variations of . Namely, increasing the weight sharpens the junctions, at a possible cost of losing some fine details in the drawings. Therefore, higher values of may be appropriate for noisier drawings. Our method also produces reasonable outputs for significantly different values of (Fig. 21).
While we kept parameters fixed,
could be adjusted for drawings of different resolutions or noise structure. One could devise a heuristic to decrease it for lower resolutions, for instance, or use machine learning to calculate an optimal parameter for a class of drawings; we leave this for future work. We did not observe that keeping
fixed caused any issues in our experiments.Limitations
Similarly to most methods of this category [Noris et al., 2013; Favreau et al., 2016; Bo et al., 2016], our method works best on drawings without shading (Fig. 24(a,b), the nose of the cat in Fig. 18). On shaded images, user interaction might be necessary to achieve correct vectorization. Very low resolution images might be also challenging to vectorize (Fig. 22).
A common alternative to shading is to convey information about shape and lighting via hatching [Kalogerakis et al., 2012]. Those drawings are quite different from typical line drawings: in a technique called crosshatching, artists would often draw ink strokes of three or more distinct hatching directions in one region. Since our frame field captures only two directions at a point, our method is not designed to vectorize natural hatching images. Even so, our method is naturally suited for vectorizing hatching examples when mostly only two sets of directions are used in every region (Fig. 25).
7. Conclusion and Future Work
We have presented a novel method for automatically vectorizing raster images, based on PolyVector field design. As we demonstrate on a gallery of examples, it reliably and efficiently disambiguates T and Xjunctions in both clean and noisy drawings, while staying true to curve shapes and connectivity. Our pipeline finds immediate application in artistic and engineering workflows, automatically providing a highquality tracing without oversimplification or noise.
The presented method can be naturally extended to image domains where highvalence junctions are common, such as creating maps from GPS traces, by raising the degree of the PolyVector field polynomial (Eq. 1). Our preliminary experiments indicate it is indeed a promising direction, but special care must be taken to find consistent matchings of the frame field roots in the presence of noise.
Another interesting potential extension is to vectorization of animated sequences: while a temporal coherence term is trivial to add in our frame field design framework, tracing might need to be modified to avoid temporal artifacts.
8. Acknowledgments
The authors acknowledge the generous support of Army Research Office grant W911NF12R0011 (“Smooth Modeling of Flows on Graphs”), from the MIT Research Support Committee (“Structured Optimization for Geometric Problems”), and from the Skoltech–MIT Next Generation Program (“Simulation and Transfer Learning for Deep 3D Geometric Data Analysis”).
References
 [1]
 Adalsteinsson and Sethian [1995] David Adalsteinsson and James A. Sethian. 1995. A Fast Level Set Method for Propagating Interfaces. J. Comput. Phys. 118, 2 (1995), 269 – 277. https://doi.org/10.1006/jcph.1995.1098
 Bartolo et al. [2007] Alexandra Bartolo, Kenneth P. Camilleri, Simon G. Fabri, Jonathan C. Borg, and Philip J. Farrugia. 2007. Scribbles to Vectors: Preparation of Scribble Drawings for CAD Interpretation. (2007), 123–130. https://doi.org/10.1145/1384429.1384456
 Bessmeltsev et al. [2015] Mikhail Bessmeltsev, Will Chang, Nicholas Vining, Alla Sheffer, and Karan Singh. 2015. Modeling Character Canvases from Cartoon Drawings. ACM Trans. Graph. 34, 5, Article 162 (Nov. 2015), 16 pages. https://doi.org/10.1145/2801134
 Bo et al. [2016] Pengbo Bo, Gongning Luo, and Kuanquan Wang. 2016. A graphbased method for fitting planar Bspline curves with intersections. Journal of Computational Design and Engineering 3, 1 (2016), 14 – 23. https://doi.org/10.1016/j.jcde.2015.05.001

Chai
et al. [2013]
Dengfeng Chai, Wolfgang
Förstner, and Florent Lafarge.
2013.
Recovering LineNetworks in Images by
JunctionPoint Processes. In
2013 IEEE Conference on Computer Vision and Pattern Recognition
. 1894–1901. https://doi.org/10.1109/CVPR.2013.247  Chen et al. [2013] Jiazhou Chen, Gael Guennebaud, Pascal Barla, and Xavier Granier. 2013. Nonoriented MLS Gradient Fields. Computer Graphics Forum (Dec. 2013), p. https://hal.inria.fr/hal00857265
 Chen et al. [2015] JiaZhou Chen, Qi Lei, YongWei Miao, and QunSheng Peng. 2015. Vectorization of line drawing image based on junction analysis. Science China Information Sciences 58, 7 (2015), 1–14. https://doi.org/10.1007/s114320145246x
 Diamanti et al. [2015] Olga Diamanti, Amir Vaxman, Daniele Panozzo, and Olga SorkineHornung. 2015. Integrable PolyVector Fields. ACM Trans. Graph. 34, 4, Article 38 (July 2015), 12 pages. https://doi.org/10.1145/2766906
 Donati et al. [2017] Luca Donati, Simone Cesano, and Andrea Prati. 2017. An Accurate System for Fashion HandDrawn Sketches Vectorization. In The IEEE International Conference on Computer Vision (ICCV).
 Favreau et al. [2016] JeanDominique Favreau, Florent Lafarge, and Adrien Bousseau. 2016. Fidelity vs. Simplicity: a Global Approach to Line Drawing Vectorization. ACM Transactions on Graphics (SIGGRAPH Conference Proceedings) (2016). http://wwwsop.inria.fr/reves/Basilic/2016/FLB16
 Goes et al. [2011] Fernando de Goes, David CohenSteiner, Pierre Alliez, and Mathieu Desbrun. 2011. An Optimal Transport Approach to Robust Reconstruction and Simplification of 2D Shapes. Computer Graphics Forum (2011). https://doi.org/10.1111/j.14678659.2011.02033.x
 Harris and Stephens [1988] Chris Harris and Mike Stephens. 1988. A combined corner and edge detector. In In Proc. of Fourth Alvey Vision Conference. 147–151.
 Hertzmann and Zorin [2000] Aaron Hertzmann and Denis Zorin. 2000. Illustrating Smooth Surfaces. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’00). ACM Press/AddisonWesley Publishing Co., New York, NY, USA, 517–526. https://doi.org/10.1145/344779.345074
 Hilaire and Tombre [2006] Xavier Hilaire and Karl Tombre. 2006. Robust and accurate vectorization of line drawings. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 6 (June 2006), 890–904. https://doi.org/10.1109/TPAMI.2006.127
 Iarussi et al. [2015] Emmanuel Iarussi, David Bommes, and Adrien Bousseau. 2015. BendFields: Regularized Curvature Fields from Rough Concept Sketches. ACM Trans. Graph. 34, 3, Article 24 (May 2015), 16 pages. https://doi.org/10.1145/2710026
 Kalogerakis et al. [2012] Evangelos Kalogerakis, Derek Nowrouzezahrai, Simon Breslav, and Aaron Hertzmann. 2012. Learning Hatching for PenandInk Illustration of Surfaces. ACM Transactions on Graphics 31, 1 (2012).
 Kang et al. [2007] Henry Kang, Seungyong Lee, and Charles K. Chui. 2007. Coherent Line Drawing. In Proceedings of the 5th International Symposium on Nonphotorealistic Animation and Rendering (NPAR ’07). ACM, New York, NY, USA, 43–50. https://doi.org/10.1145/1274871.1274878
 Kanizsa [1979] Gaetano Kanizsa. 1979. Organization in vision : essays on gestalt perception. Praeger, New York.
 Kass and Witkin [1987] Michael Kass and Andrew Witkin. 1987. Analyzing Oriented Patterns. Comput. Vision Graph. Image Process. 37, 3 (March 1987), 362–385. https://doi.org/10.1016/0734189X(87)900430
 Lecot and Lévy [2006] Gregory Lecot and Bruno Lévy. 2006. Ardeco: Automatic Region Detection and Conversion. In Proceedings of the 17th Eurographics Conference on Rendering Techniques (EGSR ’06). Eurographics Association, AirelaVille, Switzerland, Switzerland, 349–360. https://doi.org/10.2312/EGWR/EGSR06/349360
 Nocedal and Wright [2006] Jorge Nocedal and Stephen J. Wright. 2006. Numerical Optimization (2nd ed.). Springer, New York.
 Noris et al. [2013] Gioacchino Noris, Alexander Hornung, Robert W. Sumner, Maryann Simmons, and Markus Gross. 2013. Topologydriven Vectorization of Clean Line Drawings. ACM Trans. Graph. 32, 1, Article 4 (Feb. 2013), 11 pages. https://doi.org/10.1145/2421636.2421640
 Orzan et al. [2013] Alexandrina Orzan, Adrien Bousseau, Pascal Barla, Holger Winnemöller, Joëlle Thollot, and David Salesin. 2013. Diffusion Curves: A Vector Representation for Smoothshaded Images. Commun. ACM 56, 7 (July 2013), 101–108. https://doi.org/10.1145/2483852.2483873
 Pal et al. [2016] Kazim Pal, Nicole Avery, Pete Boston, Alberto Campagnolo, Caroline De Stefani, Helen MathesonPollock, Daniele Panozzo, Matthew Payne, Christian Schüller, Chris Sanderson, Chris Scott, Philippa Smith, Rachael Smither, Olga SorkineHornung, Ann Stewart, Emma Stewart, Patricia Stewart, Melissa Terras, Bernadette Walsh, Laurence Ward, Liz Yamada, and Tim Weyrich. 2016. Digitally Reconstructing The Great Parchment Book: 3D recovery of firedamaged historical documents. Literary and Linguistic Computing: the journal of digital scholarship in the humanities, Oxford University Press (13 Dec. 2016), 1–31.
 Palacios and Zhang [2007] Jonathan Palacios and Eugene Zhang. 2007. Rotational Symmetry Field Design on Surfaces. ACM Trans. Graph. 26, 3, Article 55 (July 2007). https://doi.org/10.1145/1276377.1276446
 Panozzo et al. [2014] Daniele Panozzo, Enrico Puppo, Marco Tarini, and Olga SorkineHornung. 2014. Frame Fields: Anisotropic and Nonorthogonal Cross Fields. ACM Trans. Graph. 33, 4, Article 134 (July 2014), 11 pages. https://doi.org/10.1145/2601097.2601179
 Qiu et al. [2016] Yixuan Qiu, Naoaki Okazaki, and Jorge Nocedal. 2016. LBFGS++, A Headeronly C++ Library for LBFGS Algorithm. (2016). https://yixuan.cos.name/LBFGSpp/
 SimoSerra et al. [2016] Edgar SimoSerra, Satoshi Iizuka, Kazuma Sasaki, and Hiroshi Ishikawa. 2016. Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup. ACM Trans. Graph. 35, 4, Article 121 (July 2016), 11 pages. https://doi.org/10.1145/2897824.2925972
 Szeliski [2010] Richard Szeliski. 2010. Computer Vision: Algorithms and Applications (1st ed.). SpringerVerlag New York, Inc., New York, NY, USA.
 Türetken et al. [2013] Engin Türetken, Fethallah Benmansour, Bjoern Andres, Hanspeter Pfister, and Pascal Fua. 2013. Reconstructing Loopy Curvilinear Structures Using Integer Programming. In 2013 IEEE Conference on Computer Vision and Pattern Recognition. 1822–1829. https://doi.org/10.1109/CVPR.2013.238
 Vaxman et al. [2016] Amir Vaxman, Marcel Campen, Olga Diamanti, Daniele Panozzo, David Bommes, Klaus Hildebrandt, and Mirela BenChen. 2016. Directional Field Synthesis, Design, and Processing. Computer Graphics Forum (2016). http://graphics.tudelft.nl/Publicationsnew/2016/VCDPBHB16
 Whited et al. [2010] Brian Whited, Gioacchino Noris, Maryann Simmons, Robert W. Sumner, Markus Gross, and Jarek Rossignac. 2010. BetweenIT: An Interactive Tool for Tight Inbetweening. Computer Graphics Forum 29, 2 (2010), 605–614. https://doi.org/10.1111/j.14678659.2009.01630.x
 Xia et al. [2014] GuiSong Xia, Julie Delon, and Yann Gousseau. 2014. Accurate Junction Detection and Characterization in Natural Images. Vol. 106. Kluwer Academic Publishers, Hingham, MA, USA. 31–56 pages. https://doi.org/10.1007/s1126301306401

Zhang
et al. [2007]
Eugene Zhang, James Hays,
and Greg Turk. 2007.
Interactive Tensor Field Design and Visualization on Surfaces.
IEEE Transactions on Visualization and Computer Graphics 13, 1 (Jan. 2007), 94–107. https://doi.org/10.1109/TVCG.2007.16  Zhang et al. [2009] SongHai Zhang, Tao Chen, YiFei Zhang, ShiMin Hu, and Ralph R. Martin. 2009. Vectorizing Cartoon Animations. IEEE Transactions on Visualization and Computer Graphics 15, 4 (July 2009), 618–629. https://doi.org/10.1109/TVCG.2009.9
Comments
There are no comments yet.