1 Introduction
Image segmentation is the process of dividing an image in parts, identifying objects or other relevant information (Shapiro and Stockman, 2001). It is one of the most difficult tasks in image processing (Gonzalez and Woods, 2008). Fully automatic segmentation is still very challenging and difficult to accomplish. Many automatic approaches are domaindependant, usually applied in the medical field (Christ et al., 2016; Moeskops et al., 2016; Avendi et al., 2016; Bozkurt et al., 2018; MartinezMuñoz et al., 2016; PatinoCorrea et al., 2014). Therefore interactive image segmentation, in which a user supplies some information regarding the objects of interest, is experiencing increasing interest in the last decades (Boykov and Jolly, 2001; Rother et al., 2004; Blake et al., 2004; Grady, 2006; Ding and Yilmaz, 2010; Price et al., 2010; Li et al., 2010; Artan, 2011; Ding et al., 2012; Ducournau and Bretto, 2014; Breve et al., 2015b, a; Oh et al., 2017; Wang et al., 2018b; Liew et al., 2017; Wang et al., 2018a; Lin et al., 2016; Dong et al., 2016; Wang et al., 2016b, a).
The user interaction may take place in different ways, depending on the choice of the method, including loosely tracing the desired boundaries (Blake et al., 2004; Wang et al., 2007b), marking parts of the object(s) of interest and/or background (Boykov and Jolly, 2001; Li et al., 2004; Grady, 2006; Price et al., 2010; Breve et al., 2015b, a), loosely placing a bounding box around the objects of interest (Rother et al., 2004; Lempitsky et al., 2009; Pham et al., 2010), among others. In all scenarios, the goal is to allow the user to select the desired objects with minimal effort (Price et al., 2010).
This paper focuses on the second type of approach, in which the user “scribbles” some lines on the object(s) of interest and the background. The “scribbles” are then used as seeds to guide the iterative segmentation process. That is a popular approach because it requires only a quicker and less precise input from the user. They can loosely mark broader interior regions instead of finely tracing near borders (Price et al., 2010).
Graphcuts is one of the most popular approaches to seeded segmentation, with numerous methods proposed (Boykov and Jolly, 2001; Boykov and FunkaLea, 2006; Rother et al., 2004; Blake et al., 2004; Price et al., 2010; Vicente et al., 2008). In graph theory, a cut is a partition of the vertices of a graph into two disjoint subsets (Narkhede, 2013). These methods combine explicit edgefinding and regionmodeling components, modeled as an optimization problem of minimizing a cut in a weighted graph partitioning foreground and background seeds (Price et al., 2010).
Other approaches rely on graphbased machine learning
(Grady, 2006; Wang et al., 2007a; Duchenne et al., 2008; Ducournau and Bretto, 2014; Breve et al., 2015b, a; Oh et al., 2017; Wang et al., 2018b; Dong et al., 2016; Wang et al., 2016b, a), where the image is modeled as an affinity graph, where edges encode similarity between neighboring pixels. The segmentation problem may be modeled as an energy function minimization, where the target function is smooth concerning the underlying graph structure (Ducournau and Bretto, 2014). Some deeplearning approaches were also recently proposed
(Liew et al., 2017; Wang et al., 2018a; Lin et al., 2016).The emergence of graphbased techniques is also due to the development of complex networks theory. In the last decades, the network research moved from small graphs to the study of statistical properties of largescale graphs. It was discovered that a regular network diameter might be drastically reduced by randomly changing a few edges while preserving its local structure, measured by clustering coefficient (Watts and Strogatz, 1998). The resulting networks are called smallworld networks and they represent some real networks, like social and linguistics networks. Smallworld networks have tightly interconnected clusters of nodes and a shortest mean path length that is similar to a random graph with the same number of nodes and edges (Humphries and Gurney, 2008).
This paper introduces a new graphbased method for interactive segmentation. It is simpler than many other methods. It does not incorporate any specific edgefinding or regionmodeling components. There is also no explicit optimization process. The graphs are merely used to propagate the labels from the “scribbles” to unlabeled pixels iteratively, directly through the weighted edges. The method has two consecutive stages. In the first stage, a nearest neighbor (NN) graph is built based on the similarity among all pixels on a reduced version of the input image, with each node representing a pixel (a group of pixels of the original image). In the second stage, the fullsize image is used, a new graph is built with each node representing a single pixel, which is connected only to the nodes representing the adjacent pixels in the image. The propagation occurs only to the nodes that were not confidently labeled during the first stage.
The propagation approach has some similarities with that proposed by Wang et al. (2007a). However, the graph construction is fundamentally different, as in the first stage nodes are not connected in a grid, but rather based on the color components and location of the pixels they represent. In this sense, the label propagation is faster, as the graph usually presents the smallworld property of complex networks (Watts and Strogatz, 1998).
The graph construction phase share some similarities with that proposed by Breve (2017). However, that approach uses undirected and unweighted graphs while the current study uses weighted digraphs. The propagation approach is also completely different. That model uses particles walking through the graph to propagate label information, in a natureinspired approach of competition and cooperation for territory. The proposed method approach is much faster, as the label information spreads directly through the graph. Finally, the particles model is stochastic and this proposed model is deterministic.
In spite of its simplicity, the proposed method can perform interactive image segmentation with high accuracy. It was applied to the images from the Microsoft GrabCut dataset (Rother et al., 2004) and the mean error rate achieved is comparable to those obtained by some stateoftheart methods. Moreover, its computational complexity order is only linear, , where is the amount of the pixels in the image in the best case scenario, and linearithmic, , in the worst case. It can also be applied to multiclass problems at no extra cost.
The remaining of this paper is organized as follows. Section 2 describes the proposed model. Section 3 presents some computer simulations to show the viability of the method. Section 4 discuss the time and storage complexity of the algorithm and the smallworld property of its networks. In Section 5, the method is applied to the Microsoft GrabCut dataset and its results are compared to those achieved by some stateoftheart algorithms. Some parameter analysis are also conducted in this section. Finally, the conclusions are drawn on Section 6.
2 Model Description
The proposed algorithm is divided into two stages. In the first stage, the input image is reduced to one ninth of its original size using bicubic interpolation, and a network is built with each node representing a pixel in the downsized image. The edges among them are built by connecting each node to its
nearest neighbors, in a complex arrangement, which considers both the pixel location and color. Then, label information is propagated iteratively through this network. Usually, most pixels are labeled with confidence in this stage.In the second stage, the full input image is used. Again, each node represents a single pixel. However, this time, the connections are made only from the pixels not confidently labeled in the first stage to the nodes representing the adjacent pixels in the image, in a grid arrangement, which considers only pixel location. Label information propagates iteratively again, only to the unlabeled nodes. Therefore, the remaining pixels are labeled at this stage.
In both networks, the same set of pixel features, considering both color and location, are extracted to define the edge weights. The whole procedure is detailed in the following subsections.
2.1 The First Stage
In the first stage, the input image is resized to one ninth of its original size (one third in each dimension) using bicubic interpolation. Then, the set of pixels of the resized image are reorganized as , such that is the labeled pixel subset and is the unlabeled pixels set. is the set containing the labels. is the function associating each to its label
. The proposed model estimates
for each unlabeled pixel .The labels are extracted from an image with the user input (“scribbles”), in which a different color represents each class, and another color is used for the unlabeled pixels. In the first stage, this image is also resized to one ninth of its original size, but using the nearestneighbor interpolation; otherwise, new colors would be introduced and mistakenly interpreted as new classes.
2.1.1 Graph Generation
For each pixel , a set of nine features are extracted. They are shown in Table 1.
#  Feature Description 

1  Pixel row location 
2  Pixel column location 
3  Red (R) component of the pixel 
4  Green (G) component of the pixel 
5  Blue (B) component of the pixel 
6  Value (V) component of the pixel from a RGB to HSV transform 
7  Excess Red Index (ExR) of the pixel 
8  Excess Green Index (ExG) of the pixel 
9  Excess Blue Index (ExB) of the pixel 
List of features extracted from each image to be segmented
The V component () is obtained from the RGB components using the method described by Smith (1978). ExR, ExG, and ExB ( to ) indexes are obtained from the RGB components as described in the “Image Segmentation Data Set”^{1}^{1}1Available at http://archive.ics.uci.edu/ml/datasets/image+segmentation (Dheeru and Karra Taniskidou, 2017):
(1)  
(2)  
(3) 
The Excess Green Index (ExG) and some of its derivatives are commonly employed on segmentation of agricultural images (Guijarro et al., 2011). These indexes are useful for identifying the amount of a color component concerning the others. In this paper, they are used because they decrease the distance among pixels representing the same segment which may have different amounts of incident light.
This set of features was chosen based on some earlier experiments with a preliminary version of the algorithm which had a single stage and no image resize step. It was applied to a subset of images from the GrabCut dataset, with a set of features, including H and S components obtained with the method described by Smith (1978)
, and mean (M) and standard deviation (SD) of RGB and HSV components on a
window around the pixel. All features were normalized to have zero mean and unit variance. After that, each feature had a weight to be used in the calculation of the distance among pixels. The weights of the
features were optimized using the Genetic Algorithm from the MATLAB Global Optimization Toolbox, with its default parameters and a fitness function to minimize the error rate, given by the number of mislabeled pixels in relation to all unlabeled pixels. Based on the results shown in Table
2, H and S features were discarded because of their low relevance in most images. The Mean and Standard deviation based features were discarded because the current version of the algorithm works on the resized image, so each pixel is already roughly an average of a window around the pixel. The remaining features are those presented in Table 1.Image / Feature  dog  21077  124084  271008  208001  llama  doll  person7  sheep  teddy  Mean  

Row  
Col  
R  
G  
B  
H  
S  
V  
ExR  
ExB  
ExG  
MR  
MG  
MB  
SDR  
SDG  
SDB  
MH  
MS  
MV  
SDH  
SDS  
SDV 
In the proposed method, the features from Table 1
are normalized to have zero mean and unit variance. After that, the components may be scaled by a vector of weights
to emphasize/deemphasize each feature during the graph generation. However, for simplicity, in all experiments on this paper, only two set of weights were used as . They will be later referenced as:(4) 
Thus, means all features have the same weight, and means the two location features have more weight than the seven color features. While there are many other possible weight combinations, there is not a reliable method to set them a priori, without relying on the segmentation results.
A directed and weighted graph is created representing the image. It is defined as , where is the set of nodes, and is the set of edges . Each node corresponds to a pixel . There is an edge between and only if is unlabeled () and is among the nearest neighbors of , considering the Euclidean distance between and features. Along with development, it was noticed that provides reasonable results in most images, as long as representative seeds are provided. But this parameter may be finetuned for each specific image to achieve better segmentation results.
For each edge , there is a corresponding weight , which is defined using a Gaussian kernel:
(5) 
where is the Euclidean distance between and . Along development, it was noticed that is not a very sensitive parameter. Therefore, is fixed for all computer simulations in this paper.
2.1.2 Label Propagation
For each node , a domination vector is created. Each element corresponds to the domination level from the class over the node . The sum of the domination vector in each node is always constant:
(6) 
Nodes corresponding to labeled pixels are fully dominated by their corresponding class, and their domination vectors never change. On the other hand, nodes corresponding to unlabeled pixels have variable domination vectors. They are initially set in balance among all classes. Thus, for each node , each element of the domination vector is set as follows:
(7) 
Then, the iterative label propagation process takes place. At each iteration , each unlabeled node gets contributions from all its neighbors to calculate its new domination levels. Therefore, for each unlabeled node , the domination levels are updated as follows:
(8) 
where is the set of neighbors. In this sense, the new domination vector is the weighted arithmetic mean of all its neighbors domination vectors, no matter if they are labeled or unlabeled.
The iterative process stops when the domination vectors converge. At this point, is reorganized to form a bidimensional grid, with each vector element in the same position of its corresponding pixel in the resized image. Then, the grid is enlarged to the size of the original input image, using bilinear interpolation, so has a vector for each pixel of the original input image.
When the first stage finishes, most pixels are completely dominated by a single class. The exceptions are usually the pixels in classes borders. Thus, for every node , if there is a highly dominant class, that class is assigned to the corresponding pixel:
(9) 
where is the class assigned to . Otherwise, the pixel is left to be labeled in the second stage.
2.2 The Second Stage
In the second stage, nodes that were not labeled in the first stage continue to receive contributions from their neighbors. However, in the second stage a new graph is built, in which every pixel in the input image becomes a node (no resizing), and each node corresponding to an unlabeled pixel () is connected to the nodes representing the adjacent pixels in the original image, except for pixels in the image borders, which have only or adjacent pixels. So, in the second phase, neighbors are defined only by location, but the edge weights are still defined by Eq. (5), using all the nine features.
Notice that the domination vectors are not reset before the second stage. The iterative label propagation process in the second stage also uses Eq. (8), and it stops when the domination vectors converge. At this point, all the still unlabeled pixels are labeled after the class that dominated their corresponding node:
(10) 
where is the class assigned to .
2.3 Stop Criterion
In both stages, the convergence is measured through the average maximum domination level, which is defined as follows:
(11) 
for all representing unlabeled nodes. is checked every iterative steps and the iterations stop when the increase is below between two consecutive checkpoints. In this paper, is used in all computer simulations.
2.4 The Algorithm
Overall, the proposed algorithm can be outlined as shown in Algorithm 1.
3 Computer Simulations
In this section, some experimental results using the proposed model are presented to show its efficacy in the interactive image segmentation task. First, five realworld images were selected to show that the algorithm can split foreground and background. Later, other two realworld images were selected to show the algorithm results segmenting multiple objects at once. For all images, the parameters are set to their default values, except for and . is tested with some values in the interval and is tested with and . Then, the values that produced the best segmentation results are used for each image. Figure 1 shows: (a) the five images selected to show the segmentation in background and foreground; (b) the “scribbles” that represent the user input, which is shown in different colors for background and foreground, over the grayscale image; and (c) the segmentation results achieved using the proposed method, shown as the foreground extracted from the background. Figure 2 shows: (a) the two images selected to show the multiclass segmentation capabilities of the proposed method; (b) the “scribbles” representing the user input, which are shown in different colors for each object and the background; and (c) the segmentation results achieved using the proposed method, with each object shown separately. Table 3 shows the image sizes and the parameters and used for each of them.
Image  Size  

Dog  
Ball  
Flower  
Bird  
Couple  
Cartridges  
Care Bears 
Notice that the algorithm receives the “scribbles” in a different image or layer than the image to be segmented. It considers that each color represents a different segment to be discovered, so the user seeds must be in different colors for each segment. The images to be segmented were added in blackandwhite as background to the “scribbles” in Figures 1 and 2 for illustrative purposes only.
By visually analyzing the segmentation results, one can notice that the proposed method was able to interactively segment different kinds of realworld images, with few mistakes.
4 Computational Time and Storage Complexity
In this section, time and storage complexity order analysis of the algorithm presented in Subsection 2.4 are provided.
4.1 Computational Time Complexity
At the beginning of the Algorithm 1, step 1 consists in resizing the input image with bicubic interpolation. This step has complexity order , where is the number of pixels in the image. Step 2 consists in building a NN graph. It is possible to find nearest neighbors in logarithmic time using kd trees (Friedman et al., 1977). Therefore, this step computational complexity is . Step 3 calculates edges weights. This step depends on the number of edges. Each node has edges; therefore, the computational complexity is . Step 4 is the initialization of the nodes domination levels, and it depends on the number of nodes and classes. Therefore, its computational complexity is . These first steps are dominated by step 2, which is the network construction as and are usually much smaller than .
Then we have the loops from steps 5 to 8. The instruction on step 8 consists in updating the domination levels on a node. Since each node takes contributions from its neighbors, the computational complexity order is . The inner loop executes step 7 for all unlabeled nodes. Most nodes are unlabeled; therefore the inner loop complexity order is . The outer loop is executed until the algorithm converges. The convergence depends on the network size and connectivity, which are related to and . Therefore, a set of experiments is performed, first with increasing image sizes and fixed , and later with fixed image size and increasing , to discover how they impact the number of outer loop executions. These will be presented and discussed later.
Step 9 consists of checking domination levels and labeling some nodes. Step 10 increases the domination levels matrix using bilinear interpolation. Both these steps have complexity order , due to the domination levels matrix size. In step 11, another graph is built, but only adjacent nodes are connected, and to exactly other nodes (except for nodes representing image border pixels), so this step has complexity order . Step 12 is similar to Step 3, but this time the average node degree is nearly constant, so the complexity order is .
From step 13 to 16, there is another pair of loops. Step 15 runs in constant time , since all nodes (except those representing image border pixels) have the same degree () no matter the image size. The inner loops execute step 15 for each unlabeled node. In most cases, there is only a small amount of unlabeled nodes at this point. The outer loop also depends on how many nodes are still unlabeled in the second stage and also on the network connectivity. In the typical scenario, there are few unlabeled nodes, and they form isolated subgraphs. Though it is difficult to calculate the exact computational complexity of the second stage, it is lower than in any typical case. The set of experiments also measures the number of outer loop iterations in the second stage, as and increases.
Finally, step 17 is similar to step 9. It also has complexity order .
Figure 3 and Tables 4 and 5 show the number of iterations of the outer loops of the first and second stages, and time required to convergence when the proposed method is applied to two images from Figure 1: “Dog” and “Bird” (which are in the first and the fourth row, respectively). Each image and their respective “scribbles” images are resized to of their original size, while is kept fixed. By analyzing the graphics, it is possible to realize that as the image size increases, the execution time increase is close to linear (). While the first stage inner loop increases linearly on , the outer loop does not increase significantly, which is expected due to the smallworld property of complex networks that keep the nodes grouped in clusters, so the label spreading rate does not change much. The second stage requires only a few iterations of the outer loop, only in most cases, which is the minimum value since the convergence check is performed every iteration. Tables 4 and 5 also show the error rate in each scenario, which is the number of mislabeled pixels in relation to all the pixels labeled by the algorithm. Notice that the algorithm labels all the image pixels, except those already covered by the “scribbles”.
Image Size  Width  Height  Tot. Pixels  Ph. 1  Ph. 2  Time (s)  Error Rate  

Image Size  Width  Height  Tot. Pixels  Ph. 1  Ph. 2  Time (s)  Error Rate  

Figure 4 and Tables 6 and 7 show the number of iterations of the outer loops of the first and second stages, and time required to convergence when the proposed method is applied to the same two images from Figure 1: “Dog” and “Bird”. However, this time the images are not resized and the outdegree of the nodes has increasing sizes (). By analyzing the graphics, it is possible to realize that as increases, the number of iterations of the first stage outer loop decreases. This is expected, since the network connectivity is increasing, and thus the labels have higher spread at each iteration. On the other hand, the execution time increases because the first stage inner loop execution time is higher as increases. However, this increase is only logarithmic (). Tables 6 and 7 also show the error rate in each scenario, which is calculated as previously described.
k  Ph. 1  Ph. 2  Time (s)  Error Rate  

k  Ph. 1  Ph. 2  Time (s)  Error Rate  

It is worth noting that in realworld problems, and do not increase proportionally to as image sizes increases. The amount of classes is unrelated to the image size, and the optimal value of depends on many factors, like the image structure, labeled pixels, objects position, and others, but even in similar images it does not need to increase linearly on to keep similar network connectivity, as the smallworld property of complex networks applies. Therefore, the average time complexity in the first stage is usually lower than .
In this sense, step 2 would dominate the execution time. However, although step 2 has the highest computational complexity, its execution time is faster than the first stage loop in all the experiments presented in this paper. It will undoubtedly dominate the execution time for huge images, but in the typical scenario, the execution time is still dominated by the first stage (steps 5 to 8).
The second stage execution time is usually negligible. It is very fast when compared to the first stage and step 2.
In summary, steps 5 to 8 run at linear time in the best scenario (fixed ) and linearithmic time in the worst scenario. So, the first stage dominates the execution time in images of moderate size. Only in huge images, step 2 would dominate the execution time, and it runs in linearithmic time . Therefore, in most realworld scenarios a time complexity from to is expected.
4.2 Storage Complexity
Regarding the memory requirements and storage complexity, the proposed algorithm uses the following data structures: the resized image, the features table, the neighbors table, the weights table, the domination vectors, and the labeled output image. The resized image size is . The features table is built from the input image (or the resized image), and it is used to build the graph. There are features, so the features table size is in the first stage (which works with the resized image) and on the second stage. In the first stage, the neighbors table holds the nearest neighbors of each node, so its size is . In the second stage, each neighbor has only neighbors or less, so the neighbors table size is . The weights table has the same size as the neighbors table in both stages. The domination vectors hold the pertinence of each node to each class, so its size is in the first stage, and in the second stage. Finally, the labeled output image size is . As explained before, in realworld problems, and do not increase proportionally to . So, we may expect that all data structures grow linearly on and the storage complexity is .
4.3 LargeScale Networks
The proposed method was also tested on large images to evaluate its behavior on largescale networks. The source picture of the “Dog” image from Figure 1 is used in these experiments. It is a megapixels JPEG image ( pixels).
In the first experiment, the picture is resized up to times its size, using bicubic interpolation, to simulate a megapixels picture. After the enlargement, some Poisson noise is added using the “imnoise” function from the MATLAB Image Processing Toolbox to simulate the noise from a camera sensor. Otherwise, the enlarged images would look like a set of flat tiles. The same enlargement is applied to the “scribbles” image, but using the nearestneighbor interpolation to avoid the introduction of new colors which would be mistakenly interpreted as new classes. Figure 5 and Table 8 show the number of iterations of the outer loops of the first and second stages, and time required to convergence when the proposed method is applied. Table 8 additionally shows the error rate in each scenario.
Image Size  Width  Height  Tot. Pixels  Ph. 1  Ph. 2  Time (s)  Error Rate  

MP  
MP  
MP  
MP  
MP  
MP  
MP  
MP  
MP  
MP 
In the second experiment, the megapixels picture is used without modification, but the outdegree of the nodes has increasing sizes (). Figure 6 and Table 9 show the number of iterations of the outer loops of the first and second stages, and time required to convergence when the proposed method is applied. Table 9 additionally shows the error rate in each scenario.
k  Ph. 1  Ph. 2  Time (s)  Error Rate  

By analyzing these results, the same patterns seem on the experiments with smaller images is observed. As the network increases, the amount of first phase iterations also increases and the execution time increases almost linearly. As the connectivity increases, the amount of first phase iterations decreases and the execution time increases logarithmically.
4.4 SmallWorldNess
The proposed method efficiency highly relies on the smallworld property of the networks it generates in its first phase. In particular, when the edges are created to the nearest neighbors of each node, the clustering coefficient is usually high, so the label information is quickly spread to the neighborhood.
To verify if the smallworld property is present on a network, Humphries and Gurney (2008) proposed a measure called smallworldness. The smallworldness of a given graph may be calculated as follows:
(12) 
where and are the clustering coefficient and the mean shortest path length of the network, respectively, and and are the clustering coefficient and the mean shortest path length observed in random equivalente networks, i.e., networks with the same amount of nodes and edges. indicates the presence of the smallworld property.
Unfortunately, is undefined for disconnected networks, because in those scenarios diverges to infinity. To overcome this drawback, Zanin (2015) proposed an alternative formulation to compute smallworldness, which uses the average efficiency of the network instead of the shortest path length since efficiency is defined even for disconnected networks. The efficiency of a graph is calculated as follows:
(13) 
where is the total of nodes in the network and denotes the length of the shortest path between a node and another node .
The new efficiencybased smallworldness is then defined as follows:
(14) 
where and are the clustering coefficient and the average efficiency of the network, respectively, and and are the clustering coefficient and the average efficiency observed in random equivalent networks. indicates the presence of the smallworld property.
Notice that disconnected networks are not a problem for the proposed method. An unlabeled node only needs a path to a labeled node to get label information. Even if an unlabeled node does not have a path to a labeled node, it still gets its label in the second stage. Therefore, the efficiencybased smallworldness is used in this paper.
Table 10 shows the measures of smallworldness, clustering coefficient and efficiency for the networks built during the proposed method first phase for the “Dog” image with to of its original size and . Table 11 shows the same measures for the networks built during the proposed method first phase for the “Dog” image with its original size and to . In both cases, the mean clustering coefficient and the mean average efficiency of random networks with the same amount of nodes and edges are also shown for comparison.
By analysing Tables 10 and 11, one can notice that all the networks have high smallworldness levels, clearly showing that they have the smallworld property. In particular, the clustering coefficients are much higher than those of an equivalent random network.
Image Size  

10%  38.03  0.4301  0.0898  0.0036  0.2816 
20%  62.65  0.3923  0.0740  0.0018  0.2591 
30%  127.09  0.3851  0.0685  0.0009  0.2234 
40%  240.52  0.3739  0.0628  0.0005  0.1874 
50%  418.88  0.3718  0.0603  0.0003  0.1610 
60%  694.56  0.3683  0.0592  0.0002  0.1374 
70%  1045.17  0.3648  0.0567  0.0002  0.1169 
80%  1506.40  0.3625  0.0559  0.0001  0.0993 
90%  2382.55  0.3606  0.0555  0.0001  0.0827 
100%  3943.10  0.3595  0.0544  0.0001  0.0694 
k  

10  2599.98  0.3595  0.0544  0.0001  0.0695 
20  4490.24  0.3858  0.0780  0.0001  0.0694 
30  6446.55  0.3976  0.0935  0.0001  0.0694 
40  6570.85  0.4061  0.1044  0.0001  0.0696 
50  8705.99  0.4128  0.1134  0.0001  0.0696 
60  8098.44  0.4178  0.1208  0.0001  0.0694 
70  8161.99  0.4219  0.1272  0.0001  0.0695 
80  7588.72  0.4255  0.1331  0.0001  0.0696 
90  10167.96  0.4288  0.1385  0.0001  0.0695 
100  10150.12  0.4319  0.1433  0.0001  0.0694 
110  8386.42  0.4348  0.1475  0.0001  0.0694 
120  9803.54  0.4375  0.1516  0.0001  0.0694 
130  12145.72  0.4401  0.1556  0.0001  0.0695 
140  10484.87  0.4426  0.1592  0.0001  0.0696 
150  13139.53  0.4450  0.1627  0.0001  0.0696 
160  12865.17  0.4471  0.1659  0.0001  0.0694 
170  10555.00  0.4491  0.1689  0.0001  0.0695 
180  11310.11  0.4510  0.1718  0.0001  0.0696 
190  12677.64  0.4529  0.1747  0.0001  0.0695 
200  14342.62  0.4547  0.1774  0.0001  0.0694 
210  13331.34  0.4565  0.1801  0.0001  0.0694 
220  15375.38  0.4583  0.1826  0.0001  0.0697 
230  14912.54  0.4600  0.1851  0.0001  0.0696 
240  15552.47  0.4618  0.1875  0.0001  0.0695 
250  12742.60  0.4635  0.1899  0.0001  0.0694 
5 Benchmark
Figures 1 and 2 show segmentation examples on realworld images, where the user input is limited to a set of “scribbles” on the main object(s) and the background. The results are qualitatively good as they mostly agree with perceptual boundaries.
For quantitative results, the proposed method is applied to the images of the Microsoft GrabCut dataset (Rother et al., 2004). Though there are some other data sets available with ground truth segmentation results, this one is, to the best of my knowledge, the only one where seed regions are provided. It is also the only database which was widely used in other papers; therefore it is possible to present a quantitative comparison with stateoftheart methods. Their original seed regions are not presented as “scribbles”. Instead, they present a large number of labeled pixels and a narrow band around the contour of objects to be segmented. In spite of that, the proposed method can be applied to it without any modification or extra cost.
Table 12 presents a comparison of the average error rates obtained on the GrabCut dataset (Rother et al., 2004) by the proposed method and other interactive image segmentation methods. The proposed method was first applied to the whole dataset with its default parameters (, , ). In this way, it achieved an error rate of . Later, the parameter was optimized for each image, and an error rate of was achieved.
Method  Error Rate 

sDPMNL (boundary) (Ding et al., 2012)  11.43% 
GMMVL (location + color + boundary) (Yi et al., 2004)  10.45% 
SVM (location + color + boundary) (Chang and Lin, 2011)  9.21% 
GMMRF (Blake et al., 2004)  7.90% 
sDPMNL (color) (Ding et al., 2012)  7.65% 
Superpixels Hypergraph (Ding and Yilmaz, 2008)  7.30% 
Lazy Snapping (Li et al., 2004)  6.65% 
Graph Cuts (Boykov and Jolly, 2001)  6.60% 
Cost volume filtering (Hosni et al., 2013)  6.20% 
Directed Image Neighborhood Hypergraph (Ducournau and Bretto, 2014)  6.15% 
RobustP^{n} (Kohli et al., 2009)  6.08% 
Grabcut (Rother et al., 2004)  5.46% 
Regularized Laplacian (Duchenne et al., 2008)  5.40% 
Grady’s random walker (Grady, 2006)  5.40% 
Probabilistic Hypergraph (Ding and Yilmaz, 2010)  5.33% 
DPMVL (color + boundary) (Ding et al., 2012)  5.19% 
Laplacian Coordinates (Casaca et al., 2014)  5.04% 
sDPMVL (color + boundary) (Ding et al., 2012)  4.78% 
SubMarkov Random Walk (Dong et al., 2016)  4.61% 
Normalized Lazy Random Walker (Bampis et al., 2017)  4.37% 
Normalized Random Walker (Bampis et al., 2017)  4.35% 
Nonparametric HigherOrder (Kim et al., 2010)  4.25% 
Proposed Method (default parameters)  4.15% 
Constrained Random Walks (Yang et al., 2010)  4.08% 
Lazy Randow Walks (Shen et al., 2014)  3.89% 
Robust Multilayer Graph Constraints (Wang et al., 2016a)  3.79% 
Texture Aware Model (Zhou et al., 2013)  3.64% 
Pairwise Likelihood Learning (Wang et al., 2017)  3.49% 
Multilayer Graph Constraints (Wang et al., 2016b)  3.44% 
Proposed Method (optimized )  3.21% 
Random Walks with Restart (Kim et al., 2008)  3.11% 
Normalized SubMarkov Random Walk (Bampis et al., 2017)  3.10% 
Difusive Likelihood (Wang et al., 2018b)  3.08% 
Figure 7 shows some examples of images from the Microsoft GrabCut dataset and the corresponding segmentation results. The first column shows the input images. The second column show “trimaps” providing seed regions. Black (0) represents the background, ignored by the algorithm; dark gray (64) is the labeled background; light gray (128) is the unlabeled region, which labels are estimated by the proposed method; and white (255) is the labeled foreground, which generates the foreground class particles. The error rates in Table 12
are computed as the ratio of the number of incorrectly classified pixels to the total amount of unlabeled pixels. Third and fourth columns show the segmentation results obtained by the proposed method with its default parameters and with
optimized for each image, respectively.5.1 Execution Times
The algorithm was implemented in MATLAB. The loops in both stages were implemented in C (MEX function). It took an average of milliseconds to segment each image from the Microsoft GrabCut dataset on a computer with an Intel Core i7 4790K CPU and 32GB of RAM.
Wang et al. (2018b) presents a comparison of the average running times of representative interactive image segmentation techniques on all test images of size in the Microsoft GrabCut dataset. They also used an Intel i7 CPU and MATLAB implementations in their tests. Therefore, the same test was applied to the proposed method and the results are shown in Table 13. The proposed method was faster than all the other tested methods.
Method  Time (s) 

Nonparametric HigherOrder (Kim et al., 2010)  11.0 
Multilayer Graph Constraints (Wang et al., 2016b)  5.4 
SubMarkov Random Walk (Dong et al., 2016)  5.1 
Diffusive Likelihood (Wang et al., 2018b)  3.4 
Laplacian Coordinates (Casaca et al., 2014)  3.2 
Grady’s random walker (Grady, 2006)  0.8 
GrabCut (Rother et al., 2004)  0.7 
Proposed Method (default parameters)  0.3 
5.2 Parameter Analysis
The proposed method sensitivity to parameter values is analyzed using the Microsoft GrabCut dataset. In all scenarios, the images of the dataset are segmented with the default parameters, except for the parameter under analysis. Figure 8(a) shows the error rates when . Figure 8(b) shows the error rates when . Figures 8(c) and 8(d) shows the error rates and execution times when .
By analyzing those graphics, one can notice that and produced the best results in the parameter analysis. has low sensitivity and has its best range around to . Finally, has decreasing error rates as it lowers down to , and then it stabilizes. However, since this parameter is directly related to the stop criterion, the execution times are higher as decreases. offers a good tradeoff between execution time and error rates.
5.3 Seed Sensitivity Analysis
The original “trimaps” from the Microsoft GrabCut dataset provides a large number of seeds for iterative image segmentation methods. However, the proposed method does not need all those seeds to provide reasonable segmentation results. Therefore, an experiment was set in which each “trimap” from the dataset had each of its seeds randomly erased with a probability
, so the changed pixels would appear unlabeled to the method. By varying from to , it is possible to generate “trimaps” with roughly 100% to 1% of the original seeds, respectively. So “trimaps” were generated for each image, of them for each of the configurations of . The proposed method was applied to all of them. The mean error rates on each configuration are presented in Figure 9(a).Notice that while the error rates decrease as the number of seeds decreases in Figure 9(a), that does not necessarily mean that the segmentation results are better, because with fewer seeds, there are more unlabeled pixels and each pixel mislabeled by the algorithm has less impact on the error rate. Thus, Figure 9(b) shows the error rates on each configuration, but excluding the pixels which were seeds in the original “trimaps” from the error rate computation. These results showed that the number of seeds may be greatly reduced without much impact in the error rates.
5.4 Microsoft GrabCut Dataset with “Scribbles”
Andrade and Carrera (2015) presents an objective and empirical evaluation method for seedbased interactive segmentation algorithms. They have extended the Microsoft GrabCut dataset by incorporating two sets of “scribbles” for each of the images^{2}^{2}2Available at https://github.com/flandrade/datasetinteractivealgorithms.
The first set of “scribbles” employ four strokes per image, three on the background and one small area on the foreground object. The second set of “scribbles” indicate and mark in more detail the foreground region. The two sets reflect two different degrees of user effort.
The proposed method was applied to both sets of “scribbles”. Table 14 presents a comparison of the average error rates obtained by the proposed method and other interactive image segmentation methods. The proposed method was first applied to the whole dataset with its default parameters (, , ). Later, the parameter was optimized for each image. In both scenarios, and for both sets of “scribbles”, the proposed method outperformed the other methods even with its default parameters.
6 Conclusions
In this paper, a graphbased interactive image segmentation method is proposed. Seeds are provided by the user in form of “scribbles”, loosely traced over the objects of interest and the background. The method takes advantage of complex networks properties to spread labels quickly, with low time and storage complexity. It can be applied to multiclass problems at no extra cost.
Despite its simplicity, the method can achieve classification accuracy comparable to those achieved by some stateoftheart methods when applied to the Microsoft GrabCut dataset, with their original trimaps used as user input, which is commonly used to evaluate and compare interactive image segmentation methods. It is also the fastest method when compared to other methods, including some classic and some newer stateoftheart approaches. Moreover, it achieved the best results when the user input is composed only by a few “scribbles”, outperforming other recent approaches.
Though the proposed method has some parameters which can be finetuned to achieve better results, usually only has a significant impact on the classification accuracy. The default parameters may be used when the time is restricted. The user may also finetune parameters while adding more “scribbles” if he/she is not satisfied with the current segmentation results.
The method may also be extended by introducing edgefinding components or edge related features to decrease error rates further, and to handle more challenging segmentation tasks.
Acknowledgment
The author would like to thank the São Paulo State Research Foundation  FAPESP [grant 2016/056694].
References

Andrade and Carrera (2015)
Andrade, F., Carrera, E.V.,
2015.
Supervised evaluation of seedbased interactive image segmentation algorithms, in: 2015 20th Symposium on Signal Processing, Images and Computer Vision (STSIVA), pp. 1–7.
doi:10.1109/STSIVA.2015.7330447.  Artan (2011) Artan, Y., 2011. Interactive image segmentation using machine learning techniques, in: Computer and Robot Vision (CRV), 2011 Canadian Conference on, pp. 264–269. doi:10.1109/CRV.2011.42.
 Avendi et al. (2016) Avendi, M., Kheradvar, A., Jafarkhani, H., 2016. A combined deeplearning and deformablemodel approach to fully automatic segmentation of the left ventricle in cardiac mri. Medical Image Analysis 30, 108 – 119. URL: http://www.sciencedirect.com/science/article/pii/S1361841516000128, doi:https://doi.org/10.1016/j.media.2016.01.005.
 Bampis et al. (2017) Bampis, C.G., Maragos, P., Bovik, A.C., 2017. Graphdriven diffusion and random walk schemes for image segmentation. IEEE Transactions on Image Processing 26, 35–50. doi:10.1109/TIP.2016.2621663.
 Blake et al. (2004) Blake, A., Rother, C., Brown, M., Perez, P., Torr, P., 2004. Interactive image segmentation using an adaptive gmmrf model, in: Pajdla, T., Matas, J. (Eds.), Computer Vision  ECCV 2004. Springer Berlin Heidelberg. volume 3021 of Lecture Notes in Computer Science, pp. 428–441. URL: http://dx.doi.org/10.1007/9783540246701_33, doi:10.1007/9783540246701_33.
 Boykov and FunkaLea (2006) Boykov, Y., FunkaLea, G., 2006. Graph cuts and efficient nd image segmentation. International Journal of Computer Vision 70, 109–131. URL: http://dx.doi.org/10.1007/s1126300679345, doi:10.1007/s1126300679345.
 Boykov and Jolly (2001) Boykov, Y., Jolly, M.P., 2001. Interactive graph cuts for optimal boundary amp; region segmentation of objects in nd images, in: Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, pp. 105–112 vol.1. doi:10.1109/ICCV.2001.937505.
 Bozkurt et al. (2018) Bozkurt, F., Köse, C., Sarı, A., 2018. An inverse approach for automatic segmentation of carotid and vertebral arteries in cta. Expert Systems with Applications 93, 358 – 375. URL: http://www.sciencedirect.com/science/article/pii/S0957417417307236, doi:https://doi.org/10.1016/j.eswa.2017.10.041.
 Breve (2017) Breve, F., 2017. Building networks for image segmentation using particle competition and cooperation, in: Gervasi, O., Murgante, B., Misra, S., Borruso, G., Torre, C.M., Rocha, A.M.A., Taniar, D., Apduhan, B.O., Stankova, E., Cuzzocrea, A. (Eds.), Computational Science and Its Applications – ICCSA 2017, Springer International Publishing, Cham. pp. 217–231.
 Breve et al. (2015a) Breve, F., Quiles, M., Zhao, L., 2015a. Interactive image segmentation of noncontiguous classes using particle competition and cooperation, in: Gervasi, O., Murgante, B., Misra, S., Gavrilova, M.L., Rocha, A.M.A.C., Torre, C., Taniar, D., Apduhan, B.O. (Eds.), Computational Science and Its Applications – ICCSA 2015. Springer International Publishing. volume 9155 of Lecture Notes in Computer Science, pp. 203–216. URL: http://dx.doi.org/10.1007/9783319214047_15, doi:10.1007/978331921404715.

Breve et al. (2015b)
Breve, F., Quiles, M.G.,
Zhao, L., 2015b.
Interactive image segmentation using particle competition and cooperation, in: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8.
doi:10.1109/IJCNN.2015.7280570. 
Casaca et al. (2014)
Casaca, W., Nonato, L.G.,
Taubin, G., 2014.
Laplacian coordinates for seeded image segmentation, in: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 384–391.
doi:10.1109/CVPR.2014.56. 
Chang and Lin (2011)
Chang, C.C., Lin, C.J.,
2011.
Libsvm: a library for support vector machines.
ACM transactions on intelligent systems and technology (TIST) 2, 27. 
Christ et al. (2016)
Christ, P.F., Elshaer, M.E.A.,
Ettlinger, F., Tatavarty, S.,
Bickel, M., Bilic, P.,
Rempfler, M., Armbruster, M.,
Hofmann, F., D’Anastasi, M.,
Sommer, W.H., Ahmadi, S.A.,
Menze, B.H., 2016.
Automatic liver and lesion segmentation in ct using cascaded fully convolutional neural networks and 3d conditional random fields, in: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (Eds.), Medical Image Computing and ComputerAssisted Intervention – MICCAI 2016, Springer International Publishing, Cham. pp. 415–423.
 Dheeru and Karra Taniskidou (2017) Dheeru, D., Karra Taniskidou, E., 2017. UCI machine learning repository. URL: http://archive.ics.uci.edu/ml.
 Ding and Yilmaz (2008) Ding, L., Yilmaz, A., 2008. Image segmentation as learning on hypergraphs, in: Machine Learning and Applications, 2008. ICMLA ’08. Seventh International Conference on, pp. 247–252. doi:10.1109/ICMLA.2008.17.
 Ding and Yilmaz (2010) Ding, L., Yilmaz, A., 2010. Interactive image segmentation using probabilistic hypergraphs. Pattern Recognition 43, 1863 – 1873. URL: http://www.sciencedirect.com/science/article/pii/S0031320309004440, doi:http://dx.doi.org/10.1016/j.patcog.2009.11.025.
 Ding et al. (2012) Ding, L., Yilmaz, A., Yan, R., 2012. Interactive image segmentation using dirichlet process multipleview learning. IEEE Transactions on Image Processing 21, 2119–2129. doi:10.1109/TIP.2011.2181398.
 Dong et al. (2016) Dong, X., Shen, J., Shao, L., Gool, L.V., 2016. Submarkov random walk for image segmentation. IEEE Transactions on Image Processing 25, 516–527. doi:10.1109/TIP.2015.2505184.
 Duchenne et al. (2008) Duchenne, O., Audibert, J.Y., Keriven, R., Ponce, J., Segonne, F., 2008. Segmentation by transduction, in: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. doi:10.1109/CVPR.2008.4587419.
 Ducournau and Bretto (2014) Ducournau, A., Bretto, A., 2014. Random walks in directed hypergraphs and application to semisupervised image segmentation. Computer Vision and Image Understanding 120, 91 – 102. URL: http://www.sciencedirect.com/science/article/pii/S1077314213002038, doi:http://dx.doi.org/10.1016/j.cviu.2013.10.012.
 Friedman et al. (1977) Friedman, J.H., Bentley, J.L., Finkel, R.A., 1977. An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3, 209–226. URL: http://doi.acm.org/10.1145/355744.355745, doi:10.1145/355744.355745.
 Gonzalez and Woods (2008) Gonzalez, R.C., Woods, R.E., 2008. Digital Image Processing (3rd Edition). PrenticeHall, Inc., Upper Saddle River, NJ, USA.
 Grady (2006) Grady, L., 2006. Random walks for image segmentation. Pattern Analysis and Machine Intelligence, IEEE Transactions on 28, 1768–1783. doi:10.1109/TPAMI.2006.233.
 Guijarro et al. (2011) Guijarro, M., Pajares, G., Riomoros, I., Herrera, P., BurgosArtizzu, X., Ribeiro, A., 2011. Automatic segmentation of relevant textures in agricultural images. Computers and Electronics in Agriculture 75, 75 – 83. URL: http://www.sciencedirect.com/science/article/pii/S0168169910001924, doi:https://doi.org/10.1016/j.compag.2010.09.013.
 Hosni et al. (2013) Hosni, A., Rhemann, C., Bleyer, M., Rother, C., Gelautz, M., 2013. Fast costvolume filtering for visual correspondence and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 504–511. doi:10.1109/TPAMI.2012.156.
 Humphries and Gurney (2008) Humphries, M.D., Gurney, K., 2008. Network ‘smallworldness’: A quantitative method for determining canonical network equivalence. PLOS ONE 3, 1–10. URL: https://doi.org/10.1371/journal.pone.0002051, doi:10.1371/journal.pone.0002051.
 Kim et al. (2008) Kim, T.H., Lee, K.M., Lee, S.U., 2008. Generative image segmentation using random walks with restart, in: Forsyth, D., Torr, P., Zisserman, A. (Eds.), Computer Vision – ECCV 2008, Springer Berlin Heidelberg, Berlin, Heidelberg. pp. 264–275.
 Kim et al. (2010) Kim, T.H., Lee, K.M., Lee, S.U., 2010. Nonparametric higherorder learning for interactive segmentation, in: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3201–3208. doi:10.1109/CVPR.2010.5540078.
 Kohli et al. (2009) Kohli, P., Ladický, L., Torr, P.H.S., 2009. Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision 82, 302–324. URL: https://doi.org/10.1007/s1126300802020, doi:10.1007/s1126300802020.
 Lempitsky et al. (2009) Lempitsky, V., Kohli, P., Rother, C., Sharp, T., 2009. Image segmentation with a bounding box prior, in: 2009 IEEE 12th International Conference on Computer Vision, pp. 277–284. doi:10.1109/ICCV.2009.5459262.

Li et al. (2010)
Li, J., BioucasDias, J.,
Plaza, A., 2010.
Semisupervised hyperspectral image segmentation using multinomial logistic regression with active learning.
Geoscience and Remote Sensing, IEEE Transactions on 48, 4085–4098. doi:10.1109/TGRS.2010.2060550.  Li et al. (2004) Li, Y., Sun, J., Tang, C.K., Shum, H.Y., 2004. Lazy snapping. ACM Trans. Graph. 23, 303–308. URL: http://doi.acm.org/10.1145/1015706.1015719, doi:10.1145/1015706.1015719.
 Liew et al. (2017) Liew, J., Wei, Y., Xiong, W., Ong, S., Feng, J., 2017. Regional interactive image segmentation networks, in: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2746–2754. doi:10.1109/ICCV.2017.297.
 Lin et al. (2016) Lin, D., Dai, J., Jia, J., He, K., Sun, J., 2016. Scribblesup: Scribblesupervised convolutional networks for semantic segmentation, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3159–3167. doi:10.1109/CVPR.2016.344.
 MartinezMuñoz et al. (2016) MartinezMuñoz, S., RuizFernandez, D., GalianaMerino, J.J., 2016. Automatic abdominal aortic aneurysm segmentation in mr images. Expert Systems with Applications 54, 78 – 87. URL: http://www.sciencedirect.com/science/article/pii/S0957417416000270, doi:https://doi.org/10.1016/j.eswa.2016.01.017.
 Moeskops et al. (2016) Moeskops, P., Viergever, M.A., Mendrik, A.M., de Vries, L.S., Benders, M.J.N.L., Išgum, I., 2016. Automatic segmentation of mr brain images with a convolutional neural network. IEEE Transactions on Medical Imaging 35, 1252–1261. doi:10.1109/TMI.2016.2548501.
 Narkhede (2013) Narkhede, M.H., 2013. A review on graph based segmentation. Journal of innovative research in electrical, electronics, instrumentation and control engineering 1, 3.
 Oh et al. (2017) Oh, C., Ham, B., Sohn, K., 2017. Robust interactive image segmentation using structureaware labeling. Expert Systems with Applications 79, 90 – 100. URL: http://www.sciencedirect.com/science/article/pii/S0957417417301215, doi:https://doi.org/10.1016/j.eswa.2017.02.031.
 PatinoCorrea et al. (2014) PatinoCorrea, L.J., Pogrebnyak, O., MartinezCastro, J.A., FelipeRiveron, E.M., 2014. White matter hyperintensities automatic identification and segmentation in magnetic resonance images. Expert Systems with Applications 41, 7114 – 7123. URL: http://www.sciencedirect.com/science/article/pii/S0957417414003169, doi:https://doi.org/10.1016/j.eswa.2014.05.036.
 Pham et al. (2010) Pham, V.Q., Takahashi, K., Naemura, T., 2010. Boundingbox based segmentation with single mincut using distant pixel similarity, in: 2010 20th International Conference on Pattern Recognition, pp. 4420–4423. doi:10.1109/ICPR.2010.1074.
 Price et al. (2010) Price, B.L., Morse, B., Cohen, S., 2010. Geodesic graph cut for interactive image segmentation, in: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3161–3168. doi:10.1109/CVPR.2010.5540079.
 Rother et al. (2004) Rother, C., Kolmogorov, V., Blake, A., 2004. “grabcut”: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 309–314. URL: http://doi.acm.org/10.1145/1015706.1015720, doi:10.1145/1015706.1015720.
 Shapiro and Stockman (2001) Shapiro, L., Stockman, G., 2001. Computer Vision. Prentice Hall.
 Shen et al. (2014) Shen, J., Du, Y., Wang, W., Li, X., 2014. Lazy random walks for superpixel segmentation. IEEE Transactions on Image Processing 23, 1451–1462. doi:10.1109/TIP.2014.2302892.
 Smith (1978) Smith, A.R., 1978. Color gamut transform pairs, in: ACM Siggraph Computer Graphics, ACM. pp. 12–19.
 Vicente et al. (2008) Vicente, S., Kolmogorov, V., Rother, C., 2008. Graph cut based image segmentation with connectivity priors, in: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. doi:10.1109/CVPR.2008.4587440.
 Wang et al. (2007a) Wang, F., Wang, X., Li, T., 2007a. Efficient label propagation for interactive image segmentation, in: Sixth International Conference on Machine Learning and Applications (ICMLA 2007), pp. 136–141. doi:10.1109/ICMLA.2007.54.
 Wang et al. (2018a) Wang, G., Zuluaga, M.A., Li, W., Pratt, R., Patel, P.A., Aertsen, M., Doel, T., David, A.L., Deprest, J., Ourselin, S., Vercauteren, T., 2018a. Deepigeos: A deep interactive geodesic framework for medical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence , 1–1doi:10.1109/TPAMI.2018.2840695.
 Wang et al. (2007b) Wang, J., Agrawala, M., Cohen, M.F., 2007b. Soft scissors: An interactive tool for realtime high quality matting. ACM Trans. Graph. 26. URL: http://doi.acm.org/10.1145/1276377.1276389, doi:10.1145/1276377.1276389.
 Wang et al. (2018b) Wang, T., Ji, Z., Sun, Q., Chen, Q., Ge, Q., Yang, J., 2018b. Diffusive likelihood for interactive image segmentation. Pattern Recognition 79, 440 – 451. URL: http://www.sciencedirect.com/science/article/pii/S0031320318300761, doi:https://doi.org/10.1016/j.patcog.2018.02.023.
 Wang et al. (2016a) Wang, T., Ji, Z., Sun, Q., Chen, Q., Jing, X., 2016a. Interactive multilabel image segmentation via robust multilayer graph constraints. IEEE Transactions on Multimedia 18, 2358–2371. doi:10.1109/TMM.2016.2600441.

Wang et al. (2017)
Wang, T., Sun, Q., Ge,
Q., Ji, Z., Chen, Q.,
Xia, G., 2017.
Interactive image segmentation via pairwise likelihood learning, in: Proceedings of the TwentySixth International Joint Conference on Artificial Intelligence, IJCAI17, pp. 2957–2963.
URL: