Texture segmentation fits within the larger area of image segmentation, with a particular focus on images that contain textures. The goal, then, is to partition the image, i.e., group the pixels, into differently textured regions. Texture segmentation, and image segmentation more generally, is an important task in computer vision and pattern recognition, being widely applied to areas such as scene understanding, remote sensing and autonomous driving(Pal and Pal, 1993; Zhang, 2006; Reed and Dubuf, 1993; Liu et al., 2019).
At least in recent decades, texture segmentation methods are almost invariably based on extracting local features around each pixel, such as SIFT (Lowe, 1999), which are then fed into a clustering algorithm, such as k-means. An emblematic approach in this context is that of Shi and Malik (2000)
, who used the pixel value as feature, arguably the simplest possible choice, and applied a form of spectral clustering to group the pixels. The process is similar to what is done in the adjacent area of texture classification, the main difference being that a classification method is used instead of a clustering algorithm(Varma, 2004; Randen and Husoy, 1999a).
Although this basic approach has remained essentially unchanged, the process of extracting features has undergone some important changes over the years, ranging from the use of sophisticated systems from applied harmonic analysis such as Gabor filters or wavelets (Dunn and Higgins, 1995; Grigorescu et al., 2002; Jain and Farrokhnia, 1991; Unser, 1995; Weldon and Higgins, 1996; Randen and Husoy, 1999b) to multi-resolution or multiscale aggregation approaches (Galun et al., 2003; Mao and Jain, 1992), among others (Malik et al., 2001; Hofmann et al., 1998)2015; Ronneberger et al., 2015; Milletari et al., 2016; Badrinarayanan et al., 2017). See (Humeau-Heurtier, 2019) for a recent survey.
While the vast majority of the work in texture segmentation, as in image processing at large, is applied, we contribute some theory by establishing the consistency of the basic approach described above. We do so in a stylized setting which is nonetheless a reasonable mathematical model for the problem of texture segmentation. Markov random fields (MRF) are common models for textures (Cross and Jain, 1983; Geman and Graffigne, 1986), and arguably the most popular in theoretical texture analysis (Rue and Held, 2005; Arias-Castro et al., 2018; Verzelen, 2010a, b; Verzelen and Villers, 2009)
. This is the model that we use. Although supplanted by the more recent feature extraction methods mentioned above, which in recent years are invariably nonparametric, Gaussian MRF in particular remain the most commonly-used parametric model for textures, also used in the development of methodology not too long ago(Chellappa and Chatterjee, 1985; Zhu et al., 1998; Manjunath and Chellappa, 1991; Paciorek and Schervish, 2006). When textures are modeled by stationary Gaussian MRF, what characterizes them is the covariance structure, so that in congruence with adopting Gaussian MRF as models for textures, when assumed stationary the feature we extract is the (local) covariance. When textures are not assumed stationary, we also incorporate location as an additional feature, as the covariance structure may change within a textured region.
The basic approach calls for applying a clustering algorithm to the extracted features. Features are typically represented by (possibly high-dimensional) feature vectors, as is the case with the features that we work with, and thus a large number of clustering methods are applicable, some of them coming with theoretical guaranties such as k-means(Arthur and Vassilvitskii, 2007)1999; Vempala and Wang, 2004; Hsu and Kakade, 2013)2005; Dasgupta, 2010), including single-linkage clustering (Arias-Castro, 2011), and spectral clustering (Ng et al., 2002). In this paper, we use k-means in the context of stationary textures and singe-linkage clustering in the context of non-stationary textures.
The paper is organized as follows. In Section 2
, we consider stationary textures, which is done by the extraction of local second moment information on patches and the application of k-means. In Section3, we consider non-stationary textures, where we also include location as a feature and we apply instead single-linkage clustering. In Section 4, we present the result of some numerical experiments, mostly there to illustrate the theory developed in the main part of the paper. Both synthetic and natural textures are considered.
2 Stationary Textures
In this section we consider textures to be stationary. The model we adopt and the method we implement are introduced in Section 2.1 and Section 2.2. We then establish in Section 2.3 the consistency of a simple incarnation of the basic approach.
We have a pixel image of size , that we assume is partitioned into two sub-regions and by curve . is a stationary Gaussian Markov random field with mean and autocovariance matrix . is a stationary Gaussian Markov random field with mean and autocovariance matrix . In image , we pick up pixels with equal intervals, and get observations
To estimate curve, we need to cluster the pixels into two groups.
We define scanning patches as follows. To simplify the presentation assume is the square of an integer (namely for some integer m). For , pick up patch with size ,
Next, autocovariance is defined based on scanning patches. For and , define true autocovariance and sample autocovariance as follows
Denote the vectorizations of and to be and respectively. Here is the true feature of pixel and is the observed feature of pixel .
Also based on scanning patches, we define following three sets
Here and are both stationary fields, so all elements in set are the same and we denote it as . Similarly, all elements in set are the same and we denote it as . Define template autocovariance .
Then we introduce membership matrix. Define true membership matrix such that for ,
Also define the set of membership matrices as follows
Based on above calculations and definitions, we define k-means clustering estimation as
where is the row of matrix .
In practice k-means can not be solved exactly, however, there exists polynomial time algorithm which obtains approximation () satisfying following equation (11), such as -approximate method in (Kumar et al., 2004).
where and . Thus we cluster the pixels into two groups by membership matrix estimation . As a summary, we provide the procedure of k-means algorithm in Algorithm 1.
Define the set of permutation matrices
Next, define the set of mistakenly clustered elements to be as follows
then clustering error rate is
Firstly we introduce following assumptions.
Both and are wide-sence stationary Gaussian Markov random fields.
Let . For ,
Define for and for . With the same in Assumption 2,
Next, before introducing the theory, we indicate the error bound of .
Under Assumption 1, for and , there exists a constant such that
First let be the vectorization of , then is a vector of length
Then for , there exists a matrix such that
where is a matrix with elements and , and the number of 1 is less than .
Since the field is stationary, suppose , where is non-negative. Let be the spectral decomposition of . Define
By Hanson-Wright inequality in Rudelson and Vershynin (2013), for , there exist constants and , such that
Next we focus on and . Since
is an orthogonal matrix,
Then for , there exist constants , and such that
So when is small enough, we have
by Union bound, for ,
Let . Define , and , we have . Then all the elements in are clustered correctly.
On the one hand, for and , by contradiction, if ,
which is conflicted by itself, so . On the other hand, suppose or , by contradiction, if , has at least three distinct rows, however, according to the structure of , it has exactly two distinct rows, which is a conflict. So . Thus, all the elements in are clustered correctly. ∎
Next, we introduce the theory for k-means clustering algorithm.
where , . Without loss of generality, set , then
On the one hand, by Triangle Inequality,
On the other hand,
then we have
By Lemma 2.2,
So for , as ,
Thus, for stationary Gaussian random field, we get the error bound of k-means clustering algorithm. ∎
3 Non-stationary Textures
In this section we consider textures to be non-stationary. Here both and are non-stationary Gaussian Markov random fields with mean . We add location information into consideration and cluster the pixels into two groups by single-linkage algorithm. The algorithm is established in Section 3.1. Then we show the consistency of a simple incarnation of the basic approach in Section 3.2.
Pick up pixels with equal intervals from , where
For , it is obvious that or . So there is no overlap between and .
Next for , add location information into its true feature . Denote as the new true feature of pixel as follows
where is a vector of length . Similarly, denote as the new observed feature of pixel as follows
where also is a vector of length .
Apply single-linkage algorithm in following steps. Firstly among , connect all pairs with , where . Next for , assign all the other pixels in into the same cluster as . Then for any pixel which is still not clustered, find the pixel in with the smallest distance to , and assign pixel into the same cluster with . Finally we obtain the clustering result.
As a summary, we provide the procedure of single-linkage algorithm in Algorithm 2.
Define the following set
In next section, we can show that all the pixels in
can be clustered correctly with probability going to. Thus, all pixels in can be clustered correctly with probability going to .
Firstly we introduce two assumptions on the non-stationary level of the fields.
For any in the same sub-region,
where is the distance between two pixels and
For any in different sub-regions, if , there exists a constant such that
Next, we show that the single-linkage algorithm in above section is consistent.
Set threshold value , where . For , denote as the pixel bordering and above in , and denote as the pixel bordering and below in , then
By Union bound,
We calculate above probability in three steps. Firstly, under Assumption 4, for ,