Some Theory for Texture Segmentation

10/31/2020 ∙ by Lin Zheng, et al. ∙ 0

In the context of texture segmentation in images, and provide some theoretical guarantees for the prototypical approach which consists in extracting local features in the neighborhood of a pixel and then applying a clustering algorithm for grouping the pixel according to these features. On the one hand, for stationary textures, which we model with Gaussian Markov random fields, we construct the feature for each pixel by calculating the sample covariance matrix of its neighborhood patch and cluster the pixels by an application of k-means to group the covariance matrices. We show that this generic method is consistent. On the other hand, for non-stationary fields, we include the location of the pixel as an additional feature and apply single-linkage clustering. We again show that this generic and emblematic method is consistent. We complement our theory with some numerical experiments performed on both generated and natural textures.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Texture segmentation fits within the larger area of image segmentation, with a particular focus on images that contain textures. The goal, then, is to partition the image, i.e., group the pixels, into differently textured regions. Texture segmentation, and image segmentation more generally, is an important task in computer vision and pattern recognition, being widely applied to areas such as scene understanding, remote sensing and autonomous driving

(Pal and Pal, 1993; Zhang, 2006; Reed and Dubuf, 1993; Liu et al., 2019).

At least in recent decades, texture segmentation methods are almost invariably based on extracting local features around each pixel, such as SIFT (Lowe, 1999), which are then fed into a clustering algorithm, such as k-means. An emblematic approach in this context is that of Shi and Malik (2000)

, who used the pixel value as feature, arguably the simplest possible choice, and applied a form of spectral clustering to group the pixels. The process is similar to what is done in the adjacent area of texture classification, the main difference being that a classification method is used instead of a clustering algorithm

(Varma, 2004; Randen and Husoy, 1999a).

Although this basic approach has remained essentially unchanged, the process of extracting features has undergone some important changes over the years, ranging from the use of sophisticated systems from applied harmonic analysis such as Gabor filters or wavelets (Dunn and Higgins, 1995; Grigorescu et al., 2002; Jain and Farrokhnia, 1991; Unser, 1995; Weldon and Higgins, 1996; Randen and Husoy, 1999b) to multi-resolution or multiscale aggregation approaches (Galun et al., 2003; Mao and Jain, 1992), among others (Malik et al., 2001; Hofmann et al., 1998)

, to the use deep learning, particularly in the form convolutional neural networks (CNN), whose success is attributed to the capability of CNN to learn a hierarchical representation of raw input data

(Long et al., 2015; Ronneberger et al., 2015; Milletari et al., 2016; Badrinarayanan et al., 2017). See (Humeau-Heurtier, 2019) for a recent survey.

While the vast majority of the work in texture segmentation, as in image processing at large, is applied, we contribute some theory by establishing the consistency of the basic approach described above. We do so in a stylized setting which is nonetheless a reasonable mathematical model for the problem of texture segmentation. Markov random fields (MRF) are common models for textures (Cross and Jain, 1983; Geman and Graffigne, 1986), and arguably the most popular in theoretical texture analysis (Rue and Held, 2005; Arias-Castro et al., 2018; Verzelen, 2010a, b; Verzelen and Villers, 2009)

. This is the model that we use. Although supplanted by the more recent feature extraction methods mentioned above, which in recent years are invariably nonparametric, Gaussian MRF in particular remain the most commonly-used parametric model for textures, also used in the development of methodology not too long ago

(Chellappa and Chatterjee, 1985; Zhu et al., 1998; Manjunath and Chellappa, 1991; Paciorek and Schervish, 2006). When textures are modeled by stationary Gaussian MRF, what characterizes them is the covariance structure, so that in congruence with adopting Gaussian MRF as models for textures, when assumed stationary the feature we extract is the (local) covariance. When textures are not assumed stationary, we also incorporate location as an additional feature, as the covariance structure may change within a textured region.

The basic approach calls for applying a clustering algorithm to the extracted features. Features are typically represented by (possibly high-dimensional) feature vectors, as is the case with the features that we work with, and thus a large number of clustering methods are applicable, some of them coming with theoretical guaranties such as k-means

(Arthur and Vassilvitskii, 2007)

, Gaussian mixture models

(Dasgupta, 1999; Vempala and Wang, 2004; Hsu and Kakade, 2013)

, hierarchical clustering

(Dasgupta and Long, 2005; Dasgupta, 2010), including single-linkage clustering (Arias-Castro, 2011), and spectral clustering (Ng et al., 2002). In this paper, we use k-means in the context of stationary textures and singe-linkage clustering in the context of non-stationary textures.

The paper is organized as follows. In Section 2

, we consider stationary textures, which is done by the extraction of local second moment information on patches and the application of k-means. In Section 

3, we consider non-stationary textures, where we also include location as a feature and we apply instead single-linkage clustering. In Section 4, we present the result of some numerical experiments, mostly there to illustrate the theory developed in the main part of the paper. Both synthetic and natural textures are considered.

2 Stationary Textures

In this section we consider textures to be stationary. The model we adopt and the method we implement are introduced in Section 2.1 and Section 2.2. We then establish in Section 2.3 the consistency of a simple incarnation of the basic approach.

2.1 Model

We have a pixel image of size , that we assume is partitioned into two sub-regions and by curve . is a stationary Gaussian Markov random field with mean and autocovariance matrix . is a stationary Gaussian Markov random field with mean and autocovariance matrix . In image , we pick up pixels with equal intervals, and get observations


To estimate curve

, we need to cluster the pixels into two groups.

2.2 Methods

We define scanning patches as follows. To simplify the presentation assume is the square of an integer (namely for some integer m). For , pick up patch with size ,


Next, autocovariance is defined based on scanning patches. For and , define true autocovariance and sample autocovariance as follows


Denote the vectorizations of and to be and respectively. Here is the true feature of pixel and is the observed feature of pixel .

Also based on scanning patches, we define following three sets


Here and are both stationary fields, so all elements in set are the same and we denote it as . Similarly, all elements in set are the same and we denote it as . Define template autocovariance .

Then we introduce membership matrix. Define true membership matrix such that for ,


Also define the set of membership matrices as follows


Based on above calculations and definitions, we define k-means clustering estimation as


where is the row of matrix .

In practice k-means can not be solved exactly, however, there exists polynomial time algorithm which obtains approximation () satisfying following equation (11), such as -approximate method in (Kumar et al., 2004).


where and . Thus we cluster the pixels into two groups by membership matrix estimation . As a summary, we provide the procedure of k-means algorithm in Algorithm 1.

0:  Observations , approximation error .
0:  Membership matrix estimation .
1:  For , pick up patch .
2:  Calculate sample autocovariance and obtain observed features .
3:  Define template autocovariance and the set of membership matrices .
4:  Obtain k-means approximation solution () which satisfies (11).
5:  return  .
Algorithm 1 Texture Segmentation with K-means Algorithm

Define the set of permutation matrices


then calculate


Next, define the set of mistakenly clustered elements to be as follows


then clustering error rate is


2.3 Theory

Firstly we introduce following assumptions.

Assumption 1.

Both and are wide-sence stationary Gaussian Markov random fields.

Assumption 2.

Let . For ,

Assumption 3.

Define for and for . With the same in Assumption 2,


Next, before introducing the theory, we indicate the error bound of .

Lemma 2.1.

Under Assumption 1, for and , there exists a constant such that


First let be the vectorization of , then is a vector of length


Then for , there exists a matrix such that


where is a matrix with elements and , and the number of 1 is less than .

Since the field is stationary, suppose , where is non-negative. Let be the spectral decomposition of . Define






By Hanson-Wright inequality in Rudelson and Vershynin (2013), for , there exist constants and , such that


Next we focus on and . Since

is an orthogonal matrix,




Then for , there exist constants , and such that


So when is small enough, we have


Next since


by Union bound, for ,


Lemma 2.2.

Let . Define , and , we have . Then all the elements in are clustered correctly.


On the one hand, for and , by contradiction, if ,


which is conflicted by itself, so . On the other hand, suppose or , by contradiction, if , has at least three distinct rows, however, according to the structure of , it has exactly two distinct rows, which is a conflict. So . Thus, all the elements in are clustered correctly. ∎

Next, we introduce the theory for k-means clustering algorithm.

Theorem 2.3.

Under Assumption 1, 2 and 3, consider k-means clustering in Algorithm 1. For , there exists a constant , as ,


where is the clustering error rate. Here as .


By Algorithm 1 in Section 2.2, we have


where , . Without loss of generality, set , then


On the one hand, by Triangle Inequality,


In addition, by Assumption 3 and (47),


On the other hand,


then we have


By Lemma 2.2,


then by Lemma 2.1 and (54), there exists a constant ,


So for , as ,


Thus, for stationary Gaussian random field, we get the error bound of k-means clustering algorithm. ∎

3 Non-stationary Textures

In this section we consider textures to be non-stationary. Here both and are non-stationary Gaussian Markov random fields with mean . We add location information into consideration and cluster the pixels into two groups by single-linkage algorithm. The algorithm is established in Section 3.1. Then we show the consistency of a simple incarnation of the basic approach in Section 3.2.

3.1 Method

Pick up pixels with equal intervals from , where


Here is a subset of . Similar to (2) in Section 2.2, for , pick up patch as follows


For , it is obvious that or . So there is no overlap between and .

Next for , add location information into its true feature . Denote as the new true feature of pixel as follows


where is a vector of length . Similarly, denote as the new observed feature of pixel as follows


where also is a vector of length .

Apply single-linkage algorithm in following steps. Firstly among , connect all pairs with , where . Next for , assign all the other pixels in into the same cluster as . Then for any pixel which is still not clustered, find the pixel in with the smallest distance to , and assign pixel into the same cluster with . Finally we obtain the clustering result.

As a summary, we provide the procedure of single-linkage algorithm in Algorithm 2.

0:  Observations .
0:  Single-linkage clustering result.
1:  For , pick up patch .
2:  Calculate new observed features .
3:  Apply single-linkage algorithm based on .
4:  return  Single-linkage clustering result.
Algorithm 2 Texture Segmentation with Single-linkage Algorithm

Define the following set




In next section, we can show that all the pixels in

can be clustered correctly with probability going to

. Thus, all pixels in can be clustered correctly with probability going to .

3.2 Theory

Firstly we introduce two assumptions on the non-stationary level of the fields.

Assumption 4.

For any in the same sub-region,


where is the distance between two pixels and

Assumption 5.

For any in different sub-regions, if , there exists a constant such that


Next, we show that the single-linkage algorithm in above section is consistent.

Theorem 3.1.

Under Assumption 4 and Assumption 5, by single-linkage clustering in Algorithm 2, set threshold value , where . Then as ,


Set threshold value , where . For , denote as the pixel bordering and above in , and denote as the pixel bordering and below in , then


By Union bound,


We calculate above probability in three steps. Firstly, under Assumption 4, for ,