Robust Periocular Recognition By Fusing Sparse Representations of Color and Geometry Information

09/11/2013 ∙ by Juan C. Moreno, et al. ∙ University of Missouri 0

In this paper, we propose a re-weighted elastic net (REN) model for biometric recognition. The new model is applied to data separated into geometric and color spatial components. The geometric information is extracted using a fast cartoon - texture decomposition model based on a dual formulation of the total variation norm allowing us to carry information about the overall geometry of images. Color components are defined using linear and nonlinear color spaces, namely the red-green-blue (RGB), chromaticity-brightness (CB) and hue-saturation-value (HSV). Next, according to a Bayesian fusion-scheme, sparse representations for classification purposes are obtained. The scheme is numerically solved using a gradient projection (GP) algorithm. In the empirical validation of the proposed model, we have chosen the periocular region, which is an emerging trait known for its robustness against low quality data. Our results were obtained in the publicly available UBIRIS.v2 data set and show consistent improvements in recognition effectiveness when compared to related state-of-the-art techniques.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Biometrics attempts to recognize human beings according to their physical or behavioral features [15]. In the past, various traits were used for biometric recognition, out of which iris and face are the most popular [32, 38, 17, 26]. Based on the pioneering work of Wright et al. [44], the sparse representation theory is emerging as a popular method in the biometrics fields and is considered specially suitable to handle degraded data acquired under uncontrolled acquisition protocols [31, 37].

1.1 Sparse Representation

Model selection in high-dimensional problems has been gaining interest in the statistical signal processing community [10, 4]. Using convex optimization models, the main problem is recovering a sparse solution of an underdetermined system of the form

, given a vector

and a matrix . There is a special interest in signal recovery when the number of predictors are much larger than the number of observations (n m). A direct solution to the problem is to select a signal whose measurements are equal to those of , with smaller sparsity by solving a minimization problem based on the -norm:

(1)

(), being a direct approach to seek the sparsest solution. Problem (1) is proved to be NP-hard and difficult to approximate since it involves non-convex minimization [5]. An alternative method is to relax the problem (1) by means of the -norm (). Hence problem (1) can be replaced by the following -minimization problem:

which can be solved by standard linear programming methods

[8]. In practice, signals are rarely exactly sparse, and may often be corrupted by noise. Under noise, the new problem is to reconstruct a sparse signal , where

is white Gaussian noise with zero mean and variance

. In this case the associated -minimization problem adopts the form:

(2)

where is a nonnegative parameter and denotes the -norm (). The convex minimization problem (2) is known as the least absolute value shrinkage and selection operator (LASSO) [40].

Although sparsity of representation seems to be well established by means of the LASSO approach, some limitations were remarked by Hastie et al. [47]. LASSO model tends to select at most variables before it saturates and in case predictors are highly correlated, LASSO usually selects one variable from a group, ignoring others. In order to overcome these difficulties, Hastie et al. [47]

proposed the elastic net (EN) model as a new regulation technique for outperforming LASSO in terms of prediction accuracy. The elastic net is characterized by the presence of ridge regression term (

-norm) and it is defined by the following convex minimization problem:

(3)

where and are non-negative parameters. An improvement for the EN model was proposed in [48] where a combination of the -penalty and an adaptive version of the -norm have been implemented by considering the minimization problem

(4)

where the adaptive weights are computed using a solution given by the EN minimization problem (3). If we let the solution of EN to be , then the weights are given by the equation where is a positive constant. A variant of the above model was proposed in [14] by incorporating the adaptive weight matrix in the -penalty term:

(5)

In this paper we use a re-weighted elastic net regularization model for periocular recognition application.

1.2 Summary of Contributions

The main contribution of this paper is to propose a re-weighted elastic net (REN) regularization model, that enhances the sparsity of the solutions found. The proposed REN model is a regularization and variable selection method that enjoys sparsity of representation, particularly when the number of predictors are much larger than the number of observations. The weights are computed such that larger weights will encourage small coordinates by means of the -norm, and smaller weights will encourage large coordinates due to the -norm. Our model differs from the schemes in [48] and [14] (see equations (4) and (5) above), since the and terms are automatically balanced by weights which are continuously updated using with a positive parameter [7]. We also provide a concise proof of the existence of a solution for the proposed model as well as its accuracy property. A complete presentation of the numerical implementation of the REN model using a gradient projection (GP) method [12], seeking sparse representations along certain gradient directions is described in this paper using a reformulation of the REN model as a quadratic programming (QP) problem.

As a main application of our model, we consider the periocular recognition problem. The periocular region has been regarded as a trade-off between using the entire face or only the iris in biometrics. Periocular region is particularly suitable for recognition under visible wavelength light and uncontrolled acquisition conditions [28, 42, 27]. We enhance periocular recognition through the sparsity-seeking property of our REN model over different periocular sectors, which are then fused according to a Bayesian decision based scheme. The main idea is to benefit from the information from each sector, which should contribute in overall recognition robustness. Two different domains are considered for this purpose: (1) geometry and (2) color. Full geometry information is accessed by decomposing a given image into their cartoon - texture components by means of a dual formulation of the weighted total variation (TV) scheme [34]. For color, a key contribution is the use of nonlinear features such as chromaticity and hue components, which are thought to improve image geometry information according to human perception [20]. Our methodology is inspired by two related works: 1) Wright et al. [44], which introduced the concept of sparse representation for classification (SRC) purposes; and 2) Pillai et al. [31]

, that used a SRC model for disjoint sectors of the iris and fused results at the score level, according to a confidence score estimated from each sector. Our experiments are carried out in periocular images of the UBIRIS.v2 data set 

[33]: images were acquired at visible wavelengths, from 4 to 8 meters away from the subjects and uncontrolled acquisition conditions. Varying gazes, poses and amounts of occlusions (due to glasses and reflections) are evident in this data set and makes the recognition task harder, see Figure 1. The results obtained using our model allowed us to conclude about consistent increases in performance when compared to the classical SRC model and other important approaches (e.g., Wright et al. [44] and Pillai et al. [31]). Also, it should be stressed that such increase in performance were obtained without a significant overload in the computational burden of the recognition process.

Figure 1: Examples of periocular images of different subjects and varying gazes, containing the corneal, eyebrows and skin regions.

The reminder of the paper is organized as follows. Section 2

summarizes the most relevant in the scope of this work concerning penalized feature selection for sparse representation. The re-weighted elastic net (REN) model is introduced together with statistical motivation ensuring high prediction rates. An algorithm based on gradient projection (GP) for the REN model is also introduced. Section 

3 describes the different geometrical information extracted from periocular images for performing recognition based on cartoon - texture and chromaticity features in a total variation framework. Section 4 describes the experimental validation procedure carried out together with remarkable comparisons. Finally, Section 5 concludes the paper.

2 The Reweighted Elastic Net model for Classification Model

2.1 The LASSO Model for Recognition

We first briefly describe the sparse representation based classification framework which is a precursor to our REN based approach. Having a set of labeled training samples ( samples from the i subject), they are arranged as columns of a matrix . A dictionary results from the concatenation of all samples of all classes:

The key insight is that any probe can be expressed as a linear combination of elements of . As the data acquisition process often induces noisy samples, it turns out to be practical to make use of the LASSO model. In this case it is assumed that the observation model has the form .

Classification is based on the observation that high values of the coefficients in the solution are associated with the columns of of a single class, corresponding to the identity of the probe. A residual score per class is defined: , where is a indicator function that set the values of all coefficients to 0, except those associated to the i class. Over this setting, the probe is then reconstructed by , and the minimal reconstruction error deemed to correspond to the identity of the probe, between and :

with

In [44] a sparsity concentration index (SCI) is used to accept/reject the response given by the LASSO model. The SCI of a coefficient vector corresponds to:

If , the computed signal is considered to be acceptably represented by samples from a single class. Otherwise, if the sparse coefficients spread evenly across all classes and a reliable identity for that probe cannot be given.

The recognition model proposed by Pillai et al. [31] obtains separate sparse representations from disjoint regions of an image and fusing them by considering a quality index from each region. Let be the number of classes with labels . A probe is divided into sectors, each one described by the SRC algorithm. SCI values are obtained over each sector, allowing to reject those with quality bellow a threshold. Let represent the class labels of the retained sectors, and

be the probability that the

-th sector returns a label , when the true class is :

being and constants such that . According to a maximum a posteriori (MAP) estimate of the class label, the response corresponds to the class having the highest accumulated SCI:

2.2 The Re-weighted Elastic Net (REN) Method

The proposed REN model is a sparsity of representation approach balances the LASSO shrinkage term (-norm) and the strengths of the quadratic regularization (-norm) coefficients by the following minimization problem:

(6)

where are positive weights taking values in . The REN-penalty is strictly convex and it is a compromise between the ridge regression penalty and the LASSO. The convex combination in the REN-penalty term is natural in the sense that both the and norms are balanced by weights controlling the amount of sparsity versus smoothness expected from the minimization scheme. As in [7], the weights are chosen such that they are inversely related to the computed signal according to the equation with a positive parameter. Under this setting, large weights will encourage small coordinates with respect to the REN-penalty term, whereas small weights imply big coordinates with respect to the REN-penalty term, respectively. Then, it is seen that the new model combines simultaneously a continuous shrinkage and an automatic variable selection approach. We next consider the existence of solution and the sign recovery property of the REN model.

2.3 Existence of Solution

We state necessary and sufficient conditions for the existence of a solution for the proposed model (6). We follow the notations used in [41, 16]. In terms of and norms, we rewrite the minimization problem in (6) as,

(7)

Let us denote by and the real and estimated solution of (7) respectively. Given , we define the block-wise form matrix

where () is a () matrix formed by concatenating the columns () and is assumed to be invertible.

First we assume that there exist satisfying (7) and . Lets define together with the set,

From the Kauush-Kuhn-Tucker (KKT) conditions we obtain

which can be rewritten as,

(8)

for some and by substituting the equality . From the above Eqn. (8) the following two equations arise:

(9)
(10)

Solving for in (9) and replacing in (10) to get in terms of leave us with

(11)
(12)

From (11) and (12), we finally get the next two equations:

(13)

and

(14)

for .

Now, let us assume that equations (13) and (14) both hold. It will be proved there exist satisfying . Setting satisfying and

which guarantees the equality due to (13). In the same manner, we define satisfying and

implying from (14) the inequality for and therefore . From previous, we have found a point a point and satisfying (9) and (10) respectively or equivalently (8). Moreover, we also have the equality . Under these assertions we can prove the sign recovery property of our model as illustrated next.

2.4 Sign Recovery Property

Under some regularity conditions on the proposed REN model, we intend to give an estimation for which the event is true. Following similar notations in [48, 46], we intend to prove that our model enjoys the following probabilistic property:

(15)

For theoretical analysis purposes, the problem (6) is written as

The following regularity conditions are also assumed:

  • Denoting with and

    the minimum and maximum eigenvalues of a symmetric matrix

    , we assume the following inequalities hold:

    where and are two positive constants.

  • for some

  • .

Let

(16)

By using the definitions of and , the next two inequalities arise

(17)

and

(18)

The combination of equations (17) and (18) give

(19)

On the other hand

(20)

By combining equations (19) and (20) we get

which together with the identity

allow us to prove

(21)

Let us notice that

(22)

From equations (21) and (22) we conclude that

(23)

Let and . Because of (21),

Then

(24)

Now, we notice that

Since

and as long as , it follows that

(25)

By using (23), we derive

(26)

Substituting (25) and (26) in (24) allow us to conclude that

Then (15) holds.

Remark 1.

There is special interest in applying the REN model in the case the data satisfies the condition . For the LASSO model it was suggested in [6] to make use of the Dantzig selector which can achieve the ideal estimation up to a factor. In [11] a performing of the Dantzig selector called the Sure Independence Screening (SIS) was introduced in order to reduce the ultra-high dimensionality. We remark that the SIS technique can be combined with the REN model (6) for dealing the case . Then previous computations can be still applied to reach the sign recovery property.

Next we describe an algorithm for the REN model allowing us to directly deal with the case . It turns out that our REN model can be expressed as a quadratic program (QP), thus allowing us to apply a gradient projection approach to perform the sparse reconstruction.

Figure 2: Sparse signal reconstruction with EN and LASSO models. (a) Sparse signal of Length with observations. (b)-(e) Response signals computed with the proposed reweighted elastic net, [48], [14] and LASSO, respectively.

2.5 Numerical Implementation

The algorithm that alternates between the computed signal and redefining the weights is as follows:

  1. Choose initial weights , .

  2. Find the solution of the problem

    (27)
  3. Update the weights: for each ,

    where is a positive stability parameter.

  4. Terminate on convergence or when a specific number of iterations is reached. Otherwise, go to step 2.

Note that our REN problem in (27) can also be expressed as a quadratic program [13], by splitting the variable into its positive and negative parts. That is, where and are the vectors that collect the positive and negative coefficients of , respectively. Then, we handle the minimization problem,

(28)

where , , and with

The minimization problem (28) can then be solve using the Barzilai-Borwein Gradient Projection Algorithm [36]. Under this approach the iterative equation is given by,

where is the step size computed as

with

The operator is the define as the middle value of three scalar arguments and and are two given parameters. The parameter take the form

The performance of the REN minimization along with comparisons is shown is Figure 2 for a sparse signal. We want to reconstruct a length- sparse signal (in the canonical basis) from observations, with . The matrix

is build with independent samples of a standard Gaussian distribution and by ortho-normalizing the rows, while the original signal

contains 160 randomly placed and the observation is defined as with a Gaussian noise of variance . The reconstruction of the original signal over the REN minimization problem produces a much lower mean squared error (MSE = with been an estimate of ) equal to , while the MSE given by the adaptive elastic model proposed in [14], [48] and LASSO are , and respectively. Therefore, the proposed REN approach does an excellent job at locating the spikes.

Figure 3: Cartoon - Texture component for grayscale periocular images using a weighted TV model (3.1). (a) Grayscale periocular images. (b)-(c) Cartoon - Texture decomposition with 80 iterations. (d)-(e) Cartoon - Texture decomposition with 400 iterations.

3 Geometric and Color Spaces for Image Decomposition

3.1 Cartoon + Texture (CT) Space

The periocular images contain cartoon (smooth) and texture parts (small scale oscillations) which can be obtained using the total variation (TV) [34] model effectively. In this setting, the grayscale version of a periocular image is divided into two components representing the geometrical and texture parts. The TV based decomposition model is defined as an energy minimization problem,

where is the input grayscale image, and is an edge indicator type function. Following [3] we use a splitting with an auxiliary variable to obtain the following relaxed minimization,

(29)

After a solution is computed, it is expected to get the representation , where the function represents the geometric cartoon part, the function contains texture information, and the function represent edges. The minimization (29) is achieved by solving the following alternating sub-problems based on the dual minimization technique: