With the development of information technology, the number of images in people’s daily life is increasing rapidly. In computer vision, the extraction of image description has been one of the most important tasks. Shape and color are two basic types of information to people’s visual cognition and play very important roles in image analysis and understanding. Shape features reflect the position, size, and shape information and color features reflect different spectral reflecting attributes of the object’s surface, as shown in Fig.1.
The real imaging environment is intricate, which is influenced by different scene illumination, camera sensors, and the reflective characteristics of the objects. Images captured from real scene are always degraded, hence their color and shape are not consistent. This means the color of images are different, and the geometric deformation like scaling, rotation, and skewing are occurring. Various approaches have been proposed to recognize images of the same objects under such geometric and photometric deformations. An effective way is to extract invariant features.
The 2-D geometric moment invariants were firstly proposed by Hu in 1962 for character recognition hu1962visual
, which are invariant to similarity transformation. The seven invariants can describe some of features of shape and played important roles in pattern recognition. Influsser1993pattern , Suk and Flusser extend the moment invariants from similarity transformation to affine transformation. They are invariant under the affine transformation. The main advantage of invariant features is their invariance under given transformations and there is no need to consider the corresponding deformations in the imaging process. Hence, it was even argued that object recognition is the search for invariantsweiss1993geometric .
Color information is useful for object recognition. Many approaches extracting color information are based on color histogramshealey1994using ; funt1995color and color momentstricker1995similarity . They make full use of color information but have no color consistency, so are sensitive to illumination changes. Some other color constant descriptors in li2009illumination ; gevers1999color can deal with the degeneration of illumination. Gong et al.gong2013moment
proposed a kind of color affine moment invariants which are applicable to more complicated color variance and robust to shape deformation to certain extent. The limitation of the color descriptor is that they do not exploit any spatial information of the object, which leads to vital information lost.
Lots of efforts have been made on the invariant features which can deal with both shape and color deformations. In common practices, shape and color descriptors are extracted independently, as shown in Fig.2. Consequently, they are not very robust in the real condition. It is difficult to combine those two types of information in one descriptor effectively. The conventional method is just to make a simple linear combination of the two factors, but manual intervention for the selection of weights is a complex issue. Wang et al. proposed a kind of modified Hu invariant momentswang2008moment , which contains both color information and shape information. However, this method is just applicable for the gray-level degradation. Mindru et al. proposed a kind of generalized color moment invariantsmindru2004moment , extending the moment invariants obtained by Lie group methods detailed in moons1995foundations ; van1995vision , which considers both shape and color deformations. The generalized color moment invariants do not change under geometric affine and photometric diagonal transformations, which get good performance on the experimental datasets. For the diagonal transformation of the color with 3-bands (R,G,B), transformation parameters of three color bands can be separated easily, hence the invariants can be constructed in an arbitrary combination of color bands. However, for the affine transformation, it is impossible to separate parameters by different color bands that this method can not be extended to the color affine transformation.
Many works apply invariants into shape’s contours to extract shape invariant featuresmundy1992goemetric ; mundy1994applications . The limitation of this kind method is that the object contours should be extracted robustly, which is difficult to reach in real scene. Alferez and Wang proposed a method considering both geometric and illumination deformationsalterez1999geometric . These invariants are based on the sampling curves extracted from the image, which can be the contour of imaged objects or some characteristic curves. This method gets good performance on experimental dataset. The limitation is that the objects’ contours should be properly extracted in advance and the weighted average between geometry and illumination invariants also needs manual interventionalterez1999geometric .
This paper is dedicated to research on Shape-Color Affine Moment Invariants (SCAMI). The construction method is based on the multiple integration framework, which can be extended easily along with the various integral kernels. The integral kernel is assigned as the continued product of the shape and color invariant cores, which is based on the geometric invariant primitives.
The main advantage of SCAMI is that they naturally and intrinsically unify shape and color factors together, which is the first time to directly derive an invariant to shape-color dual-affine transformations. Furthermore, the manual selection of weights is no longer necessary. In addition, they can be extended to higher order and dimension easily.
The paper is organized as follows. In Section 2, the background to construct SCAMI is presented, including the geometric model and the illumination model. In Section 3, the invariant constructing framework will be claimed and the derivation of the SCAMI is conducted. Several cases are listed in Section 4 and the independent of them are tested. In Section 5, the experiment results and conclusion are listed.
The Shape-Color Affine Moment Invariants (SCAMI) are invariant under the shape affine transformation and color affine transformation. In the real imaging process, the imaging system and conditions are uncertain, which is influenced by different scene illumination, camera sensors, and many other factors. Images captured from real scene are always degraded, hence their color and shape are not consistent. It means that color and shape degradation will bring into the image because of the complex imaging conditions and different systems. In order to deal with these degradations, the transformation mode should be established.
2.1 Shape Transformation
Similar to the imaging mechanism of human’s eyes, the camera model is projective transformation from 3D to 2D. When the camera takes images of the same object in different viewpoints, the relationship of the images follows the projective transformation.
Fortunately, in the imaging process, when the distance between the camera and the object is far enough, the projective transformation can be well approximated by affine transformation, as shown in the Fig.3.
The transformation from to is projective transformation. And the similar transformation from to is affine transformation. Therefore, we take affine geometric transformation into consideration. Assume spatial coordinates in an image corresponds to coordinates in another, the expression of the affine transformation can be written as:
2.2 Color Transformation
The illumination model models the interaction of light with the surface. But the interaction of light with the surface in the real scenery is too complex to describe. There are illuminating models ranging from simple to complex.
The Kubelka-Munk theory is commonly used as the model to describe the imaging process of the color image. This theory makes a hypothesis that the propagation of light within the material is isotropic, then the properties of the material surface can be described by scattering coefficient and absorption coefficient. This theory can be described as finlayson2005convex :
represents a position of the imaging plane;
is the wavelength;
is the distribution of light spectrum;
is the spectral reflectance in the position x;
is the body reflectance of the object;
is the spectrum distribution in the viewing direction, which can be interpreted as color distribution of the image.
When the optical resolution is high enough, it can be described in a simplified form. Some symbols can be defined as:
Then the equation can be simplified as:
This model is the famous Dichromatic Reflection Model(DRM), which is proposed by Shafer in shafer1985using . The model is also commonly used in computer vision, which is presented in Fig.4. The reflected light can be split into two parts, one is reflected by the surface and another by the body. In the equation, subscript represents the surface’s part and b the body’s part.
In the finlayson2005convex , through adding some constrains to the equation, Finlayson proposed that the transformation between the image under the illumination and the image under the illumination can be described with the diagonal-offset model, which can be expressed as:
Where is a 3D diagonal matrix and , which maps the image color captured under illumination to that under another . represents the offset of each channels. In the color space, this model can be expressed as:
Although the diagonal-offset model performs well in describing the indoor scenery, the more complicated models are still needed for describing more complex scenery. In the geusebroek2001color , Geusebroek et al. proved that statistically the affine transformation mode is the best linear model to describe the outdoor scenery, which can be described as:
3 The Construction Framework of SCAMI
The general method to construct SCAMI is introduced in this section. Conveniently, some basic terms used in the method are introduced first. The general idea of the construction framework is to create some invariant geometric primitives as the integral kernel or invariant core. Within the multiple integration framework, we can get the SCAMI theoretically. The key point of this method is the selection of geometric invariant core.
3.1 Definition of Shape-Color Moment
For a piecewise continuous image function , the geometric moment of order can be defined as:
where presents the density of gray. For color image , the shape-color moment, which is called generalized color moment in mindru2002model , can be defined as follow:
In the shape-color moment form, the geometric center of the image can be expressed as:
If the density function of each color channel is piecewise continuous, the moments of all orders exist. are the mean values of each color channel, which can be calculated with equation (11).
Then the central shape-color moment can be defined as follow:
The advantage of central moment comparing to the general moment is invariant to the translation transformation. Then the invariants constructed by the central moments are translation-invariant. In the following, a set of SCAMIs will be presented in this explicit central shape-color moment form.
3.2 Shape Invariant Primitive
In the 2-D geometric space, the central affine transformation matrix can be defined as:
The area of triangle is relatively invariant under the affine transformation in equation (13). The area of a triangle composed by two point and the origin point can be written as:
The coefficient have no meaning on the definition of invariants. So for convenience, the coefficient can be ignored.
After affine transformation in equation (13), the can be described as:
Where is the Jacobi determinant of the affine transformation in equation (13).
So the conclusion comes out obviously that the is a relatively invariant under the affine transformation, which is called shape primitive. Then the shape-invariant core can be constructed with the various primitives composed of different points. The combinations form of the primitives is multiplication as presented in equation (18).
Where is the number of participating points in the shape primitives, is the number of the primitives and is the multiplicity of the i-th point appearing in the shape core. And are the points with .
The shape core is a relatively invariant under affine transformation.
3.3 Color Invariant Primitive
Similarly, in the 3-D geometric space, the central affine transformation matrix can be expressed as:
The volume of a parallelepiped is relatively invariant under the affine transformation in equation (20).
As mentioned above, in the 3-D color space of image, the image pixel can be expressed as . For brevity, the pixel can be written as . In the color space, the volume of a parallelepiped determined by three points and the origin point can be written as:
Then the can be defined as the corresponding points after the affine transformation in equation (20). And the volume after transformation becomes as follow:
Where is the Jacobi determinant of the affine transformation in equation (20).
It is obviously that is relatively invariant under the affine transformation, which is called color primitive. Then the color-affine core can be constructed in the same way as the shape-affine core construction. The color core can be presented as:
Similarly, is the number of points participating in the color primitives, is the numbers of color primitives and is the multiplicity of the -th pixel appearing in the color core. And are the points with .
If we set as the corresponding core of the under the affine transformation, according to the equation (22), we can get that
So the color core is also a relatively invariant under affine transformation.
Based on the structure of geometric invariant primitives, this part introduces the general method to construct SCAMIs. For the integral kernel , the multiple integral can be defined as:
Where is an expression of points and are the coordinates of the -th points. And this is a -ple integral for the expression . Based on the multiple integral and the geometric invariant primitives, the integral-based method to construct Shape-Color Affine Moment Invariants is proposed. In the Fig.5, the method is presented visually.
Based on it, the theorem is shown as:
The shape-color moment invariants are constructed as equation (26).
Under the transformations of and , the denominator of equation (26) will be
Therefore, it is proved that equation (26) is shape-color moment invariants to dual affine transformations. This ends the proof.
Theorem 1 gives the general form of the SCAMI. It is the first time to directly derive an invariant to dual affine transformations of shape and color, which is the most complicated linear transformation. In this framework, the invariants can be extended to higher order and higher dimensions, and generate infinite numbers of SCAMI theoretically. And the infinite of invariant cores makes the method to be extensible conveniently. In order to instantiate the invariant, various invariant cores will be applied into equation (26), which will be demonstrated in Section 4.
4 SCAMI24 in Lower Order
4.1 The Shape-Color Invariant Cores
According to the definition of moment invariant, the high order moments involved in the invariant functions are susceptible to noise in the images. This has also been illuminated in many worksmindru2004moment ; mindru2002model . Another limitation of the invariants with high order is the computation complexity. Under these considerations, we choose all the shape cores whose orders are no more than 4 and participation points are also no more than 4. The orders of color cores are no more than 2 and the participation points are limited as 3.
The point 1,2,3 represent to the participation points in the shape cores, and the edge between two points represent to the primitive constructed by the two end points. Thus, Fig 6 represents the shape core , Fig 6 represents the shape core . Then all the shape cores whose orders are no more than 4 and participation points are also no more than 4 can be represented as graphs as shown in Fig 7. There are 50 shape cores in total.
Finally, 50 shaped cores are combined with the 2 color cores, forming a total 100 shape-color invariant cores. The 100 shape-color cores are taken into the equation (26), respectively, to obtain 100 shape-color dual-affine invariants.
4.2 Independence of the SCAMIs
With the selection of the shape-color cores, we find all the low order moments with certain limitation. It is easily to expand them into the rational expressions of the shape-color moments in equation (9). Unfortunately, some SCAMIs are useless because they are equal to zero. Another important factor for the SCAMIs’ selection is the dependencies among the invariants. It is important for applications to verify the independence between invariants. And it is a complicated task because the invariants are functions in the form of high degree polynomials in moments. There are several kinds of criterion for the selection of SCAMIs. We can group them into three classes.
Zero invariants. In the first case, the expansion of some SCAMIs are equal to zero. These invariants can not be used as the image descriptors, because they are all invariant for any images the are applied. There are 30 SCAMIs of this kind in total.
For another case, as demonstrated in suk2004graph , when using central moments, all first-order moments are zero by definition and, consequently, such invariants are zero, too. For the shape-color moments in equation (9), this theory is also true. The first-order shape-color moments are also equal to zero when using central moments, no matter first-order shape moments or first-order color moments. In this case, there are 32 SCAMIs are equal to zero in total.
Linear combinations. The dependencies among invariants is common, some invariants may be equal to another, or equal to the linear combination of some other invariants. After excluding the invariants equaling to zero, there are 38 SCAMIs left. All possible linear combinations of the SCAMIs are checked. The number of linear combinations is so much that it is difficult to enumerate. Fortunately, a necessary condition for linear dependence is that all invariants should have the same numbers of shape-color moments of the same order. This will greatly reduce the possible situation. Finally, four linear dependencies are found in total. Hence there are 34 SCAMIs which are linear independent remained.
Functional dependencies. ”Linear Independent” does not mean ”Absolutely Independent”. There may exists functional relationship with higher-order polynomial dependencies among these SCAMIs. Verifying the independence between them is important for applications. But that is a complicated task, for the invariants are functions in the form of high degree polynomials in moments.
Fortunately, there is a theorem as follow:
The condition for dependence of n functions of n+p variables is that every determinant of order n formed from the matrix of the first partial derivatives vanish identically.brown1935functional
It means that N functions are independent as long as there exists a nonsingular n-order determinant that is not zero.
In the definition of 34 linear independent SCAMIs, there are 150 shape-color moments, like , , and , in total. So 34 functions with 150 variables are established. And a matrix of the first partial derivatives needs to be checked. The complexity to calculate an -order determinant is , and the number of -order sub-matrix is . Thus the complexity of this problem is . For our existing experimental environment, the problem is complex enough to get the result within an acceptable time.
Fortunately, not every SCAMI contains all the 150 variables and most of them contain 20 to 60. Only three pairs SCAMIs contain exactly the same variables. Therefore, the matrix is very sparse with most of entries are zero. The computational complexity will be greatly reduced. The rank of the matrix is calculated in Maple2015 and the result is 34. Then we can get the conclusion that the 34 invariant above are independent.
4.3 Invariant Capability Analysis
In order to verify the performance of each invariant, we designed experiments under different transformations synthetically. The experiments aim at evaluating the robustness and discriminant power of the SCAMIs under deformation from the ideal conditions. Artificial shape and color transformations were applied to sample images, including rotation, scale, affine, and color affine transformations, which are shown in Fig. (8).
In addition, we generate 400 deformed images with multiple transformations as a final evaluation. The mean relative error (MRE) are calculated for each SCAMI on all datasets. The results of these five group experiments are listed in Table 1.
As shown in the results, the characteristics of each SCAMI for different deformations are listed clearly. For the shape deformations listed in the first three columns of the table, the MRE of most SCAMIs is less than 10%. Several SCAMIs are not accurate enough for particular transformation. Such as SCAMI(7), SCAMI(20), and SCAMI(12) get high MRE for the scaling transformation but low MRE for others. This is determined by the characteristics of invariant cores that make up the SCAMIs. For the color deformations listed in the forth columns of the table, the SCAMIs’ MRE are not as accurate as that under shape deformations. This is mainly due to the discrete calculation for color affine transformations. The image values are distributed between 0 and 255. The color affine transformation may map some of the different values into the same, even outside of the range. This will result in certain loss of information, especially under severe transformation as shown in Fig. 8. But most of them are less than 20%, which does not affect their effectiveness. The advantage of them compared to other color descriptors will be demonstrated in the experiments.
For a comprehensive evaluation, the results with multiple deformations listed in the fifth column are considered as the criteria. Finally, 24 effective SCAMIs, whose MRE is no more than 10% in the experiment with multiple deformations, are selected, which can be used as the basic descriptor in the process of analyzing and understanding images.
For the sake of clarity, The serial numbers are rearranged from 1 to 24. The invariant cores chosen for the 24 SCAMIs are presented in Table 2, where is the volume of the parallelepiped consisting of 3 points, which is clarified in equation (21).
|SCAMI||Color Core||Shape Core|