The world we live in is so rich and varied. But there must be some invariable factors that enable people to identify the same object from the change. Among these complex transformations in nature, affine transformation is one of the most basic and common issue in computer vision, which is the largest set of linear transformations, including rotation, translation, scaling, et cetera. Geometric affine transformation in shape space can be used to approximate the projective transformations in images where the camera happened a slight change of viewing perspective, which is the most common situation in the imaging process. Photometric affine transformation in channel space is the best linear model to describe the color degradation of the outdoor scenes, which has been proved inGeusebroek et al. (2001). And affine transformations often occur in both shape and color space simultaneously, which can be termed as Dual-Affine Transformation (DAT). DAT is a useful and common transformation model to describe the degradations happened in vision information from our diverse and colorful world, which considering both shape and color degradation. Invariants under DAT is a kind of important image feature that can be used for object classification and object retrieval.
With the diversification of visual data acquisition means, the formats of visual data are also diversified, which exist in different dimension and with different number of color channels, such as 2D color images, 3D color objects, 4D movies of a 3D color object, even higher-dimensional data and multi-spectrum images with multi-channels. Other data organized in channels (such as vector fields) can also be included in this work. Some examples are shown in Figure1.
In common, we can derive dual-affine moment invariants for objects in a particular dimension with a certain number of channels separately. To the best of our knowledge, there is no general framework to derive invariants for all of these data formats. In this paper, we systematically analyze the properties of invariants under DAT for different data dimensions and present a general framework for deriving invariants under DAT for objects in -dimensional shape space with color channels.
1.1 Object definition in -dimensional space with channels
An object in -dimensional (-D) shape space, (), with channels, (), can be described as Equation (1):
where represents the coordinate in -D space. And represent the functions in different channels. Object can be defined in another form as Equation (1.1):
where represents the index of pixels of object . represents the coordinate in shape space and represents the values in channel space.
Lots of data are organized in channel forms, which can represent different meanings in different data, such colors in color objects and vectors in vector fields. When we specify to different values, we can get different data formats. For examples, gray image is object in 2D space with 1 channel where . 2D vector field can be regarded as object in 2D space with 2 channels where . The values of 2 channels represent the vector at each point. Color image is object in 2D space with 3 channels where . Both 3D vector field and 3D Point Cloud are objects in 3D space with 3 channels. The difference is that values of 3 channels in 3D vector field represent the vector at each point and that in 3D Point Cloud represent the color at each point. Some objects in different formats are in Figure 1.
1.2 Dual-Affine Transformation (DAT)
The definition of -D Affine Transformation Matrix (ATM) is listed in Equation (3), which is a matrix:
After -D affine transformation, the coordinates of object pixels in space should be transformed to , which can be expressed by Equation (4):
After -D affine transformation, the pixel values at different channels will be transformed to in the same way as Equation (5):
Affine transformations in shape space and channel space often occur simultaneously, which can be called Dual-Affine Transformation (DAT). DAT is the transformation we focus on in this article. The formal definition of DAT for -D objects with channels can be described as Equation (6):
where is the Spatial Transformation and is the Channel Transformation. We depict the relationship between them with DAT in Figure 2. DAT is equivalent to applying a Spatial Transformation and a Channel Transformation on the object independently.
One typical case of DAT is for color images, where . The 2D geometric affine transformation in space represents the deformation of object’s shape in the images, and the 3D photometric affine transformation in channel space represents the deformation of object’s color. Moment invariants under DAT of color images has been fully discussed in Gong et al. (2017). Another case is invariants under DAT in 2D vector fields, which have been presented in Schlemmer et al. (2007); Kostková et al. (2018); Kostkova et al. (2019). There are also researches about 3D moment invariants under single affine transformation for 3D objects with one channel in Lo and Don (1989); Xu and Li (2008). However, with the continuous advancement of data acquisition methods, more and more forms of data have emerged, like 3D color objects, 4D or higher-dimensional data, multi-spectral images with more than 3 channels, etc. Invariant descriptors under dual-affine transformations in these domains are highly desirable.
The main contribution of this paper is that, we present a general framework for the derivation of moment invariants under DAT for -dimensional objects with channels. And with this framework, we get a set of independent and usable DAMI for 3D color objects with 3 channels. The paper is organized as follows. First, we give a survey of relevant literature about DAMI in Section 2. In Section 3, we illuminate the DAMI’s derived framework in detail and get the DAMI generation formula for objects in any data format. In Section 4, we instantiate DAMI with data format of 3D color point cloud as an example. Through a visual approach, we generate a complete set of DAMI equations with . We perform independent analysis on all DAMI equations and obtain a set of independent DAMI. In Section 5, we evaluate the ability of DAMI through numerical experiments on several data sets.
2 Related Works
Moment invariants were introduced into visual pattern recognition byHu (1962) for character recognition, which are invariant under 2D similarity transformation. These seven invariants are considered as a kind of shape descriptor, which plays an important role in the field of pattern recognition. Since then, various kinds of moment invariants have been proposed for different data formats. Researchers in this field mainly focus on the following three directions: (i) The first is to extend the transform group of invariants to more complex ones. (ii) The second is to derive invariants for data of different dimensions or formats. (iii) The third is to derive invariants including both shape and color degradations into consideration.
For gray-scale images, Flusser and Suk extended Hu’s moment invariants into affine transformation in Flusser and Suk (1993), which are more common in nature than similar transformation. Relevant works are also demonstrated in Flusser and Suk (1994); Suk and Flusser (2004a, 2011). Invariants under projective transformation, which is the exact transformation model of the imaging process, has been analyzed in Weiss (1988); Suk and Flusser (2004b); Li et al. (2018). There are also some other interesting transformations that not only for image data. Invariants under these transformations are also an important direction in the field, like chiral invariants in Zhang et al. (2017), invariants under Möbius transformation in Zhang et al. (2018), and even some researches about invariants under conformal, quasi-conformal, and diffeomorphism transformations in White (1973); Willmore (2000); Zeng and Gu (2011); Rustamov et al. (2013). Invariant primitive is the most important part of the derivation process of invariants and is the essential invariant property under a certain transform group. Invariant construction frameworks are always based on primitives. Under affine transformation in 2D space, the cross-product of the coordinates of two points is the covariant primitive, which is used to derive affine invariant for gray-scale images in Suk and Flusser (2004a). In Xu and Li (2008), the distances, areas and volumes of coordinates of points are considered as the covariant primitives for different transformations. The essential primitive under affine transformation in -D space is considered as the determination of the coordinates of pixels in Mamistvalov (1998). It is the same but different format with the area of triangle in 2D space and the volume of tetrahedron in 3D space. Li et al. (2017) analysis the generating formula of invariants under different transformations in detail, which can be considered as the Shape-DNA.
There are some researches extend moment invariants into different data formats. Some works analyze shape features by extracting invariants for curves in 2D space like Mundy et al. (1992); Zhao and Chen (1997). For the 3D objects with one channel, researches about 3D affine moment invariants can be found in Lo and Don (1989); Novotni and Klein (2003); Xu and Li (2008). The invariant for n-D space has been studied in Mamistvalov (1998).
Color information is important for the recognition of objects. Some color descriptors are simple and useful for applications, like color histogram in Funt and Finlayson (1995), and color moment in Stricker and Orengo (1995). But they do not consider the color degradation model in reality. Some other color descriptors consider the degradation model of illumination, like works in Gevers and Smeulders (1999); Li et al. (2009); Gong et al. (2013). To consider shape and color degradation in the same framework, there are lots of works. Mindru et al. (2004) proposed a kind of color moment invariant with Lie group method for only color diagonal transformation. Gong et al. (2017) proposed moment invariants under dual-affine transformation for 2D color images. Some other invariants for color images can be found in Alferez and Wang (1999); Mindru et al. (2004); Suk and Flusser (2009); Mo et al. (2017). Affine transformation is considered as the best model to describe the outdoor color degradation statistically in Geusebroek et al. (2001). In this paper, we will present a general framework for the derivation of moment invariants under dual-affine transformation for -D objects with channels. And with 3D color point cloud as an example, we get a set of independent and usable DAMI with .
3 Dual-Affine Moment Invariants (DAMI)
The flowchart of the derivation, instantiation, and usage of DAMI is depicted in Figure 3. We will detail the DAMI derivation framework in this section and leave the instantiation and usage part in Section 4 and 5. The general derivation framework for moment invariants is to first find the covariant primitive under a given transformation, and then derive the invariants in the global domain through the integral framework and the normalization factor. Firstly, we give the definition of covariant primitive (Step 1) in Section 3.1 and corresponding and under -D affine transformation (Step 2) in Section 3.2. With the proposed and , we can derive the moment covariants under DAT (Step 3) through integration in Section 3.3. In Section 3.4, we get the generating formula of DAMI after applied normalization (Step 4). Finally, we analysis the null space and a special case of DAMI In Section 3.5 and 3.6.
3.1 Covariant primitive
The matrix determinant of the coordinates of points in -D space is an affine covariant primitive.
The determinant of coordinates of points in -D space, , , …, , is the covariant primitive in shape space as Equation (7):
After -D affine transformation , the transformed will be Equation (3.1):
where is the matrix of the transformed coordinates as Equation (9):
From the result we can see that the determinant before and after the transformation only differs by one scalar coefficient, which is the determinant of the transformation matrix . is a covariant primitive in -D space. Theorem 1 is proved.
And similarly, in -D channel space, the matrix determinant of points with channels, , , …, , is covariant primitive in channel space as Equation (10):
where is the matrix of the channels’ values of points. And after -D affine transformation, the transformed will be Equation (3.1):
From the results, we can see that the matrix determinant is covariant under affine transformation in -D channel space.
3.2 Space Kernel and Channel Kernel
In which, and represent or different points and different and represent different combinations of points. All points involved in building constitute the Point Set . And all points involved in building constitute the Point Set . The number of points in the point set, and , are called the degree. And and are the order of corresponding primitives separately. And the and are called the order.
3.3 Covariant through integration
To define the on object , we make an integration of the and as Equation (3.3):
The relationship between before and after -D shape affine and -D channel affine transformation is in Equation (3.3):
We can see that the difference between and is only one scalar coefficient, which dependents on the transformation matrix and . The is covariant under DAT.
3.4 Invariant through normalization
To achieve on object , we should eliminate the covariant scalar of the . We use a consists of some special to divide the in Equation (3.3). Then we will get under DAT in -D space with channels as Equation (3.4):
The degree is , and the order is . The can be any other one that can eliminate the covariant scalar in the numerator.
This definition can be proved in Equation (3.4):
The invariants before and after the transformation are completely equal. is invariant under DAT in the -D space with channels, which can be called Dual-Affine Moment Invariants ().
3.5 Null space of DAMI
As we can see from the definition of covariant primitive in Equation (7) and (10), when the determinant of all primitives equal to 0, the DAMI will be no definition, which can be called the null space of DAMI.
When the null space happened in the primitives in shape space, it means matrix of the coordinations is singular. And some axis is linear combination of the other axises. The object will be part of a low-dimensional hyperplane in the original space, where the DMAI has no definition. For this kind of objects, we can find all the independent axises first, say there are, and then derive on these axises. That will make the DAMI work again.
When the null space happened in the primitives in channel space, it will be the same with that in shape space. Some channels are linear dependent on others. We can find all the independent channels, say there are , and then derive on these independent channels. That will make the DAMI work again. The literature Kostková and Flusser (2019) has discussed some about the null space of color space. This situation is rarely seen in RGB images, but may exist in multi-spectral images.
3.6 Special case
When , can be used on 2D vector fields with 2 channels. The will be the coordinate transformation of the vector field and the will be the vector transformation at each point. The DAMI will be as Equation (3.6):
4 Instantiation of
In this section, we will use the for 3D color objects as an example to demonstrate how to instantiate DAMI.
To instant the , we can select different and with different combinations of points. To make the selection clearer and more organized, we use a visualization method, as shown in Figure 4.
In the figure, we use to represent different points’ index and each triangle is a covariant primitive. is the degree of the DAMI, which represents how many points has participated in the construction in total. Then the constructed with these points can be represented with , ,…, and so on. Each combination of these triangles (with specified order) would be a or of . There are limited numbers of triangles in the figure. And all combinations of and in each subfigure make up a Complete Set under the specific degree and specific order . By the way, Complete Set with lower degree is included in that with higher degree. What we need to do is just specifying the and as the different indexes of points.
Here, we only consider and with order and degree both not greater than 4, which are corresponding to Figure 4(b). There are four triangles in total, , , and . We use them to construct the and specify different orders which not bigger than 4. Then we can get different with different triangle combinations and different parameters . All of the combinations are listed in Table 1. It should be illuminated that some other combinations with other point indexes does not appear in the Table, because those combinations have the same topological structure with one in the table. One example is , which does not appear in table, but has the same topological structure with the first element in table, . These two will get the same invariant. So just one of them is listed in table. Finally, we get a Complete Set of all combinations with both degree and order not greater than 4. Each one has a specific in Table 1. And in the first column of the table is used just to make our enumeration more convenient and orderliness.
To calculate the in practice, we should expand the to the form of Moment. The basic geometric moment in 2D shape space is listed in Equation (20).
And the central moment is replacing the origin with the centroid, as Equation (21):
where and represent the center of the image in corresponding axis.
In 3D shape-color space, we define the as Equation (22):
And the central moment is listed in Equation (4.2):
We can expand all the in the form of . By the way, some may expand to 0, such as 2,3,8,9,13,17,21,24, in Table 1. Several examples of expanded are listed in Equation (4.2-26). We use listed in Table 1 to index the 29 .
Considering the length, the others are not listed here.
5 Experiments for
As listed in Table 1, except the 8 DAMIs that are expanded to 0, we get 21 usable DAMIs. In order to give a comprehensive analysis of these DAMIs, we conduct several experiments in this section. First, we select a typical object and design an experiment to evaluate the characteristics of DAMI for different types of transformations in Section 5.1. Then, we conduct experiments on some special objects with 1 or more symmetric planes in Section 5.2, which may affect the stability of some DAMIs. Through these cases, we can get the characteristics of each DAMI. Based on this analysis, we give some guidelines for selecting a high-order invariant kernels. Finally, we conduct a classification experiments for the dataset with dual-affine transformations in Section 5.3.
5.1 Invariance to different transformations
To evaluate the invariance of the , we use one object from 3D-MNIST dataset111https://www.kaggle.com/daavoo/3d-mnist, which is in 3D points cloud format. Then we assign color for every pixels based the coordination and generate one 3D color object, as shown in Figure 5(a). We transform it with some shape affine transformations, including translation, rotation, scaling, and affine, channel affine transformations and dual-affine transformations. Examples of the 3D color object and its transformed versions are shown in Figure 5.
We calculate the 21 for these objects. To measure the stability of DAMIs, we calculate the coefficient of variation (CV), as Equation (27), for every DAMI to each type of the transformation. The results are listed in Table 2. The smaller of the CV is, the more robust of the invariant is. And 0 is the best.