A Survey of Orthogonal Moments for Image Representation: Theory, Implementation, and Evaluation

Image representation is an important topic in computer vision and pattern recognition. It plays a fundamental role in a range of applications towards understanding visual contents. Moment-based image representation has been reported to be effective in satisfying the core conditions of semantic description due to its beneficial mathematical properties, especially geometric invariance and independence. This paper presents a comprehensive survey of the orthogonal moments for image representation, covering recent advances in fast/accurate calculation, robustness/invariance optimization, and definition extension. We also create a software package for a variety of widely-used orthogonal moments and evaluate such methods in a same base. The presented theory analysis, software implementation, and evaluation results can support the community, particularly in developing novel techniques and promoting real-world applications.



There are no comments yet.


page 1

page 18

page 19


A Survey of Recent Advances in Texture Representation

Texture is a fundamental characteristic of many types of images, and tex...

A Principled Design of Image Representation: Towards Forensic Tasks

Image forensics is a rising topic as the trustworthy multimedia content ...

Content Based Image Retrieval Using Exact Legendre Moments and Support Vector Machine

Content Based Image Retrieval (CBIR) systems based on shape using invari...

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service

Image features detection and description is a longstanding topic in comp...

Orthogonal multifilters image processing of astronomical images from scanned photographic plates

In this paper orthogonal multifilters for astronomical image processing ...

Fast and accurate computation of orthogonal moments for texture analysis

In this work we propose a fast and stable algorithm for the computation ...

Supercharging Imbalanced Data Learning With Causal Representation Transfer

Dealing with severe class imbalance poses a major challenge for real-wor...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In the mid-twentieth century, American mathematician Claude Elwood Shannon published his paper A mathematical theory of communication [1], which marked the creation of information theory. As a landmark contribution, information theory is the theoretical foundation of information storage, processing, and transmission in modern computer systems. On the other hand, it also dictates that the raw signal (i.e., digital data) of multimedia (e.g., image) is not semantic in nature. Therefore, one of the main requirements in many computer vision and pattern recognition applications is to have a “meaningful representation” in which semantic characteristics of digital image are readily apparent, as shown in Fig. 1. This process is often termed as image representation. For example, for recognition, the representation should highlight the most salient semantics; for denoising, it should efficiently distinguish between signal (semantically relevant) and noise (semantically irrelevant); and for compression, it should capture the most semantic information using the least coefficients.

Fig. 1: Image representation is an important topic in computer vision and pattern recognition.

Mathematically, a basic idea of image representation is to project the original image function onto a space formed by a set of specially designed basis functions and obtain the corresponding feature vector. For many years, dictionary (i.e., the set of basis functions) design has been pursued by many researchers in roughly two different paths:

hand-crafted and deep learning [2].

Recently, deep learning techniques, represented by Convolutional Neural Networks (CNN), have led to very good performance on a variety problems of computer vision and pattern recognition. Deep learning based image representations formed by the composition of multiple nonlinear transformations, mapping raw image data directly into abstract semantic representations without manual intervention (i.e., end-to-end paradigm). For such representation learning methods, the dictionary can be considered as a composite function and is trained/learned by back-propagating error. Deep learning based image representations offer great flexibility and the ability to adapt to specific signal data. Due to the data-driven nature, this line of approaches is strongly influenced by the latest advances in optimization algorithms, computing equipment, and training data. As a result, the deep learning approaches exhibit limitations in the following three aspects [3]: 1) The quality of the representation depends heavily on the completeness of the training data, i.e., a large and diverse training set is required. 2) The time/space cost of these approaches is often very high, which prevents them from being used in time-critical applications. 3) The robustness to geometric transformation is limited, requiring the data augmentation to enhance the geometric invariance, but at the cost of time/space complexity. In contrast, hand-crafted image representations are still competitive in the above three aspects. It is also worth mentioning that the successful experiences behind hand-crafted features are instructive for the design of deep learning methods such as PCANet [4] and spatial pyramid pooling [5].

In the pre-CNN era, hand-crafted representations and feature engineering had made important contributions to the development of computer vision and pattern recognition. At current stage, hand-crafted features still cannot be completely replaced, considering that some limitations of deep learning are just the characteristics of hand-crafted representations [6]. The existing hand-crafted image representation methods can be roughly divided into four categories [7]:

  • Frequency transform

    – such as Fourier Transform, Walsh Hadamard Transform, and Wavelet Transform;

  • Texture – such as Scale Invariant Feature Transform (SIFT), Gradient Location and Orientation Histogram (GLOH), and Local Binary Patterns (LBP);

  • Dimensionality reduction

    – such as Principal Component Analysis (PCA), Singular Value Decomposition (SVD), and Locally Linear Embedding (LLE);

  • Moments and moment invariants – such as Zernike Moments, Legendre Moments, and Polar Harmonic Transforms.

Starting from the semantic nature of the representation task, it is clear that image representation should satisfy the following basic conditions [8]:

  • Discriminability – the representation reflects inter-class variations, i.e., two objects from two different classes have different features;

  • Robustness – the representation is not influenced by intra-class variations, i.e., two objects from one class have the same features.

Hand-crafted representations based on frequency transform, texture, and dimensionality reduction have been widely used in the real-world applications. However, due to the inherent semantic gap between low-level descriptors and high-level visual concepts, these methods have flaws in one or both of robustness and discriminability. One example is that SIFT features exhibit synonymy and polysemy [9], caused by poor discriminability and robustness. In contrast, moments and moment invariants perform better in overcoming the semantic gap due to their beneficial mathematical properties:

  • Independence – the orthogonality of basis functions ensures no information redundancy in moment set, which in turn leads to better discriminability in image representation;

  • Geometric invariance – the invariants w.r.t. geometric transformation, e.g., rotation, scaling, translation, and flipping, can be derived from the moment set, meaning better robustness in image representation.

Moments and moment invariants were introduced to the pattern recognition communities in 1962 by Hu [10]. Since then, after almost 60 years of research, numerous moment based techniques have been developed for image representation with varying degrees of success. In 1998, Mukundan et al. [11] surveyed the main publications proposed until then and summarized the theoretical aspects of several classical moment functions. In 2006, Pawlak [12] gave a comprehensive survey on the reconstruction and calculation aspects of the moments with great emphasis to the accuracy/error analysis. In 2007, Shu et al. [13–15] provided a brief literature review for the mathematical definitions, invariants, and fast/accurate calculations of the classical moments, respectively. In 2009, Flusser et al. [8] presented a unique overview of moment based pattern recognition methods with significant contribution to the theory of moment invariants. The substantial expansion [16] of this book includes more detailed analysis of the 3D object invariant representation. In 2011, Hoang [17] reviewed unit disk-based orthogonal moments in his doctoral dissertation, covering theoretical analysis, mathematical properties, and specific implementation. For most of the above reviews, state-of-the-art methods in the past 10 years are not covered. In 2014, Papakostas et al. [18] gave a global overview of the milestones in the 50 years research and highlighted all recent rising topics in this field. However, the theoretical basis for these latest research directions is rarely introduced. More recently, in 2019, Kaur et al. [19] provided a comparative review for many classical and new moments. Although this paper covers almost all the main literatures, it still lacks the overall analysis of the current research progress in various directions. A common weakness of all the works is that there are almost no available software packages, restricting the further development of the community.

The significant contribution of this paper is to give a systematic investigation for orthogonal moments based image representation along with an open-source implementation, which we believe would be a useful complement to [17–19]. For completeness, this paper starts with some basic theories and classical methods in the area of orthogonal moments. Different from the mentioned reviews, we pay special attention to the motivation and successful experiences behind these traditional works. Furthermore, we organize a theoretical discussions for the recent advances of orthogonal moments on different research directions, including fast/accurate calculation, robustness/invariance optimization, and definition extension. Such overall analysis of the state-of-the-art research progress is mostly ignored in previous studies. In addition, we show the performance evaluation of widely-used orthogonal moments in terms of moment calculation, image reconstruction, and pattern recognition. To embrace the concept of reproducible research, the corresponding software package is available online. In the end, several promising directions for future research are given along with some initial discussions/suggestions.

The rest of this paper is organized as follows. In the following Section II, we first give the basic idea of image moments and categorize existing methods into different categories. In Section III, we further review in detail the unit disc-based orthogonal moments that are most relevant to image representation. Then, the recent advances of orthogonal moments within each research direction are reviewed and analyzed in Section IV. Furthermore, Section V reports the comparative results of state-of-the-art orthogonal moments along with an open-source implementation. Section VI gives a conclusion and highlights some promising directions in this field.

Fig. 2: A classification of image moments.

Ii Overview

Mathematically, the image moment is generally defined as the inner product of the image function and the basis function of order on the domain [8]:


where the asterisk denotes the complex conjugate. The direct geometric interpretation of image moment set is that it is the projection of onto a subspace formed by a set of basis functions [17]. Because there are an infinite number of basis function sets, it is often necessary to manually design a set of special basis functions with beneficial properties in that meet the semantic requirements. According to the mathematical properties of basis functions, the family of image moments can be divided into different categories, as shown in Fig. 2.

Firstly, depending on whether the basis functions satisfy orthogonality

, the image moments can be classified into orthogonal moments and non-orthogonal moments. The orthogonality means any two different basis functions

and from the basis function set are uncorrelated or they are “perpendicular” in geometric term, leading to no redundancy in the moment set. Mathematically, and are orthogonal when the following condition is satisfied


where is the Kronecker delta function defined as


Some of the most popular non-orthogonal moments are geometric moments [8], rotational moments [20], complex moments [21], and generic Fourier descriptor [22]. Due to the non-orthogonality of the basis functions, high information redundancy exists in such moments. This further leads to difficulties in image reconstruction and poor discriminability/robustness in image representation. Therefore, it is a natural requirement to satisfy orthogonality when designing basis functions.

Secondly, according to whether the basis function is continuous, the image moments can be divided into continuous moments and discrete moments. In the case of two-dimensional (2D) images, the basis functions for continuous and discrete moments are generally defined in the 2D real-valued space and the 2D digital image space, i.e., the domains and , respectively. When it is necessary to calculate the continuous moments of a digital image, a suitable discretization approximation of the continuous integral is often introduced with corresponding computational errors. In Section IV-A, we will further describe the causes and solutions to such errors. On the contrary, discrete moments, such as Tchebichef moments [23], Krawtchouk moments [24], Hahn moments [25], dual Hahn moments [26], and Racah moments [27], do not involve any approximation errors. Thus they are more suitable for the high-precision image processing tasks, e.g., image reconstruction, compression, and denoising.

Finally, depending on the coordinate system that defines the basis functions, image moments can be grouped into Cartesian moments and circular moments. In the case of continuous moments, the basis functions of Cartesian moments are defined in or , while the domain for the circular moments is or (i.e., the unit disk). According to the proof of Bhatia et al. [28], the basis function will be invariant in form w.r.t. rotations of axes about the origin if and only if, when expressed in polar coordinates , it is of the form:


with angular basis function () and radial basis function could be of any form [17]. Let be the rotated version of the original image . When conforms to the form of (4), there must be a function such that


i.e., satisfying the rotation invariance

. Therefore, the Cartesian moments, such as Legendre moments [29], Gaussian-Hermite moments [30], Gegenbauer moments [31], Chebyshev moments [32], and the discrete moments listed above, have difficulties in achieving the rotation invariance. As for the calculation of circular moments, an appropriate coordinate transformation is often introduced since digital images are generally defined in a Cartesian coordinate system with corresponding computational errors. In Section IV-A, we will further describe the causes and solutions to such errors.

From the above theoretical analysis, it is clear that the circular orthogonal moments are generally better than other kind of moments as far as image representation tasks are concerned. Therefore, great scientific interest has been given to the circular orthogonal moments, mainly the unit disk-based orthogonal moments. As the most relevant works, existing unit disk-based orthogonal moments will be reviewed in the next section.

Iii Classical Orthogonal Moments

It can be checked from (4) that the basis functions of unit disk-based orthogonal moments are separable, i.e., decomposed into the product of the radial basis functions and the angular basis functions. Therefore, (2) can be rewritten as


Since , the radial basis function should satisfy the following weighted orthogonality condition:


(7) is a general requirement that must be considered when designing unit disk-based orthogonal moments, which ensures that the designed basis function set has the beneficial orthogonal properties. The angular basis functions have a fixed form due to the proof of Bhatia et al. [28], which means that the difference in existing methods is only in the definition of the radial basis functions. In this regard, there are mainly three types of orthogonal functions used as the definition, including Jacobi polynomials, harmonic functions, and eigenfunctions. Next, we will briefly introduce the specific methods in these three groups and give their radial basis function definitions in a unified form, i.e., normalized version as (7).

Iii-a Jacobi Polynomials

Iii-A1 Zernike Moments (ZM) [29]

The radial basis function of the ZM, , is defined as


where and satisfying and . This normalized radial basis function directly satisfies the orthogonality condition in (7):


Iii-A2 Pseudo-Zernike Moments (PZM) [20]

The radial basis function of the PZM, , is defined as


where and satisfying . This normalized radial basis function directly satisfies the orthogonality condition in (7):


Iii-A3 Orthogonal Fourier-Mellin Moments (OFMM) [33]

The radial basis function of the OFMM, , is defined as


where . This normalized radial basis function directly satisfies the orthogonality condition in (7):


Iii-A4 Chebyshev-Fourier Moments (CHFM) [34]

The radial basis function of the CHFM, , is defined as


where . This normalized radial basis function directly satisfies the orthogonality condition in (7):


Iii-A5 Pseudo Jacobi-Fourier Moments (PJFM) [35]

The radial basis function of the PJFM, , is defined as


where . This normalized radial basis function directly satisfies the orthogonality condition in (7):


Iii-A6 Jacobi-Fourier Moments (JFM) [36]

The radial basis function of the JFM, , is defined as


where and parameters must fulfill and . This normalized radial basis function directly satisfies the orthogonality condition in (7):


It is worth mentioning that the radial basis function of JFM is constructed directly from the original Jacobi polynomials [28], while the radial basis functions of ZM, PZM, OFMM, CHFM, and PJFM are all special cases of the Jacobi polynomials. Thus, JFM is, in fact, a generic expression of the above famous methods. By properly setting the values of the parameters and , , and can be directly obtained from . The relationship between and / is more complicated, and we refer readers to the work of Hoang et al. [37] for more details. Here, the parameter setting is listed below:

  • ZM – and ;

  • PZM – and ;

  • OFMM – and ;

  • CHFM – and ;

  • PJFM – and ;

Remark: The unit disk-based orthogonal moments using Jacobi polynomials, especially the landmark Zernike moments, have a long-history in optical physics, digital image processing, and pattern recognition. As one can note from the formulas listed above, however, the definitions of these radial basis functions rely on factorial/gamma terms and summations, which leads to high computational complexity. In addition, the factorial and gamma functions tend to cause the calculation errors, mainly numerical instability. In Sections IV-A and IV-B, we will further describe the causes and solutions to above issues.

(a) ZM
(b) PZM
(c) OFMM
(d) CHFM
(e) PJFM
(f) JFM ()
(g) RHFM
(h) EFM (real part)
(i) PCET (real part)
(j) PCT
(k) PST
(l) BFM
Fig. 3: Illustrations of radial basis functions of unit disk-based orthogonal moments.

Iii-B Harmonic Functions

Iii-B1 Radial Harmonic Fourier Moments (RHFM) [38]

The radial basis function of the RHFM, , is defined as


where . This normalized radial basis function directly satisfies the orthogonality condition in (7):


Iii-B2 Exponent-Fourier Moments (EFM) [39]

The radial basis function of the EFM, , is defined as


where . This normalized radial basis function directly satisfies the orthogonality condition in (7):


Iii-B3 Polar Harmonic Transforms (PHT) [40]

The PHT consists of three different transformations: the Polar Complex Exponential Transform (PCET), the Polar Cosine Transform (PCT), and the Polar Sine Transform (PST). The radial basis function of the PCET, , is defined as


where . This normalized radial basis function directly satisfies the orthogonality condition in (7):


The radial basis function of the PCT, , is defined as


where . This normalized radial basis function directly satisfies the orthogonality condition in (7):


The radial basis function of the PST, , is defined as


where . This normalized radial basis function directly satisfies the orthogonality condition in (7):


Remark: It can be seen that the radial basis functions of RHFM, EFM, and PHT are all based on the harmonic functions commonly used in Fourier analysis, i.e., complex exponential functions and trigonometric functions . Therefore, the definitions of the above methods are closely related, and the radial basis functions can be transformed into each other via Euler’s formula and variable substitution . More details on this will be given in Section IV-D. As for the calculation, compared to Jacobi polynomials, the orthogonal moments using only harmonic functions do not involve any complicated factorial/gamma terms and long summations, meaning better time complexity and numerical stability.

Method Order Range Complexity Stability Number of Zeros Distribution of Zeros
ZM , , , high poor biased
PZM , , high poor biased
OFMM high poor basically uniform
CHFM high poor basically uniform
PJFM high poor basically uniform
JFM high poor basically uniform
RHFM low medium uniform
EFM low medium uniform
PCET low good biased
PCT low good biased
PST low good biased
BFM very high medium basically uniform
TABLE I: Mathematical Properties of Radial Basis Functions of Unit Disk-Based Orthogonal Moments

Iii-C Eigenfunctions

At current stage, there are still relatively few orthogonal moments based on eigenfunctions, and the most representative one is Bessel-Fourier Moments (BFM) [41]. The radial basis function of the BFM, , is defined as


where . is the Bessel functions of the first kind, which is the set of eigenfunctions of the Laplacian on the unit disk, and its closed-form definition is


where is a real constant. The items and in (30) are both defined by : and is the -th zero of . This normalized radial basis function directly satisfies the orthogonality condition in (7):


Remark: It is observed that (31) relies on infinite series and factorial/gamma terms, and (30) requires root-finding algorithm for Bessel functions to determine . Therefore, compared with the Jacobi polynomials and harmonic functions, the orthogonal moments based on eigenfunctions have significantly higher complexity in theory.

Fig. 4: The research directions of image moments.

Iii-D Summary and Discussion

For a better perception, the illustrations of radial basis functions of unit disk-based orthogonal moments are summarized in Fig. 3.

Furthermore, we reveal the mathematical properties of these methods in Table I, including radial basis functions’ order range, computational complexity, numerical stability, number of zeros, and distribution of zeros. Computational complexity is graded as low, high, and very high based on whether the definition involves factorial/gamma terms, summation/series operations, and root-finding processes. Numerical stability is graded as poor, medium, and good depending on whether the function includes factorial/gamma terms and very high absolute-values (mainly unbounded). The number of zeros of radial kernels is related to the ability of moments for capturing image high-frequency information [17, 40]. Besides the quantity, radial kernel’s another key attribute is the distribution of zeros, because it is related to the description emphasis of the moments in image plane [17, 40]. When the essential discriminative information is distributed uniformly in the spatial-domain, unfair emphasis of the extracted moments on certain regions has been shown to have a negative impact on the discrimination quality, termed as information suppression problem [21].

It can be seen in Table I that, among these unit disk-based orthogonal moments, almost no method has both low complexity, good stability, large number and unbiased zeros. This observation strongly motivates the design of the improvement strategies that address the above common shortcomings, which will be discussed in Section IV.

Iv Research Directions and Recent Advances

In addition to defining new image moments, existing work mainly focuses on optimizing the classical moments listed in Sections II and III. As shown in Fig. 4, the current research directions in this field generally include: accurate calculation, fast calculation, robustness/invariance optimization, and definition extension. This section will introduce the above directions in detail along with the recent advances.

Iv-a Accurate Calculation

As the mathematical background of Sections IV-A and IV-B, the general procedure for calculating image moments is first given.

Going back to (1), for computing the image moments of a digital image defined in the discrete Cartesian grid , it is first necessary to unify the domains of and , i.e., to design a mapping between the two:


where a pixel region centered at is mapped into a region centered at . By the coordinate mapping , (1) can be written down in discrete form as follows:


where is the integral value of the basis function over the mapped pixel region , defined as:


In general, the calculation of unit disk-based orthogonal moments suffers from geometric error, numerical integration error, and representation error (mainly numerical instability). These errors will severely restrict the quality of image representation, especially when high-order moments are required to better describe the image. Hence, the accurate computation strategies of moments are vital for the applicability.

Iv-A1 Geometric Error

Geometric error may occur when mapping the image domain into the basis function domain, i.e., (33). Such errors are common in the calculation of the unit disk-based orthogonal moments, because digital images are generally defined over a square region in Cartesian coordinate system, rather than the unit disk.

As for mapping between the square region and the unit disk, there are naturally, as shown in Fig. 5, the incircle mapping [42]:


and the circumcircle mapping [43]:

(a) Incircle
(b) Circumcircle
Fig. 5: The coordinate mapping between the square region of digital image and the unit disk.

When the incircle mapping is used, there exists such that , i.e., the pixels that are mapped to the outside of the unit disk are not counted, and there exists such that , i.e., some mapped pixel regions partially intersect the unit disk. In both cases, geometric errors occur.

As for the circumcircle mapping, it is able to completely avoid such two cases, so there will be no geometric errors. However, this comes at the cost of representation capability [17], as the mapped regions containing image information occupy only of the entire unit disk.

Iv-A2 Numerical Integration Error

Numerical integration error may occur when calculating the integral of the continuous basis functions, i.e., (35). Here, for discrete moments, the basis functions are generally constant on the interval . Thus, it is easy to check that , meaning discrete moments typically do not involve numerical integration errors [23]. As for continuous moments, since the basis functions are continuous over the interval , solving requires some integration tricks.

As the simplest case, there is an analytical solution to , i.e., (35) can be solved directly by the Newton-Leibniz formula [44, 45]. Such calculations also do not involve numerical integration errors.

As the more general case, considering the complication of the definition of many basis functions , it is often difficult to determine the analytical solution and some approximate algorithms are needed for achieving the numerical solution of . The most commonly used approximation algorithm is Zero-Order Approximation (ZOA) [42], which imitates the calculation of discrete moments, as follows:


Generally, the accuracy of the numerical integration method is inversely proportional to the interval area . Therefore, by further dividing a single pixel region into multiple smaller integration intervals (e.g., sub-intervals), higher accuracy can be easily obtained. This strategy is often called pseudo up-sampling [46, 47]. When the ZOA is used in these sub-intervals, the integral can be expressed as


where is the sampling point with and . In addition to this simple approach, other more complex high-precision numerical integration strategies [48–50], such as the Gaussian quadrature rule and Simpson’s rule, can also be used in the computation of (35). Their general definition is


where is the weight corresponding to the sampling point .

It is worth noting that all above discussion in Section IV-A relies on the Cartesian coordinate system, which we call Cartesian based calculation method. Correspondingly, there also exist a calculation method based on polar coordinate system for unit disk-based orthogonal moments, often referred as polar pixel tiling [51, 52].

Fig. 6: The coordinate mapping in polar pixel tiling scheme.

It first resamples the digital image to a discrete polar grid , as shown in Fig. 6, and then performs calculation in a manner similar to (34) and (35):




Note that (42) has a distinct advantage, i.e., the computation of can be separated into two independent parts [53]: 1) can be approximately integrated by pseudo up-sampling and Gaussian quadrature rule; 2) can be exactly integrated by Newton-Leibniz formula, as follows:


In addition to above advantage, the polar pixel tiling has similar properties to the Cartesian based calculation method in terms of geometric error and numerical integration error, and will not be repeated here.

Iv-A3 Representation Error

Representation error is caused by the finite precision of the numerical computing systems, which occurs in all processes of the calculation, i.e., (33), (34), and (35). The representation error can be further divided into overflow error, underflow error, and roundoff error [17]. The common numerical instability is mainly attributable to roundoff error and overflow error.

The basis functions based on Jacobi polynomials contain factorial/gamma terms. When the order is large, the actual values of these coefficients may exceed the representation range of the numerical computing system, resulting in roundoff error or even overflow error. To avoid direct calculation of the factorial/gamma terms, recursive strategies [54–56] and numerical approximation algorithms [57, 58] are often used. The recursive method relies on the recursive relationship of the basis functions derived from and , using several low-order basis functions to directly derive the high-order ones. This process does not involve the factorial/gamma of large number. In another path, the numerical approximation method achieves factorial-free calculations by resorting to a suitable approximation for factorials such as Stirling’s formula.

In addition to factorial/gamma terms, the basis functions of some orthogonal moments have very high (or even infinite) absolute-values at certain points, which may also exceed the representation range of the numerical computing system. For example, although the definitions of RHFM and EFM do not involve any factorial/gamma terms, their basis functions are infinite at the origin (see also Fig. 3), which will cause roundoff error or even overflow error.

Iv-A4 Recent Advance of Accurate Calculation

In the field of accurate calculation for unit disk-based orthogonal moments, Xin et al.’s polar pixel tiling [51] may be the most influential and representative work. Recent papers typically combine polar pixel tiling with other techniques, including circumcircle mapping, pseudo up-sampling, high-precision numerical integration, Newton-Leibniz formula, and recursive strategy, to provide the state-of-the-art performance.

In this way, Sáez-Landete [59] integrated the work of Camacho-Bello et al. [53] and Upneja et al. [56], giving an accurate calculation algorithm for the JFM. Hence, it also applies to other Jacobi polynomial based moments such as OFMM and CHFM. Due to the polar pixel tiling, the complicated radial parts can be evaluated using Gaussian quadrature, pseudo up-sampling, and recursive relationship, while the angular parts can be exactly evaluated by Newton-Leibniz formula. A somewhat similar method was proposed by Hosny et al. [60] for the harmonic function based PHT. The main difference is that, due to the simple definition of the radial basis function, the radial parts can also be integrated exactly by the Newton-Leibniz formula just like the angular parts.

These two recent works provide near-perfect accuracy for calculating Jacobi polynomial based moments and harmonic function based moments, respectively. To be critical, the only flaw may be the error introduced in the process of image resampling to discrete polar grid [51]. If the nonlinear interpolation methods are used, such as bicubic method, the interpolation error is negligible for image representation tasks. On the other hand, if a given task is sensitive to such errors, other more complex mathematical tools, such as pseudo-polar [61], are instructive for the design of accurate calculations [62].

Iv-B Fast Calculation

In the implementation of orthogonal moments, the overall complexity may become excessively high when 1) a large number of moments is needed, 2) the image has high resolution, 3) many images need to be processed, or 4) a high-precision computation is required. Since these requirements are common in practical applications, the fast calculation of orthogonal moments is strongly demanded. Depending on the application scenario, optimization efforts for computational speed can be divided into two categories: global calculation and local calculation.

Iv-B1 Global Calculation

Global calculation is to calculate the image moments for the entire image. In this scenario, the number of moments and the resolution of the image are the main factors that affect the time complexity. Taking the simplest ZOA-based direct calculation as an example, the time cost of an -order image moment comes from:

  • For (35), it is necessary to evaluate the values of the basis function at sampling points, i.e., . The time complexity is related to the definition of the basis functions. For example, the basis functions using Jacobi polynomials require additions for the summation. In contrast, the basis functions using harmonic functions do not involve such additions.

  • For (34), it is necessary to calculate the inner product of the basis function and the digital image at sampling points, i.e., . The time complexity is multiplications and additions.

Note that the calculations listed above are only required for one moment. Let be some integer constant, if all the moments of orders in set are computed, the total complexity will increase by a factor of . Moreover, the computational cost of (35) may rise sharply if the high-precision numerical integration strategy, such as Gaussian quadrature, is adopted. To reduce the complexity of (34) and (35), the existing methods are designed at the function level and the pixel level.

  • Function level: For the orthogonal moments based on Jacobi polynomials, if the recursive method and numerical approximation (see also Section IV-A3) are used to evaluate the basis functions, the number of additions (from summation) and multiplications (from factorial) in the calculation of (35) can be reduced [54–58]. For example, the recursive calculation of polynomial requires additions, less than the in direct computation. If polynomials of orders are required, the direct method involves additions, while the recursive scheme only requires additions. For the orthogonal moments based on harmonic functions, in addition to adopting the similar recursive form [63–66], more effective strategy is to make use of its inherent relationship with Fourier transform [67–71]. Once the explicit relationship between the two is determined, the Fast Fourier Transform (FFT) algorithm can be used to calculate the moments. Note that FFT has the ability to reduce the number of the most time-consuming multiplications in (34) from to .

  • Pixel level: The most common strategy in this path is to simplify the calculation by exploring the symmetry and anti-symmetry of the basis functions in the domain [72, 73]. More specifically, the basis function values at all sampling points can be completely derived by the basis function values at a few special sampling points, thus reducing the complexity of (35) and (34). In Fig. 7, we give an illustrations of symmetrical points in the unit disk. For a point , there are seven symmetrical points w.r.t. coordinate axes, origin, and . Their Cartesian coordinates and polar coordinates are listed in Table II. It can be seen that all these points have the same radial coordinate and related angular coordinate. Based on the mathematical properties of the complex exponential function, i.e., the trigonometric identities, such correlation of coordinates will be directly converted to the correlation of basis function values. Hence, this observation can lead to a reduction in computational complexity of (35) and (34) by approximately . Considering that all unit disk-based orthogonal moments are based on angular basis functions using complex exponential functions, the above symmetry and anti-symmetry properties are maintained in all these methods. As a result, this fast calculation strategy is generic and easy to use in combination with other fast algorithms.

Fig. 7: Illustrations of symmetrical points in the unit disk.
Symm. Point Symm. Axis Cartesian Coord. Polar Coord.
, -axis
, -axis
TABLE II: Cartesian Coordinates and Polar Coordinates of the Symmetrical Points

Iv-B2 Local Calculation

Local calculation is to calculate the image moments for a part of the image such as dense image blocks or interest regions of keypoints. In this application scenario, the resolutions of such image patches are generally small, which can be described well using few moments. As the main reason for pushing up the time complexity, the number of such patches is typically very large, e.g., this number for dense blocks can be of order or even . Therefore, speed optimization methods for global computation often do not solve the local calculation problem well. At present, the problem of calculating local image moments has not been fully discussed. The most promising path is to find and compress redundant operations in the for-loop like processing of image patches. In this regard, useful properties of basis functions (such as shift properties in [74, 75]) and special data structures (such as complex-valued integral images in [76]) have been explored to improve efficiency.

Iv-B3 Recent Advance of Fast Calculation

For the fast global calculation of Jacobi polynomial based orthogonal moments, state-of-the-art methods usually use recursion and symmetry in combination. A recent work in this way was proposed by Upneja et al [56] for JFM (see also [53, 59]), it requires additions and multiplications for all the moments of orders in set . Considering that multiplication is usually much more expensive than addition in modern computing systems, so the total elapsed time is mainly determined by the multiplications. In this case, it is clear that when higher image resolution (i.e., increases) or more moments (i.e., increases) are required, the computational cost still increases in quadratic manner. This is mainly due to the complicacy of the Jacobi polynomials, in other words, which leads to the lack of fast implementation with a complexity below the quadratic time.

For the fast global calculation of harmonic function based orthogonal moments, the state-of-the-art performance offered by the FFT based fast implementation. The earliest idea in this way can be traced back to 2014, when Ping et al. [70] introduced the FFT by the relationship between EFM and Fourier transform. Subsequently, similar ideas have been used for RHFM [68] and PCET [69]. An important work was recently proposed by Yang et al. [67], which generalized such FFT-based techniques to a generic version of harmonic function based orthogonal moments (will be seen in Section IV-D3). This generic fast implementation, similar to its previous special forms, exhibits the multiplicative complexity of