    # Design of a Simple Orthogonal Multiwavelet Filter by Matrix Spectral Factorization

We consider the design of an orthogonal symmetric/antisymmetric multiwavelet from its matrix product filter by matrix spectral factorization (MSF). As a test problem, we construct a simple matrix product filter with desirable properties, and factor it using Bauer's method, which in this case can be done in closed form. The corresponding orthogonal multiwavelet function is derived using algebraic techniques which allow symmetry to be considered. This leads to the known orthogonal multiwavelet SA1, which can also be derived directly. We also give a lifting scheme for SA1, investigate the influence of the number of significant digits in the calculations, and show some numerical experiments.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction and Problem Formulation

A digital filter bank is a collection of filters , , which split a discrete-time signal into subbands. These subband signals are usually decimated by a factor of . Such a system is called a maximally decimated analysis bank. It can be used in applications such as signal, image, and video coding. For example, a scalar wavelet system is based on a single scaling function and mother wavelet, so Strang and Nguyen (1996). In multiwavelet theory there are several scaling functions and mother wavelets Cooklev et al. (1996)

. This provides more degrees of freedom in the design, and makes it possible to simultaneously achieve several useful properties, such as symmetry, orthogonality, short support, and a higher number of vanishing moments

Strang and Strela (1994).

The first multiscaling function (GHM) was constructed in 1994 by Geronimo, Hardin, and Massopust Geronimo et al. (1994)

, based on fractal interpolation. The multiwavelet function with which the GHM multiscaling function forms a perfect reconstruction pair was constructed in a separate paper by Strang and Strela

Strang and Strela (1995).

Cooklev et al. Cooklev et al. (1996) investigated the general theory of multifilter banks, and showed that real orthonormal multifilter banks and multiwavelets can be obtained from orthonormal complex-coefficient scalar filter banks. This ensures a number of good properties, such as orthogonality, regularity, symmetry, and a lattice-structure implementation. However, the limit functions in this case tend to have long support.

The design of multifilter banks and multiwavelets remains a significant problem. In the scalar case, spectral factorization of a half-band filter that is positive definite on the unit circle in the complex plane was, in fact, the first design technique for perfect reconstruction filter banks, suggested by Smith and Barnwell Smith and Barnwell (1986). This provides motivation to extend Cooklev’s original idea Cooklev (1995) to the multiwavelet case, where spectral factorization is more challenging.

We consider a new approach for obtaining an orthogonal multiwavelet which possesses symmetry/antisymmetry and regularity, from a matrix product filter. This immediately leads to three main problems:

1. How can we obtain an appropriate matrix product filter which satisfies strong requirements?

2. How do we perform Matrix Spectral Factorization (MSF)?

3. How do we improve the accuracy of the computed factors?

We will consider these three problems in more detail, as well as giving a brief description of some basic results.

Based on the previous theory and many applications of the SA1 multiwavelet, it is useful to develop a lifting scheme for it. Many of the above applications can then be implemented in software or hardware to be faster and more applicable.

The paper is organized as follows. After a brief review of multiwavelet background, we begin by constructing a half-band product filter with good smoothness properties. We then find a multiscaling function by using matrix spectral factorization. To complete the design by finding a multiwavelet function, we use additional algebraic techniques based on QR decomposition. The resulting multiwavelet is called SA1. It has orthogonality, symmetry/antisymmetry, piecewise smooth functions, and short support. We then construct a lifting scheme. Finally, some experiments are shown with respect to this simple multiwavelet. This includes investigating the influence of the number of significant digits used in the calculations, as well as the performance of the SA1 multiwavelet in terms of coding gain.

The main novel contributions of this paper are the following:

1. We introduce for the first time the product multifilter approach for designing a general MIMO filter bank;

2. We obtain for the first time an orthogonal multiscaling function by using the closed form of the matrix spectral factorization algorithm of Youla and Kazanjian Youla and Kazanjian (1978);

3. We obtain the complementary orthogonal multiwavelet function;

4. We construct a lifting scheme for the obtained simple orthogonal multiwavelet;

5. We investigate the influence of the number of significant digits used for the coefficient , in 1D and 2D signal processing.

## 2 Multiwavelet Theory

We use the following notation conventions, illustrated with the letter ’a’:

• – lowercase letters refer to scalars or scalar functions;

• – lowercase bold letters are vectors or vector-valued functions;

• – uppercase letters are matrices;

• – uppercase bold letters are matrix-valued functions.

and are identity and zero matrices of appropriate size. denotes the transpose of the matrix , and is the complex conjugate transpose. We are only considering real matrices in this paper. However, the variable used in matrix polynomials lies on the unit circle in the complex plane, so that . Thus, the complex conjugate transpose of

 A(z)=∑kAkzk

is given by

 A(z)∗=∑kATkz−k.

The inner product of two vector-valued functions and in the -dimensional Hilbert space of square integrable functions is given by

 ⟨f,g⟩=∫f(t)g(t)∗dt,

which is a matrix.

### 2.1 Multiwavelets and the Discrete Multiwavelet Transform

Multiscaling and multiwavelet functions are vectors of scaling functions and wavelet functions , respectively:

 φ(t)=⎡⎢ ⎢ ⎢ ⎢ ⎢⎣φ0(t)φ1(t)⋮φr−1(t)⎤⎥ ⎥ ⎥ ⎥ ⎥⎦,ψ(t)=⎡⎢ ⎢ ⎢ ⎢ ⎢⎣ψ0(t)ψ1(t)⋮ψr−1(t)⎤⎥ ⎥ ⎥ ⎥ ⎥⎦.

They belong to , and satisfy the matrix dilation equations Strang and Nguyen (1996)

 φ(t)=√2m∑k=0Hkφ(2t−k),ψ(t)=√2m∑k=0Gkφ(2t−k). (1)

Multiscaling and multiwavelet functions together are referred to as a multiwavelet. All of the functions have support contained in the interval (see Wasin and Jianzhong (1997)).

We only consider orthonormal multiwavelets, where for any integer ,

 ⟨φ(t),φ(t−ℓ)⟩=⟨ψ(t),ψ(t−ℓ)⟩=δ0ℓI, ⟨φ(t),ψ(t−ℓ)⟩=0.

Equivalently, the matrix coefficients and satisfy the orthonormality conditions

 ∑kHkHTk+2ℓ=∑kGkGTk+2ℓ=δ0ℓI,∑kHkGTk+2ℓ=0. (2)

In the Discrete Multiwavelet Transform (DMWT), the input signal , at level , is a sequence of -vectors in . For example, in the 2-channel case,

 s0={s0k}∞k=−∞,s0k=[s00[k]s01[k]],with ∑k∥s0k∥2<∞.

The DMWT of consists of computing recursively, for levels , the approximation signal

 sj={sjℓ}∞ℓ=−∞,sjℓ=∑kHk−2ℓsj−1k (3)

and the detail signal

 dj={djℓ}∞ℓ=−∞,djℓ=∑kGk−2ℓsj−1k. (4)

We call the scaling function coefficients, and the wavelet coefficients.

Because of orthogonality, the inverse can be computed iteratively in a straightforward way:

 sj−1k=∑ℓHTk−2ℓsjℓ+∑ℓGTk−2ℓdjℓ. (5)

The matrix filter coefficients , completely characterize the multiwavelet filter bank.

We can also describe the filters and the wavelet coefficients of the signal by their symbols

 H(z)=∑kHkz−k,G(z)=∑kGkz−k,sj(z)=∑ksjkzk,dj(z)=∑kdjkzk.

In symbol notation, one step of DMWT and inverse DMWT is described by

 sj(z2)=12[H(z)sj−1(z)+H(−z)sj−1(−z)],dj(z2)=12[G(z)sj−1(z)+G(−z)sj−1(−z)],sj−1(z)=H(z)∗sj(z2)+G(z)∗dj(z2). (6)

Decomposition and reconstruction of the DMWT for a scalar input signal is represented in fig. 1. Since is scalar-valued, it is necessary to vectorize it to produce . A prefiltering step is inserted at the beginning of the analysis filter bank. A corresponding postfilter is incorporated at the end, which is the inverse of the prefiltering step. Figure 1: Two-channel critically sampled analysis and synthesis filter banks, with prefilter M and postfilter N (↓2 and ↑2represent decimation and zero-padding).

The result of executing the steps in eq. (6) is

 ^s0(z)=12[H(z)∗H(z)+G(z)∗G(z)]s0(z)+12[H(z)∗H(−z)+G(z)∗G(−z)]s0(−z).

If we want perfect reconstruction (that is, ), we need

 H(z)∗H(z)+G(z)∗G(z)=2I,H(z)∗H(−z)+G(z)∗G(−z)=0. (7)

The orthogonality relations (2), expressed in terms of symbols, are

 H(z)H(z)∗+H(−z)H(−z)∗=2I,G(z)G(z)∗+G(−z)G(−z)∗=2I,H(z)G(z)∗+H(−z)G(−z)∗=0 (8)

It is easy to show that eqs. (7) and (8) are equivalent.

### 2.2 Multiwavelet Denoising

One of the most interesting signal processing applications is denoising. In most real applications the original signals are corrupted by noise. In addition, truncation error in the computation and in other processing introduces more errors. These errors can be interpreted as additive white Gaussian noise (AWGN) added to the original signal . A denoising technique is a method of removing the noise while retaining important features of the signal. We consider denoising methods based on the multiwavelet transform.

The wavelet transform concentrates the energy of the true signal in a few wavelet coefficients, while the noise energy will be scattered among all coefficients. The wavelet coefficients of Gaussian noise are still Gaussian. Shrinking the small coefficients towards zero will eliminate most of the noise, while not affecting the signal very much.

We use the following notation. The original signal is . This could be a one-dimensional signal , or an image . The noisy signal is . The denoised signal is . We likewise use tildes and hats to denote the wavelet coefficients of , .

#### 2.2.1 Stochastic Sequences and Covariance Matrix

Assume that the input is a noisy signal . This is sampled and preprocessed into a vector sequence . The preprocessing is described by a linear operator . We also assume that postprocessing is the inverse of preprocessing: . The discrete multiwavelet transform (DMWT) is described by a linear map . For orthogonal multiwavelet filter banks we have .

It is important note the influence of the type of transform (orthonormal or biorthogonal) on the variance of stochastic sequences subjected to preprocessing and the DMWT. If the preprocessing

and are orthonormal transforms, then the energy (i.e., the sum of squares of the elements) in the sequence is unchanged: . For biorthogonal transforms, both preprocessing and matrix transformation will change the input energy.

The noisy signal is the sum , where is the true (deterministic) signal (noise free), and is the noise, with noise power . The goal of signal denoising is to minimize the mean square error subject to the condition that the denoised signal is at least as smooth as . Due to better signal representation by the multiwavelet transform, noise and signal can be separated much more easily. Multiwavelet denoising using a multivariate shrinkage operator effectively exploits the statistical information of the transformed wavelet coefficient vectors of noise, which improves the denoising performance. As a result of the decomposition and downsampling, the dependence of coefficients is reduced with increasing levels. The multiwavelet transform with prefilter of the noisy signal

 ΘM~s=ΘMs+ΘMe

produces the vector wavelet coefficients

 \boldmath~djk=djk+fjk,

which are correlated with their neighbors.

The scaling function coefficients of the signal are fairly large, and do not need to be thresholded. We can concentrate on the wavelet coefficients of the signal, and of the noise.

The sequence of wavelet coefficients at level

of the stochastic noise has a multivariate normal (Gaussian) distribution

with a covariance matrix

 cov(fj)=ΘjMjΣMTjΘTj,

where is the covariance matrix of the noise. The matrices are blocks indicating the correlation between the vector coefficients inside each single decomposition level Bacchelli and Papi (2003). In the case of AWGN noise ,

 cov(fj)=Θj[σ2eMMT]ΘTj=σ2eΘj[MMT]ΘTj,

the covariance of the stochastic part depends only on the preprocessing and the choice of multiwavelet transform . If the preprocessing matrix satisfies , the covariance matrix is . In the general case, is a symmetric block Toeplitz matrix.

Assuming that the blocks describing the correlation between the vector coefficients are known, the nonnegative scalar values

 wjk=√(\boldmath~djk)TY−1j\boldmath~djk

can be used for thresholding, as proposed Bacchelli and Papi (2003). In Downie and Silverman (1998) it is shown that it is these values that should be thresholded, and the wavelet coefficient vectors can then be adapted accordingly.

#### 2.2.2 Thresholding Rules

There exist threshold and non-threshold wavelet shrinkage methods. A non-threshold wavelet shrinkage method is Besov shrinkage Dechevsky et al. (2009, 2007). It is known that its scale is closed under real interpolation, and it is highly adaptive to the local smoothness.

In this paper, we consider threshold methods in more detail. This is a general wavelet denoising procedure Rioul and Vetterli (1991); Smith et al. (2008). Wavelet shrinkage denoising has been proven to be nearly optimal in the mean square error (MSE) sense, and performs well in simulation studies.

Thresholding methods can be grouped into two categories Soman and Ramachandran (2004): global thresholds and level-dependent thresholds. The first means that we choose a single value for the threshold, to be applied globally to all wavelet coefficients; the second means that a possibly different threshold value is chosen for each resolution level . In what follows, we consider a global threshold .

The threshold distinguishes between insignificant coefficients attributed to noise, and significant coefficients that encode important signal features. Signals with sparse or near sparse representation, where only a small subset of the coefficients represent all or most of the signal energy, are fit for thresholding. The noise is estimated by a properly chosen threshold after the multiwavelet transform.

Filtering of additive Gaussian noise (zero mean and standard deviation

) via the thresholding of wavelet coefficients was pioneered by Donoho Donoho (1995). A wavelet coefficient is compared with a universal threshold . If the coefficient has a magnitude smaller than the threshold, it is considered to be noise, and is set to zero. Otherwise, is kept or modified, depending on the thresholding rule.

In this paper hard and soft thresholding methods will be considered. The vector wavelet coefficients thresholded by the soft thresholding rule are

 \boldmath^djk=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩\boldmath~djkwjk−λwjk,wjk≥λ,0,wjk<λ.

For the hard thresholding rule, they are

 \boldmath^djk={\boldmath~djk,wjk≥λ,0,wjk<λ.

Another method, semisoft thresholding, has been proposed Niu and Shen (2007); Zhang et al. (2017), but so far only for the scalar wavelet setting.

The vector wavelet coefficients could be thresholded using either the hard or soft thresholding rules. These thresholding rules have strong connections with model selection in the traditional linear models. The trick of thresholding is to include these irregularities while excluding the noise Donoho et al. (1995).

In general, in terms of visual quality, the noise reduction effect of soft thresholding is the best, followed by the improved semisoft shrinkage method; the hard thresholding method is the worst.

Hard shrinkage produces smaller bias but larger variance than soft shrinkage, and empirically performs better than soft shrinkage minimax estimation Bruce and Gao (1996). It better retains local characteristics of the image edges Donoho (1995); Strela (1998), and is suitable for denoising of noise with sudden changes. However, it introduces sudden jumps in the multiwavelet domain that in turn lead to oscillations in the reconstructed images. These oscillations are very close to the Gibbs phenomenon (see fig. 3 in Durand and Froment (2001)) and are called pseudo-Gibbs phenomena. They affect the visual quality of the denoising image, and manifest themselves as annoying artifacts in the denoised image (see Fig. 2(a)).

The soft thresholding method smoothes the jumps in the wavelet domain by shrinking also the coefficients near the jumps. The denoised signals end up relatively smooth, and the pseudo-Gibbs phenomenon is reduced to some extent.

Better results are expected in the future by using the multiple description approach for the multiwavelet transformation shown in Wang et al. (2019).

#### 2.2.3 Universal Threshold

The threshold value is a very important parameter, which directly influences the performance of wavelet denoising. It can be determined by estimating the variance of the noise in images, but thise value is often unknown in applications.

Using a universal threshold value, instead of a different threshold at each level, is attractively simple, but is strictly suitable only when thresholding AWGN wavelet coefficients of variance

and mean zero. The basic idea is that if the signal component is in fact zero, then with high probability the combination of (zero) signal plus noise should not exceed the threshold level

, where is the length of the signal. The universal threshold Donoho (1995); Donoho et al. (1995) pays attention to smoothness rather than to minimizing the mean square error (MSE).

This threshold is the asymptotic infimum, the asymptotic optimal threshold Donoho and Johnstone (1994), and comes close to the minimax threshold, i.e. the threshold that minimizes the worst case MSE Donoho and Johnstone (1998). The trivial extension of the universal threshold to multiple wavelet coefficients is to use the threshold above applied to each coefficient element.

In the numerical examples in this paper, we are using the following Algorithm 1. It implements the algorithm described in the text, adapted to two-dimensional signals (images).

Algorithm 1: Image Denoising Algorithm Based on Multiwavelet Transform
Step 1 Apply the pre-filtering to all rows, then all columns (of previously row pre-filtered pixels) of the noisy image
Step 2 Apply the 2D forward multiwavelet transform row by row and column by column, up to level
Step 3 Choose the universal threshold and apply a (hard or soft) thresholding procedure to the resulting multiwavelet coefficients to obtain the denoised wavelet coefficients
Step 4 Apply the 2D backward multiwavelet transform up to level
Step 5 Apply the post-filtering , producing a denoised image

### 2.3 The Multiwavelet SA1

In this section, we describe a simple multiwavelet that will be used in the remainder of this paper to illustrate our approach. We are interested in supercompact multiwavelets, that is, multiwavelets with support in . Such functions have many advantages:

• Application to functions defined on a finite interval does not require any special treatment at the boundaries of the interval Beam and Warming (2000);

• Interpolation with non-equally spaced data is possible Daubechies et al. (1999); Hornbeck (1975);

• Application to functions that are only piecewise continuous (internal boundaries) can be efficiently implemented Beam and Warming (2000); Mohamed and Torky (2014); Prandoni and Vetterli (1999);

• It is easy to construct a nodal basis with (spectral) finite elements Baran and Stoyan (2007); Sadov and McGreer (1999);

• Polynomials are reconstructed exactly Beam and Warming (2000);

• Discontinuities in the signal are easy to detect Whipple (1910), and the size of jumps at element boundaries can be measured, in 1D Vuik (2014) and 2D Vuik and Ryan (2016);

• Construction of the operation matrices in numerical methods is simple Abbas et al. (2009); Devi and Verma (2017);

• Supercompact multiwavelets with the vanishing moment property are suitable for adaptive solvers of PDEs subject to boundary conditions, as well as for solving of a wide class of integro-differential operators having sparse representation in these bases.

Alpert Alpert (1990) already observed that a multiscaling function whose entries form a basis of polynomials of degree on will always be refinable. For , this leads to the Haar wavelet. For , we can take = Haar wavelet, and = any non-constant linear polynomial. If we want to be orthogonal to and normalized, we find that for ,

 ϕ0(t) =1, ϕ1(t) =√3(1−2t).

This choice is unique, up to sign. The two components of are shown in fig. 2(a). Figure 2: The scaling and wavelet functions of SA1; (a) the multiscaling function φ(t)=[φ0(t),φ1(t)]T; (b) the multiwavelet function ψ(t)=[ψ0(t),ψ1(t)]T.

The refinement coefficients are easy to derive with elementary algebra: we find

 H0=√24[20√31],H1=√24[20−√31]. (9)

The completion with a corresponding multiwavelet function is not unique, but by imposing orthogonality and symmetry constraints we find an essentially unique completion with recursion coefficients

 G0=√24[02−1√3],G1=√24[0−21√3]. (10)

Details are given in section 4.2. The corresponding functions are described by

 ψ0(t) ={√3(1−4t)t∈[0,12],√3(4t−3)t∈[12,1], ψ1(t) ={(1−6t)t∈[0,12],(5−6t)t∈[12,1].

They are shown in fig. 2(b).

We call this multiwavelet SA1.

## 3 Matrix Product Filters

### 3.1 Theory of Matrix Product Filters

Given a matrix filter

 H(z)=m∑k=0Hkz−k,

its matrix product filter is defined as

 P(z)=H(z)H(z)∗=m∑k=−mPkzk,

where

 Pk=H0HTk+⋯+Hm−k−1HTm−1+Hm−kHTm.

In particular for ,

 P0=H0HT0+⋯+Hm−1HTm−1+HmHTm.

The Laurent polynomial matrix is a discrete-time para-Hermitian matrix. Here is the ring of matrices whose elements are polynomials in , with complex coefficients Bose (2017). By the first equation in (8), it is a half-band filter Cooklev et al. (1996), which means it satisfies the equation

 P(z)+P(−z)=2I. (11)

This implies

 P0 =I, P2k =0for k≠0.

We can similarly define the half-band filter

 Q(z)=G(z)G(z)∗.

Every orthogonal multiscaling filter is a spectral factor of some half-band multifilter. We use this fact as the starting point for constructing multiwavelets: Construct a suitable first, and factor it to obtain .

In the scalar case, spectral factorization is one of the main design techniques Bhati et al. (2018); Strang and Nguyen (1996). For , matrix spectral factorization is required, which is more challenging.

### 3.2 Matrix Spectral Factorization

A critical component in classical spectral decomposition is the Fejér-Riesz theorem for positive definite functions Carathéodory (1907). Fejér Fejér (1916) was the first to note the importance of the class of trigonometric polynomials that assume only non-negative real values; the theorem was considered and proved by Riesz Riesz (1916).

The Fejér-Riesz theorem in one dimension considers a trigonometric polynomial expressed in the form

 ν(z)=N∑k=−Nνkzk.

When is real for all , the coefficients satisfy for all . The theorem states that if for all , such a is expressible in the form

 ν(z)=p(z)p(z)∗

for some polynomial . As noted in Ephremidze et al. (2007), a spectral factor is unique up to a constant right unitary multiplier , i. e.

 pnew(z)=p(z)U(z).

The Fejér-Riesz theorem does not extend to factorization of multivariate trigonometric polynomials (see Dritschel (2004) for some counterexamples). Relevant key theorems are (Rudin, 1963, Theorem 3.2) in the 1D case, (Dritschel, 2004, Theorem 6.2) for the 2D case, with examples and necessary and sufficient conditions in Geronimo and Woerdeman (2004), and (Rudin, 1963, Theorem 3.1) for arbitrary dimensions.

However, the theorem can be extended to the case of univariate polynomials with matrix coefficients. This is Matrix Spectral Factorization (MSF) (see Vostry (1972); Callier (1985); Ephremidze et al. (2018); Youla and Kazanjian (1978)).

The MSF problem plays a crucial role in the solution of various applied problems for MIMO systems in communications and control engineering Bose (2017); Kalathil and Elias (2015); Wang et al. (2015); Yin et al. (2015).

### 3.3 Bauer’s Method

A well-known method for MSF is Bauer’s method Bauer (1956). This method has been successfully applied in Charoenlarpnopparut (2007); Du et al. (2013); Fischer (2005); Hansen et al. (2010). Details of the algorithm are given in section 4.1 below.

Bauer’s method, in the implementation of Youla and Kazanjian Malyshev and Sadkane (2018); Malyshev (2013); Moir (2018); Youla and Kazanjian (1978), has the advantage over other approaches that it can handle zeros of the determinant of on . Unfortunately, the presence of these zeros affects the accuracy and speed of convergence. In its original form, the method only gives us a low precision spectral factor, and convergence is very slow.

The approaches in Ephremidze et al. (2018) for finding precision spectral factors can be used instead. For instance, the Janashia-Lagvilava MSF method for the orthogonal SA4 multiwavelet considered in Ephremidze et al. (2018) is exact.

In this paper, we achieve an exact MSF leading to a simple orthogonal multiwavelet by the closed form of Bauer’s method for short product filters. This is possible for some types of short support multiwavelets, especially the supercompact multiwavelets Beam and Warming (2000).

### 3.4 A Simple Matrix Product Filter

The simplest non-scalar example is for , . In this case, has the form

 P(z)=PT1z−1+P0+P1z. (12)

From the half-band condition (11) we see that

is the identity matrix.

The smoothness of the multiscaling and multiwavelet functions is improved if the determinant of the product filter has a higher-order zero at (see Vetterli (1986)). Therefore, we choose

 det(P(z))=(1+z−12)kq(z), (13)

where is a linear phase polynomial in , and is necessarily even.

The simplest case corresponds to and , where

 det(P(z))=(1+z−12)4z2=116(z−2+4z−1+6+4z+z2).

Setting

 P1=[abcd],

we find the equations

These equations have precisely four solutions: either , or , , and in both cases we can choose . However, these are all essentially equivalent. The different choices just correspond to interchanging and , or changing the sign of one of the functions.

We choose

 a=12,b=−√34,c=√34,d=−14.

The resulting product filter has coefficient

 P1=14[2−√3√3−1]. (14)

## 4 Design of Orthogonal Multiwavelet Filter

### 4.1 Finding H(z)

For simplicity, we only describe the details of Bauer’s method for the case considered in this paper. The equation is equivalent to the fact that the infinite block tridiagonal matrix

 P=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣⋱⋱⋱PT1P0P1PT1P0P1⋱⋱⋱⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦

can be factored as , where is the infinite block bidiagonal matrix

 H=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣⋱⋱H1H0H1H0⋱⋱⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦

For a chosen integer , we truncate to the matrix of size , and do a Cholesky factorization . The last row of will be , .

The YK theory says that as , and likewise for . The computation can be streamlined, by defining

 X(k)=H(k)0[H(k)0]T,k=0,1,….

The Cholesky factorization algorithm is equivalent to

 X(0)=P0,X(k+1)=P0−PT1[X(k)]−1P1. (15)

The limit matrix satisfies

 X=P0−PT1X−1P1. (16)

We can use fixed point iteration based on (15) to avoid forming the large matrix . Once has been found to sufficient accuracy, we can find as the Cholesky factor of , and .

Even better, in the simple case considered in this paper, the Symbolic Math toolbox in Matlab is able to solve equation (16) in closed form. On a MacBook Air with 2.2GHz Intel i7 processor, using Matlab 9.1, the computational time is under 0.8 sec. We get the exact matrix spectral factor with coefficients

 H0=√24[20√31],H1=√24[20−√31]. (17)

This is precisely the multiscaling function of SA1, introduced in section 2.3.

The spectral factor obtained by our closed form of Bauers method is precise. This multiscaling function can also be obtained from the 4x4 Type-II DCT matrix Gan and Ma (2005) and by using the Janashia-Lagvilava method Ephremidze et al. (2018) in exact form. It can also be obtained approximately by the Janashia-Lagvilava matrix spectral factorization method (see procedure 2 in Janashia et al. (2011)), with an error in the spectral factors of about .

### 4.2 Finding G(z)

The orthogonality conditions (2) in the case simply state that the matrix

 [H0H1G0G1]

must be orthogonal. We can find an initial orthogonal completion from a QR-decomposition of . Matlab returns

 [^G0,^G1]=√24[1−√3−1−√3020−2].

Any other orthogonal completion has to be of the form

 [G0,G1]=U⋅[^G0,^G1]

for orthogonal . We want to choose a that introduces symmetry.

A function is symmetric about a point if . It is antisymmetric about if . For multiscaling and multiwavelet functions, we only consider the simplest case, where each component or is symmetric or antisymmetric about the midpoint . Thus,

 φ(m2+t)=S⋅φ(m2−t), (18)

where is a diagonal matrix with entries (symmetry) or (antisymmetry). Likewise,

 ψ(m2+t)=T⋅ψ(m2−t).

It is not hard to show that , have the symmetry given by , if and only if

 Hk=SHm−kS,Gk=TGm−kS,k=0,…,m. (19)

In our example, we observe that the original factors (17) already satisfy

 H1=SH0S

for . That is, component is symmetric, is antisymmetric.

The symmetry conditions (19) for are

 U^G0 =TU^G1S, U^G1 =TU^G0S,

 U=TU^G1S^G−10=−TUS.

It is easy to check that is not possible, so . We choose , which means that we want to be symmetric, antisymmetric.

Any of the form

 U=[0±1±10]

is then suitable. The choice of signs corresponds to changing the sign of and/or . Using

 U=[01−10]

leads to the same , as in (10).

Choosing would interchange and , and give us

 U=[±100±1],

which again corresponds to sign changes.

## 5 Efficient Implementation

We give here a brief overview of the lifting scheme (LS) and some rounding operations. Lifting steps are an efficient implementation of filtering operations (see fig. 3(a)). One lifting step for a filter bank consists of an operation Split, a prediction , and an update .

One of the great advantages of LS is that it decomposes a multiwavelet filter into simple invertible steps Calderbank et al. (1998); Sweldens (1998). The inverse LS can be obtained from the forward LS by reversing the order of the operations, inverting the signs in the lifting steps, and replacing the Split by a Merge step (see fig. 3(b)). Figure 3: One lifting step; (a) analysis part; (b) synthesis part.

If a floor (or round or ceiling) operator is placed in each lifting step, the filter banks can now map integers to integers with perfect reconstruction, regardless of whether the lifting step has integer coefficients or not.

Moreover, if the factors are chosen to be dyadic, multiplication can be implemented by addition and binary bit shift. If the integer mantissa in a lifting step with coefficient is written in binary notation as

 k=∑ki2i,ki=0 or 1,

then the multiplication of a number by can be implemented by adding copies of , left-shifted by , for each , and a final right-shift by bits.

This approach leads to an implementation with solely binary shifts, preventing bit-depth expansion in intermediate results. With all scaling factors set to unity, multiplierless filter banks can be easily constructed using this method. If the scaling factors are powers of two, we can still retain the multiplierless feature on filter banks.

The lifting scheme can be done in place: we never need samples other than the output of the previous lifting step, and therefore we can replace the old stream by the new stream at every summation point.

### 5.1 The Lifting Scheme

Based on matrix factorization theory Hao and Shi (2001), a nonsingular matrix can be factored into a product of at most three triangular elementary PLUS reversible matrices She and Hao (2005); Wang et al. (2009); Jing et al. (2010). If the diagonal elements of the matrices are ones, a reversible integer-to-integer transform can be realized by LS.

In order to speed up the implementation, it is necessary to optimize the elementary triangular factorization by either minimizing the number of factorized matrices or the computational complexity of each step. The factorization is not unique, and there are other different factorizations that affect the error between the dyadic approximation transform and the original transform.

For input vector and output vector , the lifting implementation of the forward DMWT (6) is

 y=Hx=(PLU)x=⎡⎢ ⎢ ⎢⎣1000000101000010⎤⎥ ⎥ ⎥⎦⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣10000100−12√3210√3212−√31⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦⎡⎢ ⎢ ⎢ ⎢⎣1010010−1001√30004⎤⎥ ⎥ ⎥ ⎥⎦⎡⎢ ⎢ ⎢⎣x0x1x2x3⎤⎥ ⎥ ⎥⎦. (20)

This leads to the synthesis implementation

 H−1=12U−1L−1PT. (21)

We include the factor of with the reconstruction instead of the decomposition (see (6)).

In our implementation, we write the and matrices as products of elementary triangular and/or diagonal reversible matrices (see Strang (2006), page 34), for faster evaluation.

 L =⎡⎢ ⎢ ⎢ ⎢⎣10000100−1/2√3/2100001⎤⎥ ⎥ ⎥ ⎥⎦⎡⎢ ⎢ ⎢ ⎢⎣100001000010√3/21/2−√31⎤⎥ ⎥ ⎥ ⎥⎦, U =⎡⎢ ⎢ ⎢⎣1000010000100004⎤⎥ ⎥ ⎥⎦⎡⎢ ⎢ ⎢ ⎢⎣1000010−1001√30001⎤⎥ ⎥ ⎥ ⎥⎦⎡⎢ ⎢ ⎢⎣1010010000100001⎤⎥ ⎥ ⎥⎦

The implementation of the lifting scheme is shown in fig. 4. Figure 4: The PLUS matrices implementation of H; (a) forward; (b) backward.

### 5.2 Multiplierless architecture of the coefficient √3

For the transform to be reversible, it must be able to exactly reconstruct the input from the output. Low computational complexity is another desirable property. In order to avoid floating-point multiplications, we approximate values of the lifting matrices by dyadic numbers.

The lifting scheme described in subsection 5.1 uses only a single non-dyadic number, which is . We want to approximate this coefficient by a number of the form with integer . Since is between 1 and 2, this corresponds to an approximation with bits. For example, for , is approximated by the 6-bit number , with an error of approximately .

The dyadic approximations of the coefficient and its quantization errors are shown in table 1. The table also shows the number of adders required to implement the multiplierless structure. Figure 5: Floating-point arithmetic (solid line) in comparison with fixed point arithmetic (dash-dot line) of the impulse responses of the SA1 orthogonal multiwavelet, with 2-bit dyadic approximation of the coefficient √3 (that is, √3≈3/2); (a) ϕ0; (b) ϕ1; (c) ψ0; (d) ψ1.

We illustrate the effect of quantization for in more detail. This is the case of 2-bit approximation, with .

The true recursion coefficients of SA1 are given in eqs. (9) and (10). The actual recursion coefficients used in the 2-bit approximation are

 ~H0=√24[201.51],~H1=√24[20−1.51],~G0=√24[02−11.5],~G1=√24[0−211.5]. (22)

These approximate matrix coefficients (22) satisfy

 ~H(z)~H(z)∗+~H(−z)~H(−z)∗=,~G(z)~G(z)∗+~G(−z)~G(−z)∗=,~H(z)~G(z)∗+~H(−z)~G(−z)∗=0

instead of the true orthogonality conditions (8).

This leads to a decrease of CG to dB and non-perfect reconstruction, which is undesirable in many mathematical and engineering applications.

The effect of rounding can also be illustrated by its effect on the multiscaling and multiwavelet functions. It is easy to verify that the functions which satisfy the rounded recursion relations corresponding to (1) are

 ~ϕ0(t)=1,~ϕ1(t)=32(1−2t)

and

 ~ψ0(t)={32(1−4t)t∈[0,12],32(4t−3)t∈[12,1],~ψ1(t)={58−92tt∈[0,12],318−92tt∈[12,1].

This is illustrated in fig. 5. The biggest differences are at the end points, which are marked by squares in the figure.

Obviously, the dyadic approximation strongly influences the sloped parts in time domain, which leads to smaller magnitudes for the scaling function , as well as both wavelet functions and .

## 6 Performance Analysis

### 6.1 Comparative Analysis

The objective of this section is to evaluate numerically the performance of the SA1 multiwavelet, and compare it to some well-known orthogonal and biorthogonal multiwavelets, with the criteria being coding gain (CG) Soman and Vaidyanathan (1993); Thyagarajan (2006), Sobolev smoothness (S) Jiang (1998)

, symmetry/anti-symmetry (SA), and length. The CG for orthogonal transforms is a good indication of the performance in signal processing. It is the ratio of arithmetic and geometric means of channel variances

 CG=1r∑ri=1σ2i(∏ri=1σ2i)1/r. (23)

Coding gain is one of the most important factors to be considered in many applications. A transform with higher coding gain compacts more energy into a smaller number of coefficients. A comparison of CGs is given in table 2

. Sobolev regularity, symmetry/antisymmetry properties and the length of the given multiwavelets are also included. The CG is computed using a first order Markov model AR(1) with intersample autocorrelation coefficient

Vaidyanathan (1998). Obviously, despite the short length (only two matrix coefficients) of the simple multiwavelet SA1, its CG is better than that of most of the orthogonal and biorthogonal multiwavelets.

### 6.2 Pre- and Postfilters

In the case of a scalar-valued input signal to the multiple-input multiple-output (MIMO) filter bank it is necessary to vectorize the input. In other words, a pre-processing step (called prefiltering) must be inserted at the beginning of the analysis multifilter; in a symmetric way, a postfilter must be incorporated at the end of the synthesis filter bank. In practical applications, a popular choice is based on the Haar transforms Cotronei et al. (1998); Tan et al. (1999).

Haar pre- and postfilters have the advantage of simultaneously possessing symmetry and orthogonality, and no multiplication is needed if one ignores the scaling factor. The prefilters should be as short as possible to preserve the time localization of the wavelet decomposition. The choice of prefilter is often critical for the results, and should be made depending on the application at hand and the type of signal to be processed.

The postfilter that accompanies the prefilter must satisfy . The matrix coefficients of the pre- and postfilters we used are shown in table 3.