Inhomogenous Regularization with Limited and Indirect Data

by   Jihun Han, et al.

For an ill-posed inverse problem, particularly with incomplete and limited measurement data, regularization is an essential tool for stabilizing the inverse problem. Among various forms of regularization, the lp penalty term provides a suite of regularization of various characteristics depending on the value of p. When there are no explicit features to determine p, a spatially varying inhomogeneous p can be incorporated to apply different regularization characteristics that change over the domain. This approach has been investigated and used for denoising problems where the first or the second derivatives of the true signal provide information to design the spatially varying exponent p distribution. This study proposes a strategy to design the exponent distribution when the first and second derivatives of the true signal are not available, such as in the case of indirect and limited measurement data. The proposed method extracts statistical and patch-wise information using multiple reconstructions from a single measurement, which assists in classifying each patch to predefined features with corresponding p values. We validate the robustness and effectiveness of the proposed approach through a suite of numerical tests in 1D and 2D, including a sea ice image recovery from partial Fourier measurement data. Numerical tests show that the exponent distribution is insensitive to the choice of multiple reconstructions.



page 16

page 17

page 18

page 19

page 20

page 21

page 22


Accelerated Variance Based Joint Sparsity Recovery of Images from Fourier Data

Several problems in imaging acquire multiple measurement vectors (MMVs) ...

l_p regularization for ensemble Kalman inversion

Ensemble Kalman inversion (EKI) is a derivative-free optimization method...

Regularization of backward time-fractional parabolic equations by Sobolev equations methods

It is well-known that backward parabolic equations in which the initial ...

Efficient edge-preserving methods for dynamic inverse problems

We consider efficient methods for computing solutions to dynamic inverse...

Signal retrieval with measurement system knowledge using variational generative model

Signal retrieval from a series of indirect measurements is a common task...

Whole Brain Susceptibility Mapping Using Harmonic Incompatibility Removal

Quantitative susceptibility mapping (QSM) uses the phase data in magneti...

Regularization for the inversion of Fibre Bragg Grating spectra

Fibre Bragg Gratings have become widespread measurement devices in engin...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Inverse problems aim to identify an underlying signal from measurement data. The measurement data is typically limited and incomplete, and thus the inverse problem is often formulated as an ill-posed problem. A regularization is an essential tool in stabilizing an ill-posed inverse problem. Regularization imposes an additional structure in the solution or restricts the solution to a particular space so that there is a unique solution to the problem. A general approach of regularization adds a penalty term to an objective function that measures the data misfit. Tikhonov, or regularization, adds an

norm penalty term so that all components of the solution are equally regularized. In the Bayesian context, the Tikhonov regularization is equivalent to prior knowledge that the uncertainty of the solution follows a Gaussian distribution, which helps recover a smooth solution.

Another widely used class of regularization includes regularization to recover a sparse solution. regularization has been used as a convex relaxation of the penalty that imposes a constraint in the number of nonzero components of the solution. By avoiding the combinatorial complexity related to penalty, regularization is computationally efficient in recovering a sparse signal from a limited measurement data that is much smaller than the unknown variable’s dimension [6]. In particular, the total variation (TV) regularization [20], which is related to an

norm of the gradient, shows robust performance in recovering jump-discontinuities or edges by incorporating the sparsity in the gradient domain.

When there are no explicit features, such as sparsity for or smooth variations for , or when the features are mixed, none of the single regularizations provide robust performance. regularization smoothes out sharp edges or discontinuities. On the other hand, or TV gives rise to an unfavorable staircase effect [12] that yields piecewise constant recovery of smooth or oscillatory signals. There are approaches to combine both the Tikhonov and TV regularizations [17, 15], which utilize the trade-off characteristics of both regularizations. The approach in [17] decomposes the signal into a piecewise constant component and a smooth component and penalizes two components with TV and Tikhonov regularizations, respectively. Another approach proposed in [15]

uses a weighted sum of both regularizations, in which the balance is determined by the local behavior estimated from gradient information.

There are several classes of methods to recover signals with mixed features. The weighted TV [7, 10, 18, 13, 14] changes the balance between the fidelity and the regularization terms over components so that the method locally controls the regularization effect. High-order TV [2, 16, 8, 5, 21] takes -th order derivatives into account to obtain piece-wise constant ()-th order derivatives. A jump discontinuity becomes large for a high-order derivative compared to a low-order derivative, and thus the penalty can focus more on discontinuity regions by using a high-order derivative. The order of derivatives can be constant [2, 16] or of various orders [8, 5, 21]. An inhomogeneous regularization is another class of regularization to recover signals with mixed features. The inhomogeneous regularization uses a non-constant for the regularization to seamlessly utilize the characteristics of various values, which was first proposed in [3] for image denoising application. Compared to the weighted TV, the inhomogeneous regularization changes the shape of the regularization constraint, while the weighted TV changes only the scale, not the geometry of the regularization.

There are strategies to assign the spatially varying exponent over the signal using different references. The method used in [3] designs as a function of the pixel gradient value where it is close to 1 for a large gradient pixel to simulate the TV regularization at sharp edges, and 2 for a small gradient pixel to behave as Tikhonov regularization at smooth regions. Another method proposed in [11] assigns as a function of the second derivative called the difference curvature distinguishing the edge from flat and ramp regions. In the design of the inhomogeneous exponent distribution

, the strategies using the first and the second derivatives of the true signal are sensitive to the quality of the derivative estimation. In the denoising application with a large signal-to-noise ratio (SNR), the measurement data provides a robust estimation for the derivative. However, in a wide range of applications, for example, remote sensing, such high-quality information of the true signal is often unavailable, and thus the design of the inhomogeneous regularization exponent remains as a challenge.

In this work, we propose a method to design a spatially varying

for the inhomogeneous regularization from the incomplete and indirect measurement, such as partial incomplete Fourier measurement data. The proposed method utilizes the statistical information of the true signal. From a single measurement, the method generates a set of samples from a standard reconstruction method. Each reconstruction is not accurate, but the set of samples provides robust statistical information to design the inhomogeneous exponent distribution. A similar idea has been used in the variance based joint sparsity (VBJS)

[1] where point-wise variance statistics is utilized to estimate the weights of the weighted regularization. Our method is different from other methods in that patches are used to extract statistical information instead of point-wise or pixel-wise information. Also, the proposed method uses interrelation on a patch neighborhood in addition to the point-wise gradient value. This strategy improves the stability and reduces the uncertainty in estimating the local characteristics of the signal that assist in classifying each patch to predefined features with corresponding values.

In the assignment of in each patch, the range of is restricted to as in [3]. The range of could extend to or for increased flexibility in the design of the inhomogeneous regularization exponent. -regularization, , is known to be superior to -regularization in capturing sparsity features. -regularization, , , is satisfactory to recover very smooth regions. However, , leads to the non-convexity of the objective function and thus requires a delicate optimization method. Also, , can suffer from numerical instability while the gain is marginal compared to . For this consideration, we restrict to , maintaining a convex optimization framework.

The rest of the paper is organized as follows: Section 2 details the problem setup and briefly reviews the inhomogeneous regularization, which is with a spatially varying . In section 3, we propose an intrusive method to design the spatially varying distribution from indirect and limited data. In Section 4, we modify ADMM [4] as an efficient optimization method to solve the inhomogeneous regularization problem. Section 5 provides 1D and 2D numerical experiments validating the effectiveness and robustness of the proposed method, including recovery of sea ice from incomplete Fourier measurement data. Finally, we conclude this paper with discussions about the limitation and future directions of the current study in Section 6.

2 Inhomogeneous regularization

In this section, we formulate the inverse problem of our interest and briefly review the standard regularization method using a penalty term, such as and regularizations. When the signal has mixed features, such as sparsity and oscillatory behaviors, a homogeneous type regularization can suffer from inaccuracy due to inappropriate characterization of signal characteristics that change over the domain. As a strategy to reconstruct such signals with mixed features, we review the inhomogeneous regularization [3], which is an with a spatially varying . We also discuss how to design the distribution of for denoising applications and its limitations in other applications.

2.1 Problem setup

We are interested in recovering a signal from a measurement . In particular, we assume that and satisfy the following relation


Here is a linear operator.

is a measurement error vector that is assumed to be Gaussian with zero mean. We further assume that the measurement error is uncorrelated across different components, and thus

has a diagonal covariance. The measurement operator can describe a wide range of measurement operators, such as deburring using a Gaussian kernel. In the case of image denoising,

is the identity matrix, and the inverse problem is interested in recovering

from . In this study, we are particularly interested in the discrete Fourier measurement that finds applications in magnetic resonance (MRI) or Synthetic Aperture Radar (SAR) for sea ice imaging.

Under the aforementioned setup, the reconstruction of from is obtained by solving the following optimization problem


When the measurement data is limited or incomplete, there is insufficient information to perfectly reconstruct . That is, the reconstruction of from is ill-conditioned when . Regularization plays an essential role in stabilizing the inverse problem. The regularization forces constraints or restricts to a particular space reflecting prior information. The inverse problem is typically regularized by adding a penalty term


where the regularization parameter determines the balance between the penalty and the fidelity terms. Tikhonov regularization uses the norm of to recover the smoothly varying signal, while regularization is useful to reconstruct a signal with sparse structures. The sparsity can be extended to the gradient space of ; the total variation (TV) regularization, which uses the norm of the gradient for , can reconstruct a signal with sharp edges or discontinuities.

TV is beneficial for capturing edge-like jump discontinuities while it has a limited application for smoothly varying signals due to the staircase or blocky effect. When the sparsity is not a dominant feature of the signal, it is necessary to incorporate other features in the local scale.

2.2 Inhomogeneous regularization using spatially varying

When the signal has mixed features, for example, edges and oscillations, an application of with a homogeneous regularization will suffer from performance degradation. For example, TV regularization for a smooth oscillatory signal will generate staircase effects while regularization fails to capture sparse/discontinuous feature. To address such issues, [3] proposed an inhomogeneous regularization, which is an with a spatially varying


where is a discrete differential operator, for example, the forward Euler discrete gradient operator (which can be read as 1-dimensional or 2-dimensional depending on the unknown signal). Instead of the homogeneous in the standard TV regularization, the inhomogeneous regularization uses the exponent vector to adaptively use the characteristics of for various values, while each , is bounded in due to the computational benefits discussed in the previous section. [3] designed the exponent as a function of the pixel gradient value, , (i.e., ) for image denoising application. As the function satisfies and , it adaptively simulates the TV regularization at sharp gradients and Tikhonov regularization at smoother regions.

We note that the inhomogeneous regularization is different from the weighted TV


using a weight vector . The weighted TV changes the scale of each , while preserving the geometry of the constraint. On the other hand, the inhomogeneous regularization changes the geometry of the constraint while maintaining the convexity. It is natural to combine the two methods to utilize the characteristics of each method


using an exponent vector and a weight vector . A PDE-based denoising method [11] considered both the spatially varying and the balance, in which pointwise weight is applied to the corresponding fidelity term compared to Eq. (6). Exponents and weights are designed as functions of the second derivative based edge indicator called difference curvature. The edge indicator is the difference between second directional derivatives in tangential and normal directions of the gradient, which has better performance than the gradient and other curvatures in distinguishing edges from flat and ramp regions and isolated noise. The authors assign exponents and weights negatively and positively proportional to the square root of the difference curvature following the similar idea in [3] mentioned above.

The focus of the current study is the application of inhomogeneous regularization to incomplete and indirect data. The aforementioned strategies to design the exponent vector requires the first and/or second order derivatives of the true signal. In the denoising problem, as the measurement operator is the identity, such information of the unknown signal can be estimated when the signal-to-noise is low. However, if the measurement operator is limited and indirect, no such information is available. In particular, there is no robust exponent design strategy for an incomplete measurement of the Fourier data.

3 Construction of the exponent distribution in the inhomogeneous regularization

In this section, we propose a strategy to design the exponent distribution of the inhomogeneous regularization (4), using incomplete indirect data. Our numerical experiments in the next section show the robust performance of the inhomogeneous regularization without weights, and thus we focus on the inhomogeneous exponent in the current study. We leave the weight design as future work. In the design of the inhomogeneous exponent following the ideas in [3, 11], the main issue of the indirect and limited measurement data is the lack of an accurate estimate of the first and/or the second derivative information. We address this issue by estimating the exponent based on statistical information of the unknown signal. In particular, we generate samples using multiple reconstructions from the standard regularization, such as TV or . Any single reconstruction of them is not necessarily accurate as the regularization can be inappropriate. However, using patches instead of pointwise values, we can extract robust statistical information to design the inhomogeneous exponent. That is, we identify the local characteristic of the signal within a small patch rather than a single point. The rationale of the patches is that the signal characteristics to design , such as sparsity and oscillation, rely on the interrelation with neighborhoods.

Let be the set of non-overlapping patches that covers the domain of the signal. Then, the regularization functional (4) can be rewritten as


where is the index set corresponding to the patch , and is an exponent that is constant in . Here, we presume that the regularization effect is local within a patch as inter-patch interactions do not diffuse very far [19]. For a 2D image, we can consider the total variation as either isotropic TV, , or the anisotropic TV, . One difference between the two candidates is the rotational invariant property of the isotropic TV, and each type is employed in favor of applications. In this work, we consider the isotropic TV for its convenience in the adaptation in ADMM, which is explained in Section 4. Our construction of the exponent distribution consists of three sub-steps; i) generating multiple samples from a single measurement data, ii) classifying each patch to predefined features, and iii) assigning the exponent value based on the classification.

3.1 Multiple reconstructions from a single data

From a single measurement data , we reconstruct multiple samples by changing the balance between the fidelity and the homogeneous regularization penalty terms. That is, we solve the following standard homogeneous regularization using different values of


where is a constant vector. This idea of generating multiple measurement samples has been used in [1] to design the weight of the weighted regularization, where only one value of

has been used. In our study, we use variance-based statistics to extract features selectively from two sets of samples; discontinuity from TV-regularized samples (

) and oscillation from Tikhonov-regularized samples (). For each or , we use a logarithmically-spaced distribution to draw . The rationale behind the logarithmically-spaced distribution is to have samples not necessarily biased by either the fidelity or the regularization terms. By drawing multiple values, we solve the standard optimization (8), which yields two sets of reconstructed samples, and , from the TV and Tikhonov regularizations, respectively.

3.2 Patch classification

From the multiple reconstructions, we classify each patch into three categories, i) discontinuity, ii) oscillation, and iii) smoothness. First, we estimate the gradient of the unknown signal using the averaged gradient of the reconstructed samples and


To classify the patches in an unsupervised way, we compute the patch-wise variance of the gradient called variance pooling


and normalize the variance map with Min-Max normalization (denoted as ). The variance is expected to be low if a patch lies on a smooth region and high if there is discontinuity or oscillation. We identify the smooth region by introducing the threshold value on the variance. The patch with a variance greater than the threshold is classified as either discontinuity or oscillation. We distinguish the discontinuity patch from the oscillation patch by observing the neighborhood patches.

We first define the -sized directional neighborhoods for given patch (1D) or (2D) as


Note that the centered patch is exclusive in each neighborhood, and four directional neighborhoods (i.e., horizontal, vertical, and two diagonal directions) are considered in a 2D image. We utilize the different characteristics of two classes in the neighborhoods that a discontinuity patch is isolated from the surrounding smooth patches; on the other hand, an oscillatory patch is encompassed by other oscillatory patches. Thus, a discontinuity patch has a directional patch-neighborhood with low variance patches, which is not the case for oscillation. To capture such behavioral difference, we develop a filter on the variance-pooled gradient map,


which computes the minimum among the maximums of each directional neighborhood. Here and .

If the filtered value is less than the smooth threshold used above, the patch is classified as discontinuity as it implies there is a smooth region around. Otherwise, the patch is identified as oscillation as non-smooth regions surround the patch. To identify discontinuity, -statistics are preferred to -statistics in that the first have the higher value on the discontinuous patch and the smooth neighborhoods are well-identified with lower value since the second is contaminated by the Gibb’s phenomena around the discontinuity. On the other hand, -statistics are more stable for identifying oscillation since the -statistics have a chance to mislead the oscillatory feature by the staircase effect (a numerical test for comparison between and statistics in variance pooling

1:patch size , variance threshold , MinMax-neighborhood filter size
3:Choose the set of regularization parameters . Recover the unknown signal from TV and Tikhonov regularizations;
4:Compute the gradient statistics from and ;
5:Split the gradient map and to -sized patches, . Apply the variance pooling in each patch, ;
where is the index set of the patch .
6:Classify the patches into categories;
and is the Min-Max normalization of , and is read in Section 3.
Algorithm 1 Local feature classification

maps are presented in Fig. 1-(a) and (b) in Section 5). The complete local feature classification algorithm is summarized in Algorithm 1.

3.3 Exponent distribution

We interpret

as an interpolation balancing between

and regularizations. As the exponent converges to the lower bound , the regularization behaves in the TV manner, which is favorable to recover the sparse feature, such as jump discontinuity. In other perspectives, the sparsity is relaxed as the exponent is far away from , and the regularization weights more on smoothness as the exponent gets close to in the sense of Tikhonov regularization. Following this interpretation, we assign for the exponent on a discontinuity patch. For smooth and oscillatory patches, we adaptively set the exponent in reference to the patch-wise average of the gradient

1:patch size , variance threshold , MinMax-neighborhood filter size , power distribution constant
3:Choose the set of regularization parameters . Recover the unknown signal from TV and Tikhonov regularizations;
4:Compute the gradient statistics from and ;
5:Split the gradient map and to -sized patches, . Classify patches with Algorithm 1.
6:Apply the average pooling on and maps;
where is the index set of the patch .
7:Distribute the exponent on each patch;
where is Min-Max normalization of .
8:Recover the unknown signal with inhomogeneous exponent distribution ,
Algorithm 2 Signal recovery with the inhomogenous regularization

(called average pooling),


The distribution of the exponent is given as a function of Min-Max normalization of the average pooling, (denoted as ), which is designed to be an increasing function as a larger exponent induces less penalization on the gradient. sets the exponent close to on the flat region (i.e., zero gradient), while it sets the exponent close to on the oscillatory patch with a high averaged gradient. We design to be


From the numerical experiment in Section 5 (Fig. 1 (c) and (d)), we observe that on smooth patches, -average pooling is more stable under the choice of reconstructions in comparison with -average pooling (and vice versa on oscillatory patches). Accordingly, we assign the on a smooth patch and on an oscillatory patch. The complete procedure is presented in Algorithm 2.

4 ADMM for inhomogeneous regularization

As the exponents of the inhomogeneous regularization remain bounded in , a convex optimization method can be utilized to solve the problem. Among other convex optimization methods, we find that the inhomogeneous regularization can be efficiently solved using ADMM [4] with appropriate problem transformation. The ADMM is a variant of the augmented Lagrangian method, which is flexible in decomposing the original optimization into small local subproblems with computational benefits. For instance, it reduces the complexity by avoiding the joint (sub)optimizations, or is suitable for parallel computing. We take such advantage of the ADMM to solve the regularized optimization by decoupling the regularization term from the fidelity term and handling both separately. Among many possibilities to transform the problem to fit in the framework, we design the transformation to make the following subproblems efficiently solved. Motivated by the group Lasso regularization [22], we reduce the suboptimization corresponding to the inhomogeneous regularization separable, which is computationally beneficial.

We rewrite the problem (8) by introducing the auxiliary variable and the constraint as following;

minimize (16)
subject to (17)

where is the discrete gradient operator, is either or depending on whether the unknown is 1D signal or 2D image, is the auxiliary variable, and the objective functions and are


We note that the constraint Eq. (17) is read as , for the case of 1D signal, and , for the case of 2D image. The augmented Lagrangian corresponding to the optimization Eq. (16) and Eq. (17) is


where is the Lagrange multiplier and is the penalty parameter. The ADMM consists of the iterations with dual updates


The suboptimizations, Eq. (21) and Eq. (22) are known as proximal operators, which, in general, have either closed-form or implicit form depending on the objective terms. The -optimization Eq. (21), as a quadratic objective form, is known to have the closed-form


For the proximal operator in -optimization Eq. (22), the objective function is separable, so is the optimization. For ,


in which the last equality holds by the norm composition rule of the proximal operator. We can easily derive the proximal operator, in Eq. (27) as


where is the soft thresholding or shrinkage operator defined as . We note that has the unique zero as it is an increasing function. We find the root of with the Chadrupatla’s method [9], which is superior to find the root of a function with stiff regions, as does near , than secant methods.

The algorithm is also applied for the case of anisotropic TV by replacing in Eq. (19) to . The -proximal operator is invariant, but the -proximal operator corresponding to Eq. (26) should be directly solved by an iterative method due to the limitation of composition rule for -norm.

5 Numerical Results

We validate the robustness and effectiveness of the proposed method to design the inhomogeneous regularization through a suite of numerical tests, including sea ice reconstruction. In particular, as an incomplete and indirect measurement, we consider partial Fourier measurement used in many applications, such as magnetic resonance (MRI) or Synthetic Aperture Radar (SAR) imaging. The measurement is characterized as a global operator in that each Fourier coefficient depends on the global shape of the signal, which contrasts to the local measurement such as blurred or noisy signal.

We conduct a numerical experiment to investigate the consistency of the exponent design under various sets of reconstructed samples. We particularly analyze the distributions of variance and average pooling maps from multiple sets of reconstruction samples, each set of which corresponds to the logarithmically-spaced regularization parameters on a randomly generated interval. In generating multiple samples, we can easily parallelize to solve multiple optimization problems with different regularization parameters which are not coupled with each other. In the ADMM

Figure 1: The distributions of the pooling maps () from multiple sets of reconstruction samples. The first column ((a) and (c)) describes the pooling maps from and the second column ((b) and (d)) corresponds to . The first row ((a) and (b)) and the second row ((c) and (d)) describe the variance and average pooling maps, respectively. The empty dot represents the mean of pooled samples and the bar corresponds to the range of pooled values on each patch.

framework, a regularization parameter is only engaged in -optimization, Eq. (22), in the dual update scheme, which is also separable. We compute the -update in parallel in both regularization parameters and components by Eq. (28). In addition, we use the penalty parameter in the ADMM for all examples.

We measure the performance of the proposed method using the point-wise and -errors, for , comparing with three other regularizations; TV, Tikhonov, and the inhomogeneous regularization with the curvature-based exponent design [11] where the curvature is calculated using the average of the samples


5.1 Synthetic 1D signal recovery

The first test is a synthetic 1D signal. The signal is generated by combining two signals with different characteristics


where the signal has a sparse gradient feature on and highly oscillatory behavior on . We discretize the signal with uniform grid points in , which yields a 200-dimensional true signal.

The measurement is the first

of the smallest wavenumber components of the Fourier transform of the true signal. As each coefficient is a complex number, we transform it to the real-valued vectors by separating the real and imaginary parts (i.e., Fourier cosine and sine coefficients); thus, the total number of measurement is

of the signal length (i.e., ). To focus on the design of the inhomogeneous regularization while minimizing the effect from the measurement error, we do not contaminate the measurement with the Gaussian noise in this test.

Fig. 1 shows the distribution of variance and average pooling maps generated from 1000 TV and Tikhonov samples, and , . Each regularization parameter set, , is logarithmically-spaced values in a random interval from with random length in the range from to . We compare two distributions of variance pooling maps (Fig. 1 (a) and (b)) in the aspect of predefined features for the classification. Smooth regions consistently have low variance in both distributions, which are distinguished from the discontinuity and oscillatory regions with high variance. The discontinuity feature is more discernible in TV-regularized reconstructions (Fig. 1-(a)) in that a patch with a discontinuity is well-isolated from low variance patches identified as the smooth class, and is stable in high variance value. On the other hand, variance values on oscillatory regions are higher in average and more consistent in Tikhonov-regularized reconstructions ( Fig. 1-(b)). We note that the identification for smooth regions is a baseline for our feature classification algorithm, and the variance statistic has more consistent distribution on smooth regions than the average statistic. Moreover, a discontinuity is more distinguishable in both magnitude and the isolation feature in the variance pooling maps. In such consideration, we adapt the variance statistics for feature classification rather than average correspondence. Two distributions of average pooling maps are consistent in the different regions as the smooth patch has smaller average ranges in the TV-regularized reconstruction (Fig. 1-(c)) and an oscillatory patch corresponds to the Tikhonov-regularized reconstruction (Fig. 1-(d)). As adaptively referring to two average pooling maps, we reduce the uncertainty of the exponent distribution from the choice of reconstruction samples.

Figure 2:

Local feature classification (a) and exponent distribution (b) with the hyperparameters

in algorithm 2

The exponent is assigned as on the discontinuity regions, on the oscillatory regions, and the nontrivial exponents ()) are distributed on the smooth regions based on

Figure 3: 1-dimensional signal recovery from 5 different regularizations; (a) homogeneous , (b) homogeneous , (c) curvature-based regularization, (d), on and on , (e) proposed regularization

the estimated gradient information from reconstruction samples (Fig. 2-(b)). Note that it is not uniform exponent distribution on the flat (i.e., gradient-free) regions on the sparse gradient domain () by reflecting the behavior of reconstruction samples. Also, the two tails of the oscillation are not classified as oscillations but have relatively large exponents proportional to the gradient values among the smooth regions.

The signals recovered from different regularizations are presented in Fig. 3 alongside their point-wise errors in Fig. 4. The recovery from the classical TV (homogeneous ) performs well in capturing the discontinuities but is contaminated on the oscillation by flattening the peaks of oscillation as known as the staircase effect (Fig. 3-a). The case of homogeneous is the other way around as the oscillation is clearly recovered, but the discontinuity is deteriorated by Gibbs phenomena (Fig. 3-b). The curvature-based regularization reduces the Gibbs phenomena near the jump discontinuity but is still limited to recover the oscillation fully (Fig. 3-c).

As the interpolation idea of and , the proposed regularization adapts the favorable exponents responding signal features, and the corresponding recovery is accurate through all the regions (Fig. 3-e). We also conduct the experiment that we assign the homogeneous on and

Figure 4: Pointwise errors of 5 signal recoveries presented in Fig. 3

homogeneous on as the exponents are favorable to each separated signal (Fig. 3-d). The recovery is comparable to our reconstruction on the oscillatory region; on the sparse gradient domain

, our nonuniform exponent distribution outperforms the uniform distribution

(see Fig. 4). As the Fourier measurement is a global operator, the reconstructions on different regions are interrelated. The behavior of reconstruction samples, though possibly inaccurate, provides a meaningful guideline to design the exponent distribution.

5.2 Synthetic 2D image recovery

We extend our experiment to a 2D image. We generate the test image


where the features of oscillation, jump discontinuity, and ramp are observed. We consider the discretization with uniform grid on (i.e., ). We assume that of original Fourier coefficients are given as the measurement. The coefficients are sampled from -direction, and, as in the previous test, we choose the first coefficients in increasing order of wavelength. Also, the measurement is contaminated by Gaussian noise with corresponding to the signal-to-noise ratio (SNR) .

Fig. 5 presents the feature classification (Fig. 5-(b)) and the exponent distribution (Fig. 5-(c)) from algorithms 1 and 2. The smooth region is identified by the collection of patches with low variance in gradient. Among the patches with high gradient variance, the discontinuity class is distinguished from the oscillation by observing the directional neighborhoods with smooth patches. Even under the noisy measurement, the variance statistics are consistent with identifying the features of regions. In reference to the classification, the exponents are distributed along the discontinuity curve with the patch-sized neighborhood, and the circular-shaped oscillatory region is covered by the exponents closed to regarding high gradient distribution in there. In other ramp regions, the exponents are adaptively assigned proportionally to the estimated gradient values, which are close to in this example.

The recovery approximations from the different regularizations are presented in Fig. 6, and each pointwise error and pairwise comparisons are summarized in Fig. 7 and Fig. 8. The homogeneous (classical TV) is well-performed in the recovery of jump discontinuity, and the homogeneous is corresponding to the oscillatory region. Fig. 8-(a) presents the comparison of two regularizations, and the advantage of each regularization is distinguishable. The curvature-based regularization has the trade-off performance on the oscillatory region and other regions including discontinuity and ramps in comparison to homogeneous regularizations (Fig. 8-(b),(c)). The proposed regularization accepts the benefits of two homogeneous regularizations selectively as it is superior to homogeneous on the oscillatory region (Fig. 8-(d)) and homogeneous on the discontinuity (Fig. 8-(e)). On the ramp regions, the recovery is comparable to the homogeneous as the exponent distribution is close to (Fig. 8-(d)). Compared to the curvature-based regularization, both regularizations are comparable on the ramp regions, and the proposed regularization has better performance on the regions of the discontinuity and the oscillation (Fig. 8-(f)). Overall, as the signal has oscillatory features in , discontinuity in , and the ramp in over the domain, the proposed regularization improves the homogeneous with in -error ( in -error), the homogeneous with in -error ( in -error) (Fig. 7).

Figure 5: True image (a), local feature classification (b), exponent distribution (c) with the hyperparameters in algorithm 2
Figure 6: Synthetic 2D image recovery from 4 different regularizations; (a) homogeneous , (b) homogeneous , (c) curvature-based regularization, (d) proposed regularization
Figure 7: Pointwise errors of image recovery from 4 different regularizations; (a) homogeneous , (b) homogeneous , (c) curvature-based regularization, (d) proposed regularization. The darker color indicates less accurate approximation.
Figure 8: Pointwise error comparison; each plot presents the subtraction between pointwise errors of two recoveries (a) homogeneous subtracts homogeneous , (b) curvature-based regularization subtracts homogeneous , (c) curvature-based regularization subtracts homogeneous , (d) proposed regularization subtracts homogeneous , (e) proposed regularization subtracts homogeneous , (f) proposed regularization subtracts curvature-based regularization. The color blue indicates the minuend has less pointwise error than the subtrahend, and the color red indicates contrariwise.

5.3 Real 2D image recovery

Finally, we test our algorithm for real image reconstruction. The image is employed from the dataset of Navy Ice Camp See Dragon (ICEX2020). We downsample the original image and use the part with the resolution , which corresponds to the sea area of size (Fig. 9-(a) or Fig. 10-(a)). The measurement is the downsampling from the original Fourier coefficients by choosing every three coefficients in -direction. Also, the measurement is assumed to have additive Gaussian noise with corresponding to the signal-to-noise ratio (SNR) 50.

The region classification and the exponent distribution are presented in Fig. 9. The classification map is less readable than the case of the synthetic image, as the features of interest are not simply discernible from the given image. The classification indicates discontinuity regions and oscillatory regions over the given image.

Fig. 10 displays the reconstruction from different regularizations, and the pointwise errors of each approximation are summarized in Fig. 11. The recovery from the proposed regularization has resulted in overall improvement as indicated in and errors. In particular, it is better in capturing the underlying features of sea ice textures, which is especially discernible as illustrated by the pointwise error on the region (see Fig. 11).

Figure 9: True image (a), local feature classification (b), exponent distribution (c) with the hyperparameters in algorithm 2
Figure 10: Real 2D image recovery from 4 different regularizations; (a) true image, (b) homogeneous , (c) homogeneous , (d) curvature-based regularization, (e) proposed regularization
Figure 11: Pointwise errors of image recovery from 4 different regularizations; (a) homogeneous , (b) homogeneous , (c) curvature-based regularization, (d) proposed regularization

6 Conclusions

For an underdetermined inverse problem, regularization plays an important role in stabilizing the inverse problem. For denoising applications, the inhomogeneous regularization, which is the regularization of the gradient with a spatially varying exponent has been investigated for the reconstruction of a signal with mixed features, such as sparsity, edges, oscillation, etc. The current study has focused on the design of the inhomogeneous exponent when the first and the second derivative of the true signal are not available due to indirect and incomplete measurement data.

The proposed method generates multiple reconstructions using standard homogeneous regularization to extract statistical information that assists the design of the inhomogeneous exponent. The local characteristics of the unknown signal are estimated on a set of patches for the average and the variance pooling of the gradient. The proposed method classifies each patch in an unsupervised way where the exponent value is assigned based on the classification. We also modified the ADMM method to efficiently handle the inhomogeneous regularization while maintaining the computational efficiency of ADMM. We have tested the robustness and effectiveness of the proposed method through a suite of image reconstruction problems in 1D and 2D images, including sea ice reconstruction from partial Fourier measurement data. The numerical results show the robust performance of the proposed method in comparison with the standard regularization method, including the inhomogeneous regularization using the exponent design based on curvature information.

In the current study, we focused on incomplete Fourier measurements, which is a global operator. Although each patch is classified independently, signal components in different patches are connected through the measurement operation. We expect that this implicit connection across different patches will affect the patch-wise exponent estimation. We plan to investigate the effect of varying patch sizes and shapes to handle the interconnection across patches and other measurement types, which we leave as future work.

In our numerical tests, the inhomogeneous regularization shows robust performance without weighting. However, it is natural to investigate the interplay between the weighted and the inhomogeneous regularizations and develop a strategy to design the weights. The inhomogeneous regularization changes the shape of the regularization constraint. On the other hand, the weighted regularization changes the length of a simplex while maintaining the geometry of the simplex. As these two approaches have different geometrical interpretations, we expect that each approach has intrinsic performance difference for a certain class of problems, will be reported in another place.


The authors thank Chris Polashenski for providing the sea ice data. Yoonsang Lee is supported in part by NSF DMS-1912999 and ONR MURI N00014-20-1-2595.


  • [1] B. Adcock, A. Gelb, G. Song, and Y. Sui (2019) Joint sparse recovery based on variances. SIAM Journal on Scientific Computing 41 (1), pp. A246–A268. Cited by: §1, §3.1.
  • [2] R. Archibald, A. Gelb, and R. B. Platte (2016) Image reconstruction from undersampled fourier data using the polynomial annihilation transform. Journal of Scientific Computing 67 (2), pp. 432–452. Cited by: §1.
  • [3] P. Blomgren, T. F. Chan, and P. Mulet (1997) Extensions to total variation denoising. In Advanced Signal Processing: Algorithms, Architectures, and Implementations VII, Vol. 3162, pp. 367–375. Cited by: §1, §1, §1, §2.2, §2.2, §2, §3.
  • [4] S. Boyd, N. Parikh, and E. Chu (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Now Publishers Inc. Cited by: §1, §4.
  • [5] K. Bredies, K. Kunisch, and T. Pock (2010) Total generalized variation. SIAM Journal on Imaging Sciences 3 (3), pp. 492–526. Cited by: §1.
  • [6] E.J. Candes, J. Romberg, and T. Tao (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory 52 (2), pp. 489–509. External Links: Document Cited by: §1.
  • [7] E. J. Candes, M. B. Wakin, and S. P. Boyd (2008) Enhancing sparsity by reweighted ℓ 1 minimization. Journal of Fourier analysis and applications 14 (5-6), pp. 877–905. Cited by: §1.
  • [8] T. Chan, A. Marquina, and P. Mulet (2000) High-order total variation-based image restoration. SIAM Journal on Scientific Computing 22 (2), pp. 503–516. Cited by: §1.
  • [9] T. R. Chandrupatla (1997) A new hybrid quadratic/bisection algorithm for finding the zero of a nonlinear function without using derivatives. Advances in Engineering Software 28 (3), pp. 145–149. Cited by: §4.
  • [10] R. Chartrand and W. Yin (2008) Iteratively reweighted algorithms for compressive sensing. In 2008 IEEE international conference on acoustics, speech and signal processing, pp. 3869–3872. Cited by: §1.
  • [11] Q. Chen, P. Montesinos, Q. S. Sun, P. A. Heng, et al. (2010) Adaptive total variation denoising based on difference curvature. Image and vision computing 28 (3), pp. 298–306. Cited by: §1, §2.2, §3, §5.
  • [12] D. C. Dobson and F. Santosa (1996) Recovery of blocky images from noisy and blurred data. SIAM Journal on Applied Mathematics 56 (4), pp. 1181–1198. Cited by: §1.
  • [13] A. El Hamidi, M. Menard, M. Lugiez, and C. Ghannam (2010) Weighted and extended total variation for image restoration and decomposition. Pattern Recognition 43 (4), pp. 1564–1576. Cited by: §1.
  • [14] A. Gelb and T. Scarnati (2019) Reducing effects of bad data using variance based joint sparsity recovery. Journal of Scientific Computing 78 (1), pp. 94–120. Cited by: §1.
  • [15] A. Gholami and S. M. Hosseini (2013) A balanced combination of tikhonov and total variation regularizations for reconstruction of piecewise-smooth signals. Signal Processing 93 (7), pp. 1945–1960. Cited by: §1.
  • [16] S. Lefkimmiatis, A. Bourquard, and M. Unser (2011) Hessian-based norm regularization for image restoration with biomedical applications. IEEE Transactions on Image Processing 21 (3), pp. 983–995. Cited by: §1.
  • [17] K. Liu, J. Tan, and B. Su (2014) An adaptive image denoising model based on tikhonov and tv regularizations. Advances in Multimedia 2014. Cited by: §1.
  • [18] Y. Liu, J. Ma, Y. Fan, and Z. Liang (2012) Adaptive-weighted total variation minimization for sparse data toward low-dose x-ray computed tomography image reconstruction. Physics in Medicine & Biology 57 (23), pp. 7923. Cited by: §1.
  • [19] C. Louchet and L. Moisan (2011) Total variation as a local filter. SIAM Journal on Imaging Sciences 4 (2), pp. 651–694. Cited by: §3.
  • [20] L. I. Rudin, S. Osher, and E. Fatemi (1992) Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena 60 (1-4), pp. 259–268. Cited by: §1.
  • [21] S. Setzer, G. Steidl, and T. Teuber (2011) Infimal convolution regularizations with discrete ℓ1-type functionals. Communications in Mathematical Sciences 9 (3), pp. 797–827. Cited by: §1.
  • [22] M. Yuan and Y. Lin (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68 (1), pp. 49–67. Cited by: §4.