# Minimax adaptive wavelet estimator for the anisotropic functional deconvolution model with unknown kernel

In the present paper, we consider the estimation of a periodic two-dimensional function f(·,·) based on observations from its noisy convolution, and convolution kernel g(·,·) unknown. We derive the minimax lower bounds for the mean squared error assuming that f belongs to certain Besov space and the kernel function g satisfies some smoothness properties. We construct an adaptive hard-thresholding wavelet estimator that is asymptotically near-optimal within a logarithmic factor in a wide range of Besov balls. The proposed estimation algorithm implements a truncation to estimate the wavelet coefficients, in addition to the conventional hard-thresholds. A limited simulations study confirms theoretical claims of the paper.

## Authors

• 4 publications
• 19 publications
• ### Minimax adaptive wavelet estimator for the simultaneous blind deconvolution with fractional Gaussian noise

We construct an adaptive wavelet estimator that attains minimax near-opt...
06/01/2018 ∙ by Rida Benhaddou, et al. ∙ 0

• ### Robust functional estimation in the multivariate partial linear model

We consider the problem of adaptive estimation of the functional compone...
12/25/2017 ∙ by Michael Levine, et al. ∙ 0

• ### Anisotropic functional deconvolution with long-memory noise: the case of a multi-parameter fractional Wiener sheet

We look into the minimax results for the anisotropic two-dimensional fun...
12/18/2018 ∙ by Rida Benhaddou, et al. ∙ 0

• ### Multiresolution analysis and adaptive estimation on a sphere using stereographic waveletsBogdan Ćmiel

We construct an adaptive estimator of a density function on d dimensiona...
09/07/2018 ∙ by Bogdan Ćmiel, et al. ∙ 0

• ### Multiresolution analysis and adaptive estimation on a sphere using stereographic wavelets

We construct an adaptive estimator of a density function on d dimensiona...
09/07/2018 ∙ by Bogdan Ćmiel, et al. ∙ 0

• ### De-noising by thresholding operator adapted wavelets

Donoho and Johnstone proposed a method from reconstructing an unknown sm...
05/28/2018 ∙ by Gene Ryan Yoo, et al. ∙ 0

• ### Estimation of convex supports from noisy measurements

A popular class of problem in statistics deals with estimating the suppo...
04/26/2018 ∙ by Victor-Emmanuel Brunel, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction.

We consider the estimation problem of an unknown function from observations

contaminated by Gaussian white noise in the following convolution model:

 y(t,u)=∫10f(x,u)g(t−x,u)dx+ϵz(1)(t,u), gδ(t,u)=g(t,u)+δz(2)(t,u), (1)

where , is the unknown blurring function with observations , and and are two independent two-dimensional Gaussian white noises with covariance function

 E[z(k)(t1,u1)z(k)(t2,u2)]=δ(u1−u2)δ(t1−t2), k=1,2, (2)

and is the Dirac delta function. The parameters and are positive and satisfy asymptotically.
The discrete version of model (1) when and are observed at points , , is as follows

 y(ti,ul)=∫10f(x,ul)g(ti−x,ul)dx+σ1z(1)li, gδ(ti,ul)=g(ti,ul)+σ2z(2)li, (3)

where and are two positive constants independent of and , and . The quantities , with

, are zero-mean i.i.d. normal random variables with

. In addition, and are independent of each other.

Deconvolution model has witnessed a considerable number of publications since late 1980s and Donoho (1995) was the first to devise a wavelet solution to the problem. The list also includes Abramovich and Silverman (1998), Pensky and Vidakovic (1999), Walter and Shen (1999), Johnstone, Kerkyacharian, Picard and Raimondo (2004), Donoho and Raimondo (2004), and Pensky and Sapatinas (2009). Functional deconvolution has been investigated in Benhaddou et al. (2013), where they considered model (1) with , which corresponds to the case when the kernel is known. This model is motivated by experiments in which one needs to recover a two-dimensional function using observations of its convolutions along profiles . This situation occurs, for example, in seismic inversions (see Robinson (1999)). Recently, Benhaddou, Pensky and Rajapakshage (2019) investigated functional Laplace deconvolution where the function under consideration is not periodic.

In the present setting, the convolution kernel is unknown, but observations are available. This problem is referred to as the blind deconvolution. Inverse problem with unknown operators in its general aspect was studied by Hoffmann and Reiss (2008) where they proposed two nonlinear methods of estimating a one dimensional function. Delattre et al. (2012) considered the blind deconvolution problem and applied the block SVD procedure to construct a wavelet estimator for a one dimensional function that belongs to Sobolev space. Benhaddou (2018a) considered the blind deconvolution model under fractional Gaussian noise, where the function under consideration is univariate and periodic. The common feature between Hoffmann and Reiss (2008) and Benhaddou (2018a) is that they implement preliminary thresholding procedures that eliminate the estimated wavelet coefficients that are judged to be too large.

The objective of the paper is to construct an adaptive hard-thresholding wavelet estimator for model (1). We focus on the regular-smooth convolution. A preliminary stabilizing thresholding procedure is applied to the functional Fourier coefficients of the ”data” to estimate the wavelet coefficients, taking advantage of the flexibility of the Meyer wavelet basis in the Fourier domain. More specifically, we apply the Meyer wavelet transform in the Fourier domain, and for each resolution level , we truncate the estimated wavelet coefficients at values that are close to zero. We show that the proposed approach is asymptotically near-optimal over a wide range of Besov balls under the -risk. In addition, we show that the convergence rates are expressed as the maxima between two terms, taking into account both the noise sources. Similar behavior has been pointed out in Hoffmann and Reiss (2008), Vareschi (2015) and Benhaddou (2018a, 2018b). It should be noted that with , our convergence rates coincide with those in Benhaddou et al. (2013), and with and , our convergence rates match those in Benhaddou (2017). Finally, with , and with and , our rates are comparable to those in Benhaddou (2018a) and Benhaddou (2018b), respectively.

The rest of the paper is organized as follows. In section 2, we describe in details the estimation procedure. In section 3, we study the asymptotic performance of the proposed estimator in terms of its minimax squared loss. In Section 4, we consider a limited simulations study to assess the goodness of our estimator when the sample size is finite. Finally, Section 5 contains the proofs of our theoretical findings.

## 2 Estimation algorithm

Let denote the inner product in the Hilbert space , i.e., for , where is the complex conjugate of . Denote as a Fourier basis on the interval . Let , , , , be functional Fourier coefficients of functions

, respectively. Applying Fourier transform to equation (

1) we get

 ym(u)=fm(u)gm(u)+ϵz(1)m(u), gδm(u)=gm(u)+δz(2)m(u). (4)

Consider a bounded bandwidth periodized wavelet basis ( Meyer-type) and a finitely supported periodized -regular wavelet basis (Daubechies-type). Denote the wavelet functions of these two bases by and respectively, where and correspond to the lowest resolution levels for the two bases. Then, the function has the wavelet series representation given by

 f(t,u)=∞∑j=m0−1∞∑j′=m0′−12j−1∑k=02j′−1∑k′=0βj,k,j′,k′ψj,k(t)ηj′,k′(u), (5)

with

 βj,k,j′,k′=⟨⟨f(t,u),ψj,k(t)⟩,ηj′,k′(u)⟩=∑m∈Wj∫10fm(u) ηj′,k′(u)du ¯¯¯¯ψj,k,m, (6)

and are the Fourier coefficients of . It is well-known (see, e.g, Johnstone et al (2004), section 3.1) that under the Fourier domain and for any , one has

 Wj={m:ψj,k,m≠0}⊆2π3[−2j+2,−2j]∪[2j,2j+2], (7)

due to the fact that Meyer wavelets are band-limited, and the cardinality of is . If were known, the problem would reduce to the regular deconvolution model studied in Benhaddou et al. (2013). However, since is unknown and contaminated with gaussian white noise, a preliminary thresholding procedure is to be applied to provide the estimate of . Let us define the quantities

 1~gδm(u)=⎧⎨⎩1gδm(u),if |gδm(u)|>κδ√ln(1/δ)0if |gδm(u)|≤κδ√ln(1/δ), (8)

where is a positive constant independent of and . Then, using (6) and (8), and by Plancherel formula, a truncated estimator for is given by

 ^βj,k,j′,k′=∑m∈Wj∫10ym(u)~gδm(u) ηj′,k′(u)du ¯¯¯¯ψj,k,m. (9)

Bear in mind that the above thresholding on the Fourier coefficients enables us to have a stable inversion. Now, define

 Ω(J,J′)={ω=(j,k;j′,k′):m0≤j≤J−1,m′0≤j′≤J′−1,k=0,⋯,2j−1,k′=0,⋯,2j′−1}. (10)

Then, consider the hard-thresholding estimator for given by

 ^f(t,u)=J−1∑j=m0−1J′−1∑j′=m0′−12j−1∑k=02j′−1∑k′=0^βωI(|^βω|>λjϵ,δ)ψj,k(t)ηj′,k′(u), (11)

where and will be determined later, and

will be chosen based on the variance of

.
Assumption A.1. Assume that the functional Fourier coefficients of are uniformly bounded from below and above, that is, there exists positive constants and , all independent of and , such that

 C1|m|−2v≤|gm(u)|2≤C2|m|−2v . (12)

The next step is to evaluate the variance of (9). Define for and positive constant , the sets and such that

 Ω1={m∈Wj:|gδm(u)|>κδ√ln(1/δ)}, (13)
 Ω2={m∈Wj:|δz(2)m(u)|<ρκδ√ln(1/δ)}, (14)

and denote .

###### Lemma 1

Let the constant defined in (14) be such that . Then on , one has

 1−2ρ1−ρ|gm(u)|≤|gδm(u)|≤11−ρ|gm(u)|. (15)
###### Lemma 2 (Upper bound for Var(^βω))

Let be defined by (9), and let constant be such that . Then on , one has

 Var(^βω)=O(max{ϵ222jv,δ222jv}), (16)

and on ,

 E∣∣^βω−βω∣∣4=O(max{ϵ424jv+j′,δ424jv+j′}). (17)

Now, to determine the choice of the thresholds , let us define the quantity

 S(gδm)j=∫10∑m∈Ω0|gδm(u)|2du. (18)

Then by (7) and (12), we obtain

 ∫10∑m∈Ω0|gδm(u)|2du≍∑m∈Ω0|m|−2v≍2j2−2jv. (19)

Hence, we choose our thresholds of the form

 λjϵ,δ=2j/2[S(gδm)j]−1/2max{γ1ϵ√ln(1/ϵ),γ2δln(1/δ)}. (20)

Finally, choose and such that

 J = max{j∈Z:[S(gδm)j]−1≤2−2j(max{A−2ϵ2,A−2δ2})−1}, (21) 2J′ ≍ (max{ϵ2,δ2})−1. (22)
###### Lemma 3

For J defined in (21), one has

 2J≍(max{A−2ϵ2,A−2δ2})−12v+1. (23)
###### Remark 1

Notice that our choices of the threshold , and the finest resolution levels and do not depend on any unknown parameter, and therefore estimator (11) is adaptive. It remains to investigate how the estimator performs. Next, we evaluate the lower and upper bound for the -risk.

## 3 Minimax rates of convergence

To construct minimax lower bounds for the -risk, we define the -risk over the set as

 R(V)=inf~fsupf∈VE∥~fε−f∥22, (24)

where is the -norm of a function and the infimum is taken over all possible estimators of .
Assumption A.2. Let . Assume that belongs to a two-dimensional Besov ball, and its wavelet coefficients satisfy

 Bs1,s2p,q(A)=⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩f∈L2(U):⎛⎜ ⎜⎝∑j,j′2(js∗1+j′s∗2)q⎛⎝∑k,k′∣βω∣p⎞⎠qp⎞⎟ ⎟⎠1q≤A⎫⎪ ⎪ ⎪ ⎪⎬⎪ ⎪ ⎪ ⎪⎭. (25)
###### Theorem 1

Let with , let and define , for . Then under conditions (12) and (25), as , simultaneously,

 Rϵ,δ(Bs1,s2p,q(A))≥CA2(max{ϵ2,δ2}A2)d, (26)

where

 d=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩2s22s2+1if  s2(2v+1)≤s12s12s1+2v+1if  (2v+1)(1p−12)

Before we derive the minimax upper bounds for the -risk, we first prove the next lemma which provides the large deviation inequalities for the wavelet coefficients estimators.

###### Lemma 4

Let and be defined by (9) and (20) respectively. Define for the set

 Θω,α={θ:|^βω−βω|>αλjϵ,δ}. (28)

Then, under and the condition (12), and as , simultaneously, one has

 P(Θω,α)=O((ϵ2)τ1+(δ2)τ2), (29)

where

 τ1=C1(1−2ρ)2α2γ21256π2C2 and τ2=14[αγ2(1−ρ)√C18π√C2−√C24πρ2κ2√C1(1−2ρ)]2, (30)

and appear in (12), and , appear in (14).

Then, the following statement is true.

###### Theorem 2

Let be the wavelet estimator defined in (11), with and given by (21) and (22). Let conditions (12) and (25) hold and , with , and choose and in (29) large enough. Then as , simultaneously,

 supf∈Bs1,s2p,q(A)E∥^f−f∥2≤CA2{(ϵ2A2ln(1/ϵ))d[ln(1/ϵ)]d1∨(δ2A2ln2δ)d[ln(1/δ)]d1}, (31)

where is defined in (27) and

 d1=1(s1=(2v+1)(1p−12))+1(s1=s2(2v+1)). (32)
###### Remark 2

Theorems 1 and 2 imply that, for the -risk, the estimator in (11) is asymptotically quasi-optimal within a logarithmic factor of or over a wide range of anisotropic Besov balls .

###### Remark 3

The convergence rates are expressed as a maxima between two terms, taking into account both noise sources. Similar behavior has been pointed out in Hoffmann and Reiss (2008), Vareschi (2015) and Benhaddou (2018a, 2018b). These rates depend on a delicate balance between the parameters of the Besov ball, smoothness of the convolution kernel and the noise parameters; and .

###### Remark 4

With , our convergence rates coincide with those in Benhaddou et al. (2013), and with and , our convergence rates match those in Benhaddou (2017). In addition, if we fix the variable , and with , our rates are comparable to those in Benhaddou (2018a) in their univariate case, and with and are convergence rates match those in Benhaddou (2018b), in their univariate but multichannel case.

## 4 Simulation Study

In this section, we carried out a limited simulation study in order to investigate the finite sample performance of our estimator. The first step though is to provide the sample equivalent to equations (4), (8), (9), (13)-(14) and (20)-(22). Indeed, apply the Fourier transform to (3) to obtain

 ym(ul)=fm(ul)gm(ul)+σ1z(1)m(ul), gδm(ul)=gm(ul)+σ2z(2)m(ul). (33)

Then, the discrete version of (9) is given by

 ^βj,k,j′,k′=1MM∑l=1∑m∈Wjym(ul)~gδm(ul) ηj′,k′(ul) ¯¯¯¯ψj,k,m. (34)

In addition, equations (3) and (1) are equivalent by setting

 ϵ2=σ21MN, δ2=σ22MN. (35)

The sets and are now of the form

 Ω1={m∈Wj:minl≤M|gδm(ul)|2>κ2σ22MNln(MN)}, (36)
 Ω2={m∈Wj:maxl≤M|z(2)m(ul)|2<ρ2κ2ln(MN)}, (37)

and (18) has the sample counterpart

 S(gδm)j=1MM∑l=1∑m∈Ω0|gδm(ul)|2. (38)

Finally, the threshold and the finest resolution levels are given by

 λj,M,N=γ2j/2[S(gδm)j]−1/2max⎧⎨⎩[σ21ln(MN)MN]1/2,[σ22ln(MN)MN]1/2⎫⎬⎭, (39)
 J=max⎧⎨⎩j∈Z:[S(gδm)j]−1≤2−2j(max{σ21MN,σ22MN})−1⎫⎬⎭, (40)
 2J′=(max{σ21MN,σ22MN})−1. (41)

The simulation was implemented based on the above equations. In particular, it was formatted through MATLAB using the Wavelab toolbox. Similar to Benhaddou, et al. (2013), degree 3 Meyer wavelets family and degree 6 Daubechies wavelets were utilized for the wavelet transform. We generated our data by equation (33) with testing kernel , and with various testing functions and different combinations of values , and , . In particular, we generate from with being a quadratic function scaled to have a unit norm, and being the routinely used testing functions blip, bumps, heavy sine, and doppler. We rescale to have a unit norm. As noted in Benhaddou, et al. (2013), our method did not know that were generated by a product of two functions, thus can not take advantage of this information. For the choices of and , they are dependent on the signal-to-noise ratio. In particular, , where , and is the signal-to-noise ratio of the first equation in (33); and where is the signal-to-noise ratio of the second equation in (33). We used and .
We evaluated the mean integrated square error (MISE)

of the functional deconvolution estimator. Figure 1 reports the averages of those errors over 100 simulation repetitions together with their standard deviations (in the parentheses).

The simulation is aimed to study the effect of two components; the sample budget and the noise levels , . The simulation results confirm the theory and are consistent with previous results in the literature. That is, as the sample size increases ( decreases), the performance of estimation increases. At the same time, as the noise level increases ( increases), the performance deteriorates. The simulation results confirm that as the noise level decreases ( increases), the convergence rate improves.

## 5 Proofs

Proof of Lemma 1. From (1), one has that

 |gm(u)|=|gδm(u)−δz(2)m(u)|=∣∣ ∣∣gδm(u)⎛⎝1−δz(2)m(u)gδm(u)⎞⎠∣∣ ∣∣=|gδm(u)|∣∣ ∣∣1−δz(2)m(u)gδm(u)∣∣ ∣∣. (42)

Therefore,

 |gδm(u)|=|gm(u)|∣∣ ∣ ∣ ∣∣11−δz(2)m(u)gδm(u)∣∣ ∣ ∣ ∣∣=|gm(u)|∣∣ ∣∣∞∑l=0⎛⎝δz(2)m(u)gδm(u)⎞⎠l∣∣ ∣∣. (43)

This is because on , one has

 0<∣∣ ∣∣δz(2)m(u)gδm(u)∣∣ ∣∣<∣∣ ∣∣ρκδ√ln(1/δ)κδ√ln(1/δ)∣∣ ∣∣=|ρ|<1. (44)

Thus, using (43) and (44), yields

 |gδm(u)|=|gm(u)|∣∣ ∣∣∞∑l=0⎛⎝δz(2)m(u)gδm(u)⎞⎠l∣∣ ∣∣≤|gm(u)|11−ρ. (45)

On one hand, one has

 |gm(u)| > |gδm(u)|−|δz(2)m(u)|>(1−ρ)κδ√ln(1/δ). (46)

On the other hand,

 |gδm(u)|=|gm(u)+δz(2)m(u)|=∣∣ ∣∣gm(u)⎛⎝1+δz(2)m(u)gm(u)⎞⎠∣∣ ∣∣. (47)

Combining (14) and (46) with , one has

 ∣∣ ∣∣δz(2)m(u)gm(u)∣∣ ∣∣<∣∣ ∣∣ρκδ√ln(1/δ)(1−ρ)κδ√ln(1/δ)∣∣ ∣∣=ρ1−ρ<1. (48)

Now, using (47) and (48), one obtains

 |gm(u)| = |gδm(u)|∣∣ ∣∣∞∑l=0⎛⎝−δz(2)m(u)gm(u)⎞⎠l∣∣ ∣∣≤|gδm(u)|1−ρ1−2ρ. (49)

Hence, combining (45) and (49) completes the proof.
Proof of Lemma 2. Note that , thus on , the variance of (9) is

 E|^βω−βω|2 ≤ 2E∣∣ ∣∣∑m∈Wj∫10⎛⎝ϵz(1)m(u)−δfm(u)z(2)m(u)gδm(u)⎞⎠ ηj′,k′(u)du ¯¯¯¯ψj,k,m⋅I(Ω1)I(Ω2)∣∣ ∣∣2 (50) +2E∣∣ ∣∣∑m∈Wj∫10⎛⎝ϵz(1)m(u)−δfm(u)z(2)m(u)gδm(u)⎞⎠ ηj′,k′(u)du ¯¯¯¯ψj,k,m⋅I(Ω1)I(Ωc2)∣∣ ∣∣2 := E1+E2.

and can be partitioned to and , where

 E11 = E∣∣ ∣∣∑m∈Wj∫10⎛⎝ϵz(1)m(u)gδm(u)⎞⎠ ηj′,k′(u)du ¯¯¯¯ψj,k,m⋅I(Ω1)I(Ω2)∣∣ ∣∣2, (51) E12 = E∣∣ ∣∣∑m∈Wj∫10⎛⎝δfm(u)z(2)m(u)gδm(u)⎞⎠ ηj′,k′(u)du ¯¯¯¯ψj,k,m⋅I(Ω1)I(Ω2)∣∣ ∣∣2, (52) E21 = E∣∣ ∣∣∑m∈Wj∫10⎛⎝ϵz(1)m(u)gδm(u)⎞⎠ ηj′,k′(u)du ¯¯¯¯ψj,k,m⋅I(Ω