Deep Plug-and-play Prior for Low-rank Tensor Completion

05/11/2019 ∙ by Wen-Hao Xu, et al. ∙ Hong Kong Baptist University NetEase, Inc 3

Tensor image data sets such as color images and multispectral images are highly correlated and they contain a lot of image details. The main aim of this paper is to propose and develop a regularized tensor completion model for tensor image data completion. In the objective function, we adopt the newly emerged tensor nuclear norm (TNN) to characterize the global structure of such tensor image data sets. Also, we formulate an implicit regularizer to plug in the convolutional neural network (CNN) denoiser, which is convinced to express the image prior learned from a large amount of natural images. The resulting model can be solved efficiently via an alternating directional method of multipliers algorithm. Experimental results (on color images, videos, and multispectral images) are presented to show that both image global structure and details can be recovered very well, and to illustrate that the performance of the proposed method is better than that of testing methods in terms of PSNR and SSIM.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 6

page 7

page 9

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

A tensor is an extension of a matrix, which can provide a richer and more natural representation for many data. Due to the inevitable degradation process, an observed tensor sometimes is incomplete. Tensor completion aims at estimating the missing entries from the observed tensor, which is widely used in image and video recovery

[1, 2, 3], hyperspectral image (HSI) and multispectral image (MSI) data recovery [4, 5, 6, 7], and background subtraction [8]. To solve the tensor completion problem, the low-rankness of real-world data was successfully discovered to catch global information. Mathematically, a low-rank tensor completion (LRTC) model is generally formulated as:

(1)

where is the underlying tensor, is the observed tensor, is the index set corresponding to the observed entries, and is the projection function that keeps the entries of in while making others be zeros.

Observed HaLRTC
TNN DPLR
Fig. 1: The recovered results by HaLRTC, TNN and DPLR on color image Starfish for the sampling rate .
(a) Ground truth
(b) TNN
(c) Residual of ground truth and TNN
(d) Histogram of ground truth
(e) Histogram of TNN
(f) Histogram of the residual
Fig. 2: The top row is the ground truth, result by TNN, and the residual of ground truth and TNN, respectively. The bottom row is the histograms of the ground truth, result by TNN, and the residual of ground truth and TNN, respectively. .

Different from the matrix case, the definition of tensor rank is not unique. As one of the most popular definitions, the CANDECOMP/PARAFAC(CP) rank of a tensor is defined based on the CP decomposition. For a tensor , its CP decomposition is

(2)

where “

” denotes the vector outer product,

is a positive integer and and . Then, the integer , the minimum number of rank-one tensors basis required to express [9], is denoted as the CP rank of . This definition is intuitive and similar to the definition of matrix rank, but the calculation of CP rank is an NP-hard problem. Moreover, the CP rank has no relaxation, which limits its application. Another popular definition is the -rank based on the Tucker decomposition. The Tucker decomposition for a tensor is

(3)

where “” denotes the mode- product, is called the core tensor, and are matrices [10, 11]. Then, the -rank is defined as the vector . As the -rank relies on the matrix rank, its calculation is relatively simple. Because the nuclear norm is the tightest convex surrogate approximation of the matrix rank, Liu et al. introduced the sum of nuclear norm (SNN) as a relaxation of the -rank to characterize the low-rankness of all mode tensor [12]. Then, the LRTC model can be rewritten as:

(4)

where , , and is the nuclear norm of matrix.

A novel definition of tensor rank called multi-rank based on tensor singular value decomposition (t-SVD) was proposed

[13, 14, 15, 16]. For a third-order tensor

, t-SVD performs one-dimensional Fourier transformation along the tube and get

, then decomposed each frontal slice of in the matrix SVD format. The multi-rank is correspondingly defined as , where is the -th frontal slice of (see more details in Section II-B). Similar to SNN, the tensor nuclear norm (TNN) [17] is used as a convex surrogate for the multi rank and the LRTC model is rewritten as:

(5)

where is the TNN of tensor and is defined as . The TNN model has shown its capability of characterizing the overall structure of the multi-dimensional data [18, 19].

Although enhancing the global low-rankness has shown its effectiveness for the tensor completion, these methods suffer from two drawbacks. Firstly, many real-world multi-dimensional imaging data maintain not only the global correlation but also abundant details. The details of the data would unavoidably be erased when minimizing TNN or other tensor rank relaxations. Secondly, when the sampling rate is extremely low, the observed entries is not sufficient to support the recovery of the whole data. This phenomena can be observed in Fig. 1. The results by HaLRTC [12], which minimizing the SNN, and TNN [20] are of low quality when the sampling rate is 5%.

Therefore, as a compensate, many LRTC methods also taken additional local/non-local prior knowledge into consideration for better reconstruction performance on the multi-dimensional imaging data. For example, the local continuity/smoothness has received much attention in [21, 22, 23, 24, 25]. Particularly, within the t-SVD framework, Jiang et al. [23] proposed to incorporate an anisotropic total variation into tensor completion, which focuses on exploiting the local information of the piecewise smooth structures in the spatial domain. Meanwhile, many methods utilize the abundant non-local self-similarity [2], and obtain outstanding results when handling regular and repetitive patterns.

In this paper, instead of investing efforts in designing hand-craft regularizers to introduce additional prior knowledge, we employ the plug-and-play (PnP) prior framework [26, 27, 28, 29, 30, 31, 32] and add an implicit regularizer to express the image prior in (5). After variable splitting by alternating direction method of multipliers (ADMM), we directly use an off-the-shelf convolutional neural network (CNN) based single image denoiser, i.e., the FFDNet, to solve the prior associated subproblem. Three facts motivate us to adopt the CNN denoiser which is originally designed for the single image Gaussian noise removal. Firstly, from Fig. 2

, we can see the histogram of the residual of the ground truth and the result recovered by TNN is consistent with a Gaussian distribution. Secondly, the effectiveness of the denoising prior based PnP has been validated in many single image inverse problems, such as deblurring

[33], inpainting [34]

and super-resolution

[35]. The CNN denoiser is convinced to express the image prior learned from a large amount of natural images with very efficient inference on GPUs. Thirdly, it is noteworthy that the multi-dimensional imaging data consists of single images. For example, each frame of a video is indeed an image. Thus, we believe that bringing in the image prior into the LRTC model would give rise to promising performance.

To sum up, there would be two regularizers in our tensor completion model, the low-rank part and the deep PnP prior part. These two regularizers are organically combined and complement each other. On the one hand, the TNN would guarantee the entire low-rankness and this would compensate the CNN denoiser’s deficiency that the receptive field is not able to reach the whole data of arbitrary size. On the other hand, the CNN denoiser brings in the external image prior and helps preserve the details.

Actually, as the research line in [36], we can also unroll the ADMM iterations into a CNN architecture and conduct the end-to-end denoising prior driven tensor completion. However, the network in [36] for the single image is already very large and the scale of a similar CNN architecture for multi-dimensional imaging data would be too large to be efficiently handled. Therefore, the optimization-CNN hybrid structure in our model is a dessert choice. Meanwhile, to best of our knowledge, this is the first attempt to introducing the CNN denoising prior into the tensor completion task.

The rest of this paper is organized as follows. Section II presents some preliminary knowledge, i.e., t-SVD, FFDnet, and PnP framework. Section III gives the CNN-based learning prior tensor completion model and the corresponding solving algorithm. Section IV evaluates the performance of the proposed method and compares the results with state-of-the-art competing methods. Section V discuss some details about the DPLR. Section VI concludes this paper.

Ii Preliminaries

Ii-a Notation

In this subsection, we give the basic notations and briefly introduce some definitions. We denote vectors as bold lowercase letters (e.g., ), matrices as uppercase letters (e.g., ), and tensors as calligraphic letters (e.g., ). For a third-order tensor , with the MATLAB notation, we denote its -th element as or , its -th mode-1, mode-2, and mode-3 fibers as , , and , respectively. We use , , and to denote the -th horizontal, lateral, and frontal slices of , respectively. More compactly, and tube are used to represent and mode-3 fiber, respectively. The Frobenius norm of is defined as . We use to denote the tensor generated by performing discrete Fourier transformation (DFT) along each tube of , i.e., .

Ii-B t-SVD and TNN

For a third-order tensor , the block circulation operation [37] is defined as

The block diagonalization operation and its inverse operation are defined as

The block vectorization operation and its inverse operation are defined as

Definition 1 (t-product [16])

The t-product between two third-order tensors and is defined as

There is an important point that block circulant matrix can be block diagonalized [13]:

(6)

where denotes the Kronecker product, is an DFT matrix and is an identity matrix. With equation (6), the t-product in the spatial domain corresponds to the matrix multiplication of the frontal slices in the Fourier domain:

(7)

which greatly simplifies the process of the corresponding algorithms.

Definition 2 (special tensors [16])

The conjugate transpose of a third-order tensor , denote as , is the tensor obtained by conjugate transposing each of the frontal slices and then reversing the order of transposed frontal slices 2 through . The identity tensor is the tensor whose first frontal slice is the identity matrix, and other frontal slices are all zeros. A third-order tensor is orthogonal if A third-order tensor is f-diagonal if each of its frontal slices is a diagonal matrix.

Theorem 1 (t-SVD [16])

Let is a third-order tensor, then it can be factored as

where and are the orthogonal tensors, and is a f-diagonal tensor.

The t-SVD can be efficiently obtained by computing a series of matrix SVDs in the Fourier domain. Now, we give the definition of the corresponding tensor tubal rank and multi rank.

Definition 3 (tensor tubal rank and multi rank [17])

Let be a third-order tensor, the tensor multi rank of is a vector , whose -th element is the rank of -th frontal slice of , where . The tubal rank of , denote as , is defined as the number of non-zero tubes of , where comes from the t-SVD of : . That is, .

Definition 4 (tensor nuclear norm (TNN) [17])

The tensor nuclear norm of a tensor , denoted as , is defined as the sum of singular values of all the frontal slices of , i.e.,

where is the -th frontal slice of , and .

Ii-C CNN Denoiser and PnP framework

Recently, the emergence of CNN has dramatically influenced the field of image processing. The CNN has shown its remarkable performance in different tasks of image restoration. The CNN can exploit spatially-local correlation by enforcing a local connectivity pattern between neurons of adjacent layers

[38]

. Most CNN based methods directly learn mapping functions form low-quality images to desirable high-quality images, which can be considered as minimizing the loss function with the help of a data-driven prior. Compared with the carefully hand-crafted priors,

e.g., TV and framelet, the prior from CNN has an implicit form and may be more strong [39]. Recent studies about PnP ADMM show that any off-the-shelf denoising method can be directly utilized as priors instead of the step in ADMM to calculate the proximal operator of the specified regularizer [26, 40, 41]. Discriminative learning methods have shown better performance than model-based optimization methods in most image denoising problem. However, most learning methods need to learn multiple models for different noise levels. For flexible, fast and effective image denoising, Zhang et al. [42] proposed a discriminative learning method called FFDnet. The FFDnet takes a tunable noise map as input to make the denoising model flexible to noise levels, which is formulated as , where model parameters are invariant to the noise level. To improve the efficiency, FFDnet introduces a reversible downsampling operator to reshape the input image of size into four downsampled sub-images of size . Moreover, to robustly control the trade-off between noise reduction and detail preservation, FFDnet adopts the orthogonal initialization method to the convolution filters. The experimental results demonstrate that FFDnet can produce state-of-the-art performance in image denoising. Through applying FFDnet denoiser in PnP framework, the proposed method can achieve excellent performance in tensor completion problem.

Iii The Proposed Model and Algorithm

We combine the FFDnet denoiser prior with the TNN model to generalize the DPLR model as:

(8)

where is the FFDnet denoiser prior. The DPLR model (8) has two terms. The term is the low-rank prior which can preserve the spacial relationship among entries. With its help, the DPLR model can better capture the global information of underlying tensor . The another term is the denoiser regularization term.

After denoting

(9)

and introducing two auxiliary variables and , we consider the augmented Lagrangian function of (8):

(10)

where and are the Lagrangian multipliers, is the penalty parameter.

According to the framework of ADMM, the solution of (8) can be found by solving a sequence of subproblems.

In Step 1, we need to solve the -subproblem. Since the variables and are decoupled, their optimal solutions can be calcalated separately.

1) Rewrite the -subproblem:

(11)

Let be the t-SVD of and be the result of DFT of along the tube. Then each element of the singular tubes of is the result of multiplying every entry with [43], where “” denotes keeping the positive part. In other word, the closed form solution of problem (11) can be obatined by the singular value thresholding (SVT) operator as

(12)

where is an f-diagonal tensor whose each frontal slice in the Fourier domain is .

2) The -subproblem is

(13)

Let , (13) can be rewritten as

(14)

Treating as the “noisy” image, (14) can be regarded as minimizing the residual of “noisy image” and the “clean” image using the prior . With this idea, in PnP framework, we utilize the FFDnet as the denoiser to solve the related subproblem. Letting as the input of FFDnet, then we can get

(15)

In Step 2, we need to solve the -subproblem:

(16)

It is easy to solve (16) as:

(17)

where denotes the complementary set of .

In Step 3, we update the multipliers and as:

(18)

The overall algorithm is shown in algorithm 1.

1:the observed tensor ; the set index of the observed entries; the parameter and .
2:Initialization: ,; max iteration number .
3:while not converged and  do
4:     Updating by (12),
5:     Updating by (15),
6:     Updating by (17),
7:     Updating multipliers and by (18).
8:end while
9:the recovered tensor .
Algorithm 1 The ADMM algorithm for solving (8)

Iv Numerical Experiments

In this section, the performance of DPLR will be firstly evaluated by experiments on color images. Although the FFDnet prior is trained for natural color images, the DPLR can be extended well to real-world videos and MSI data. The DPLR is compared with the baseline method HaLRTC [12], the TNN based method [17], and TNN-3DTV [23].

The peak signal to noise rate (PSNR) in dB and the structural similarity index (SSIM) are chosen as the performance evaluation indices. Due to the third-order data, mean PSNR and mean SSIM of all bands or frames are reported. The relative change (RelCha) is adopted as the stopping criterion of all methods, which is defined as

In all experiments, the tolerence is set to be . All parameters involved in different methods are manually selected from a candidate set to get the highest PSNR value.

All the experiments are implemented on Windows 10 and Matlab (R2018a) with an Intel(R) Core(TM) i5-4590 CPU at 3.30 GHz,16 GB RAM, and NVIDIA GeForce GTX 1060 6GB.

Iv-a Color Image Completion and Inpainting

Image PSNR SSIM
HaLRTC TNN TNN-3DTV DPLR HaLRTC TNN TNN-3DTV DPLR
Starfish 10% 18.48 19.47 22.59 28.06 0.3617 0.3007 0.6345 0.8270 [t]
20% 22.27 22.55 24.85 31.59 0.5476 0.5052 0.7519 0.9143
30% 24.66 25.71 26.52 34.49 0.6883 0.6685 0.8217 0.9533 [b]
Airplane 10% 20.91 21.94 23.46 31.92 0.6706 0.6623 0.8414 0.9471 [t]
20% 24.81 25.17 25.83 37.68 0.8279 0.8242 0.9220 0.9709
30% 27.20 27.90 28.45 39.42 0.9060 0.9014 0.9531 0.9766 [b]
Baboon 10% 17.43 17.83 19.56 22.90 0.4077 0.3959 0.5835 0.7798 [t]
20% 19.34 20.07 20.67 24.60 0.5974 0.6005 0.7271 0.8433
30% 21.17 21.66 21.93 25.86 0.7308 0.7346 0.8083 0.9118 [b]
Fruits 10% 20.73 20.72 24.81 31.21 0.6046 0.5646 0.8363 0.9362 [t]
20% 24.23 24.23 27.31 34.67 0.7777 0.7510 0.9122 0.9621
30% 26.90 26.99 29.22 35.97 0.8689 0.8541 0.9434 0.9723 [b]
Lena 10% 21.43 21.89 25.96 31.01 0.6415 0.6177 0.8396 0.9241 [t]
20% 24.98 25.68 28.41 33.17 0.8034 0.7888 0.9069 0.9423
30% 27.71 28.06 30.07 36.91 0.8844 0.8719 0.9380 0.9575 [b]
Watch 10% 22.47 23.01 26.27 34.47 0.7128 0.7490 0.8863 0.9825 [t]
20% 25.64 26.61 28.46 37.63 0.8641 0.8923 0.9466 0.9888
30% 28.37 29.70 30.31 41.15 0.9332 0.9502 0.9709 0.9971 [b]
Opera 10% 24.23 25.05 25.56 31.18 0.7499 0.7486 0.8075 0.9300 [t]
20% 27.55 28.24 29.13 34.35 0.8649 0.8734 0.9132 0.9628
30% 29.80 30.89 31.14 36.99 0.9188 0.9323 0.9447 0.9812 [b]
Water 10% 20.20 21.00 22.57 27.56 0.5860 0.6126 0.7756 0.9178 [t]
20% 22.75 23.37 24.29 29.14 0.7790 0.6685 0.8822 0.9472
30% 24.67 25.76 25.88 31.96 0.8767 0.8922 0.9302 0.9795 [b]
Average 10% 20.74 21.36 23.85 29.79 0.5919 0.5814 0.7756 0.9056
20% 23.95 24.49 26.12 32.85 0.7578 0.7380 0.8703 0.9415
30% 26.31 27.08 27.94 35.34 0.8509 0.8507 0.9138 0.9662
TABLE I: Quantitative comparison of the results by different methods on color images. The best and second best values are highlighted in bold and underline, respectively.









Observed
HaLRTC TNN TNN-3DTV DPLR Ground truth
Fig. 3: The recovered color images by HaLRTC, TNN, TNN-3DTV, and DPLR for the sampling rate , respectively.

In this experiment, different methods are performed on 8 color images, e.g., Starfish, Airplane, Baboon, Fruits etc. The observed incomplete tensors are generated by randomly sampling elements (same in the video and MSI completion). Each image is composed of red, green, and blue channels and rescaled to . For each image, we test color image completion with the sampling rates , and . For color image, we directly use the trained net for color image to perform the DPLR.

Tab. I presents the PSNR and SSIM values of the testing color image recovered results by different methods. It shows that the TNN based method obtains better performance than HaLRTC and TNN-3DTV achieves the second high PSNR and SSIM values with the power of TV prior. Meanwhile, DPLR gets the highest PSNR and SSIM values. The margins between the results by DPLR and TNN-3DTV are more than 3dB and 0.2 considering the PSNR and SSIM, respectively.

For visualization, we demonstrate the recovered images with the sampling rate in Fig. 3. It is easy to observe that the images recovered by the proposed method are more clear and more similar to the original images than the compared methods.





Observed
HaLRTC TNN TNN-3DTV DP3-LR Ground truth
Fig. 4: The inpainting results of color images with different masks by HaLRTC, TNN, TNN-3DTV, and DPLR, respectively.

Different from random sampling, we evaluate the performance of different methods for manual sampling. Fig. 4 displays the inpainting results by different methods on 3 color images: Fruits, Lena, and Sailboat. These three images are painted with three different kinds of masks: letters, graffiti, and grid, respectively. It is easy to observe that the proposed method is superior to others. The HaLRTC and TNN still leave many masks and TNN-3DTV blurs many details around the masks. The DPLR gets the clearest results in the inpainting experiment.

Iv-B Video Completion

Video PSNR SSIM
HaLRTC TNN TNN-3DTV DPLR HaLRTC TNN TNN-3DTV DPLR
Akiyo 5% 20.18 28.33 28.76 30.59 0.5976 0.8630 0.8949 0.9294 [t]
10% 23.64 31.16 31.97 33.33 0.7340 0.9272 0.9452 0.9623
20% 27.54 34.82 36.06 36.96 0.8642 0.9674 0.9777 0.9816 [b]
Suzie 5% 20.84 26.37 27.39 29.37 0.5944 0.7203 0.7989 0.8373 [t]
10% 24.40 28.41 29.27 31.80 0.7046 0.7976 0.8513 0.8899
20% 28.12 31.14 31.95 34.75 0.8189 0.8739 0.9123 0.9358 [b]
Container 5% 19.81 26.27 26.63 26.87 0.6446 0.8305 0.8611 0.8695 [t]
10% 22.26 29.53 30.05 30.44 0.7413 0.9023 0.9272 0.9230
20% 25.56 33.65 34.69 34.82 0.8495 0.9553 0.9677 0.9654 [b]
News 5% 18.34 26.65 27.20 28.29 0.5610 0.8182 0.8738 0.9023 [t]
10% 21.47 30.31 30.56 31.96 0.6955 0.9106 0.9285 0.9473
20% 25.00 33.82 34.21 35.55 0.8256 0.9554 0.9665 0.9740 [b]
Bus 5% 15.79 18.30 18.36 20.78 0.3300 0.3331 0.3973 0.5932 [t]
10% 17.70 19.61 19.66 22.83 0.4174 0.4421 0.5083 0.7222
20% 19.86 21.74 21.91 25.48 0.5560 0.5993 0.6778 0.8359 [b]
Average 5% 18.99 25.18 25.67 27.18 0.5455 0.7130 0.7652 0.8263
10% 21.89 27.80 28.30 30.07 0.6586 0.7960 0.8321 0.8889
20% 25.22 31.03 31.76 33.51 0.7828 0.8703 0.9004 0.9385
TABLE II: Quantitative comparison of the results by different methods on videos. The best and second best values are highlighted in bold and underline, respectively.






Observed
HaLRTC TNN TNN-3DTV DPLR Ground truth

Fig. 5: The -th frame of the recovered videos by HaLRTC, TNN, TNN-3DTV, and DPLR for the sampling rate , respectively.

In this subsection, 5 videos with the different size ( and ) are chosen as the ground truth data. For video data, when the sampling rate is more than , all the methods can achieve high performances. Thus, we exhibit the recovered results with the sampling rate , and . Since the FFDnet denoiser can only handle the third-order tensor whose third dimension is 1 or 3 bands, we unfold the video data into a matrix and then process it through the trained net for the grayscale image (we do the same operation in MSI completion experiments).

Tab. II lists the PSNR and SSIM values of the testing video recovered results by different methods with different sampling rates. It is easy to observe that DPLR obtains the results with the highest performance evaluation indices. For visualization, we show the 20-th frame of the recovered videos with the sampling rate in Fig. 5. We can see that the video recovered by DPLR is more clear than the other methods.

Iv-C MSI Completion

In this subsection, we test 7 MSI images from the CAVE database which are of size with the wavelengths in the range of nm at an interval of 10nm.

Tab. III exhibits the PSNR and SSIM values on all results by different method. The DPLR achieves the highest PSNR and SSIM values while the TNN-3DTV get the second highest. Fig. 6 illustrate the pseudo-color images (composed of 1st, 2nd, and 31-th bands) of the results by different methods with the sampling rate . The DPLR avoids the negative effect arising in results by TNN and TNN-3DTV, and obtains the results which are the most similar to the ground truth.

MSI PSNR SSIM
HaLRTC TNN TNN-3DTV DPLR HaLRTC TNN TNN-3DTV DPLR
Balloons 5% 24.29 31.35 32.66 39.04 0.8232 0.8670 0.9381 0.9843 [t]
10% 31.27 35.63 37.66 42.98 0.9242 0.9380 0.9735 0.9937
20% 37.34 41.11 42.77 47.54 0.9737 0.9803 0.9904 0.9975 [b]
Beads 5% 16.50 19.86 21.27 22.53 0.3259 0.4569 0.6506 0.6912 [t]
10% 17.99 22.92 23.95 26.08 0.4191 0.6512 0.7907 0.8369
20% 21.34 27.61 27.78 30.20 0.6404 0.8420 0.9024 0.9319 [b]
Cd 5% 22.19 26.51 29.35 30.22 0.8497 0.8029 0.9251 0.9597 [t]
10% 25.98 29.54 31.92 35.30 0.9018 0.8871 0.9550 0.9790
20% 30.85 33.22 36.04 39.35 0.9498 0.9447 0.9791 0.9890 [b]
Clay 5% 28.83 33.62 34.25 39.89 0.9224 0.9052 0.9487 0.9680 [t]
10% 35.78 38.22 39.66 43.80 0.9726 0.9657 0.9819 0.9907
20% 41.58 43.69 44.69 48.02 0.9895 0.9813 0.9890 0.9952 [b]
Face 5% 25.64 32.32 32.70 36.82 0.8346 0.8981 0.9381 0.9663 [t]
10% 30.51 36.50 37.52 39.66 0.9125 0.9548 0.9736 0.9829
20% 36.20 41.33 42.00 42.96 0.9664 0.9842 0.9778 0.9916 [b]
Feathers 5% 20.45 27.55 28.23 31.14 0.6565 0.7694 0.8712 0.9375 [t]
10% 24.58 31.19 31.83 34.59 0.7925 0.8717 0.9308 0.9656
20% 29.30 35.86 36.73 38.68 0.8995 0.9481 0.9726 0.9838 [b]
Flowers 5% 20.47 26.75 27.90 30.34 0.6293 0.7153 0.8288 0.8894 [t]
10% 24.88 30.36 31.37 33.70 0.7605 0.8357 0.9025 0.9419
20% 29.49 35.33 36.25 37.66 0.8804 0.9342 0.9621 0.9745 [b]
Average 5% 22.62 28.28 29.48 32.85 0.7202 0.7735 0.8715 0.9138
10% 27.28 32.05 33.42 36.59 0.8119 0.8720 0.9297 0.9558
20% 32.30 36.88 38.04 40.63 0.9000 0.9450 0.9676 0.9805
TABLE III: Quantitative comparison of the results by different methods on MSI data. The best and second best values are highlighted in bold and underline, respectively.








Observed
HaLRTC TNN TNN-3DTV DPLR Ground truth

Fig. 6: The pseudo-color images (composed of the 1st, 2nd, and 31th bands) of recovered MSI data by HaLRTC, TNN, TNN-3DTV, and DPLR for the sampling rate , respectively.

V Discussion

V-a Convergence

Fig. 7: The convergence curves with respect to iteration number on the video Akiyo for the sampling rate , , and , respectively. The red dot in the figure denotes the real iteration number of DPLR.

The numerical experiments have shown the remarkable performance of DPLR. However, it is still an open question whether the algorithm with an implicit regularizer in PnP framework has a good property of convergence. In Fig. 7, we demonstrate the relative change curves with respect to iteration number in processing of DPLR on the video Akiyo with the sampling rate , , and , respectively. From Fig. 7, we can observe the algorithm stops the iteration at step 94 which shows the numerical convergence of DPLR.

V-B Parameter Analysis


Fig. 8: The PSNR and SSIM values of the results with respect to and on color image Starfish for the sampling rate , respectively.

In this subsection, we analyze the effect of the ADMM parameter and the noise level . In Fig. 8, we demonstrate the PSNR and SSIM values of the results by DPLR on color image Starfish for the sampling rate with respect to and , respectively. We can see that DPLR gets the best performance when and , respectively.

V-C Not a Post-processing

(a) FFDnet (11.59 dB)
(b) TNN (22.55 dB)
(c) FFDnet after TNN (23.74 dB)
(d) DPLR (31.59 dB)
Fig. 9: The recovered results by FFDnet, TNN, FFDnet after TNN, and DPLR on color image Starfish for the sampling rate , respectively.

The DPLR is not just simple post-processing after TNN but is a systematical integration of TNN with FFDnet. In Fig. 9, we demonstrate the results on color image Starfish recovered by FFDnet TNN, directly performing FFDnet after TNN, and DPLR with the sampling rates . The FFDnet gets the worst performance, because it is trained for color image denoising not tensor completion. And the result by directly performing FFDnet after TNN is slightly better than that by TNN. This direct post-processing does work, but the improvement is not remarkable. Meanwhile, we can see that the PSNR value of the result by DPLR is nearly 8 dB higher than that of the post-processing. This margin proves that DPLR is not simple post-processing, but also proves that using FFDnet to reduce the residual is the right choice.

Vi Conclusions

In this paper, we proposed a hybrid tensor completion model, in which the TNN is utilized to catch the global information and a data-driven learning prior is used to express the local information. The proposed model simultaneously combines the model-based optimization method with the CNN based method, in consideration of the global tensor structure and the detail preservation. An efficient ADMM is developed to solve the proposed model. Numerical experiments on different types of multi-dimensional imaging data illustrate the superiority of our method on the tensor completion problem.

References

  • [1] X. Zhang, “A nonconvex relaxation approach to low-rank tensor completion,” IEEE Transactions on Neural Networks and Learning Systems, 2018.
  • [2] Q. Xie, Q. Zhao, D. Meng, and Z. Xu, “Kronecker-basis-representation based tensor sparsity and its applications to tensor recovery,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 8, pp. 1888–1902, 2018.
  • [3] D. Yang and J. Sun, “Bm3d-net: A convolutional neural network for transform-domain collaborative filtering,” IEEE Signal Processing Letters, vol. 25, no. 1, pp. 55–59, 2018.
  • [4] Y. Wang, J. Peng, Q. Zhao, Y. Leung, X.-L. Zhao, and D. Meng, “Hyperspectral image restoration via total variation regularized low-rank tensor decomposition,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 11, no. 4, pp. 1227–1243, 2018.
  • [5] X. Fu, W.-K. Ma, J. M. Bioucas-Dias, and T.-H. Chan, “Semiblind hyperspectral unmixing in the presence of spectral library mismatches,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 9, pp. 5171–5184, 2016.
  • [6] Y. Chang, L. Yan, H. Fang, and C. Luo, “Anisotropic spectral-spatial total variation model for multispectral remote sensing image destriping,” IEEE Transactions on Image Processing, vol. 24, no. 6, pp. 1852–1866, 2015.
  • [7] Q. Zhao, D. Meng, X. Kong, Q. Xie, W. Cao, Y. Wang, and Z. Xu, “A novel sparsity measure for tensor recovery,” in

    Proceedings of the IEEE International Conference on Computer Vision

    , 2015, pp. 271–279.
  • [8] W. Cao, Y. Wang, J. Sun, D. Meng, C. Yang, A. Cichocki, and Z. Xu, “Total variation regularized tensor rpca for background subtraction from compressive measurements,” IEEE Transactions on Image Processing, vol. 25, no. 9, pp. 4075–4090, 2016.
  • [9] T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,” SIAM Review, vol. 51, no. 3, pp. 455–500, 2009.
  • [10] L. R. Tucker, “Some mathematical notes on three-mode factor analysis,” Psychometrika, vol. 31, no. 3, pp. 279–311, 1966.
  • [11]

    N. D. Sidiropoulos, L. De Lathauwer, X. Fu, K. Huang, E. E. Papalexakis, and C. Faloutsos, “Tensor decomposition for signal processing and machine learning,”

    IEEE Transactions on Signal Processing, vol. 65, no. 13, pp. 3551–3582, 2017.
  • [12] J. Liu, P. Musialski, P. Wonka, and J. Ye, “Tensor completion for estimating missing values in visual data,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 208–220, 2013.
  • [13] M. E. Kilmer and C. D. Martin, “Factorization strategies for third-order tensors,” Linear Algebra and its Applications, vol. 435, no. 3, pp. 641–658, 2011.
  • [14] K. Braman, “Third-order tensors as linear operators on a space of matrices,” Linear Algebra and its Applications, vol. 433, no. 7, pp. 1241–1253, 2010.
  • [15] C. D. Martin, R. Shafer, and B. LaRue, “An order- tensor factorization with applications in imaging,” SIAM Journal on Scientific Computing, vol. 35, no. 1, pp. A474–A490, 2013.
  • [16] M. E. Kilmer, K. Braman, N. Hao, and R. C. Hoover, “Third-order tensors as operators on matrices: A theoretical and computational framework with applications in imaging,” SIAM Journal on Matrix Analysis and Applications, vol. 34, no. 1, pp. 148–172, 2013.
  • [17] Z. Zhang, G. Ely, S. Aeron, N. Hao, and M. Kilmer, “Novel methods for multilinear data completion and de-noising based on tensor-SVD,” in

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    , 2014, pp. 3842–3849.
  • [18]

    C. Lu, J. Feng, Y. Chen, W. Liu, Z. Lin, and S. Yan, “Tensor robust principal component analysis: Exact recovery of corrupted low-rank tensors via convex optimization,” in

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 5249–5257.
  • [19] O. Semerci, N. Hao, M. E. Kilmer, and E. L. Miller, “Tensor-based formulation and nuclear norm regularization for multienergy computed tomography,” IEEE Transactions on Image Processing, vol. 23, no. 4, pp. 1678–1693, 2014.
  • [20] Z. Zhang and S. Aeron, “Exact tensor completion using t-SVD,” IEEE Transactions on Signal Processing, vol. 65, pp. 1511–1526, 2017.
  • [21] T. Yokota, Q. Zhao, and A. Cichocki, “Smooth parafac decomposition for tensor completion,” IEEE Transactions on Signal Processing, vol. 64, no. 20, pp. 5423–5436, 2016.
  • [22] X. Li, Y. Ye, and X. Xu, “Low-rank tensor completion with total variation for visual data inpainting,” in

    Proceedings of the AAAI Conference on Artificial Intelligence

    , 2017, pp. 2210–2216.
  • [23]

    F. Jiang, X.-Y. Liu, H. Lu, and R. Shen, “Anisotropic total variation regularized low-rank tensor completion based on tensor nuclear norm for color image inpainting,” in

    Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2018, pp. 1363–1367.
  • [24] M. Imaizumi and K. Hayashi, “Tensor decomposition with smoothness,” in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017, pp. 1597–1606.
  • [25] C. Paris, J. Bioucas-Dias, and L. Bruzzone, “A novel sharpening approach for superresolving multiresolution optical images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 3, pp. 1545–1560, 2019.
  • [26] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors for model based reconstruction,” in Proceedings of the IEEE Global Conference on Signal and Information Processing, 2013, pp. 945–948.
  • [27] A. Danielyan, V. Katkovnik, and K. Egiazarian, “Bm3d frames and variational image deblurring,” IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 1715–1728, 2012.
  • [28] D. Zoran and Y. Weiss, “From learning models of natural image patches to whole image restoration,” in Proceedings of the IEEE International Conference on Computer Vision, 2011, pp. 479–486.
  • [29] S. H. Chan, X. Wang, and O. A. Elgendy, “Plug-and-play admm for image restoration: Fixed-point convergence and applications,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 84–98, 2017.
  • [30] A. M. Teodoro, J. M. Bioucas-Dias, and M. A. Figueiredo, “A convergent image fusion algorithm using scene-adapted gaussian-mixture-based denoising,” IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 451–463, 2019.
  • [31] K. Zhang, W. Zuo, S. Gu, and L. Zhang, “Learning deep cnn denoiser prior for image restoration,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3929–3938.
  • [32] X. Wang and S. H. Chan, “Parameter-free plug-and-play admm for image restoration,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2017, pp. 1323–1327.
  • [33] T. Tirer and R. Giryes, “Image restoration by iterative denoising and backward projections,” IEEE Transactions on Image Processing, vol. 28, no. 3, pp. 1220–1234, 2019.
  • [34] T. Meinhardt, M. Moller, C. Hazirbas, and D. Cremers, “Learning proximal operators: Using denoising networks for regularizing inverse imaging problems,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 1781–1790.
  • [35] K. Zhang, W. Zuo, and L. Zhang, “Deep plug-and-play super-resolution for arbitrary blur kernels,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
  • [36] W. Dong, P. Wang, W. Yin, and G. Shi, “Denoising prior driven deep neural network for image restoration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018.
  • [37] M. Cheng, L. Jing, and M. K. Ng, “Tensor-based low-dimensional representation learning for multi-view clustering,” IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2399–2414, 2019.
  • [38] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142–3155, 2017.
  • [39] L. Zhang and W. Zuo, “Image restoration: From sparse and low-rank priors to deep priors [lecture notes],” IEEE Signal Processing Magazine, vol. 34, no. 5, pp. 172–179, 2017.
  • [40] P. Mu, J. Chen, R. Liu, X. Fan, and Z. Luo, “Learning bilevel layer priors for single image rain streaks removal,” IEEE Signal Processing Letters, vol. 26, no. 2, pp. 307–311, 2019.
  • [41] W. He, Q. Yao, C. Li, N. Yokoya, and Q. Zhao, “Non-local meets global: An integrated paradigm for hyperspectral denoising,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
  • [42] K. Zhang, W. Zuo, and L. Zhang, “FFDNet: Toward a fast and flexible solution for cnn-based image denoising,” IEEE Transactions on Image Processing, vol. 27, no. 9, pp. 4608–4622, 2018.
  • [43] J.-F. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding algorithm for matrix completion,” SIAM Journal on Optimization, vol. 20, no. 4, pp. 1956–1982, 2010.