Missing data recovery is a fundamental problem in imaging science and data analysis. It can be formulated as function interpolation in multiple dimension spaces. Let be an unknown function. We would like to acquire its values on a set of points . However, due to practical limitations, we are only able to observe its values on a subset . The goal of missing data recovery is to reconstruct the missing values of based on the observed values in . In this paper, we focus on two of the typical and important tasks of missing data recovery, i.e. semi-supervised learning and image inpainting, though it can be well applied to other related tasks as well.
Since the problem of missing data recovery is an under-determined inverse problem, we can only hope to recover the missing values of if we have certain prior knowledge on , e.g. belonging to a certain function class or having certain mathematical or statistical properties. Successful models include Rudin–Osher–Fatemi(ROF) model  and its variants [23, 4, 12], the applied harmonic analysis models such as wavelets [39, 17], curvelet , shearlet [20, 29] and wavelet frame [2, 9, 11, 10, 42, 19]
, the Bayesian statistics based methods[33, 35, 43]; and the list goes on.
More recently, people started to use low dimension manifolds to describe the underlying relationship between the data points which serves as an effective geometric prior on the interpolant. For example, [31, 32] observed that image patches, regarded as data points in a high dimension space, often lie on a low dimension manifold; and [14, 44] allowed the data lie near (but may not be on) a certain low dimension manifold.
In , the authors gave a geometry view of the Dirichlet regularizer. They showed that when is the coordinate function of a manifold, we would have . This means that we can minimize the Dirichlet energy to enforce a penalty on the (local) dimensions of the underlying manifold. As a result, the authors referred to their method as the low dimension manifold model (LDMM). To recover missing data, they proposed to minimize the Dirichlet energy subject to the constraints , , where denotes the observed part of the underlying function .
1.1 Higher Order Regularization
However, having only low dimension structure of the manifold does not readily ensure smoothness of the reconstructed manifold which can lead to unsatisfactory results. As a simple demonstration, we show in Figure 1 a degenerated interpolation result from the two data points labeled in red. Although the interpolated surface is also a low dimension manifold, it is certainly not a smooth interpolation.
In this paper, we overcome the problem by not only assuming low dimensionality of the manifold, but also the smoothness. For that, in addition to the Dirichlet energy, we further introduce a CUrvature REgularization (CURE) term via biharmonic extension. The proposed CURE energy reads as follows
where LDMM is given by (1). Note that regularizing the curvature by introducing higher order energy term has already been proposed in image processing . However, to the best of our knowledge, we are the first to promote curvature-like regularization for nonlocal image processing. Furthermore, inspired by the weighted nonlocal Laplacian (WNLL) method proposed by  which can preserve symmetry of the Laplace operator, we propose a weighted CURE (WeCURE) model which can significantly improve the results over the CURE model. To demonstrate the effectiveness of CURE and WeCURE, we test our model on semi-supervised learning and image inpainting task. Numerical results show that CURE/WeCURE produces significantly better results than LDMM/WNLL in both tasks. A glimpse of the results for image inpainting is shown in Figure 2 where we can see the significant improvement of CURE over LDMM and WeCURE over WNLL. More details and numerical results can be found in later sections.
1.2 Other Related Works
Nonlocal patch-based image restoration methods[15, 16, 7, 6, 23] have achieved great success in the literature. In addition, [21, 3, 18] also introduced different graph Laplacian based regularization on manifold and graphs. Our method, however, focuses on both smoothness and sparsity of the underlying data manifold. The most similar work to ours is , where the authors also introduced a higher order regularization for semi-supervised learning. The difference is threefold. First, we extend their method to image inpainting rather than semi-supervised learning. Secondly, we introduce a curvature perspective on the higher order regularization. Last but not least, the proposed weighted version of CURE, i.e. WeCURE, has significant performance boost over CURE in both image inpainting and semi-supervised learning.
Another approach to regularize the dimension of the manifold is through low rank matrix completion [24, 25]. The basic idea is to group the patches by similarity and penalized the rank/nuclear norm of the matrix obtained by reshaping the stack of the similar patches. The work in this paper reveals a benefit of PDE-based approaches that higher order information, such as curvature, can be naturally Incorporated into the model.
1.3 Organization of the Paper
The paper is organized as follows. The proposed CURE and WeCURE model is introduced in Section 2, Numerical comparisons of CURE and WeCURE with LDMM and WNLL for semi-supervised learning and image inpainting are presented in Section 3 and Section 4 respectively. Conclusions and summary are given in Section 5.
2 Curvature Regularization (CURE): Model and General Algorithm
In this section, we first propose the CURE model and a weighted version of CURE. Then, we will discuss how (We)CURE can be applied to missing data recovery in general.
Let be a smooth manifold embedded in and locally parameterized as
where is the local dimension of at , and . Let be the coordinate function on , i.e. for
To enforce smoothness of the underlying manifold, we further regularize the curvature of the manifold. Recall that the mean curvature of a manifold is defined as the trace of the second fundamental form , i.e. . Here
is the metric tensor defined by. If the coordinate function is an isometric immersion, the mean curvature can be calculated as (see  for detail).
Now, we are ready to introduce the CURE energy in continuum setting:
where is given by (1). The gradient is commonly approximated by the nonlocal gradient in the discrete setting
where is a set with finite collection of points on the manifold . Then,
Here, is a given symmetric weight function which is often chosen to be a Gaussian weight =exp, where is a parameter and denotes the Euclidean norm in . The negative of the first variation of takes the form
. It is also called graph Laplacian in spectral graph and machine learning literature[13, 45]. To simplify the notation, we use to denote the graph Laplacian [28, 40, 41]:
Now, the proposed CURE model can be cast as the following optimization problem in discrete setting
In , a weighted nonlocal Laplancian (WNLL) method was introduced to balance the loss at both labeled and unlabeled points and to preserve symmetry of the Laplace operator at the same time. Let be a set with labeled points. The WNLL model in the discrete setting is given by
and similarly for .
Following a similar idea as WNLL, we propose the weighted CURE model (WeCURE) in discrete setting
and similarly for .
2.2 CURE for Missing Data Recovery
For missing data recovery, we can simply minimize the CURE or WeCURE energy with respect to the constraints where is the observed values of the underlying function to be recovered. We discuss how this can be done in detail with WeCURE, and the algorithm for CURE is just a special case.
Recall the definition of the energy function of WeCURE (3) and notice that . Then, the WeCURE model for missing data recovery can be rewritten as
where with for and for , and is the matrix of graph Laplacian. The first variation of (4) is
Then, the problem (4) can be solved by solving the following Euler-Lagrange equation
where with . The above linear system of equations is symmetric positive definite, and can be solved by an iterative solver such as the conjugate gradient method. We note that, for the (non-weighted) CURE method, we only need to replace the matrix
above with the identity matrix. We summarize the (We)CURE algorithm for missing data recovery in Algorithm 1.
3 CURE for Semi-Supervised Learning
Semi-supervised learning is a challenging and yet frequently encountered machine learning task. It can be formulated as a missing data recovery problem . Given a data set , we assume there are totally different classes. Let be a subset of with labels, i.e
where is the subset with label . It is typical for semi-supervised learning that is far less than . The objective of semi-supervised learning is to extend labels to the entire data set . Our algorithm is summarized in Algorithm 2.
We test LDMM,WNLL,CURE,WeCURE on the MNIST dataset  of handwritten digits classification . Some sample images from the dataset are shown in Figure 3. The MNIST dataset contains 70,000 gray scale images of size 28 28 with 10 classes of digits going from 0 to 9. Each class contains 7,000 images. Each image can be seen as a point in a 784-dimension Euclidean space.
The weight function is constructed as
where is chosen to be the distance between and its 20th nearest neighbor.
To make the weight matrix sparse, the weight is truncated to the 50 nearest neighbors.
In our test, we choose five different sampling rate to form the training set: labeling 700, 100, 70, 50 and 35 images in each class at random. For each sampling rate, we repeat the test 10 times. Figure 4 shows the success rate of WNLL, CURE, and WeCURE method. The first five images of Figure 4 show the success rate for each sampling rate, while the last image shows the average success rate for each of the five sampling rate. It can be clearly observed that the proposed CURE and WeCURE outperform WNLL for all the tested cases. With high sampling rate, the WeCURE method becomes closer to the CURE method and they have comparable performance, whereas WeCURE outperforms CURE in the cases with lower sampling rates. In terms of average success rate, both CURE and WeCURE outperform WNLL. We also compare (We)CURE with WNLL and Weighted Nonlocal Total Variation (WNTV)  in Table 1. It can be seen that (We)CURE significantly outperforms both WNLL and WNTV in cases with lower sample rates (50/70000,100/70000).
4 CURE for Image Inpainting
In this section, we apply the CURE method to the reconstruction images with partially observed pixels. To apply (We)CURE, we adopt the assumption that image patches lie on a low dimension and smooth manifold. Given an image , for any , we define an image patch as
where we assume and
are odd integers and we adopt reflective boundary conditions fornear image boundary. Define the patch set as the collection of all patches:
Define a function on as
where is the intensity of image at pixel .
Now, suppose we only observe the image on a subset of pixels . We would like to recover the entire image from the observed data . This problem can be recast as the interpolation of function on the patch set with being given in , . This falls into the general algorithmic framework of (We)CURE for missing data recovery (Algorithm 2). Notice that the patch set is unknown. Thus, we need to update the patch set iteratively. We summarize the (We)CURE algorithm for this problem in Algorithm 3.
The weight function is chosen as (6). Here, are semi-local patches and is chosen to be the distance between and its 20th nearest neighbor. To make the weight matrix sparse, the weight is truncated to the 50 nearest neighbors. In the semi-local patches, the local coordinate is normalized to have the same amplitude as the image intensity,
and are the size of the image. The purpose of introducing semi-local patches is to constrain the search space to a local area. The larger leads to smaller search space let the searching quicker while smaller leads to global search and make more accurate results. Thus following  we gradually reduce by and initialization .
We apply our algorithm to 12 widely used testing images. In our experiment, we select the patch size to be
. For each patch, the nearest neighbors are obtained by using an approximate nearest neighbor (ANN) search algorithm. We use a k-d tree approach as well as an ANN search algorithm to reduce the computational cost. The linear system in weighted nonlocal Laplacian and graph Laplacian is solved by the conjugate gradient method. We use the solution of WNLL after 6 steps as the initialization of our algorithm to get a proper initial guess of the similarity relationships between different groups. The initial image of WNLL is obtained by filling the missing pixels with random numbers which satisfy a Gaussian distribution, whereis the mean of and
is the standard deviation of.
PSNR defined as following is used to measure the accuracy of the results
where is the ground truth.
SSIM is based on the computation of three terms, namely the luminance term, the contrast term and the structural term. The overall index is a multiplicative combination of the three terms.
where and are the local means, standard deviations and cross-covariance for image .
The numerical results are shown in Table 2 and Table 3.For qualitative comparisons, Figure 6 shows the inpainting results of 3 images from Set12 dataset at sample rate. Figure 7 shows the inpainting results at sample rate. As we can see, WeCURE gives much better results than WNLL both visually and numerically in PSNR and SSIM. The proposed method can enhance the recovered image to obtain a better texture, although it may generate some artifacts which breaks some smooth regions. At the same time (We)CURE also can let the restored images sharper on the edge. Thus (We)CURE always outperformes WNLL significantly in the SSIM sense.
5 Conclusion and Future Work
In this paper, we proposed to use both low dimension and smoothness of the underlying data manifold as a regularizer for missing data recovery. For that, we introduced curvature regularization (CURE) and a weighted version of it (WeCURE). Comparing to related models such as LDMM, WNLL and WNTV, the new regularization was proven more effective on some datasets for semi-supervised learning and image inpainting.
There are still a lot need to be further studied. For modelling, a natural question is whether other curvatures can also serve as smoothness regularizer for data manifolds and how are they different from the one we chosen for CURE? Can these curvatures be easily computed? How does CURE work for other tasks of missing data recovery? Furthermore, convergence analysis of solving the Biharmonic equation (5) on manifold also needs to be studied. Due to lack of understanding of the numerical methods for Biharmonic equation, it prohibited us from generalizing CURE to generic inverse problems.
Bin Dong is supported in part by NSFC 11671022 and Beijing Natural Science Foundation (Z180001). Yiping Lu is supported by the Elite Undergraduate Training Program of the School of Mathematical Sciences at Peking University. We thank Dr. Wei Zhu kindly share their valuable comments and codes of both LDMM and LDMM+WGL for comparisons.
-  S. Agarwal, K. Branson, and S. Belongie, Higher order learning with graphs, in Proceedings of the 23rd international conference on Machine learning, ACM, 2006, pp. 17–24.
-  C. Bao, B. Dong, L. Hou, Z. Shen, X. Zhang, and X. Zhang, Image restoration by minimizing zero norm of wavelet frame coefficients, Inverse problems, 32 (2016), p. 115004.
A. L. Bertozzi and A. Flenner,
Diffuse interface models on graphs for classification of high dimensional data, Multiscale Modeling & Simulation, 10 (2012), pp. 1090–1118.
-  K. Bredies, K. Kunisch, and T. Pock, Total generalized variation, SIAM Journal on Imaging Sciences, 3 (2010), pp. 492–526.
-  A. Buades, B. Coll, and J.-M. Morel, Neighborhood filters and pde’s, Numer. Math, 105, p. 1–34.
-  A. Buades, B. Coll, and J.-M. Morel, A review of image denoising algorithms, with a new one. multiscale model, Simul, 4, p. 490–530.
-  A. Buades, B. Coll, and J.-M. Morel, A non-local algorithm for image denoising
C. Burges, Y. LeCun, and C.,
Cortes. mnist database.
-  J.-F. Cai, R. H. Chan, and Z. Shen, Simultaneous cartoon and texture inpainting, Inverse Probl. Imaging, 4 (2010), pp. 379–395.
-  J.-F. Cai, S. Osher, and Z. Shen, Split bregman methods and frame based image restoration, Multiscale modeling & simulation, 8 (2009), pp. 337–369.
-  R. H. Chan, T. F. Chan, L. Shen, and Z. Shen, Wavelet algorithms for high-resolution image reconstruction, SIAM Journal on Scientific Computing, 24 (2003), pp. 1408–1432.
-  T. Chan, A. Marquina, and P. Mulet, High-order total variation-based image restoration, SIAM Journal on Scientific Computing, 22 (2000), pp. 503–516.
-  F. Chung, Spectral graph theory, American Mathematical Society.
-  R. R. Coifman and S. Lafon, Diffusion maps, Applied and computational harmonic analysis, 21 (2006), pp. 5–30.
K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, Image denoising
with block-matching and 3d filtering
, in Image Processing: Algorithms and Systems, Neural Networks, and Machine Learning, vol. 6064, International Society for Optics and Photonics, 2006, p. 606414.
-  A. Danielyan, V. Katkovnik, and K. Egiazarian, Bm3d frames and variational image deblurring, IEEE Transactions on Image Processing, 21 (2012), pp. 1715–1728.
-  I. Daubechies, Ten lectures on wavelets, vol. 61, Siam, 1992.
-  B. Dong, Sparse representation on graphs by tight wavelet frames and applications, Applied and Computational Harmonic Analysis, 42 (2017), pp. 452–479.
-  B. Dong and Z. Shen, Mra-based wavelet frames and applications: Image segmentation and surface reconstruction, in Independent Component Analyses, Compressive Sampling, Wavelets, Neural Net, Biosystems, and Nanoengineering X, vol. 8401, International Society for Optics and Photonics, 2012, p. 840102.
-  G. Easley, D. Labate, and W.-Q. Lim, Sparse directional image representations using the discrete shearlet transform, Applied and Computational Harmonic Analysis, 25 (2008), pp. 25–46.
-  G. Gilboa and S. Osher, Nonlocal linear image regularization and supervised segmentation, Multiscale Model. Simul, 6, p. 595–630.
-  G. Gilboa and S. Osher, Nonlocal operators with applications to image processing, Multiscale Model. Simul, 7, p. 1005–1028.
-  G. Gilboa and S. Osher, Nonlocal operators with applications to image processing, Multiscale Modeling & Simulation, 7 (2008), pp. 1005–1028.
-  S. Gu, L. Zhang, W. Zuo, and X. Feng, Weighted nuclear norm minimization with application to image denoising, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 2862–2869.
-  R. Lai and J. Li, Manifold based low-rank regularization for image restoration and semi-supervised learning, Journal of Scientific Computing, 74 (2018), pp. 1241–1263.
-  Y. LeCun, The mnist database of handwritten digits, http://yann. lecun. com/exdb/mnist/, (1998).
-  H. Li, Z. Shi, and X.-P. Wang, Weighted nonlocal total variation in image processing, arXiv preprint, arXiv:1801.10441, (2019).
-  Z. Li and Z. Shi, A convergent point integral method for isotropic elliptic equations on point cloud, SIAM: Multiscale Modeling Simulation, 14, p. 874–905.
-  W.-Q. Lim, The discrete shearlet transform: a new directional transform and compactly supported shearlet frames., IEEE Trans. Image Processing, 19 (2010), pp. 1166–1180.
-  F. Manfio and F. Vitório, Minimal immersions of riemannian manifolds in products of space forms, Journal of Mathematical Analysis and Applications, 424 (2015), pp. 260–268.
-  S. Osher, Z. Shi, and W. Zhu, Low dimensional manifold model for image processing, technical report, cam report 16-04, UCLA.
-  G. Peyré, Manifold models for signals and images, Computer Vision and Image Understanding, 113 (2009), pp. 249–260.
-  S. Roth and M. J. Black, Fields of experts: A framework for learning image priors, in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 2, IEEE, 2005, pp. 860–867.
-  L. I. Rudin, S. Osher, and E. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D: nonlinear phenomena, 60 (1992), pp. 259–268.
-  Q. Shan, J. Jia, and A. Agarwala, High-quality motion deblurring from a single image, in Acm transactions on graphics (tog), vol. 27, ACM, 2008, p. 73.
-  J. Shen, S. H. Kang, and T. F. Chan, Euler’s elastica and curvature-based inpainting, SIAM journal on Applied Mathematics, 63 (2003), pp. 564–592.
-  Z. Shi, S. Osher, and W. Zhu, Weighted nonlocal laplacian on interpolation from sparse data, Journal of Scientific Computing, 73 (2017), pp. 1164–1177.
-  J.-L. Starck, E. J. Candès, and D. L. Donoho, The curvelet transform for image denoising, IEEE Transactions on image processing, 11 (2002), pp. 670–684.
-  M. Stephane, A wavelet tour of signal processing, The Sparse Way, (1999).
-  N. G. Trillos and D. Slepčev, Continuum limit of total variation on point clouds, Archive for rational mechanics and analysis, 220 (2016), pp. 193–241.
N. G. Trillos and D. Slepčev,
A variational approach to the consistency of spectral clustering, Applied and Computational Harmonic Analysis, 45 (2018), pp. 239–281.
-  Y. Zhang, B. Dong, and Z. Lu, minimization for wavelet frame based image restoration, Mathematics of Computation, 82 (2013), pp. 995–1015.
-  S. C. Zhu and D. Mumford, Prior learning and gibbs reaction-diffusion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19 (1997), pp. 1236–1250.
-  W. Zhu, Q. Qiu, J. Huang, R. Calderbank, G. Sapiro, and I. Daubechies, Ldmnet: Low dimensional manifold regularized neural networks, in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
-  X. Zhu, Z. Ghahramani, and J. Lafferty, Semi-supervised learning using gaussian fields and harmonic functions, in Proceedings of The 31st International Conference on Machine Learning, vol. 3, p. 912–919.