1 Introduction
The analysis and enhancement of highresolution digital acquisitions of paintings is becoming a popular field of research [1, 2]. Prior work includes the removal of canvas artifacts in highresolution photographs [3], the removal of cradling in Xray images of paintings on panel [4], as well as the detection and digital removal of cracks [5].
In this work, we propose a novel framework to separate Xray images taken from doublesided paintings. A famous piece of art that contains panels painted on both sides is the Ghent Altarpiece (1432) created by Jan and Hubert van Eyck. In preparation of its restoration, the masterpiece was digitized by means of various modalities: visual photography, infrared photography and reflectography, and Xradiography [6]. The latter is a powerful tool for art investigation, as it reveals information about the structural condition of the painting. However, Xray scans of doublesided paintings are very cluttered, thus making their reading by art experts difficult. The reason is that these images contain information from both sides of the painting as well as its support (wood structure or canvas).
Prior work on umixing signals focuses mostly on the blind source separation (BSS) problem, where the task is to retrieve the different signal sources from one or more linear mixtures. Independent component analysis (ICA)
[7]—where the sources are assumed to be statistically independent—and nonnegative matrix factorization—where the sources are considered or transformed into a nonnegative representation [8]—are representative methods to solve the BSS problem. Alternative solutions adhere to a Bayesian formulation, via, for example, Markov random fields [9]. Sparsity is another source prior, heavily exploited in BSS problems [10, 11], with morphological component analysis (MCA) being a stateoftheart method. The assumption in MCA is that each source has a different morphology; namely, it has a sparse representation over a set of bases, alias, dictionaries, while being nonsparse over other dictionaries. The dictionaries can be predefined, for instance, the wavelet or the discrete cosine transform (DCT), or learned from a set of training signals. Seminal dictionary learning works include the method of optimal directions (MOD) [12] and the KSVD algorithm [13], both utilizing the orthogonal matching pursuit (OMP) [14] method to perform sparse signal decomposition. Recently, MCA has been combined with KSVD, thus enabling dictionaries to be learned while separating [15].The assumptions in previous source separation methods are not fitting our problem as both sources have similar morphological and statistical traits. In this work, we propose a novel method to perform separation of Xray images of paintings by using images of another modality as side information. Our approach consists of two steps: 1) learning multiscale dictionaries from photographs and Xrays of singlesided panels (in which the Xrays are not mixed), and 2) separating the given mixed Xray from a doublesided panel, using those dictionaries and the photographs from each side. Previous work has used coupled dictionary learning to address problems in audiovisual analysis [16]
[17], photosketch synthesis[18], and human pose estimation
[19]. Besides the application domain, our method differs from prior work in the way we model the correlation between the sources. Experimental evidence proves that our method is superior compared to the stateoftheart MCA technique, configured either with fixed or trained dictionaries.2 Image Separation with Side Information
We start by describing MCA, as the stateoftheart sparsity based source separation method, and afterwards we introduce the proposed method which, unlike the former, makes use of side information. First, let us denote by and
two vectorized Xray image patches that we wish to separate from a given Xray scan patch
.Morphological Component Analysis. Assume that each admits a sparse decomposition in a different overcomplete dictionary , ; namely, each component can be expressed as , where is a sparse vector comprising a few nonzero coefficients: , with denoting the pseudonorm. MCA [10, 11] decomposes the mixture by approximately solving the following optimization problem:
(1) 
A typical approximation consists of replacing the pseudonorm with the norm.
Source Separation with Side Information.
The use of side information has proven beneficial in various inverse problems [20, 21, 22]. Adhering to this logic, we show how side information can be helpful in separating mixtures, where the sources have similar characteristics. In our particular problem, we consider side information signals and formed by the colocated visual image
patches of the
front and the back of the painting. Both the Xray and side information signals admit a sparse decomposition
in given dictionaries, namely,
(2) 
and
(3) 
where , with , denotes the sparse component that is common to the visual and Xray images with respect to dictionaries . Moreover, , with , denotes the sparse innovation component of the Xray image, obtained with respect to dictionary . The common components express the structure underlying both the Xray and natural images, while the innovation component captures Xray specific parts of the signal (e.g., traces of the wooden panel). The separation problem is now formulated as the following problem:
(4) 
The relaxed version of Problem (4) boils down to Basis Pursuit, which is solved by convex optimization tools, e.g., [23].
3 Coupled Dictionary Learning Algorithm
We train coupled dictionaries, , , , by using image patches sampled from registered visual and Xray images of singlesided panels, which do not suffer from superposition phenomena. Let represent a set of colocated vectorized visual and Xray patches, each containing pixels. We assume that the columns of and can be decomposed as in (2), and we collect their common components into the columns of the matrix and their innovation components into the columns of . We formulate the coupled dictionary learning problem as
(5) 
where , are sparse vectorcolumns of matrix and , runs over the columns of and , and , are thresholds on the sparsity level. Given initial estimates for the dictionaries^{1}^{1}1We use the overcomplete DCT to initialize our dictionaries., Problem (5) is solved by iterating between a sparsecoding step, where the dictionaries are fixed, and a dictionary update step, in which the coefficients are fixed, as in [13, 12].
Given fixed dictionaries, the sparse coding problem decomposes into problems that can be solved in parallel:
(6) 
where we used , , , and to represent column of , , , and , respectively, and counts the iterations. To address each of the subproblems in (6), we propose a greedy algorithm that constitutes a modification of the OMP method [see Algorithm 1]. Our method adapts OMP [14] to solve:
(7) 
where [resp., ] denotes the components of vector indexed by the index set (resp., ), with , . Each subproblem in (6) translates to (7) by replacing: , , and .
Given fixed sparse coefficients, the dictionary update problem decouples into two (independent) problems, that is,
and
where and . Each of these problems has a closedform solution.
4 Xray Image Separation Method
Because of complexity, dictionaries are learned for small image patches, usually with dimensions of pixels; namely, we adhere to a local sparsity prior. However, due to the highresolution of the images, patches of that size cannot fully capture large structures. Hence, we propose a multiscale image separation approach that is based on a pyramid decomposition of the images. Our multiscale strategy is as follows: The images at scale —where we use the notation , to refer to the mixed Xray and the two visuals, respectively—are divided into overlapping patches , each of size pixels. Each patch has topleft coordinates
where is the overlap stepsize, and are the height and width of the image decomposition at scale . The DC value is extracted from each patch, thereby constructing the high frequency band of the image at scale . The aggregated DC values comprise the lowpass component of the image, the resolution of which is pixels. The lowpass component is then decomposed further at the subsequent scale (). The texture of the mixed Xray image at scale is separated patchperpatch by solving Problem (4). The texture of each separated patch is then reconstructed as and . Namely, we omit the innovation component [see (2)] during reconstruction, as this is common to the two Xrays^{2}^{2}2Experimental observation revealed that including the innovation component leads to poorer visual quality of the separation. . The separated Xray images are finally reconstructed by following the reverse operation: Descending the pyramid, the separated component at the coarser level is upsampled and added to the separated component of the finer scales.
As a final note, the dictionary learning process is applied per scale, yielding a triple of coupled dictionaries per scale . Due to lack of training data in the coarser scales, dictionaries are typically learned on the finer scales and then reused in the coarsest scale.
5 Experiments
We assess our method on different crops, with dimensions of , taken from the digital acquisitions [6] of one doublesided panel of the Ghent Altarpiece (1432). An example Xray image we aim to separate and the two corresponding visual images from each side of the panel are depicted in Fig. 1. We apply the multiscale framework, where we use scales with parameters , , and . Dictionary triplets , each with dimension of , are trained for the first two layers and the dictionaries of the second layer are extrapolated to the third. We use patches from digital acquisitions of the singlesided panels of the altarpiece and set and .
To demonstrate the benefit of using side information, we compare our method against two configurations of MCA [10, 11]. In the first one we use the discrete wavelet and curvelet transforms on blocks of pixels [4]; the lowfrequency content is divided between the two components. In the second configuration we use KSVD to train two dictionaries: one on Xray images depicting cloth and the other on images depicting faces—content also found in the Xray mixtures. The KSVD method is extended with our multiscale strategy and the same parameters are used. As no ground truth data is available, we first resort to visual comparisons. The results, depicted in Fig. 2, clearly show that MCA with fixed dictionaries can only separate based on morphological properties; for example, the wood grain of the panel is captured entirely by curvelets and not by the wavelets. It is, however, unfitted to separate painted content. MCA with KSVD dictionaries is also unable to separate the Xray content as the dictionaries are not sufficiently discriminative. The results using our method show the benefit of incorporating side information in the separation problem. Towards a more objective comparison, we measure the structural similarity (SSIM) [24] index between the two separated components, where low SSIM values would indicate less similarity; hence good separation. The results on two additional Xray scans from the same painting, reported in Table 1, confirm the better separation performance of our method, as advocated by the lowest SSIM values.
MCA fixed  MCA trained  Proposed  

Xray mixture 1  0.9249  0.7385  0.1681 
Xray mixture 2  0.9603  0.8341  0.6034 
6 Conclusion
We have proposed a novel sparsitybased regularization method for source separation guided by side information. Our method is based on a new multiscale algorithm that learns dictionaries coupling multimodal data. We apply the proposed method to separate Xray images of paintings with content on both sides of their panel, where photographs of each side are used as side information. Experiments with real data from digital acquisitions of the Ghent Altarpiece (1432), prove the superiority of our method compared to the stateoftheart MCA technique [10, 11, 15].
References

[1]
L. van der Maaten and R.G. Erdmann,
“Automatic threadlevel canvas analysis: A machinelearning approach to analyzing the canvas of paintings,”
IEEE Signal Process. Mag., vol. 32, no. 4, pp. 38–45, July 2015.  [2] N. van Noord, E. Hendriks, and E. Postma, “Toward discovery of the artist’s style: Learning to recognize artists by their artworks,” IEEE Signal Process. Mag., vol. 32, no. 4, pp. 46–54, July 2015.
 [3] B. Cornelis, A. Dooms, J. Cornelis, and P. Schelkens, “Digital canvas removal in paintings,” Signal Process., vol. 92, no. 4, pp. 1166–1171, 2012.
 [4] R. Yin, D. Dunson, B. Cornelis, B. Brown, N. Ocon, and I. Daubechies, “Digital cradle removal in Xray images of art paintings,” in IEEE ICIP, 2014, pp. 4299–4303.
 [5] B. Cornelis, T. Ružić, E. Gezels, A. Dooms, A. Pižurica, L. Platiša, J. Cornelis, M. Martens, M. De Mey, and I. Daubechies, “Crack detection and inpainting for virtual restoration of paintings: The case of the Ghent Altarpiece,” Signal Process., 2012.
 [6] A. Pizurica, L. Platisa, T. Ruzic, B. Cornelis, A. Dooms, M. Martens, H. Dubois, B. Devolder, M. De Mey, and I. Daubechies, “Digital image processing of the Ghent Altarpiece: Supporting the painting’s study and conservation treatment,” IEEE Signal Process. Mag., vol. 32, no. 4, pp. 112–122, 2015.
 [7] A. Hyvärinen, J. Karhunen, and E. Oja, Independent component analysis, vol. 46, John Wiley & Sons, 2004.
 [8] P. Smaragdis, C. Févotte, G. Mysore, N. Mohammadiha, and M. Hoffman, “Static and dynamic source separation using nonnegative factorizations: A unified view,” IEEE Signal Process. Mag., vol. 31, no. 3, pp. 66–75, 2014.
 [9] K. Kayabol, E. Kuruoğlu, and B. Sankur, “Bayesian separation of images modeled with MRFs using MCMC,” IEEE Trans. Image Process., vol. 18, no. 5, pp. 982–994, 2009.
 [10] J. Bobin, J.L. Starck, J. Fadili, and Y. Moudden, “Sparsity and morphological diversity in blind source separation,” IEEE Trans. Image Process., vol. 16, no. 11, pp. 2662–2674, 2007.
 [11] M. Zibulevsky and B. Pearlmutter, “Blind source separation by sparse decomposition in a signal dictionary,” Neural computation, vol. 13, no. 4, pp. 863–882, 2001.
 [12] K. Engan, S. O. Aase, and J. HakonHusoy, “Method of optimal directions for frame design,” in IEEE ICASSP, 1999, pp. 2443–2446.
 [13] M. Aharon, M. Elad, and A. Bruckstein, “KSVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Process., vol. 54, no. 11, pp. 4311–4322, Nov. 2006.
 [14] J. A. Tropp and A. C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Trans. Inf. Theory, vol. 53, no. 12, pp. 4655–4666, 2007.
 [15] V. Abolghasemi, S. Ferdowsi, and S. Sanei, “Blind separation of image sources via adaptive dictionary learning,” IEEE Trans. Image Process., vol. 21, no. 6, pp. 2921–2930, 2012.
 [16] G. Monaci, P. Jost, P. Vandergheynst, B. Mailhe, S. Lesage, and R. Gribonval, “Learning multimodal dictionaries,” IEEE Trans. Image Process., vol. 16, no. 9, pp. 2272–2283, 2007.
 [17] J. Yang, Z. Wang, Z. Lin, S. Cohen, and T. Huang, “Coupled dictionary training for image superresolution,” IEEE Trans. Image Process., vol. 21, no. 8, pp. 3467–3478, 2012.
 [18] S. Wang, L. Zhang, Y. Liang, and Q. Pan, “Semicoupled dictionary learning with applications to image superresolution and photosketch synthesis,” in IEEE CVPR, 2012, pp. 2216–2223.
 [19] Y. Jia, M. Salzmann, and T. Darrell, “Factorized latent spaces with structured sparsity,” in Advances in Neural Information Processing Systems, 2010, pp. 982–990.
 [20] N. Vaswani and W. Lu, “Modifiedcs: Modifying compressive sensing for problems with partially known support,” IEEE Trans. Signal Process., vol. 58, no. 9, pp. 4595–4607, 2010.
 [21] J. F. C. Mota, N. Deligiannis, and M. R. D. Rodrigues, “Compressed sensing with prior information: Optimal strategies, geometry, and bounds,” arXiv preprint arXiv:1408.5250, 2014.
 [22] J. F. C. Mota, N. Deligiannis, and M. R. D. Rodrigues, “Compressed sensing with side information: Geometrical interpretation and performance bounds,” in IEEE GlobalSIP, 2014, pp. 512–516.
 [23] E. van den Berg and M. P. Friedlander, “Probing the Pareto frontier for basis pursuit solutions,” SIAM Journal on Scientific Computing, vol. 31, no. 2, pp. 890–912, 2008.
 [24] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, 2004.
Comments
There are no comments yet.