1 Introduction
Colorization consists in adding color information to a grayscale images. This technique is used for instance by the cinema industry to make old productions more attractive. As usual we can consider a grayscale image as luminance channel Y of an RGB image. The Y channel is defined as a weighted average of the RGB channels:
In addition to the luminance channel, two chrominance channels, called and , enable to recover the RGB image. Recovering an RGB image from the luminance channel alone is an illposed problem and requires additional information. This information is provided in the literature in two different ways, namely by manual or exemplarbased methods. In the first one the user augments the image by some color strokes as basis for the algorithm to compute the color of each pixel. The colorization of a complex scene by a manual prior can be a tedious work for the user. In the second approach a color image is used as a source of information. Here the results strongly depend on the choice of the image. Therefore it is often called semiautomatic.
In this paper we focus on the exemplarbased methods. A common backbones of these techniques is the matching of the images. First, the source image is transformed to a grayscale image which is compared with the input grayscale image called target. The main issue of the exemplarbased colorization consists in matching the pixels of the source and the target grayscale images. The basic hypothesis in the literature is that the color content is similar in similar texture patches. Then the main challenge is the choice of appropriate texture descriptors.
In the literature, the exemplarbased methods come along with various matching procedures. Some of them are done after an automatic segmentation, whereas others use local information. The seminal paper on exemplarbased colorization by Welsh et al. [22] was inspired by the texture synthesis of Efros and Leung [11]
and uses basic descriptors for image patches (intensity and standarddeviation) to describe the local texture. Pierre et al.
[21] proposed an exemplarbased framework based on various metrics between patches to produce a couple of colorization results. Then a variational model with total variation like regularization is applied to chose between the different results in one pixel with a spatial regularity assumption. The approach of Irony et al. [15] is built on the segmentation of the images by a meanshift algorithm. The matching between segments of the images is computed from DCT descriptors which analyse the textures. The method of Gupta et al. [12] is rather similar. Here, an oversegmentation (SLIC, see, e.g., [1]) is used instead of the meanshift algorithm. The comparison between textures in done by SURF and Gabor features. Chen et al. [7] proposed a segmentation approach based on Bayesian image matching which can also deal with smooth images including faces. The authors pointed out that the colorization of faces is a particular hard problem. However, their approach uses a manual matching between objects to skirt the problem of smooth parts. Charpiat et al. [5] ensured spatial coherency without segmenting, but their method involves many complex steps. The texture discrimination is mainly based on SURF descriptors. In the method of Chia et al. [8], the user has manually to segment and label the objects and the algorithm finds similar segments in a set of images available in the internet. Recently a convolutional neural network (CNN) has been used for colorization by Zhang et al.
[24] with promising results. Here the colorization is computed from a local description of the image. However, no regularization is applied to ensure a spatial coherence. This produces ,,halo effects” near strong contours. All the described methods efficiently distinguish textures and possibly correct them with variational approaches, but fail when similar textures have to be colorized with different colors. This case arises naturally for face images. Here the smooth skin is considered nearly as a constant part. Thus, when the target image contains constant parts outside the face, the texturebased methods fail.In this paper we propose a new technique for the colorization of face images guided by image morphing. Our framework relies on the hypothesis that the global shape of faces is similar. The matching of the two images is performed by computing a morphing map between the target image and the grayscale version of the source image. The morphing or metamorphosis approach was first proposed by Miller and Younes [17]. In this paper we build up on a time discrete version suggested by Berkels et al. [3]. In contrast to these authors we apply a finite difference approach which solves a registration problem in each iteration step. Having the morphing map available, the chrominance channels can be transported by this map to the target image, while preserving its luminance channel. This gives very good results and outperforms stateoftheart methods. For some images we can further improve the quality by applying a variational postprocessing step. This was done in two of our numerical examples. Our variational model incorporates a total variation like regularization term which takes the edges of the target image into account. This was also proposed by one of the authors of this paper in [21]. However the method is accomplished by adapting a procedure of Deledalle et al. [9] to keep the contrast of the otherwise biased total variation method.
The outline of the paper is as follows: In Section 2 we sketch the ideas of the morphing approach. In particular we show how the morphing map is computed with an alternating minimization algorithm. Section 3 deals with the color transfer. Having the morphing map at hand the transfer of the chrominance values is described in Subsection 3.1. Sometimes it appears to be useful to apply a variational model with a modified total variation regularization as postprocessing step to remove possible spatial inconsistencies. This step is described in Subsection 3.2. Numerical experiments demonstrate the very good performance of our algorithm in Section 4. The paper ends with conclusions in Section 5.
2 Image Morphing
Our colorization method is based on image morphing, also known as image metamorphosis which we briefly explain in this section. For a detailed description we refer to [19]. The morphing or metamorphosis approach was first proposed by Miller and Younes [17], see also [23] with a comprehensive analysis by Trouvé and Younes [2]. The basic idea consists in considering images as objects lying on a Riemannian manifold with a Riemannian metric that takes diffeomorphisms between images into account. Then the source and the target image are the starting and end points, respectively, of curves lying on the manifold. The method aims to find the shortest path between the images, i.e., the geodesic joining them by minimizing the path energy. By definition of the Riemannian metric this provides not only a sequence of images along the path but also the diffeomorphism which moves the image intensities pixelwise along the path. So far we are in a continuous setting, i.e., images are functions on a domain mapping into and the curves evolve continuously in time. Recently a timediscrete path approach was proposed by Berkels et al. [3]. This is the starting point of our work which we describe next.
Let be an open, bounded domain with Lipschitz continuous boundary. We are given a grayvalue template image and a target image which are supposed to be continuously differentiable and compactly supported. For set
(1) 
In our application, the template image will be the luminance channel of the color source image and the target image the grayscale image we want to colorize. We want to find a sequence of images together with a sequence of diffeomorphisms on ,i.e.,
such that
see Figure 1, and the deformations have a small linearized elastic potential defined below. To this end, we suppose for that is related to the displacement by
(2) 
and set
. The (Cauchy) strain tensor of the displacement
is defined bywhere denotes the Jacobian of . The linearized elastical potential is given by
(3) 
where . Then we want to minimize
(4) 
This functional was introduced and analyzed by Berkels et al. in [3]. The minimizer of the functional provides us with both a sequence of images along the approximate geodesic path and a sequence of displacements managing the transport of the gray values through this image sequence. Note that the term may be accomplished by a higher order derivative
which ensures in the time continuous setting that is indeed a diffeomorphism, see [10].
For finding a minimizer of (4) we alternate the minimization over and as also proposed in [3]:

Fixing and minimizing over leads to the following single registration problems:
(5) where is related to by (2).

Fixing , resp., leads to solving the following image sequence problem
(6) This can be done via the linear system of equations arising from EulerLagrange equation of the functional which we describe in the Appendix A.
Dealing with digital images which map from a rectangular image grid into rather than from a continuous domain , we need a spatial discretization of the functional (4). The authors of [3] propose to use a finite element approach for the spatial discretization. Finite element methods are highly flexible and can be also applied, e.g., for shape metamorphosis. However, having the rectangular structure of the image grid in mind, we propose to use finite differences for the spatial discretization. This has also the advantage that we can build up in the registration step 1 of our alternating algorithm on methods proposed by Haber and Modersitzki [13], see also [18]. As usual in optical flow and image registration we work on a staggered grid and apply a coarsetofine strategy. For the concrete spatial discretization of the differential operators in (4), the solution of the registration problem and detailed numerical issues concerning the computation of the morphing map we refer to [19].
3 Face Colorization
In this section, we describe a method to colorize a grayscale image based on the morphing map between the luminance channel of a source image (template) and the present image (target). The approach is based on the transfer of the chrominance channels from the source image to the target one. Further, we propose a variational method with a total variation based regularization as a postprocessing step to remove possible artifacts.
3.1 Color Transfer
The color transfer will be done in the YUV color space. While in RGB images the color channels are highly correlated, the YUV space shows a suitable decorrelation between the luminance channel and the two chrominance channels . The transfer from the RGB space to YUV space is given by
(7) 
Most of the colorization approaches are based on the hypothesis that the target image is the luminance channel of the desired image. Thus, the image colorization process is based on the computation of the unknown chrominance channels.
Luminance Normalization
The first step of the algorithm consists in transforming the RGB source image to the image by (7). The range of the target grayvalue image and the channel of the source image may differ making the meaningful comparison between these images not possible.
To tackle this issue, most of stateofthe art methods use a technique called luminance remapping which was introduced in [14]. This affine mapping between images which aims to fit the average and the standard deviation of the target and the template images is defined as
(8) 
where is the average of the pixel values, and
is the empirical variance.
Chrominance Transfer by the Morphing Maps
Next we compute the morphing map between the two grayscale images and with model (4). This results in the deformation sequence which produces the resulting map from the template image to the target one by concatenation
(9) 
Due to the discretization of the images, the map is defined, for images of size , on the discrete grid :
(10) 
where is the position in the source image which corresponds to the pixel in the target image. Now we colorize the target image by computing its chrominance channels, denoted by at position as
(11) 
The chrominance channels of the target image are defined on the image grid , but usually . Therefore the values of the chrominance channels at
have to be computed by interpolation. In our algorithm we use just bilinear interpolation which is defined for
with , by(12) 
Finally, we compute a colorized RGB image from its luminance and the chrominance channels (11) by the inverse of (7):
(13) 
for all . Figure 2^{1}^{1}1Image segements with unified background of George W. Bush https://commons.wikimedia.org/wiki/File:GeorgeWBush.jpeg and Barack Obama https://commons.wikimedia.org/wiki/File:Barack_Obama.jpg. summarizes our color transfer method.
3.2 Variational Methods for Chrominance Postprocessing
Sometimes the color transfer computed from the morphing map can be disturbed by artifacts. To improve the results, postprocessing steps are usually applied in the literature.
Variational approaches are frequently applied in image colorization either directly or as a postprocessing step, see, e.g., [16, 21, 20]. For instance, the technique of Gupta et al. [12] uses the chrominance diffusion approach of Levin et al. [16].
In this paper, we build up on the model [21] suggested by one of the authors of that paper. This variational model uses a functional with a specific regularization term to avoid ,,halo effects”. More precisely, the authors considered the minimizer of
(14) 
with
(15) 
The first term in (14) is a coupled total variation term which enforces the chrominance channels to have a contour at the same location as the target grayvalue image. The data fidelity term is the classical squared norm of the differences of the given and the desired chrominance channels. Note that the model in [21] contains an additional box constraint. We apply the primaldual Algorithm 1 to find the minimizer of the strictly convex model (14). It uses an update on the step time parameters and , as proposed by Chambolle and Pock [4], as well as a relaxation parameter to speedup the convergence. Here we use the abbreviation and . Further, is the dual variable which is pixelwise in . The parameters and are intern time step sizes. The operator stands for the discrete divergence and for the discrete gradient. Further, the proximal mapping is given pixelwise, for by
(16) 
As mentioned in the work of Deledalle et al. [9], the minimization of the TV model produces a biased result. This bias causes a lost of contrast in the case of grayscale images, whereas it is visible as a lost of colorfulness in the case of model (14). The authors of [9] describe an algorithm to remove such bias. In this paper, we propose to modify this method for our model (14) in order to enhance the result of Algorithm 1. The final algorithm is summarized in Algorithm 2. Note that it uses the result of Algorithm 1 as an input. The proximal mapping within the algorithm is defined pixelwise, for variables and , as
(17) 
where and are defined as in (16).
The results obtained at the different steps of the workflow are presented for a particular image in Figure 3. First we demonstrate in Figure 2 (c) that simply transforming the channels via the morphing map gives no meaningful results. Letting our morphing map act to the chrominance channels of our source image and applying (13) with the luminance of our target image we get , e.g., Figure 2 (d)) which is already a good result. However, the forehead of Obama contains an artifact; a gray unsuitable color is visible here. After a postprocessing of the chrominance channels by our variational method the artifacts disappear as can be seen in Figure 2 (e).
4 Numerical Examples
Image  K  

Fig. 4 1. row  0.025  24  50  0.005 
Fig. 4 2. row  0.05  24  25  0.005 
Fig. 4 3. row  0.05  12     
Fig. 5 1. row  0.005  32     
Fig. 5 2. row  0.0075  18  50  0.05 
Fig. 5 3. row  0.04  24     
In this section, we compare our method with

the patchbased algorithm of Welsh et al. [22],

the patchbased method of Pierre et al. [21], and

the segmentation approach of Gupta et al. [12].
We implemented our morphing algorithm in Matlab 2016b and used the Matlab intern function for the bilinear interpolation. The parameters are summarized in Table 1. Here and are the parameters for the morphing step, while and appear in the variational model for postprocessing.
First we compare the portraits in Figure 4^{2}^{2}2Image segments of Steve Jobs https://commons.wikimedia.org/wiki/File:Steve_Jobs_Headshot_2010CROP.jpg, Albert Einstein https://commons.wikimedia.org/wiki/File:Einstein_Portr_05936.jpg, Catherine Deneuve https://commons.wikimedia.org/wiki/File:Catherine_Deneuve_2010.jpg and Renée Deneuve https://commons.wikimedia.org/wiki/File:Ren%C3%A9e_Deneuve.jpg. starting with the modern photographies in the first row. The approach of Welsh et al. [22] is based on a patch matching between images. The patch comparison is done with basic texture descriptors (intensity of the central pixel and standarddeviation of the patches). Since the background, as well as the skin are smooth, the distinction between them is unreliable if their intensities are similar. Moreover, since no regularization is used after the color transfer, some artifacts occur. For instance, some blue points appear on Obama’s face, see Figure 4, first row. The approach of Pierre et al. [21] is based on more sophisticated texture features for patches and applies a a variational model with total variation like regularization. With this approach the artifacts mentioned above are less visible. Nevertheless, the forehead of Obama is purple which is unreliable. The method of Gupta et al. [12] uses texture descriptors after an oversegmentation, see, e.g., SLIC [1]. The texture descriptors are based on SURF and Gabor features. In the case of the Obama image, the descriptors are not able to distinguish the skin and other smooth parts, leading to a background color different from the source image.
The second and the third rows of Figure 4 focus on the colorization of old photographies.
This challenging problem is a real application of image colorization which is sought, e.g., by the cinema industry.
Note that the texture of old images are disturbed by the natural grain which is not the case in modern photography.
Thus, the texture comparison is unreliable for this application.
This issue is visible in all the comparison methods.
For the portrait of Einstein the background is not colorized with the color of the source.
Moreover, the color of the skin is different from those of the person in the source image.
For the picture of Deneuve, the color of her lips is not transferred to the target image (Deneuve’s mother)
with the stateoftheart texturebased algorithms.
With our morphing approach, the shapes of the faces are mapped.
Thus, the lips, as well as the eyes and the skin are well colorized with a color similar to the source image.
In Figure 5^{3}^{3}3Image segments of selfportraits of Vincent van Gogh https://commons.wikimedia.org/wiki/File:Vincent_Willem_van_Gogh_102.jpg and https://commons.wikimedia.org/wiki/File:Vincent_van_Gogh__SelfPortrait__Google_Art_Project.jpg,
a photography of a woman https://pixabay.com/en/womanportraitfacemodelcanon659352/, a drawing of a woman https://pixabay.com/en/blackandwhitedrawingwomanactor1724363/,
a color image of Marilyn Monroe https://www.flickr.com/photos/7477245@N05/5272564106 created by Luiz Fernando Reis, and a drawing of Marilyn Monroe https://pixabay.com/en/marilynmonroeartdrawmarilyn885229/, we provide results including painted images. Note that we use the same Van Gogh selfportraits as in [3]. Due to the low contrast of the ear and suit to the background we add here the same segmentation information as in [3], which means our images are two dimensional during the calculation for the results shown in the first row of Figure 5.
In these examples the similarity of the shapes between the source and target images is again more reliable than the matching of the textures so that only our morphing approach produces suitable results. Consider in particular the lips of the woman in the second and third row. A non postprocessed result for the woman in the second row is shown in Figure 6. Comparing the two images we see that only small details change but most of the colorization is done by the morphing.
Source.  Target.  Welsh et al. [22]  Gupta et al. [12]  Pierre et al. [21]  Our. 
As morphing approach calculates at the same time an image and diffeomorphism path, we can not only show the final colorization result, but also the way the color is transported along this path. We illustrate this by the second row of Figure 5, where every second image along the path is shown. In Figure 6 we see how the UV channels are transported via the diffeomorphism path of the Y channels of the source and the target images. We see that even though the right part of the images undergoes large transformations, the eyes and the mouth are moved to the correct places. Note that the final image does not contain a postprocessing in contrast to those in the second row of Figure 5. However, the result is already quite good.
5 Conclusion
In this paper, we propose an finite difference method to compute a morphing map between a source and a target grayvalue image. This map enables us to transfer the color from the source image to the target one, based on the shapes of the faces. An additional postprocessing step, based on a variational model with a luminance guided total variation regularization and an update to make the result unbiased may be added to remove some possible artifacts. The results are very convincing and outperform stateoftheart approaches on historical photographies.
Let us notice some special issues of our approach in view of an integration into a more global framework for an exemplarbased image and video colorization. First of all, our method works well on faces and on object with similar shapes, but when this hypothesis is not fulfilled, some artifacts can appear. Therefore, instead of using our algorithm on full images, a face detection algorithm can be used to focus on the face colorization. Let us remark that faces in image can be detected by efficient, recent methods, see, e.g.,
[6]. In future work, the method will be integrated into a complete framework for exemplarbased image and video colorization.Second, the morphing map has to be computed between images with the same size. This issue can be easily solved with a simple interpolation of the images. Keeping the ratio between the width and the height of faces similar, the distortion produced by such interpolation is small enough to support our method.
Appendix A
Fixing , resp., leads to the image sequence problem (6). In the following we show how this can be solved via the linear system of equations arising from EulerLagrange equation of the functional.
The terms in containing are
(18) 
Noting that is a diffeomorphism, the variable transform gives
(19)  
(20) 
Setting
(21) 
the EulerLagrange equations read for as
(22)  
(23) 
We introduce
which can be computed for each since the , are given. Then (23) can be rewritten for as
(24) 
In matrixvector form, this reads with
for fixed and as
(25) 
Assuming that which is the case in practical applications, the matrix is irreducible diagonal dominant and thus invertible.
Acknowledgments
Funding by the German Research Foundation (DFG) within the project STE 571/131 is gratefully acknowledged.
References
 [1] Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk. Slic superpixels compared to stateoftheart superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11):2274–2282, 2012.
 [2] Trouvé Alian and Laurent Younes. Local geometry of deformable templates. SIAM Journal of Mathematical Analysis, 37(2):17–59, 2005.
 [3] Benjamin Berkels, Alexander Effland, and Martin Rumpf. Time discrete geodesic paths in the space of images. SIAM Journal on Imaging Sciences, 8(3):1457–1488, 2015.
 [4] Antonin Chambolle and Thomas Pock. A firstorder primaldual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision, 40(1):120–145, 2011.

[5]
G. Charpiat, M. Hofmann, and B. Schölkopf.
Automatic image colorization via multimodal predictions.
In
European Conference on Computer Vision
, pages 126–139. Springer, 2008. 
[6]
Dong Chen, Gang Hua, Fang Wen, and Jian Sun.
Supervised transformer network for efficient face detection.
In European Conference on Computer Vision, pages 122–138. Springer, 2016.  [7] Tongbo Chen, Yan Wang, Volker Schillings, and Christoph Meinel. Grayscale image matting and colorization. In Asian Conference on Computer Vision, pages 1164–1169, 2004.
 [8] Alex YongSang Chia, Shaojie Zhuo, Raj Gupta Kumar, YuWing Tai, SiuYeung Cho, Ping Tan, and Stephen Lin. Semantic colorization with internet images. In ACM SIGGRAPH ASIA, 2011.
 [9] CharlesAlban Deledalle, Nicolas Papadakis, Joseph Salmon, and Samuel Vaiter. Clear: Covariant leastsquare refitting with applications to image restoration. SIAM Journal on Imaging Sciences, 10(1):243–284, 2017.
 [10] Paul Dupuis, Ulf Grenander, and Michael I. Miller. Variational problems on flows of diffeomorphisms for image matching. Quarterly of Applied Mathematics, 56(3):587–600, 1998.
 [11] Alexei A Efros and Thomas K Leung. Texture synthesis by nonparametric sampling. In IEEE International Conference on Computer Vision, volume 2, pages 1033–1038, 1999.
 [12] Raj Kumar Gupta, Alex YongSang Chia, Deepu Rajan, Ee Sin Ng, and Huang Zhiyong. Image colorization using similar images. In ACM International Conference on Multimedia, pages 369–378, 2012.
 [13] Eldad Haber and Jan Modersitzki. A multilevel method for image registration. SIAM Journal on Scientific Computing, 27(5):1594–1607, 2006.
 [14] Aaron Hertzmann, Charles E Jacobs, Nuria Oliver, Brian Curless, and David H Salesin. Image analogies. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 327–340. ACM, 2001.
 [15] Revital Irony, Daniel CohenOr, and Dani Lischinski. Colorization by example. In Eurographics conference on Rendering Techniques, pages 201–210. Eurographics Association, 2005.
 [16] Anat Levin, Dani Lischinski, and Yair Weiss. Colorization using optimization. In ACM Transactions on Graphics, volume 23–3, pages 689–694, 2004.
 [17] Michael I. Miller and Laurent Younes. Group actions, homeomorphisms, and matching: A general framework. International Journal of Computer Vision, 41(12):61–84, 2001.
 [18] Jan Modersitzki. Numerical Methods for Image Registration. Oxford University Press on Demand, 2004.
 [19] Johannes Persch and Gabriele Steidl. Morphing of manifoldvalued images. in preparation, 2017.
 [20] Pascal Peter, Lilli Kaufhold, and Joachim Weickert. Turning diffusionbased image colorization into efficient color compression. IEEE Transactions on Image Processing, 2016.
 [21] Fabien Pierre, JeanFrançois Aujol, Aurélie Bugeau, Nicolas Papadakis, and VinhThong Ta. Luminancechrominance model for image colorization. SIAM Journal on Imaging Sciences, 8(1):536–563, 2015.
 [22] Tomihisa Welsh, Michael Ashikhmin, and Klaus Mueller. Transferring color to greyscale images. In ACM Transactions on Graphics, volume 21–3, pages 277–280, 2002.
 [23] Laurent Younes. Shapes and Diffeomorphisms. SpringerVerlag, Berlin, 2010.
 [24] Richard Zhang, Phillip Isola, and Alexei A Efros. Colorful image colorization. In European Conference on Computer Vision, pages 1–16. Springer, 2016.
Comments
There are no comments yet.