Decomposing multispectral face images into diffuse and specular shading and biophysical parameters

02/18/2019 ∙ by Sarah Alotaibi, et al. ∙ 0

We propose a novel biophysical and dichromatic reflectance model that efficiently characterises spectral skin reflectance. We show how to fit the model to multispectral face images enabling high quality estimation of diffuse and specular shading as well as biophysical parameter maps (melanin and haemoglobin). Our method works from a single image without requiring complex controlled lighting setups yet provides quantitatively accurate reconstructions and qualitatively convincing decomposition and editing.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Understanding the appearance of the human face is of considerable interest in computer vision

[1] and graphics [2] and has been studied broadly in many works. One approach to understanding appearance is to estimate a physically meaningful decomposition. Existing methods for decomposing face appearance often rely on complex equipment and lab conditions. For example, Ma et al. [3] use polarised spherical gradient illumination to capture geometry and diffuse and specular reflectance maps. Donner et al. [4] use polarised, multispectral light and an assumption of a planar sample to estimate biophysical skin parameter maps. Gitlina et al. [5] essentially combine these two setups using multispectral illumination in a lightstage. In a medical setting, Claridge [6] use RGB plus NIR measurements with polarised illumination and again assuming a planar sample to estimate skin biophysical parameters for the purpose of identifying abnormalities.

Highly ambitious recent work uses deep learning to decompose face appearance from single, uncontrolled images. Kim et al. 

[7] and Tewari et al. [8] use self-supervision to learn to fit 3D morphable models [9] to single images (providing an estimate of geometry, illumination and reflectance). Intrinsic image based methods [1, 10] do not rely on a face-specific statistical model but learn the whole task from scratch.

We consider the challenging problem of decomposing a single image into diffuse and specular shading and the distribution maps of biophysical parameters that explain skin colouration. To make the problem tractable we work with multispectral images but do not require the complex and controlled illumination conditions of previous methods. To do so, we propose a novel combination of a biophysical and dichromatic reflectance model that efficiently characterises spectral skin reflectance. Previously, the dichromatic model has primarily been used in a setting where the light source colour (or more generally spectral power distribution) is unknown but the object under study is assumed to have uniform or piecewise uniform [11] colour (more precisely diffuse albedo or spectral reflectance). We show that, with known light source spectra we can estimate four spatially varying reflectance parameters by fitting our model to multispectral data using nonlinear least squares. We show how the decomposed parameter maps can be edited in an intuitive way and validate our approach quantitatively on a database of skin reflectance spectra and qualitatively on a set of multispectral face images.

2 Biophysical dichromatic skin model

In this section we introduce our multispectral skin reflectance model. We assume that skin reflects light according to the dichromatic model [12] and that diffuse albedo (i.e. subsurface absorption) can be explained using a biophysical model.

2.1 Multispectral dichromatic model

The dichromatic model assumes that wavelength-dependent scene radiance, with the wavelength, is a sum of body (diffuse) and surface (specular) reflected components. Further, it divides each source of radiance into a part that depends on geometry (informally “shading”) and a wavelength dependent part (informally “colour”). The body reflection arises from subsurface scattering and modifies the spectral power distribution (SPD) of the light through absorption whereas the surface reflectance happens at the interface and does not, meaning the model can be written as: , where and are the diffuse and specular shading respectively and is the spectral reflectance of the diffusely reflected light. We discretise at

evenly spaced wavelengths such that we can write the vector of spectral scene radiance,

, as:

(1)

where and are the wavelength-discrete illuminant SPD and spectral reflectance respectively.

2.2 Biophysical skin reflectance model

We now replace generic spectral reflectance with a biophysical model for skin. This skin model has only two free parameters, meaning the dichromatic model has in total only four unknowns per pixel. Our biophysical spectral reflectance model for skin follows a number of previous models [6, 2, 13, 14], though we focus on simplicity and limiting the number of free parameters. Specifically, our model allows only the melanin and haemoglobin concentration to vary spatially whereas all other parameters are based on measured data, validated approximation functions or average values [15, 16, 17, 18, 2, 19, 20, 13]. The free parameters have physical meaning and can therefore be constrained to the range of values observed in healthy skin.

Human skin tissue has a complicated layered structure. We consider only two layers (Fig. 1(a)). The outer layer is the epidermis contains the melanin and is responsible for absorption of the short wavelengths of the visible spectrum and the remainder of light is mostly forward scattered. The dermis has blood vessels that contain the haemoglobin pigment, and absorbs light in the green and blue wavelengths, the remainder of light is primarily reflected back through the epidermis where the melanin pigment absorbs the light again. Therefore, our skin spectral reflectance model is written as:

where is the epidermal melanosomes volume fraction and lies in the range , is the dermal blood volume fraction and lies in the range , is the proportion light transmitted through the epidermis (twice) and is modelled using the Lambert-Beer law, is the proportion of light reflected from the dermis and is modelled by Kubelka-Munk theory. In Fig. 1(b) we visualise the range of colours that can be obtained by our skin reflectance model when transformed to RGB space as described in Sec. 4.3. In wavelength-discrete terms once again, we write as the vector of diffuse spectral reflectance which can be substituted into (1).

(a) (b)
Figure 1: (a) Layered skin reflectance model. (b) sRGB render (see Sec. 4.3) of the skin colouration model.

3 Model fitting

We now assume that we are provided with a multispectral measurement of scene radiance, , at a point on the skin surface. We assume that scene illumination is spectrally uniform and that its SPD is known. In this case, our model has four unknown parameters and we pose model fitting as a nonlinear least squares problem whose objective is:

(2)

All four parameters are subject to constraints. The biophysical parameters must lie in their plausible ranges while the diffuse and specular shading must be positive. This leads to a constrained optimisation problem:

In practice, we reparameterise each of the four unknowns to obtain an unconstrained optimisation problem:

(3)

and is the logistic function: . We minimise (3) using the trust-region-reflective algorithm. Note that, in the case of multispectral images, we solve the optimisation problem independently at each pixel. i.e. we do not require any spatial smoothness priors.

Figure 2: Col. 1: sRGB render of multispectral images. Col. 2-5: decomposition into: specular and diffuse shading, melanin and haemoglobin. Col. 6: albedo from biophysical maps. Col. 7: sRGB render of reconstructed radiance.

4 Model-based editing

Once a face image has been decomposed to the four semantically meaningful parameter maps, intuitive editing becomes possible. After editing one or more of the maps, we recompute scene radiance with (1), blend with unedited background radiance using a skin mask and then transform to sRGB.

4.1 Learning skin segmentation

To produce convincing edited images we require a skin probability mask to blend the edited skin radiance with background. We train a multilayer perceptron that computes the probability that a single pixel belongs to the skin region. The network has five fully connected layers with ReLU activation (number of channels = 64/64/128/128/2). The input for a pixel is a vector

comprising the original measured radiance and the four fitted parameters:

The output of final fully connected layer is passed thought the per-pixel classification log loss and the network predicts the probability , for all pixels. We train the network by manually selecting skin (total 1.1M pixels) and non-skin (total 750k pixels) regions from the data in [21].

4.2 Blending skin and background edits

Our biophysical model is able to describe only skin spectral reflectance. So, other features such as facial features: eyes, teeth, facial hair and the image background are not well explained. For this reason, we use the estimated skin probabilities to blend the spectral radiance for non-skin regions and our edited biophysical spectral radiance for skin regions:

4.3 Spectral radiance to nonlinear sRGB

For visualisation, a spectral radiance vector resulting from an edit must be rendered to a tristimulus RGB image. This arises from an integration over wavelength of the product of scene radiance (itself the product of illumination and reflectance spectra) and camera spectral sensitivity:

(4)

where is the SPD of the illuminant, the spectral reflectance of the surface and the spectral sensitivity of the camera in colour channel . In our wavelength-discrete formulation this becomes:

(5)

where contains the wavelength-discrete versions spectral sensitivities of the camera, is a matrix that performs white balancing and colour transformation to sRGB space and controls the nonlinear gamma.

5 Experiments

Figure 3: Qualitative comparison to [22], from left to right: sRGB render of multispectral image, specular shading, diffuse shading, melanin map, haemoglobin map and the reconstructed sRGB. The top row is our result, bottom is [22].

We begin by quantitatively evaluating how well our model and model fitting method are able to explain measured data. We use the 25 faces from the ISET database [21], compute the best fitting parameters for skin labelled pixels and then compute the mean relative absolute error in the reflectance spectra over all pixels in all images, yielding a 7.4% error.

In Fig. 2 we show representative qualitative results from this experiment. Input, albedo and reconstruction RGB renderings use the mean camera spectral spectral sensitivity from [23] and assume daylight (D65) illumination. The specular shading maps clearly pick up the specular reflections. Note also that the specular shading contains fine surface detail whereas the diffuse shading is blurred due to subsurface scattering (consistent with [3]). The diffuse albedo is computed directly from the biophysical spectral reflectance and shows that shading effects are not transferred to the biophysical parameters. Note that lips and flushed cheeks are clearly visible in the haemoglobin maps and that the overall melanin value accurately reflects skin colour. All results are masked to skin region by binarising the skin probability maps.

There are no existing methods solving the same task, but for comparison we adapt the method of Tsumura et al. [22]

to multispectral data. This is a statistical approach based on Independent Component Analysis (ICA) so the parameter maps have no exact physical meaning. However, as seen in Fig. 

3 the maps do seem to have an approximate correspondence to our physically meanginful ones, though it is clear that our decomposition and reconstruction is more plausible.

Figure 4: Editing the decomposed maps. Rows 1-3 show the results of editing melanin map. Row 4 presents the result of increasing the haemoglobin. Last row shows an application of specularity removal.

Finally, in Fig. 4 we present editing results using the method in Sec. 4. In each case we show the original RGB and one of the estimated maps, then the edited map and finally the edited RGB result. In the first row we scale the melanin by 0.75, giving the appearance of lighter skin. In the second row, we show freckle removal by applying 2D median filter on the melanin map. In the third row we increase the melanin concentration by 0.2 and this shows the face appearance with darker skin as if the face were sun-tanned. In the fourth row we increase the haemoglobin map by 0.3. This presents a flushed appearance of the face such as if the face is over heated. Finally, in the fifth row we demonstrate specularity removal by setting the specular map to a constant.

6 Conclusion

We have presented a novel hybrid of a spectral biophysical and dichromatic reflectance model and shown how to fit the model to multispectral face images. Our results show that the model is able to accurately explain multispectral skin reflectance and that the estimated maps provide a highly plausible decomposition. The surprising conclusion of our work is that there is sufficient information in a multispectral image to render the decomposition task well-posed. This is possible only by introducing the constraint of a biophysical model. In principle, our model could be fitted to four channel data so it would be interesting to see whether we can obtain similar results using RGB + NIR images. Our inverse rendered results could be used to learn statistical models of the variation in intrinsic biophysical face parameters [16].

References

  • [1] Z. Shu, E. Yumer, S. Hadap, K. Sunkavalli, E. Shechtman, and D. Samaras, “Neural face editing with intrinsic image disentangling,” in Proc. CVPR, 2017.
  • [2] J. Jimenez, T. Scully, N. Barbosa, C. Donner, X. Alvarez, T. Vieira, P. Matts, V. Orvalho, D. Gutierrez, and T. Weyrich, “A practical appearance model for dynamic facial color,” ACM Trans. Graphic., vol. 29, no. 6, pp. 141:1–141:10, 2010.
  • [3] W.-C. Ma, T. Hawkins, P. Peers, C.-F. Chabert, M. Weiss, and P. Debevec, “Rapid acquisition of specular and diffuse normal maps from polarized spherical gradient illumination,” in Proc. Eurographics Conference on Rendering Techniques, 2007, pp. 183–194.
  • [4] C. Donner, T. Weyrich, E. d’Eon, R. Ramamoorthi, and S. Rusinkiewicz, “A layered, heterogeneous reflectance model for acquiring and rendering human skin,” in ACM Trans. Graphics, 2008, vol. 27, p. 140.
  • [5] X. Huang G. C. Guarnera Y. Gitlina, D. S. J. Dhillon and A. Ghosh, “Practical multispectral imaging of human skin using RGBW+ illumination,” in CVMP Short Papers, 2018.
  • [6] E. Claridge, S. Cotton, P. Hall, and M. Moncrieff, “From colour to tissue histology: Physics-based interpretation of images of pigmented skin lesions,” Medical Image Analysis, vol. 7, no. 4, pp. 489 – 502, 2003.
  • [7] H. Kim, M. Zollhöfer, A. Tewari, J. Thies, C. Richardt, and C. Theobalt, “InverseFaceNet: Deep monocular inverse face rendering,” in Proc. CVPR, 2018.
  • [8] A. Tewari, M. Zollhofer, H. Kim, P. Garrido, F. Bernard, P. Perez, and C. Theobalt,

    “MoFA: Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction,”

    in Proc. ICCV, 2017.
  • [9] Volker Blanz and Thomas Vetter, “A morphable model for the synthesis of 3d faces,” in Proc. SIGGRAPH, 1999, pp. 187–194.
  • [10] Soumyadip Sengupta, Angjoo Kanazawa, Carlos D. Castillo, and David W. Jacobs, “SFSNet: Learning shape, refectance and illuminance of faces in the wild,” in Proc. CVPR, 2018.
  • [11] Cong Phuoc Huynh and Antonio Robles-Kelly, “A solution of the dichromatic model for multispectral photometric invariance,” International Journal of Computer Vision, vol. 90, no. 1, pp. 1–27, 2010.
  • [12] Steven A Shafer, “Using color to separate reflection components,” Color Research & Application, vol. 10, no. 4, pp. 210–218, 1985.
  • [13] Aravind Krishnaswamy and Gladimir VG Baranoski, “A biophysically-based spectral model of light interaction with human skin,” Computer Graphics Forum, vol. 23, no. 3, pp. 331–340, 2004.
  • [14] S. J. Preece and E. Claridge, “Spectral filter optimization for the recovery of parameters which describe human skin,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 7, pp. 913–922, July 2004.
  • [15] S. L. Jacques, “Skin optics summary,” Oregon Medical Laser Center News, January 1998.
  • [16] S. Alotaibi and W. A. P. Smith, “A biophysical 3D morphable model of face appearance,” in Proc. ICCV, 2017.
  • [17] R. Rox Anderson and John A. Parrish, “The optics of human skin,” Journal of Investigative Dermatology, vol. 77, no. 1, pp. 13 – 19, 1981.
  • [18] A. J. Thody, E. M. Higgins, K. Wakamatsu, S. Ito, S. A. Burchill, and J. M. Marks, “Pheomelanin as well as eumelanin is present in human epidermis,” Journal of Investigative Dermatology, vol. 97, no. 2, pp. 340 – 344, 1991.
  • [19] Scott Prahl, “Optical absorption of hemoglobin,” Oregon Medical Laser Center News, December 1999.
  • [20] Ross Flewelling, “Noninvasive Optical Monitoring,” in The Biomedical Engineering Handbook, Second Edition. 2 Volume Set, Electrical Engineering Handbook. CRC Press, dec 2000.
  • [21] ImageVal Consulting, “Hyperspectral Scene Data (415 – 915 nm): High Resolution Faces,” http://www.imageval.com/scene-database-4-faces-1-meter/, 2012.
  • [22] N. Tsumura, N. Ojima, K. Sato, M. Shiraishi, H. Shimizu, H. Nabeshima, S. Akazaki, K. Hori, and Y. Miyake, “Image-based skin color and texture analysis/synthesis by extracting hemoglobin and melanin information in the skin,” ACM Trans. Graph., vol. 22, no. 3, pp. 770–779, 2003.
  • [23] J. Jiang, D. Liu, J. Gu, and S. Süsstrunk, “What is the space of spectral sensitivity functions for digital color cameras?,” in 2013 IEEE Workshop on Applications of Computer Vision (WACV), Jan 2013, pp. 168–179.