Ear-to-ear Capture of Facial Intrinsics

09/08/2016 ∙ by Alassane Seck, et al. ∙ University of York Aberystwyth University University of Twente 0

We present a practical approach to capturing ear-to-ear face models comprising both 3D meshes and intrinsic textures (i.e. diffuse and specular albedo). Our approach is a hybrid of geometric and photometric methods and requires no geometric calibration. Photometric measurements made in a lightstage are used to estimate view dependent high resolution normal maps. We overcome the problem of having a single photometric viewpoint by capturing in multiple poses. We use uncalibrated multiview stereo to estimate a coarse base mesh to which the photometric views are registered. We propose a novel approach to robustly stitching surface normal and intrinsic texture data into a seamless, complete and highly detailed face model. The resulting relightable models provide photorealistic renderings in any view.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 4

page 6

page 9

page 10

page 11

page 13

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

2 Pipeline

The polarisation properties of light have been widely used as a cue to study surface shape and reflectance properties. One of the best known effects is that specular reflectance from a dielectric material preserves the plane of polarisation of linearly polarised incident light whereas the diffuse reflectance loses it. This allows separation of specular and diffuse reflectance using a cross-polarisation technique [29]. However, while calibrating this technique is relatively straightforward it is, unfortunately, view dependent.

We overcome this problem by capturing a face multiple times in different poses relative to the calibrated viewpoint, e.g. frontal and two profile views. Together, these three photometric views provide full ear-to-ear coverage of the face. We augment the photometric camera with additional cameras providing multiview, single-shot images captured in sync with a reference frame of the photometric sequence (the diffuse constant image). We position these additional cameras to provide overlapping coverage of the face. Since we do not rely on a fixed calibration, their exact positioning is unimportant and we allow the cameras to autofocus between captures. In our setup, we use 7 such cameras in addition to the photometric view giving a total of 8 simultaneous views. Since we repeat the capture three times, we have 24 effective views. A complete dataset for a face is shown in Figure 2.

In order to merge these views and to provide a rough base mesh, we perform a multiview reconstruction using all 24 views. Solving this uncalibrated multiview reconstruction problem provides both the base mesh and also intrinsic and extrinsic camera parameters for the three photometric views. These form the input to our stitching process. Note that since the three photometric views are not acquired simultaneously, there is likely to be non-rigid deformation of the face between these views. For this reason, in Section 5 we propose a robust algorithm for stitching the views without blurring potentially misaligned features.

Our complete pipeline is summarised as follows:

  1. Uncalibrated multiview stereo: We commence by applying structure-from-motion followed by dense multiview stereo to all 24 viewpoints.

  2. Photometric capture: For the three photometric viewpoints, we capture 14 image spherical gradient illumination sequences. These comprise the 7 gradient conditions with crossed and parallel polarised filter orientations on the camera.

  3. Per-view alignment and bias removal: For each photometric viewpoint, we compensate for subject motion using the photometric alignment technique described in Section 4 and estimate diffuse and specular surface normal maps. We perform bias removal for each view, accounting for the pose-dependency of light source discretisation on the estimated normals.

  4. Stitching photometric views: Finally, we stitch the diffuse and specular albedo and surface normals onto the base mesh. The surface normal stitching is done in the mesh domain so that the detail is transferred to the vertices simultaneously with stitching the normals.

Step 1 is now a well studied problem and with high resolution face images, satisfactory results can be obtained using existing methods such as the Bundler SFM tool [21] and PMVS for multiview stereo [22]. We use the commercial tool Agisoft Photoscan 111www.agisoft.com. For step 2 we use an opto-electrical polarising filter to allow diffuse/specular separation without the need for mechanical filter rotation. Such filters form part of active 3D projection systems and are available cheaply. Optionally, for reasons of efficiency it may be desirable to decimate the final mesh and store texture and shape detail in 2D maps. This can be done as a post-processing step to our pipeline using any existing surface parameterisation and decimation algorithms.

3 Spherical Gradient Photometric Stereo

Spherical Gradient Photometric Stereo was introduced by Ma et al. [1] and refined by Wilson et al. [6]

. The idea amounts to something very simple: estimate the first moment (centre of mass) of the reflectance lobe at a point by illuminating that point with a linear spherical gradient. For a Lambertian surface, this direction coincides with the surface normal and, for a specular surface, with the reflection direction (from which the surface normal can be calculated).

3.1 The Lambertian case

Let , , and respectively the measured Lambertian radiances under the gradient, gradient, gradient and constant illuminations, [1] established the relation between the surface normal and the measured Lambertian radiance as follows:

(1)

where, is a normalizing constant.

3.2 The specular case

In the specular case, [1]

show that it is easier to estimate, from the measured specular radiances, the specular reflection vector than the surface normal directly. If

, , and denote respectively the measured specular radiances under gradient, gradient, gradient and constant illuminations, the specular reflection vector is given by:

(2)

where, is a normalizing constant.

As the surface normal corresponds to the direction half-way between the view vector (which is in our case) and its specular reflection , it can be obtained by:

(3)

3.3 Complement Gradient Illumination

Wilson et al. [6] proposed an improved method for calculating the surface normals from Spherical Gradient Illumination. The authors exploit spherical gradient images obtained under complementary lighting conditions, i.e. for which the lighting coordinate system is reversed. Thus, in addition to the four gradient images ,, and proposed by Ma et al. [1], they capture three others , and such that:

(4)

From 4, 1 and a re-normalization, they obtain:

(5)

This method is proven to improve the quality of the calculated normals and is more robust than the original method of Ma et al. [1]. This is explained by the fact that the dark regions in one gradient image are likely to be well lit in the complement image, hence improving signal to noise ratio.

4 Photometric Alignment

Since spherical gradient photometric stereo requires a set of images to be captured in series, the images within a sequence may not be in perfect alignment due to subject motion. In the context of estimating fine scale shape, these small misalignments lead to a blurring of detail. Since inter-frame motion is likely to be very small (perhaps sub-pixel) and visibility is unlikely to change between views, the obvious solution is to use optical flow to align each image to a reference frame. However, due to illumination changes in each frame the usual brightness constancy constraint does not apply (we expect the brightness of a given point on the face to vary dramatically as illumination changes).

Wilson et al. [6] overcame this problem by exploiting a property of the complement images. Assuming no motion, the sum of a gradient image and its complement are equal to the constant image. Hence, an alternative brightness constancy constraint can be written down. For example, for the -gradient images:

(6)

This involves solving for the optical flow vectors for both gradient and complement images in one go. Wilson et al. [6] propose an iterative approach to this problem where they initially compute the flow from to followed by the flow from to , where is the warp computed at the previous step. It is proposed that iterating these two steps converges towards the correct flow for both images.

A weakness of their approach is that is not necessarily a good target for warping. As they are not aligned, taking their difference leads to a blurring of features to which is unlikely to be satisfactorily warped. In an extreme case, it can be shown that this method can fail completely.

We propose an alternative that is both more efficient and more robust. We note that changing the spherical illumination pattern affects only intensity and not colour. Thus we use color space transformations to extract intensity-free information from images with different illumination condition. Hue-Saturation-Value(HSV) and normalized-RGB color spaces are known to be efficient ways of separating intrinsic color from shading related-intensity [23]. For an image , we combine the Hue component of the HSV space with normalized-RGB to produce an illumination-independent image :

(7)
X-Gradient Constant
Fig. 3: Illumination-independent images for Photometric Alignment

Figure 3 shows two images in different spherical gradient lighting patterns (X-gradient and Constant) and the corresponding illumination-independent images.

However, while allowing good alignments on global shapes, the color transformation tends to smooth out fine details which can lead to local misalignments. These are more significant as the motion is not rigid. We correct this by employing our method to initialize Wilson’s method: we use the flow between and to align and before computing . In practice, our experiments show that only one iteration after initialization is enough to get very good alignments. Figure 4 compares normal-maps obtained when photometric images are aligned with 3 iterations of Wilson’s method (a) and only 1 iteration when initialized with our method (b).

(a) (b) (c)
Fig. 4: Normal-maps obtained with different alignment strategies. (a) Wilson’s method after 3 iterations. (b) Our method after 0 iteration (only initialization). (c) Our method after 1 iteration

4.1 Bias Removal

The surface normals estimated using spherical gradient photometric stereo are subject to low frequency bias caused by a number of factors. Perhaps most significant is the discretisation of the illumination environment. The analysis above is based on the assumption that the gradient illumination is a continuous field of illumination. In practice, we use 41 LEDs distributed over a geodesic dome. Second, the method assumes that the light sources are distant and so attenuation effects are constant over the face surface. This is not the case as a head is fairly large (approximately 15cm wide) relative to the size of the geodesic dome (diameter 1.8m). Third, there is also an assumption of no occlusions. Hence, for concave regions of the face, the surface normal is biased towards the unoccluded directions. Other sources of noise such as errors in the light source positions, imperfect specular/diffuse separation and camera nonlinearities further bias the estimated normals.

Many of these effects are pose dependent. For example, the light source positions relative to the face (and hence the discretisation effects) change as the pose of the face changes. For this reason, we perform low frequency bias removal for each photometric view independently prior to stitching. To do so, we use the base mesh provided by multiview stereo to project a depth map into each photometric view. We then combine the high frequency components of the photometric normals with the low frequency components of the depth map normals using the method of Nehab et al. [11]. Finally, we transform the corrected normals into world coordinates by applying a rotation based on the extrinsic parameters estimated for that view by structure from motion.

5 Stitching photometric views

In this section, we describe a method for seamlessly stitching the intrinsic textures and normal maps from each photometric view onto the base mesh obtained with multiview stereo. This is a non-trivial problem.

The constraints of linear polarisation necessitate that the three photometric image sets are taken at different times (the subjects rotate themselves to allow capture of one frontal and two profile views). Hence, the face is likely to have changed shape between views, meaning that there is no single correct shape and that correspondence between images and mesh is imperfect. Moreover, certain reflectance effects introduce a view-dependency on the intrinsic textures. For example, Fresnel gain means that specular albedo is unreliable close to occluding boundary (see Figure 1). Applying a baseline texture stitching algorithm (such as back-projection and averaging) to such data leads to blurring of misaligned features, visible seams between textures taken from different views and inclusion of unwanted specular effects. In addition, we are not aware of any previous work that tackles the problem of stitching normal maps.

To address these problems, we propose a unified approach that allows us to stitch both intrinsic textures and shape. Our approach is based on Poisson blending using non-conservative guidance fields. The guidance fields are either in the form of texture gradients or photometric surface normals. Our approach uses overlapping patches. Within a patch, the guidance field is taken from the single best view (the one with least average viewing angle). In overlap regions, we make per-vertex (for shape) or per-triangle (for texture) selections. The majority of texture stitching algorithms are vertex- or face-based strategies with additional heuristics for robustness. We expect a patch-based approach to improve robustness by allowing selection criteria to be aggregated over a patch. Also, since a patch is taken from a single view, there will be no blending artefacts within a patch while the patch overlaps provide a means to blend between textures taken from different views.

Our stitching pipeline is as follows. We begin by sampling the photometric observations onto the base mesh provided by multiview stereo. For each view, we determine the set of visible vertices on the mesh. Each of the intrinsic textures (diffuse/specular albedo and normal map) are then sampled onto the mesh by back-projection for visible vertices after bilinear interpolation within the pixel grid. Additionally, the viewing angles for each face and vertex are computed as part of the process and stored for later use. We then segment the base mesh into overlapping, uniformly sized patches. Finally, we perform stitching using two techniques based on Poisson blending.

5.1 Mesh segmentation

We achieve mesh segmentation with a classical farthest-point strategy [24], enhanced with an original patch growing scheme to form an overlapping structure. We consider a triangular base shape mesh and assume it describes a 2D manifold . The connectivity is given by a simplicial complex whose elements are vertices , edges or faces , with indices , where is the number of vertices. We write a vertex as for simplicity.

We first select vertices iteratively by adding a new sample one at a time. Our mesh is equipped with a geodesic distance map . Denoting by the geodesic distance map to the first selected samples, we select sample as the vertex that maximizes . The distance map can simply be updated as the minimum between and . We continue this process until a desired number of vertices have been sampled.

Patches are then obtained via the geodesic Voronoi tessellation based on the samples. The segmentation thus defines a dual graph , where , and if and are neighbors, i.e., are connected by an edge . To grow a patch , we consider separately each of its neighbor patches with , and define thresholds as follows:

(8)

where is set by the user and can be seen as an overlap ratio or factor, and the geodesic distance is restricted to the union of the reference patch and considered neighbor. The overlap of onto is then constructed by geodesic projections:

(9)

A given grown patch is eventually constructed by concatenation of the reference patch with the respective overlaps:

(10)
(a) (b)
Fig. 5: Mesh segmentation with different sampling vertices number (left:100; right:400)

5.2 Poisson Blending

Blending in the gradient domain via solution of a Poisson equation was first proposed by Pérez et al. [25] for 2D images. The motivation is that second-order variations in texture are the most significant perceptually whereas low-frequency variations have a barely noticeable effect. The same argument can be made for texture and geometry on a mesh. The approach allows us to avoid visible seams where texture or geometry from different views are inconsistent.

Hence, the idea is to form a guidance field of texture gradients v selected from source images and then solve for the texture whose gradients best match the guidance field:

(11)

This minimisation problem can be solved by solving the Poisson equation:

(12)

where is the Laplace operator and is the divergence operator. For non-conservative guidance fields, an exact solution is not possible so Poisson’s equation is usually solved in a least squares sense. In our case, the function is defined over the mesh surface so is the Laplace-Beltrami operator. In the case of stitching shape, becomes the mesh coordinate function.

5.3 Discrete differential operators

In order to solve a Poisson equation over a triangle mesh, we need to define discrete counterparts to the Laplace and divergence operators. Since we rely on discrete differential operators on the mesh surface, our approach completely preserves conservative vector fields compared to extrinsic 3D finite elements in [26]. This makes our approach more natural from a theoretical perspective, even if non-conservative fields are rather formed in practice.

A discrete vector field is a piecewise constant vector function defined for each triangle by a coplanar vector . A discrete potential field is a piecewise linear function on the mesh surface, where is the piecewise linear basis function valued at vertex and at other vertices, and specifies the value of at vertex . The discrete gradient of for triangle is , where is the gradient of within . The divergence of at vertex is , where is the set of triangles sharing vertex and is the area of triangle . Writing Poisson’s equation in this framework leads to a linear system of equations for the unknown potential values , where:

(13)

This system is sparse since the sum for coefficients is non-null iff (it is an edge). The sum is then simply over the triangles (two if not a boundary edge, one otherwise) sharing this edge.

This equation can be interpreted as seeking for a potential field whose gradient matches the guide vector field . If is conservative, i.e., it is the gradient of an existing potential field , then is the exact solution. Otherwise, a more general minimizer can still be obtained by least squares but its gradient differs from . In addition, we regularize the minimization via screening:

(14)

where and defines a guide potential field as .

5.4 Texture blending

We apply this to solve for texture by considering each color channel independently as a potential field . For each view , we compute the mean viewing angle of vertices in the different patches. Unobserved vertices, due either to occlusion or missing information, are assumed to have a viewing angle of . Hence, patches with unobserved data are penalized and no difference on the nature of non-observability is made. For each patch now, we select texture from the view where the patch has the smallest viewing angle. For unobserved vertices, we also select texture from subsequent sorted views. We end up with partial textures that we stitch in overlaps by Poisson blending. To build up the guide vector field , we select local texture gradients by least angle for each triangle :

(15)

where is the view whose angle is minimal for triangle . We also fill in unobserved faces simply by setting their gradients to zero for smoothness. Screening is done via a rough estimate obtained by averaging textures , unobserved textures being discarded from the regularization. We use a small penalty to remove color offset indeterminacies since we did not observe dramatic color bleeding issues compared to [26]. We show in Figure 6 the results of the gradient stitching on the diffuse and specular textures.

(a) (b)
Fig. 6: Results of our patch-based gradient stitching for (a) diffuse and (b) specular albedo.

5.5 Surface normal blending

Ultimately, our goal is to transfer the detail from the photometric normal maps to the mesh surface. One approach to this problem would be to start by stitching the normal maps from each view into a seamless and complete normal map for the whole face using the texture stitching approach above. Then, the normals could be embossed onto the mesh using an algorithm such as Nehab’s [11]. There are two drawbacks to this approach. First, since normal maps are fields of unit vectors, the stitching must preserve unit length. Hence, the linear least squares solution used for textures would need to include quadratic equality constraints. This amounts to a quadratically constrained quadratic program which is no longer a convex optimisation problem. Second, the stitched texture will not necessarily correspond to a real surface. That is to say, the normals would not satisfy an integrability constraint.

We solve both of these problems by proposing a method to simultaneously stitch the normals and transfer the detail to the mesh. We do so using the same patch-based approach as for texture data and hence provide a unifying framework for Poisson blending both texture and shape using patches.

Instead of stitching in the surface normal domain, we solve for the mesh whose surface normals best fit those selected from the photometric normal maps by the patch-based selection approach. Our guidance field takes the form of per-vertex surface normals. We begin by writing the Laplace-Beltrami operator as applied to the mesh coordinate function at a vertex and note the relationship to the surface normal direction:

(16)

where the weights are , is the area of the Voronoi cell of and and are the two angles opposite the edge .

We cannot directly apply mesh editing techniques to our problem. If the mean curvature normal was known at each vertex, our problem would simply be a Laplacian mesh editing problem. Instead, we know only the unit surface normal. However, we propose a linearisation inspired by the direct linear transformation (DLT) algorithm

[27]. We can obtain a linear system of equations by noting that the Laplacian coordinates and surface normal differ only by a scale factor:

(17)

where denotes equality up to a non-zero scalar multiplication. Such sets of relations can be solved using the DLT. The idea is to minimise the cross product between the differential coordinates and the target normals. This has the nice property of giving higher weight to regions of high curvature so our method will seek to preserve high frequency detail. Accordingly, we write

(18)

where and is the cross product matrix:

(19)

Hence, each vertex normal contributes three linear equations, leading to a large, sparse system of linear equations. The solution is ambiguous up to a scale factor and we wish to retain the low frequency characteristics of the base shape. Hence, as for texture, we add a screening term that penalises departure from the vertex positions of the base mesh. In practice, we give the screening term a very low weight to maximise detail transfer from the normal maps.

In Figure 7 we compare Poisson blending of the surface normals with naive back projection of the least angle patch. Without the blending, the seams are clearly visible at patch boundaries when the view from which they are selected changes. After Poisson blending, we show the normals of the refined mesh where it is clear the transition between views is smooth.

(a) (b)
Fig. 7: Normal stitching before (a) and after (b) Poisson blending

5.6 View Dependant Fresnel Gain

The measured specular albedo has a view dependency which makes it unreliable when the viewing angle is large, particularly when it is close to glancing angles. This is because the proportion of light that is specularly reflected is dependent on the angle of incidence following Fresnel’s equations. The effect is that specular albedo appears amplified towards the occluding boundary, so called “Fresnel gain”.

In applications where the whole specular albedo is to be used, it is important to correct these Fresnel effect to achieve multi-view photometric consistency. Previous work has either simply cropped the face to exclude regions of unreliable specular reflectance [28] or attempted a data-driven correction process. Ghosh et al. [19] took this latter approach. They bin specular albedo values into a histogram as a function of viewing angle and fit a smooth function to the measured data. Specular albedo is then scaled at each vertex down to the average gain at zero-view angle. The problem with this approach is that it makes the assumption that all specular parameters, including specular albedo, are constant over the face surface. Although the approach succeeds in removing extreme values close to the boundary, its accuracy is questionable and the resulting specular albedo is unlikely to be an accurate measurement of an intrinsic property.

We take an alternative pragmatic approach which in practice yields seamless specular albedo maps without the lack of physical motivation or fragility of applying a correction function. Since our patch selection is based on least viewing angle, the patches selected by our stitching method typically have average viewing angle radians for all patches. At these viewing angles, Fresnel gain is negligible and we therefore prevent selecting patches containing regions at grazing angles. This is an advantage of using multiple photometric views: we ensure that all regions of the face are observed at small viewing angles in at least one image. A stitched specular albedo map is shown in Figure 6(b). Artefacts due to Fresnel gain at the boundary are successfully removed.

6 Experimental Results

We now present results of applying our face capture pipeline to a set of faces of varying age, ethnicity and gender. Our results are obtained using a custom built light stage comprising 41 ultra bright white LEDs mounted on a geodesic dome of diameter 1.8m. The photometric camera is a Nikon D200 in front of which we mount an LC-Tec FPM-L-AR optoelectric polarising filter. Each LED has a rotatable linear polarising filter in front of it. Their orientation is tuned by placing a sphere of low diffuse albedo and high specular albedo (a black snooker ball) in the centre of the dome and adjusting the filter orientation until the specular reflection is completely cancelled in the camera’s view. LED brightness is controlled via PWM from an MBED micro controller which also controls camera shutters and the polarisation state of the photometric camera. The multiview cameras are Canon 7Ds. A complete capture sequence takes around 3 seconds and this is repeated for three poses.

Specular Normals Rendering
Fig. 8: Detail renderings of different facial regions.

6.1 Intrinsic texture stitching

In Figure 9 we show the results of our patch-based texture stitching process. In this case, we show the results of stitching diffuse albedo maps from the three photometric views. In all results we use a segmentation consisting of 100 patches. On the left, we show a result where non-overlapping patches are copied directly from the best view without blending. In the mid- dle, we show a result where patches overlap but no blending is performed (textures are averaged in the overlapping regions). There are clear artefacts associated with boundaries between patches and (in the middle column) loss of detail due to averaging. On the right, our stitched result contains no patch boundary artefacts yet retains sharp detail over the whole face surface.

No Blending, No Overlap No Blending, with Overlap Overlap and Blending
Fig. 9: Patch-based texture stitching with different configurations.
Base Mesh Diffuse Normals Specular Normals
Fig. 10: Normal stitching/mesh refinement using diffuse normals (middle column) or specular normals (right column).

6.2 Normal stitching and detail transfer

To refine the base mesh either the diffuse or specular surface normals can be used. Figure 10 shows the results of using each. Note that the base mesh provided by multiview stereo is coarse and noisy. Once the mesh has been refined in order to match the stitched normals, it is evident that the diffuse normal maps tend to produce smother a mesh while the specular normal maps yield more surface details. This is consistent with the findings of Ma et al. [29] and is explained by the fact that the surface reflectance from which the normals are estimated has different characteristics depending on whether it is diffuse or specular. In the diffuse case, the reflected light is considerably affected by subsurface scattering. We note that our result with the specular normals achieves a very high level of detail over the whole face surface.

6.3 Rendering

Finally, we present rendering results to demonstrate the quality of our captured face models. We render using the Cook Torrance model [30] and the hybrid normals technique proposed by Ma et al. [29] where the estimated specular and diffuse surface normals are used to shade respectively the specular and diffuse components of the BRDF. We use two slopes of Beckmann functions to model the micro-facets distribution. In this work we assume constant roughness parameters and refraction index across the face. In Figure 8 we show renderings of face details which highlight the successful capture of face microgeometry along with reflectance properties necessary for a photorealistic effect. In Figure 11 we show the captured geometry and renderings using the captured reflectance properties for a range of faces. In spite of using a simple reflectance model and making strong assumptions about specular parameters being fixed over the face surface, we are still able to achieve highly realistic appearance. Our approach is able to cope with facial hair.

7 Conclusions

In this work we present a practical 3D face acquisition approach that allows the capture of an ear-to-ear mesh along with the skin micro-geometry and reflectance properties. Our system requires no prior geometrical calibration. The cameras parameters are obtained by structure-from-motion and are used to estimate a base mesh which is further refined using the recovered photometric surface normals. To achieve an ear-to-ear coverage of the face and overcome the problem of fixed photometric viewpoint inherent in polarized spherical gradient illumination, we capture the face in three poses and robustly stitch the corresponding normal maps and intrinsic textures into a seamless, complete and detailed face model.

While providing a practical way to bypass the view-dependency issue inherent to polarized spherical gradient illumination, our multi-pose approach requires capturing the subject in three different poses which slightly lengthens the capture process and could be a handicap to tasks like expression or performance captures.

We aim at tackling this issue in future work by augmenting our setup with two additional photometric views matching the two profile poses. At each vertex of the dome, the number of polarized light can be locally multiplied such that the incident illumination from each vertex can be produced independently by more than one source. This would be different to the longitude/latitude approach proposed by Ghosh et al. [19] in the fact that, instead of using a locally orthogonal polarisation pattern, we aim at simply assigning to each view an independent group of polarized lights. This will allow exact separation of diffuse and specular reflections to be retained. Such an approach will also enable multiview photometric constraints to be exploited in the shape reconstruction process.

Acknowledgments

We are grateful to Hadi Dahlan for assistance with data collection and to Fufu Fang for assistance with lightstage calibration and programming.

Fig. 11: Geometry mesh and renderings of different subjects

References

  • [1] W.-C. Ma, T. Hawkins, P. Peers, C.-F. Chabert, M. Weiss, and P. Debevec, “Rapid acquisition of specular and diffuse normal maps from polarized spherical gradient illumination,” in Proc. Eurographics Symposium on Rendering, 2007.
  • [2] V. Blanz and T. Vetter, “A morphable model for the synthesis of 3D faces,” in Proc. SIGGRAPH, 1999, pp. 187–194.
  • [3] R. Basri and D. W. Jacobs, “Lambertian reflectance and linear subspaces,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 2, pp. 218–233, 2003.
  • [4]

    V. Blanz and T. Vetter, “Face recognition based on fitting a 3D morphable model,”

    IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 9, pp. 1063–1074, 2003.
  • [5] F. Jiang, L. Dricot, V. Blanz, R. Goebel, and B. Rossion, “Neural correlates of shape and surface reflectance information in individual faces,” Neuroscience, vol. 163, no. 4, pp. 1078–1091, 2009.
  • [6] C. A. Wilson, A. Ghosh, P. Peers, J.-Y. Chiang, J. Busch, and P. Debevec, “Temporal upsampling of performance geometry using photometric alignment,” ACM Trans. Graphic. (Proceedings of SIGGRAPH), vol. 29, no. 2, 2010.
  • [7] J. D. Bustard and M. S. Nixon, “3D morphable model construction for robust ear and face recognition,” in Proc. CVPR, 2010, pp. 2582–2589.
  • [8] A. Ghosh, T. Chen, P. Peers, C. A. Wilson, and P. Debevec, “Estimating specular roughness and anisotropy from second order spherical gradient illumination,” Computer Graphics Forum (Proceedings of EGSR), vol. 28, no. 4, pp. 1161–1170, 2009.
  • [9] ——, “Circularly polarized spherical illumination reflectometry,” ACM Trans. Graphic. (Proc. of SIGGRAPH Asia), vol. 29, no. 6, 2010.
  • [10] S. Zafeiriou, M. Hansen, G. Atkinson, V. Argyriou, M. Petrou, M. Smith, and L. Smith, June 2011, pp. 132–139.
  • [11] D. Nehab, S. Rusinkiewicz, J. E. Davis, and R. Ramamoorthi, “Efficiently combining positions and normals for precise 3D geometry,” ACM Trans. Graphic. (Proceedings of SIGGRAPH), vol. 24, no. 3, pp. 536—543, 2005.
  • [12] R. J. Woodham, “Photometric method for determining surface orientation from multiple images,” Opt. Eng., vol. 19, no. 1, pp. 139–144, 1980.
  • [13] Y. Furukawa and J. Ponce, “Dense 3d motion capture from synchronized video streams,” in Proc. CVPR, 2008.
  • [14] C. Zhang, Q. Cai, P. A. Chou, Z. Zhang, and R. Martin-Brualla, “Viewport: A distributed, immersive teleconferencing system with infrared dot pattern,” MultiMedia, IEEE, vol. 20, no. 1, pp. 17–27, 2013.
  • [15] B. Moghaddam, J. Lee, H. Pfister, and R. Machiraju, “Model–based 3–D face capture with shape–from–silhouettes,” in Proc. IEEE Work. Analysis and Modeling of Faces and Gestures, 2003, pp. 20–27.
  • [16] T. Beeler, B. Bickel, P. Beardsley, B. Sumner, and M. Gross, “High-quality single-shot capture of facial geometry,” ACM Trans. Graphic. (Proceedings of SIGGRAPH), vol. 29, no. 3, 2010.
  • [17] C. Wu, Y. Liu, Q. Dai, and B. Wilburn, “Fusing multiview and photometric stereo for 3D reconstruction under uncalibrated illumination,” IEEE Trans. Vis. Comp. Gr., vol. 17, no. 8, pp. 1082–1095, 2011.
  • [18] J. Park, S. n Sinha, Y. Matsushita, Y.-W. Tai, and I. S. Kweon, “Multiview Photometric Stereo using Planar Mesh Parameterization,” in ICCV.   International Conference on Computer Vision, December 2013. [Online]. Available: http://research.microsoft.com/apps/pubs/default.aspx?id=207997
  • [19] A. Ghosh, G. Fyffe, B.Tunwattanapong, J. Busch, X. Yu, and P. Debevec, “Multiview face capture using polarized spherical gradient illumination,” ACM Trans. Graphic., vol. 30, no. 6, p. 129, 2011.
  • [20] C. A. Wilson, A. Ghosh, P. Peers, J.-Y. Chiang, J. Busch, and P. Debevec, “Temporal upsampling of performance geometry using photometric alignment,” ACM Trans. Graph., vol. 29, no. 2, pp. 1–11, 2010.
  • [21] N. Snavely, S. M. Seitz, and R. Szeliski, “Photo Tourism: Exploring Photo Collections in 3D,” ACM Trans. Graph., vol. 25, no. 3, pp. 835–846, Jul. 2006. [Online]. Available: http://doi.acm.org/10.1145/1141911.1141964
  • [22] Y. Furukawa and J. Ponce, “Accurate, Dense, and Robust Multiview Stereopsis,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 32, no. 8, pp. 1362–1376, Aug 2010.
  • [23] S. Mallick, T. Zickler, D. Kriegman, and P. Belhumeur, “Beyond Lambert: reconstructing specular surfaces using color,” in

    Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on

    , vol. 2, June 2005, pp. 619–626 vol. 2.
  • [24] G. Peyré and L. D. Cohen, “Geodesic remeshing using front propagation,” Int. J. Comput. Vis., vol. 69, no. 1, pp. 145–156, Aug. 2006.
  • [25] P. Pérez, M. Gangnet, and A. Blake, “Poisson image editing,” ACM Trans. Graphic. (Proceedings of SIGGRAPH), vol. 22, no. 3, pp. 313–318, 2003.
  • [26] M. Chuang, L. Luo, B. J. Brown, S. Rusinkiewicz, and M. Kazhdan, “Estimating the Laplace-Beltrami operator by restricting 3D functions,” Comput. Graph. Forum, vol. 28, no. 5, pp. 1475–1484, Jul. 2009.
  • [27] R. Hartley and A. Zisserman, Multiple view geometry in computer vision.   Cambridge university press, 2003.
  • [28] G. Stratou, A. Ghosh, P. Debevec, and L. Morency, “Effect of illumination on automatic expression recognition: a novel 3d relightable facial database,” in Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on.   IEEE, 2011, pp. 611–618.
  • [29] W. C. Ma, T. Hawkins, P. Peers, C. F. Chabert, M. Weiss, and P. Debevec, “Rapid acquisition of specular and diffuse normal maps from polarized spherical gradient illumination,” Eurographics Symposium on Rendering, pp. 183–194, 2007.
  • [30] Cook R.L. and Torrance K.E., “A Reflectance Model for Computer Graphics,” ACM Trans. Graph., vol. 1, no. 1, pp. 7–24, Jan. 1982.