Photometric stereo for strong specular highlights

by   Maryam Khanian, et al.

Photometric stereo (PS) is a fundamental technique in computer vision known to produce 3-D shape with high accuracy. The setting of PS is defined by using several input images of a static scene taken from one and the same camera position but under varying illumination. The vast majority of studies in this 3-D reconstruction method assume orthographic projection for the camera model. In addition, they mainly consider the Lambertian reflectance model as the way that light scatters at surfaces. So, providing reliable PS results from real world objects still remains a challenging task. We address 3-D reconstruction by PS using a more realistic set of assumptions combining for the first time the complete Blinn-Phong reflectance model and perspective projection. To this end, we will compare two different methods of incorporating the perspective projection into our model. Experiments are performed on both synthetic and real world images. Note that our real-world experiments do not benefit from laboratory conditions. The results show the high potential of our method even for complex real world applications such as medical endoscopy images which may include high amounts of specular highlights.



There are no comments yet.


page 3

page 7

page 8

page 9

page 10

page 11

page 13

page 14


Real-Time 3D Shape of Micro-Details

Motivated by the growing demand for interactive environments, we propose...

3D Reconstruction from Full-view Fisheye Camera

In this report, we proposed a 3D reconstruction method for the full-view...

A Multiple-View Geometric Model for Specularity Prediction on Non-Uniformly Curved Surfaces

Specularity prediction is essential to many computer vision applications...

LUCES: A Dataset for Near-Field Point Light Source Photometric Stereo

Three-dimensional reconstruction of objects from shading information is ...

Leveraging Spatial and Photometric Context for Calibrated Non-Lambertian Photometric Stereo

The problem of estimating a surface shape from its observed reflectance ...

A Variational Approach to Shape-from-shading Under Natural Illumination

A numerical solution to shape-from-shading under natural illumination is...

Augmenting reality: On the shared history of perceptual illusion and video projection mapping

Perceptual illusions based on the spatial correspondence between objects...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The reconstruction of three dimensional (3-D) information at hand of two dimensional images is a classic problem in computer vision. Many approaches exist to tackle the task, as documented by a rich literature and a number of excellent monographs, among them let us mention here [14, 43, 45]. Let us also mention the survey [16] on 3-D reconstruction methods that may be more oriented towards the computer graphics community. Following [45] one may distinguish approaches based on the point spread function as in depth from focus or defocus [49], triangulation-based methods such as stereo vision [8] or structure from motion [41] and intensity-based or photometric methods such as shape from shading or photometric stereo [14]

. An abundance of specific approaches exist that may be roughly classified at hand of the mentioned proceeding. Generally speaking these may be distinguished depending on the type of image data, the number of acquired input images, or if the camera or objects in the scene may move or not. As an example let us mention here techniques based on specular flow

[2, 10, 32] relying on relative motion between a specular object and its environment.

Focusing on photometric approaches, as mentioned by Woodham [46] and Ihrke   [16] these typically employ a static view-point and variations in illumination to obtain the 3-D structure. While shape from shading is the corresponding photometric technique classically making use of just one input image cf. [14], photometric stereo (PS) allows to reconstruct the depth map of a static scene from several input images taken from a fixed view point under different illumination conditions. The pioneer of the PS problem was Woodham in 1978 [46], see also Horn   [15]. Woodham derived the underlying image irradiance equation as a relation between the image intensity and the reflectance map. It has been shown that the Lambertian surface orientation can be uniquely determined from the resulting appearance variations provided that the surface is illuminated by at least three known, non-coplanar light sources, each for an individual input image [47].

As it is for instance also recognized in [16], most of the later approaches have followed Woodham’s idea and kept two simplifying assumptions. Of particular importance, the first one supposes that the surface reflects the light according to Lambert’s law [18]. This simple reflectance model can still be a reasonable assumption on certain types of materials, when the scene is composed of matte surfaces, but fails for shiny objects concentrating light distributions. Such surfaces can readily be seen in real world situations. It is quite well proved that a light source illuminating a rough surface, reflects a significant part of the light as described by a non-Lambertian reflectance model [35, 6, 4]. In such models the intensity of reflected light depends not only on the light direction but also on the viewing angle, and the light is reflected in a mirror-like way accompanied by a specular lobe. The second assumption in classic PS models is that scene points are projected orthographically during the photographic process. This is a reasonable assumption if objects are far away from the camera, but not if they are close in which the perspective effects grow to be important. The importance of using the perspective projection in such a situation has been demonstrated in the computer vision literature, in the context of photometric methods let us refer for instance to the work [39] where a corresponding example is discussed in detail.

Many studies in PS considered the non-Lambertian effects as outliers and tried to remove them. Mukaigawa  

[26] suggested a random sample consensus based approach where only diffuse reflection is selected from among the candidates. Mallick   [20] introduced a rotation transformation for transforming the RGB color channel to a SUV color channel with the specular channel S and diffuse channels UV. Then, the specular channel S is used for removing specularities. Chanki   [50] introduced a strategy based on a maximum feasible subsystem approach. In their method, the maximum subset of images satisfying the Lambertian constraint is obtained among the whole set of PS images that include non-Lambertian effects like specularities. A median filtering technique is illustrated by Miyazaki   [25] to evade the influence of specular reflections which they considered as outliers. Another method relying on this concept is presented by Tang   [36] who proposed a coupled Markov Random Field based on treating the specularities and shadows as noise. Wu   [48] considered the 3-D recovery problem using a convex optimization technique for separating specularities as deviations from the basic Lambertian assumption in the objective function. Smith and Fang [33] used a model-based approach that excludes observations that do not fit the Lambertian image formation model. Hertzmann and Seitz [13]

employed some reference objects which are considered to be of homogeneous material for simplicity, meaning that purely specular or purely diffuse materials are addressed. In some other works more complex appearance models are fitted to estimated data, thereby relying e.g. as in the work of Goldman  

[11] on the use of a convex combination of a small number of known materials, or as in the paper of Oxholm and Nishino [27] on a probabilistic formulation for linking geometry and lighting estimation by introducing priors.

Figure 1: Highly specular photometric stereo setup illustrated by a complex synthetic experiment. In real-world, surfaces show both specular and diffuse reflections. This surface is illuminated by three non-coplanar light sources with both specular and diffuse lights. Shading due to each light is captured in a perspective CCD camera. As can be seen, considering all stated assumptions, we are able to recover shape with high degree of surface details.

Regarding the perspective projection, one of the first works combining this technique with PS is performed by Galo and Tozzi [9]. Their work relies on considering point light sources proximate to the lighted object surface. A perspective PS model is also proposed by Tankus and Kiryati based on Lambertian reflection [37]

. A technically different perspective method for Lambertian PS using hyperbolic partial differential equations (PDEs) is presented by Mecca  

[24]. Turning to the use of non-Lambertian surface reflectance to account for specular highlights in photometric methods, we may note that the investigation of a shape-from-shading method using the Phong model has been shown to give very reasonable results when employing it within a useful process chain [44]. Therefore it seems apparent that an extension to PS making use of a similar image irradiance equation may yield even better results given that in PS more input images than in shape from shading are at hand.

Concerning perspective PS techniques that may also deal with non-Lambertian effects, the recent works of Mecca   [23, 22] should be mentioned. In these approaches, an individual model for PS is suggested by considering separated purely specular and purely Lambertian reflections using five and ten input images, respectively. The separated processing of the reflectance models requires input images with the minimum value of saturation [42] which may lead to cumbersome limitations for some real world applications as e.g. in the case of spatially varying materials. When solving the resulting hyperbolic PDEs, Mecca   [23] rely on the fast marching method. In order to apply this technique, the unknown depth value of a certain surface point must be given in advance. However, this information is not always available especially in real-world applications. Let us note that a similar approach as in the works of Mecca and his co-authors is also applied in the orthographic PS method in [42] that is based on dividing the surface into two different, purely specular and purely diffuse parts, which is a difficult task as also mentioned in [42].

Our contributions.

The novel method we propose involves the conceptual advantages of considering perspective projection and non-Lambertian reflectance simultaneously based on the complete Blinn-Phong model known from computer graphics [5, 30]. By taking into account the complete reflection model our method does not rely on a separation of specular and diffuse reflection in any stage of the computation. In particular, no surface or scene dividing task or previous knowledge on the depth of the scene is required. As a side effect our method is inherently able to handle objects with spatially varying materials without modification. In addition, it is worth to note that we will use three input images in all our experiments which is the minimum necessary inputs for the classic orthographic PS framework with Lambertian reflectance model. In recent work it has been discussed that the use of three input images can be advantageous [42].

Involving the mentioned model assumptions leads to a concrete PS algorithm as sketched in Fig. 1. By the combination of the mentioned benefits we propose a more robust and effectively easier to use method as in previous literature. As a side note, since the complete Blinn-Phong model we employ is extensively studied in computer graphics, the surface reflectance in input images as well as expected computational results are potentially easier to interprete than in methods that rely on complex preprocessing steps. Conceptually our work extends the approach presented by Khanian   [17]. A main point of the latter conference article is to study the effect of lightening directions on numerical stability while the presentation is restricted there to one spatial dimension.

In addition to presenting an appealing alternative PS approach, we investigate two different methods of realizing the perspective projection. The first method is to compute the normal field and then modifying the gradient field based on the perspective projection which is also proposed in [31, 28]

. As it manipulates the normal vectors, we refer to this technique as the perspective projection based on normal field (PPN) method. The second method is to consider a perspective parameterization of photographed object surfaces for getting the gradient field of the surface. We call this approach the perspective projection based on surface parameterization (PPS) method. Furthermore, we investigate the effect of modeling a camera with the charge-coupled device sensor (CCD camera) on the reconstruction process and the quality of results.

2 Perspective Projection

In this section we introduce two different techniques applied for obtaining the perspective projection. As we will also consider for experimental comparison a Lambertian perspective PS model, we also recall its construction here. A Lambertian scene with albedo is illuminated from directions , where by corresponding point light sources at infinity, with diffuse intensity so that it satisfies the following reflectance equation [14]:


where is the diffuse material parameter, and are the intensity and surface normal at pixel , respectively.

2.1 Modifying Normal Vectors

The first perspective projection method deals with processing the field of normal vectors . Once the normal map is reconstructed from the orthographic image irradiance equations, the depth map is recovered by giving the following components in Eq. (2) and Eq. (3) to the integrator:


where for a camera with the focal length is:


In what follows, we denote the perspective projection realised via projection of the normal vector with PPN.

Figure 2: Perspective projection of the real point to the image plane .

2.2 Direct Perspective Surface Parameterization

Another approach to apply the perspective projection is via corresponding surface parameterization, shown in Fig. 2. In order to project the real-world point to the point on the image plane , we will consider the Thales theorem in both horizontal red and vertical blue triangles. So, we will have:


On the other hand, in reality the image plane lies behind the lens. Therefore, the surface is parameterized using the following formulation, where is the focal length.
For all points in as the image plane:


From this surface parameterization, we can extract the partial derivatives of the surface:


Finally, we get the surface normal vector as the cross product of the partial derivatives of the surface:


So, in this case, the obtained surface normal (8) will be used in image irradiance equation. We recall here the Lambertian perspective image irradiance equation [37], as this will be extended in our model.

In order to remove the dependency of the image irradiance equation on the unknown depth , it will be substituted by , , so that we have to apply to obtain the depth out of our new unknown . This yields:


A closed form solution for the gradient field is obtained in [37]. For completeness of the presentation, we now recall the main points in its construction. Let us consider three input images (the minimum needed inputs in classic PS). By finding from the first image irradiance equation in (9), and replacing it in the second and third image irradiance equation, a linear system of equations should be solved for obtaining the unknown vector :


where, we have with :


The explicit solutions are:


Now, we can obtain the albedo of the surface by plugging the resultant gradient vector for instance into the following equation:


2.3 Sensitivity of the Solution

Let us try to access the sensitivity of the solution with respect to the lighting directions, which may lead to conditions on the illumination. To this end, the non-singularity condition of the matrix of coefficients introduced in the previous paragraph should be explored. So, after computing the determinant of and considering the non-singularity condition , the non-singularity can be assured in virtually all cases by ensuring that the contributing terms are not zero. This idea leads to the indicator:


The first three expressions imply the linear independence of light directions and it can be also obtained from the non-singularity condition of the light directions matrix. The other resultant expressions are different and satisfying all of them may not be an easy task. Consequently, the sensitivity of the solution to the lightening can be higher than in the PPN approach.

Specular reflection Diffuse reflection Blinn–Phong model
Figure 3: Left and middle: Specular and diffuse reflections. Both reflections can exist simultaneously in different parts of a real-world object. Right: Vectors and angles applied in the Blinn-Phong model.

3 Blinn-Phong Reflectance Model

Let us introduce the Blinn-Phong reflectance model for addressing the issue of specular reflections of non-Lambertian materials. A useful reflectance model giving an approximation for real-world surface reflectance is considering additionally to the Lambertian reflectance a specular term as introduced in the model of Blinn-Phong [5, 30]. We stress the world ”additionally” since in reality, most of the objects show both of these reflections in different areas, cf. Fig. 3. Therefore, they include both reflection models at the same time. In the Blinn-Phong model, angle of incidence and also the angle between the vector and the vector (halfway vector of the light and viewing direction) are applied as shown in Fig. 3. Now we consider the Blinn-Phong model under the perspective projection. To this end, we apply again the two different mentioned perspective approaches. The basic and complete Blinn-Phong image irradiance equation is defined as:


Here is the specular material parameter. is the specular light source intensity and the exponent is also called the specular sharpness or shininess.

A corresponding orthographic model has been investigated in the shape-from-shading context in [7]. To develop the perspective Blinn-Phong PS model, we focus on the surface parameterization and plug in the perspective normal (8) in (19).

Considering input images for corresponding lighting directions, this yields after some computation the perspective Blinn-Phong reflectance equations as:




3.1 Numerical approach

Now, we present the numerical procedure which can be applied for addressing such a highly nonlinear system of equations. Recalling the description of a system of equations as , where is a given function by the equations from (20), we will discuss our solution procedure.

In order to cope with such a nonlinear system of equations, we applied the Levenberg-Marquardt method introduced in [19, 21] as a combination of the Gauß-Newton method and steepest descent direction technique. In this method, if is the point at iteration , the next iteration can be computed as:


with .

The matrix is positive definite and is well-defined. In addition, this method does not need the conditions such as the invertibility of Jacobian matrix or Hessian matrix or .

Our numerical approach for the PPS method is based on the following problem formulation. Recalling the perspective Blinn-Phong reflectance equations (20), and dividing three equations (, , , corresponding to the three used images in our method) leads to a non-linear system of equations, with the equations like the following equation (27) as obtained for dividing the and images:




It should be noted that even in this case of existing specularities and in the process of solving the perspective PS system for the Blinn-Phong model (20), we will still follow Woodham and make use of only three input images.

Furthermore, as for the case of Lambertian PS, we will also deal with the Blinn-Phong model using the perspective version based on transforming the normal vectors (PPN method), i.e. after orthographic Blinn-Phong PS. Finally, the obtained gradient fields are processed by the Poisson integrator, see e.g. [3] for a recent account of surface normal integration.

Figure 4: Comparison of the surface reconstruction techniques. Left: Input image. Middle: Our 3-D reconstruction using orthographic projection. Right: Our 3-D reconstruction by perspective projection. It can be observed that the perspective approach is able to generate a more compatible result with respect to the original image.
Figure 5: Comparing two described perspective methods regarding their depth reconstructions.

4 CCD Cameras

We will also investigate the modeling of the CCD camera. In the case of CCD cameras, the following projection mapping is used as presented in [12]. The matrix


contains the intrinsic parameters of the camera, namely the focal length in and direction equal to and , with the sensor sizes and and the principal point or focal point . The parameter

is called skew parameter. Here, we neglect this parameter since it will be zero for most of normal cameras

[12]. Using this matrix, we will introduce the following transformation to convert the dimensionless pixel coordinate to the image coordinate as follows:


By applying the above-mentioned transformation, the following representation for the projected point will be obtained:


The effect of this modeling can be potentially interesting, since this information is not always accessible. The above transformation is called centerizing in the experiments.

Figure 6: Set of three test images used for our 3-D reconstruction: (a) Real scene used for reprojecting; (b) and (c) are rendered images used for our 3-D reconstruction in presence of specularity.
Figure 7: An account of reprojected Beethoven images. Left: Second input image for PS. Middle: Reprojected second image obtained from PPN method. Right: Reprojected second image using the PPS technique.
Perspective method MSE for 1st input MSE for 2nd input MSE for 3rd input
PPN method 0.004239 0.003297 0.007535
PPS method 0.008042 0.021409 0.007644
Table 1: Comparison between MSE of the reprojected Beethoven images from two described perspective methods of PPN and PPS.

5 Experiments

This section describes our experiments performed by the proposed model and approaches. In a first test we confirm the investigation of Tankus   [39] that the use of an orthographic camera model may yield apparent distortions in the reconstruction while a perspective model may take the geometry better into account, see the experiment documented in Fig. 4. This justifies the use of the perspective camera model. Note that in the figure the object of interest is relatively close to the camera.

In a series of tests we now turn to quantitative evaluations of the proposed computational models. To this end we consider the set of test images for use in the next experiments as shown in Fig. 6. The Beethoven test images (which depict a real-world scene) and the Sphere images are of the size . The Stanford Bunny test images have a resolution of . Both Bunny and Sphere are rendered using Blender. The 3-D model of Stanford Bunny is obtained from the Stanford 3-D scanning repository [1]. The 3-D model of the face presented in Fig. 9 is taken from [34] with the size of . For comparing our results, the ground truth depth maps are extracted, and we will make use of the Mean Squared Error (MSE) showing the accuracy.

After considering the mentioned test settings, we demonstrate the applicability of our method at hand of real world medical test images from gastro endoscopy and discuss its superior reconstruction capabilities compared to previous models.

5.1 Tests of accuracy

In the first evaluation, we compare results of two mentioned perspective techniques of PPN and PPS, applied on the specular Sphere in Fig. 6 (c) with different values of focal length. MSE results of these 3-D reconstructions are shown in Fig. 5. While obtained results of described perspective methods for some low values of focal lengths are close to each other, PPN perspective strategy outperforms PPS as the focal length increases.

In the second experiment concerned with the Beethoven image set, we investigate the difference between two mentioned perspective approaches on a more complex real-world object scene. To this end, we give in Table 1 the MSE comparing gray value data of the reprojected and input images. Since in this case the ground truth depth map is not available, we reconstruct the reprojected images by obtaining the gradient fields from the mentioned perspective approaches and replacing them in the Lambertian reflectance equation. It can be deduced from Table 1 that reprojecting from PPS method reaches a close accuracy regarding the third input image, while the PPN approach achieves higher accuracy in terms of the first and especially second input image.

As the reprojected images in Fig. 7 show, the difference between these methods as given in Table 1 can be quite significant. Furthermore, it is indicative of higher sensitivity of the PPS method to the lightening than the PPN approach.

5.2 Perspective methods and CCD camera model

Table 2 and Fig. 8 present the results of our 3-D reconstructions for highly specular input images as shown in Fig. 6 (b) and Fig. 6 (c), respectively. In order to produce such images, we set non-zero intensities for diffuse and also specular light. Furthermore, the objects include both diffuse ans specular reflections.

Figure 8: First and second row: Left: Groud truth. Middle: depth reconstruction from complete Blinn-Phong model with PPN approach. Right: depth reconstruction from complete Blinn-Phong model with PPS approach. These results turn out the proficiency of the proposed method for appealing reconstruction of the images including strong specularities. In addition, PPN approach achieves more faithful reconstructions. Last row: Depth reconstruction from Lambertian model in the presence of specularity accompanied by different perspective projection. Left: PPN approach. Right: PPS method. As it can be seen, the Lambertian model is not able to provide a faithful reconstruction for the specular surface.
Reconstruction by Perspective Blinn-Phong PS shininess centerizing no centerizing
MSE of PPN method for Bunny 0.6 0.4 1.2 1.2 50 0.006355 0.042082
MSE of PPS method for Bunny 0.6 0.4 1.2 1.2 50 0.012318 0.011318
MSE of PPN method for Sphere 0.5 0.5 1.2 1.2 150 0.008264 0.022568
MSE of PPS method for Sphere 0.5 0.5 1.2 1.2 150 0.008431 0.007716
Table 2: MSE of the reconstructed depth from images with specularities by two perspective methods of PPN and PPS. As it is clear, we consider 3-D reconstruction in the presence of both diffuse and specular reflection simultaneously from the surface which leads to involving both and and applying complete Blinn-Phong model. In addition, we applied both diffuse and specular light. Finally, we extended our model to different perspective projection techniques.
Figure 9: First row: Four purely specular input images as applied in purely specular model of [23] and the 3-D reconstruction of the [23] approach which shows deviations especially around the highly specular areas. Second row: Three ordinary input images including both diffuse and specular components as the input of our method and Our 3-D reconstruction. Note that our method does not need the decomposition of the input images into purely diffuse and purely specular components which is a very difficult task even for synthetic images.
Depth reconstruction approach shininess MSE
Proposed method 0.3 0.7 1.2 1.2 50 0.004019
Mecca [23] 0 0.7 0 1.2 50 0.056586
Table 3: MSE of the reconstructed depth from images with high specularities shown in Figure 9.
Figure 10: Test images with high specularity used in realistic real world senario. These images are produced in an endoscopy experiment. So, they do not benefit from any laboratory facilities or confine to the controlled setup conditions.

Values of MSE for 3-D reconstruction show the high accuracy of our depth reconstructions by applying the complete Blinn-Phong model which is accompanied by two presented perspective schemes. On the other hand, while results of the recovered depth map for the sphere are close to some extent, the outcome of the computed depth map for Bunny based on the PPN method obtains higher accuracy. However, the table also illustrates the higher sensitivity of the PPN perspective scheme to centerizing transformation than the PPS perspective method.

Finally, we compare our approach with the Lambertian model which is the most common model applied in PS and also the method presented by Mecca   [23]. Fig. 8 shows the outcome of applying the Lambertian model. The deviation from faithful reconstruction over the specular area of the surface can be seen clearly.

The comparison between our approach and [23] is also shown in Fig. 9. As already indicated, our method applies complete perspective Blinn-Phong model on three images including both diffuse and specular reflections and lights, while the method in [23] uses the specular term in Blinn-Phong model to handle four purely specular images. The excellent result of the proposed method presented in Fig. 9 (b) over the high value of specularity with the absence of any deviation or artifact shows that the proposed method outperforms state-of-the-art approaches such as in [23]. The MSE values of 3-D reconstruction associated with experiments in Fig. 9 are also illustrated in Table 3.

5.3 Tests of applicability on real-world test images

This section describes experiments conducted by the proposed approach on realistic images. Let us first turn to some real-world medical test images. It should be noticed that we may also call these images realistic because we did not benefit from a controlled setup or additional laboratory facilities. We used just the images that are available as in any kind of medical (or many other real world) experiments. Let us note that experiments with endoscopic images are well known to yield a challenging test for photometric methods, and they are widely accepted for indicating possible medical applications of photometric approaches, see e.g. [38, 40]. As for our work, the usefulness of computational results for the indicated, concrete medical application is confirmed via collaboration with specialized medical doctors.111Let us mention as a reference the collaboration with Dr. Mohammad Karami H. ( who is a gastroenterologist and internal medicine specialist at Shahrekord University of Medical Science (Iran).

We have performed trials on endoscopic images in which existence of high specularities is unavoidable. Input images are presented in Fig. 10 (a) and Fig. 10 (b) which are endoscopies of the upper gastrointestinal system. Their 3-D reconstructions are represented in Fig. 11 and Fig. 12.

Similar to all the previous experiments only three input images are used. All outputs are displayed with an identical viewpoint enabling their visual comparison. The first column in Fig. 11 is indicative of the deviation in the Lambertian result. As it is visible in the cropped region in Fig. 10 (a), marked in the rectangular part, the beginning and end points of all three folds (marked by A, B and C) should be at about the same level, instead a drastic deviation is showing up at the left side of the surface in results obtained by applying Lambertian reflectance model as also indicated by the blue area in the corresponding depth map.

However, this deviation is rectified by applying the complete Blinn-Phong model accompanied by PPS as can be seen in the second column of the Fig. 11 and also entirely corrected using this model with PPN approach represented in the third column. Furthermore, three folds of the surface are reconstructed very well in the Blinn-Phong outcomes. This obviously desirable complete reconstruction of those folds cannot be seen in the Lambertian output.

Finally, as also the color alteration in the second row of Fig. 11 shows, high frequency details are recovered as well in the Blinn-Phong outputs especially in the case of PPN approach.

These reconstruction aspects are again clearly observable in another endoscopy image depth reconstruction in Fig. 12 which are the depth resulting from inputs as in Fig. 10 (b). Once more, a deviation from the desirable output shape appears in the Lambertian outcome especially in the left corner side. This part of the surface, which is marked by (C) in the input and 3-D resulting images, has a cavity toward the up-side in reality, which is reconstructed well by the Blinn-Phong outputs in contrast to the Lambertian result. The latter apparently provides a reconstruction completely on the opposite side for this region of the original surface. Let us pay attention also to the second row in Fig. 12. A curved line of the upper corrugated region (A) is obtained in the right corner of the Blinn-Phong outputs, whereas this region is just a straight line in the right corner of the Lambertian outcome. The height of corrugated regions are obviously more faithfully reconstructed in the Blinn-Phong results compared to the Lambertian one.

Last but not least, it is worth to mention that the viewing angle of the endoscopy cameras is very tight. Using cropped parts of those images in our experiments makes this experiment a highly challenging task of 3D reconstruction. The success of our approach to reconstruct such a tiny range of the depth values without any knowledge about photographic conditions reveals the capability of our proposed method in challenging real world applications.

In another test with real-world input images, we compared our method with the approach used in [37] by making use of the input images depicted as Fig. 2 (a), Fig. 2 (b) and Fig. 2 (c) in [37]. The surface is a plastic mannequin head, and the plastic material itself shows specularities. It is well-known in computer graphics that plastic is a material that can be readily rendered by using the Blinn-Phong model [29].

The depth reconstructions obtained by our technique and method of Tankus for those real-world images are presented in Fig. 13. Once again, the deviation from a natural shape in the Lambertian result can be clearly observed in the output in Fig. 13 (b) shown in an identical view with our result in Fig. 13 (a). In addition, let us note that the output of the Blinn-Phong model is very clear and smooth, also at highlights. The inhomogeneous recovery of the shape when using the Lambertian model is cropped at some regions such as chin and tip of the nose c.f Fig. 13 (c), where we had to turn the Lambertian result to show these regions. The curved line appearing in the chin and the sharp point at the nose in the Lambertian reconstruction are also visible in [37]. Moreover, as proposed in [37], they could not process eyes in images, due to their specularities, while we succeeded in recovering the faithful 3-D shape even with eyes using the complete Blinn-Phong model as presented in Fig. 14.

6 Summary and Conclusion

A new framework in PS considering the complete perspective Blinn-Phong reflectance including strong specular highlights is presented. The advantages of our method over state-of-the-art PS methods and also the Lambertian model are proved via a variety of experiments. The model includes a perspective camera projection. Furthermore, two different techniques applied in perspective projection are evaluated. In addition, we have also evaluated the modeling of CCD camera. All results are obtained using a minimum necessary number of input images, which is an aspect of practical relevance in different applications and makes PS an interesting technique for close to real-time reconstruction, where a minimal set of images is required. We have demonstrated experimentally also the merits of our PS model for possible challenging real world applications, where we recover the surface with high degree of details. Let us also comment that our computational times are very reasonable i.e. in the order of a few seconds in all experiments.

Concerning possible limitations, as with all the possible approaches that rely on a parametric representation of surface reflectance, the corresponding additional parameters in the reflectance function have to be fixed. This issue may provide challenging numerical aspects in the optimization. Also, while the Blinn-Phong model gives already reasonable results as we demonstrated, other more sophisticated reflectance models may be adequate for handling highly complicated surfaces, which may be a possible issue of future research.

This work is supported by the Deutsche Forschungsgemeinschaft under grant number BR2245/4–1.

Figure 11: Depth reconstruction from real world endoscopy images: (first column) results of Lambertian model, (second column) results of the first proposed method (complete Blinn-Phong using PPS) and (third column) results of second proposed approach (complete Blinn-Phong model using PPN). The deviation in the Lambertian results can be clearly seen, while the results of our approach provide faithful 3-D reconstruction without any deviation and also with a high amount of details.
Figure 12: Depth reconstruction from real world endoscopy images: (first column) results of Lambertian model, (second column) results of the first proposed method (complete Blinn-Phong using PPS) and (third column) results of second proposed approach (complete Blinn-Phong model using PPN). Once more, the deviation in Lambertian outcomes is clear, whereas our approach provides a trustable 3-D reconstruction without any deviation.
Figure 13: Depth reconstructions from real world images: (a) Results of our proposed method using complete Blinn-Phong model, (b), (c) results of [37]. We have also cropped some parts of our results and shown them together with the same cropped area of outcomes of [37] in (c). As it is clear, our approach shows significant superiority over [37] in terms of advantages such as smoothness over the rough output of [37], reconstruction success in specularities and absence of deviation.
Figure 14: Depth reconstructions from real world images: (a) results of our proposed method using complete Blinn-Phong model, (b) results of [37]. As it is mentioned in [37], they could not obtain the reconstruction in the presence of eyes (due to the specularities) unlike our approach which provides faithful results even with including eyes.


  • [1] The stanford 3d scanning repository. Accessed: 2016-01-21.
  • [2] Y. Adato, Y. Vasilyev, T. Zickler, and O. Ben-Shahar. Shape from specular flow. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(11):2054–2070, 2010.
  • [3] M. Bähr, M. Breuß, Y. Quéau, A. S. Boroujerdi, and J. D. Durou. Fast and accurate surface normal integration on non-rectangular domains. Computational Visual Media, 3(2):107–129, 2017.
  • [4] P. Beckmann and A. Spizzichino. The scattering of electromagnetic waves from rough surfaces. Proceedings of the IEEE, 52(11):1389–1390, 1964.
  • [5] J. F. Blinn. Models of light reflection for computer synthesized. In ACM SIGGRAPH Computer Graphics, volume 11, pages 192–198, 1977.
  • [6] W. M. Brandenberg and J. T. Neu. Undirectional reflectance of imperfectly diffuse surfaces. Journal of the Optical Society of America, 56(1):97–103, 1966.
  • [7] F. Camilli and S. Tozza. A unified approach to the well-posedness of some non-lambertian models in shape-from-shading. SIAM Journal on Imaging Sciences, 10(1):26–46, 2017.
  • [8] O. Faugeras. Three-dimensional Computer Vision. The MIT Press, 1993.
  • [9] M. Galo and C. L. Tozzi. Surface reconstruction using multiple light sources and perspective projection. In International Conference on Image Processing, volume 2, pages 309–312, 1996.
  • [10] C. Godard, P. Hedman, W. Li, and G. J. Brostow. Multi-view-reconstruction of highly specular surfaces in uncontrolled environments. In 3DV, 2015.
  • [11] D. B. Goldman, B. Curless, A. Hertzmann, and S. M. Seitz. Shape and spatially-varying brdfs from photometric stereo. In CVPR, volume 1, pages 341–348, 2005.
  • [12] R. Hartley and A. Zisserman. Multiple view geometry in computer vision. Cambridge university press, 2004.
  • [13] A. Hertzmann and S. M. Seitz. Shape and materials by example: a photometric stereo approach. In CVPR, volume 1, pages 533–540, 2003.
  • [14] B. K. P. Horn. Robot Vision. The M.I.T. Press, 1986.
  • [15] B. K. P. Horn, R. J. Woodham, and W. M. Silver. Determining shape and reflectance using multiple images.

    M.I.T. Artificial Intelligence Laboratory, Memo 490, 1978.

  • [16] I. Ihrke, K. N. Kutulakos, H. P. A. Lensch, M. Magnor, and W. Heidrich. Transparent and specular object reconstruction. Computer Graphics forum, 29(8):2400–2426, 2010.
  • [17] M. Khanian, A. Sharifi Boroujerdi, and M. Breuß. Perspective photometric stereo beyond lambert. In QCAV, volume 9534, pages 95341F–95341F–8, 2015.
  • [18] J. Lambert and D. L. DiLaura. Photometry, or, on the measure and gradations of light, colors, and shade: Translation from the latin of photometria, sive, de mensura et gradibus luminis, colorum et umbrae. Illuminating Engineering Society of North America, 2001.
  • [19] K. Levenberg. A method for the solution of certain problems in least squares. Quarterly of Applied Mathematics, 5:164–168, 1944.
  • [20] S. P. Mallick, T. E. Zickler, D. J. Kriegman, and P. N. Belhumeur. Beyond lambert: reconstructing specular surfaces using color. In CVPR, volume 2, pages 619–626, 2005.
  • [21] D. Marquardt. An algorithm for least squares estimation on nonlinear parameters. Journal of the Society of Industrial and Applied Mathematics, 11(2):431–441, 1963.
  • [22] R. Mecca and Y. Quéau. Unifying diffuse and specular reflections for the photometric stereo problem. In WACV, pages 1–9, 2016.
  • [23] R. Mecca, E. Rodola, and D. Cremers. Realistic photometric stereo using partial differential irradiance equation ratios. Computers and Graphics, 51:8–16, 2015.
  • [24] R. Mecca, A. Tankus, and A. F. Bruckstein. Two-image perspective photometric stereo using shape-from-shading. In ACCV, volume 7727, pages 110–121, 2012.
  • [25] D. Miyazaki, K. Hara, and K. Ikeuchi. Median photometric stereo as applied to the segonko tumulus and museum objects. International Journal of Computer Vision, 86(2):229–242, 2010.
  • [26] Y. Mukaigawa, Y. Ishii, and T. Shakunaga. Analysis of photometric factors based on photometric linearization. Journal of the Optical Society of America, 24(10):3326–3334, 2007.
  • [27] G. Oxholm and K. Nishino. Multiview shape and reflectance from natural illumination. In CVPR, pages 2163–2170, 2014.
  • [28] T. Papadhimitri and P. Favaro. A new perspective on uncalibrated photometric stereo. In CVPR, pages 1474–1481, 2013.
  • [29] M. Pharr and G. Humphreys. Physically Based Rendering: From Theory to Implementation. Morgan Kaufmann Publishers Inc, 2010.
  • [30] B. T. Phong. Illumination for computer generated pictures. Communications of ACM 18, 18(6):311–317, 1975.
  • [31] Y. Quéau and J. D. Durou. Edge-preserving integration of a normal field: weighted least squares, tv and l1 approaches. In SSVM, volume 9087, pages 576–588, 2015.
  • [32] A. C. Sankaranarayanan, A. Veeraraghavan, O. Tuzel, and A. Agrawal. Specular surface reconstruction from sparse reflection correspondences. In CVPR, pages 1245–1252, 2010.
  • [33] W. A. P. Smith and F. Fang. Height from photometric ratio with model-based light source selection. Computer Vision and Image Understanding, 145:128–138, 2016.
  • [34] R. W. Sumner and J. Popović. Deformation transfer for triangle meshes. In ACM SIGGRAPH, volume 4, pages 399–405, 2004.
  • [35] H. D. Tagare and R. J. P. DeFigueiredo. A framework for the construction of general reflectance maps for machine vision. CVGIP: Image Understanding, 57(3):265–282, 1993.
  • [36] K. L. Tang, C. K. Tang, and T. T. Wong. Dense photometric stereo using tensorial belife propagation. In CVPR, volume 1, pages 132–139, 2005.
  • [37] A. Tankus and N. Kiryati. Photometric stereo under perspective projection. In ICCV, volume 1, pages 611–616, 2005.
  • [38] A. Tankus, N. Sochen, and Y. Yeshurun. Reconstruction of medical images by perspective shape-from-shading. In ICPR, pages 778–781, 2004.
  • [39] A. Tankus, N. Sochen, and Y. Yeshurun. Shape-from-shading under perspective projection. Computers and Graphics, 63(1):21–43, 2005.
  • [40] K. Tatemasu, Y. Iwahori, T. Nakamura, S. Fukui, R. J. Woodham, and K. Kasugai. Shape from endoscope image based on photometric and geometric constraints. Procedia Computer Science, 22:1285–1293, 2013.
  • [41] C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: a factorization method. International Journal of Computer Vision, 9(2):137–154, 1992.
  • [42] S. Tozza, R. Mecca, M. Duocastella, and A. Del Bue. Direct differential photometric stereo shape recovery of diffuse and specular surfaces. Journal of Mathematical Imaging and Vision, 56:57–76, 2016.
  • [43] E. Trucco and A. Verri. Introductory Techniques for 3-D Computer Vision. Prentice-Hall, 1998.
  • [44] O. Vogel, L. Valgaerts, M. Breuß, and J. Weickert. Making shape from shading work for real-world images. In

    DAGM Pattern Recognition

    , volume 5748, pages 191–200. Springer Berlin Heidelberg, 2009.
  • [45] C. Wöhler. 3D Computer Vision. Springer-Verlag, 2013.
  • [46] R. J. Woodham. Photometric stereo: a reflectance map technique for determining surface orientation from image intensity. In Image Understanding Systems and Industrial Applications, SPIE, volume 0155, pages 136–143, 1978.
  • [47] R. J. Woodham. Photometric method for determining surface orientation from multiple images. Optical Engineering, 19(1):134–144, 1980.
  • [48] L. Wu, A. Ganesh, B. Shi, Y. Matsushita, Y. Wang, and Y. Ma. Robust photometris stereo via low-rank matrix completion and recovery. In ACCV, volume 6494, pages 703–717, 2010.
  • [49] Y. Xiong and S. Shafer. Depth from focusing and defocusing. In CVPR, pages 68–73, 1993.
  • [50] C. Yu, Y. Seo, and S. Lee. Photometric stereo from maximum feasible lambertian reflections. In ECCV, volume 6314, pages 115–126, 2010.