1 Introduction
The problem of photometric stereo [33] inversely solves for the radiometric image formation model to recover surface normals from different appearances of objects under various lighting conditions with a fixed camera view. The classic method [33] assumes an ideal Lambertian image formation model without global illumination effects (such as interreflection and shadows), which deviates from the realistic scenario and prevents photometric stereo from being able to handle realworld objects. To make photometric stereo practical, the major difficulties lie in dealing with objects of general reflectance and global illumination effects. These can be achieved by either exploring analytical Bidirectional Reflectance Distribution Function (BRDF) representations (e.g., [12]) and general BRDF properties (e.g., [28]) to model nonLambertian interactions of lighting and surface normal or suppressing global effects by treating them as outliers (e.g., [35]
). Recently, deep learning based approaches are introduced to solve these difficulties by implicitly learning both the image formation process and global illumination effects from training data (
e.g., [7, 16]).According to a comprehensive benchmark evaluation [26] (including quantitative results for representative methods published before 2016) and the additional results reported in most recent works [7, 16, 38], a moderately dense lighting distribution (e.g., around 100 directional lights randomly sampled from the visible hemisphere) is required to achieve reasonably good normal estimation for objects with general materials (e.g., angular error around for a shiny plastic). This is because multiillumination observations under a dense set of lights are required to fit the parameters in analytic BRDF models [12], to analyze general BRDF properties [28], to observe sufficient inliers and outliers [35]
, and to ensure the convergence of training neural networks
[7]. How to achieve high accuracy in normal estimation for objects given general BRDFs with a sparse set of lights (e.g., 10), which we call sparse photometric stereo in this paper, is still an open yet challenging problem [26].In this paper, we propose to solve Sparse Photometric stereo through Lighting Interpolation and Normal Estimation Networks, namely SPLINENet. The SPLINENet is composed of two subnetworks: the Lighting Interpolation Network (LINet) to generate dense observations given a sparse set of input lights and the Normal Estimation Network (NENet) to estimate surface normal from the generated dense observations. LINet takes advantage of a learnable representation for dense lighting called observation map [16], and we propose to deal with sparse observation maps as damaged paintings and generate dense observation through inpainting (as shown in Figure 1^{1}^{1}1Note that holes in ground truth of observation maps are produced by the discrete projection from a limited number of lights to a grid with limited resolution (e.g., from 1000 lighting directions to observation maps with size in this figure).). NENet then follows LINet to infer surface normal guided by dense observation maps. To accurately guide the lighting interpolation and normal estimation specially under the photometric stereo context, we propose a symmetric loss and an asymmetric loss to explicitly consider general BRDF properties and outlier rejections. More specifically, the symmetric loss is derived according to the property of isotropy for general reflectance, which constrains pixel values on a generated observation map to be symmetrically distributed w.r.t. an axis determined by the corresponding surface normal. The asymmetric loss is derived from contaminated observation maps with global illumination effects, which constrains the difference between values of symmetrically distributed pixels to be equal to a nonzero amount. SPINENet is validated to achieve superior normal estimation accuracy given a small number of input images (e.g., 10) comparing to stateoftheart methods using a much larger number (e.g., 96), which greatly relieves data capture and lighting calibration labor for photometric stereo with general BRDFs. The contributions of this paper are twofold:

We propose the SPLINENet to address the problem of photometric stereo with general BRDFs using a small number of images through an integrated learning procedure of lighting interpolation and normal estimation.

We show how symmetric and asymmetric loss functions can be formulated to facilitate the learning of lighting interpolation and normal estimation with isotropy constraint and outlier rejection of global illumination effects considered.
2 Related Works
In this section, we briefly review traditional methods and deep learning based methods for nonLambertian photometric stereo with general materials and known lightings. For other generalizations of photometric stereo, we refer readers to survey papers in [13, 1, 26].
Traditional methods.
The classical method [33] for the problem of photometric stereo is to assume a Lambertian surface reflectance and recover surface normals pixelwisely. Such an assumption is too strong to provide an accurate recovery in realworld due to densely observed nonLambertian reflectance caused by materials with diverse reflectance and global illumination effects. In order to address nonLambertian reflectance from broad classes of materials, modern algorithms attempt to use a mathematically tractable form to describe BRDF. Analytic models exploit all available data to fit a nonlinear analytic BRDF, such as BlinnPhong model [31], TorranceSparrow model [11], Ward model and its variations [10, 12, 2], specular spike model [9, 36], and microfacet BRDF model [8]. Empirical models consider general properties of a BRDF, such as isotropy, monotonicity. Some basic derivations for isotropy BRDFs are provided in [4, 29, 6]. Excellent performance has been achieved by methods based on empirical models, including combining isotropy and monotonicity with visibility constraint [14], using isotropic constraint for the estimation of elevation angle [3, 27, 20], and approximating isotropic BRDFs by bivariate functions [23, 17, 28]. However, most of these methods based on analytic models and empirical models are pixelwise so that they cannot explicitly consider global illumination effects such as interreflection and cast shadows. Outlier rejection based methods are developed to suppress global illumination effects by considering them as outliers. Earlier works select a subset of Lambertian images from inputs for the accurate recovery of surface normals [22, 32, 21, 37, 35]. Recent methods apply robust analysis by assuming nonLambertian reflectance is sparse [34, 18]. However, these methods still rely on the existence of a dominant Lambertian reflectance component.
Deep learning based methods.
Recently, with the great success in both highlevel and lowlevel computer vision tasks achieved by neural networks, researchers have introduced deep learning based methods to solve the problem of photometric stereo. Instead of explicit modeling image formation process and global illumination effects as in traditional methods, deep learning based methods attempt to learn such information from data. DPSN
[24] is the first attempt and it uses a deep fullyconnect network to regress surface normals from given observations captured under predefined lightings in a supervised manner. However, predefinition of lightings limits its practicality for photometric stereo where the number of inputs often varies. PSFCN [7]is proposed to address such a limitation and handle images under various lightings in an orderagnostic manner by aggregating features of inputs using the maxpooling operation. CNNPS
[16] is another work to accept orderagnostic inputs by introducing observation map, which is a fixed shape representation invariant to inputs. Besides neural networks trained in a supervised manner, Taniai and Maehara [30]presented an unsupervised learning framework where surface normals and BRDFs are recovered by minimizing a reconstruction loss between inputs and images synthesized based on a rendering equation.
Only a few earlier works address the problem of photometric stereo with general reflectance using a small number of images in the literature (e.g., analytic model based method [12], shadow analysis based method [5]). Our paper revisits this problem due to its low costs for the labor of data capture and lighting calibration.
3 The Proposed SPLINENet
In this section, we introduce our solution to the problem of photometric stereo with general reflectance using a small number of images. We first present the framework of our SPLINENet in Section 3.1. Then we detailed the symmetric loss and the asymmetric loss in Section 3.2.
3.1 Framework
As illustrated in Figure 2, our SPLINENet, which consists of a Lighting Interpolation Network (LINet) and a Normal Estimation Network (NENet), is optimized in a supervised manner. LINet (represented as a regression function ) interpolates dense observation maps from sparse observation maps (i.e., sparse sets of lights),
(1) 
Such densely interpolated observation maps are then concatenated to original inputs and help estimate surface normals in NENet (represented as a regression function )
(2) 
LINet and NENet are trained in an alternating iteratively manner, where fixing one network when optimizing the other. Specifically, we update LINet once after updating NENet five times. The loss function for each network composes of a reconstruction loss, a symmetric loss, and an asymmetric loss.
Lighting Interpolation Network.
The basic idea of LINet is to inpaint sparse observation maps to obtain dense ones, based on learnable properties of observation maps (e.g., spatial continuity). LINet is designed using an encoderdecoder structure due to its excellent image generation capacity [19, 39]. The loss function of LINet is formulated as,
(3) 
The reconstruction loss is defined as^{2}^{2}2We experimentally find L1 and L2 distances provide similar results and here we compute reconstruction loss using L1 distance.,
(4) 
where , and are ground truth of a surface normal and its corresponding dense observation map, respectively, is a binary mask indicating positions of nonzero value of , represents elementwise multiplication. and are our symmetric and asymmetric loss to be introduced in Section 3.2.
Normal Estimation Net.
We use the same architecture as that in [16] (a variation of DenseNet [15]) for NENet due to its excellent capacity to model the relation between observation maps and surface normals. The loss function of NENet is formulated as,
(5) 
where and are symmetric loss and asymmetric loss, and reconstruction loss is,
(6) 
3.2 Symmetric Loss and Asymmetric Loss
In this section, we first revisit the observation map introduced by [16]. Then, we further investigate its characteristics by considering isotropy BRDFs and global illumination effects. Finally, we introduce our symmetric and asymmetric loss functions.
Observation maps.
As introduced in [16], each point on a surface normal map corresponds to an observation map (as shown in Figure 3 (a)). Elements on such a map describe observed irradiance values under different lighting directions. These lighting directions are mapped to positions of elements, which is an orthogonal projection. As illustrated in Figure 3 (a), a dense observation map can be regarded as generated by projecting a hemisphere surface to its base plane, where each point on the hemisphere surface represents a direction of lighting and its projecting value describes an observed irradiance value under such a light. Such a projecting relation motivates us to introduce isotropy to narrow the solution space of our SPLINENet.
Isotropic BRDFs and global illumination effects.
Isotropy BRDFs for general materials have the property that reflectance values are numerically equal if lighting directions are symmetric about the plane spanned by and as shown in Figure 3 (a). Considering the relation of the onetoone mapping between lighting directions and positions of observed irradiance values, these values are numerically equal if their positions are symmetrically distributed regarding to the axis projected by surface normals in observation maps, as shown by Figure 3 (b). However, such symmetric patterns can be destroyed by global illumination effects due to the fact that observation maps are pixelwisely generated. Therefore, unpredictable shapes produce cast shadows or interreflection under certain lighting directions and lead to sudden changes of irradiance values on observation maps. Figure 4 illustrates examples of isotropy, cast shadow, interreflection.
Symmetric and asymmetric loss functions.
In order to further narrow solution spaces for the lighting interpolation of dense observation maps to facilitate accurate estimation of the surface normal, we propose symmetric and asymmetric loss functions to exploit above observations for LINet and NENet. More specifically, given a dense observation map and its corresponding surface normal , the symmetric loss is introduced to force the isotropic properties of general BRDFs which are valid on various realworld reflectance. That is, it constrains irradiance values, which are symmetrically distributed w.r.t. an axis determined by its surface normal (reddotted lines in Figure 4), to be numerically equal,
(7) 
where function mirrors the observation map w.r.t. the axis determined by . Different from symmetric loss, the asymmetric loss is introduced to model the asymmetric pattern brought by outliers such as global illuminations. It constrains the difference between values of symmetrically distributed pixels to be equal to a nonzero amount ,
(8) 
where is a weight parameter, function
performs an average pooling operation with stride of 2 to ensure spatial continuity of observation maps. Empirically, we set
for all experiments. Both of and aim to better fit observations of symmetric and asymmetric patterns (as illustrated in Figure 4) during training. We integrate symmetric and the asymmetric loss functions to optimize LINet by setting,(9) 
and to optimize NENet by setting,
(10) 
We empirically set due to the fact that global illumination effects are always observed for small regions of realworld objects.
4 Experiments
In this section, we report our experimental results on one synthetic dataset and one real dataset in Section 4.1 and Section 4.2, respectively. We further analyze the effectiveness of LINet, symmetric loss, and asymmetric loss by ablation studies in Section 4.3.
Settings and implementation details.
A recent survey work [26] implies that photometric stereo with general BRDFs show significant performance drop if only around 10 images are provided. Therefore, we define 10 as the number of sparse lights and use 10 randomly sampled lights as inputs to SPLINENet for both training and testing. The resolution of our observation map is set to and the batch size is set to , which are the same as those in [16], for easier comparisons. All our experiments are performed on a machine with one single GeForce GTX 1080 Ti and 12GB RAM. Adam optimizer is used to optimize our networks with default parameters and .
Datasets and evaluation.
We use CyclesPS data provided by [16] as our training data. There are totally 45 training data including 15 shapes with 3 categories of reflectance (diffuse, metallic, and specular). Our testing sets are built based on public evaluation datasets. That is, we construct 100 instances for each testing data from these datasets. Each instance contains images illuminated under 10 randomly selected lights to cover as many lighting conditions as possible. Quantitative results are averaged over 100 instances for each testing data.
Compared methods.
We compare our method with five methods, including linear least squares based Lambertian photometric stereo method LS [33], an outlier rejection based method IW12 [18], two stateoftheart methods exploring general BRDF properties ST14 [28] and IA14 [17], and a deep learning based method CNNPS [16].^{3}^{3}3We have conducted evaluations using the same training data and testing data for PSFCN [7]. However, performance of PSFCN [7] on CyclesPSTest [16] (19.29 and 18.10) and Diligent [26] (24.41) is not as expected. The major reason is that PSFCN [7] requires training data with various shapes while our training data CyclesPS [16] only contains three shapes with diverse reflectance. For a fair comparison, we don’t compare with PSFCN [7] in our experiments.
We retrain CNNPS [16] by taking ten observed irradiance values as inputs to deal with the problem of photometric stereo using a small number of images.^{4}^{4}4Considering the overall quantitative results of CNNPS [16] with default settings (taking 50 observed irradiance values as inputs) on CyclesPSTest [16] (31.08 and 34.90) and Diligent [26] (14.10), we report results retrained by our setting.
paperbowl  sphere  turtle  Avg.  paperbowl  sphere  turtle  Avg.  

M  S  M  S  M  S  M  S  M  S  M  S  
LS [33]  41.47  35.09  18.85  10.76  27.74  19.89  25.63  43.09  37.36  20.19  12.79  28.51  21.76  27.28 
IW12 [18]  46.68  33.86  16.77  2.23  31.83  12.65  24.00  48.01  37.10  21.93  3.19  34.91  16.32  26.91 
ST14 [28]  42.94  35.13  22.58  4.18  34.30  17.01  26.02  44.44  37.35  25.41  4.89  36.01  19.06  27.86 
IA14 [17]  48.25  43.51  18.62  11.71  30.59  23.55  29.37  49.01  45.37  21.52  13.63  32.82  26.27  31.44 
CNNPS [16]  37.14  23.40  17.44  6.99  22.86  10.74  19.76  38.45  26.90  18.25  9.04  23.91  14.36  21.82 
SPLINENet  29.87  18.65  6.59  3.82  15.07  7.85  13.64  33.99  23.15  9.21  6.69  17.35  12.01  17.07 
Methods  ball  bear  buddha  cat  cow  goblet  harvest  pot1  pot2  reading  Avg. 

LS [33]  4.41  9.05  15.62  9.03  26.42  19.59  31.31  9.46  15.37  20.16  16.04 
IW12 [18]  3.33  7.62  13.36  8.13  25.01  18.01  29.37  8.73  14.60  16.63  14.48 
ST14 [28]  5.24  9.39  15.79  9.34  26.08  19.71  30.85  9.76  15.57  20.08  16.18 
IA14 [17]  12.94  16.40  20.63  15.53  18.08  18.73  32.50  6.28  14.31  24.99  19.04 
CNNPS [16]  17.86  13.08  19.25  15.67  19.28  21.56  21.52  16.95  18.52  21.30  18.50 
LINet+NENet  6.06  7.01  10.69  8.38  10.39  11.37  19.02  9.42  12.34  16.18  11.09 
Nets+Sym.  5.04  5.89  10.11  7.79  9.38  10.84  19.03  8.91  11.47  15.87  10.43 
SPLINENet  4.96  5.99  10.07  7.52  8.80  10.43  19.05  8.77  11.79  16.13  10.35 
4.1 Synthetic Data
CyclesPSTest is a testing dataset published by [16], which consists of two subsets generated under different numbers of lightings (17 or 305). Both subsets (denoted as L17 and L305 in this paper) contain three shapes, paperbowl, sphere, and turtle. Each of these shapes is spatially divided into 100 parts and each part is rendered by various reflectance of either metallic or specular, to approximate general reflectance with diverse materials. There are 6 data in each of these two subsets. For all these 12 data, we construct 100 instances, each of which contains 10 randomly selected images, to build the testing set.
As can be observed from Table 1, the metallic materials are more challenging as compared with specular materials when using a small number of images due to the more abrupt changes in BRDFs. Even for the simple shape like sphere containing few global illumination effects, all methods fail to estimate accurate surface normals for metallic materials. The performance advantage of the proposed SPLINENet is superior, i.e., the overall performance (13.64 and 17.07) is much better than the second best one achieved by CNNPS [16] (19.76 and 21.82). Interestingly, two traditional methods, IW12 [18] and ST14 [28], outperform other methods on data sphere with specular materials. However, they are not able to achieve accurate results on complex shapes like paperbowl or turtle, or metallic materials due to the ignorance of global illumination effects, while our method consistently achieves the best performance.
Visual quality comparisons in Figure 5 further validates the effectiveness of our method. Even though the overall performance is worse than that of IW12 [28] on data sphere, our method deals with specular reflectance in a more robust manner. The error maps on a more difficult shape paperbowl show that our method consistently produces the best estimation for most regions. Both quantitative performance and visualization results on synthetic data demonstrate the effectiveness of our method to address the problem of photometric stereo with general BRDFs using a small number of images.
4.2 Real Data
Diligent [26] is a benchmark dataset consists of 10 real data with various reflectance. Each data provides 16bit images from 96 known lighting directions distributed on a grid spanning . Similarly, for each of these 10 data, we construct 100 instances, each of which contains 10 randomly selected images, to build the testing set.
The quantitative results are reported in Table 2. The proposed SPLINENet demonstrates obvious superiority over most data except for ball and pot1, which get similar or worse results (4.96 and 8.77) as compared to two traditional methods LS [33] (4.41 and 9.46) and IW12 [18] (3.33 and 8.73). The reason is that these two data are diffusedominant so that traditional methods with Lambertian assumption can fit well even for a small number of observed irradiance values. However, our datadriven approach considers general reflectance and global illumination effects evenly during model optimization and hence may underfit Lambertian surfaces with simple shapes. Unlike excellent performance on synthetic data, CNNPS [16] achieve less accuracy on real data, i.e., its performance on real data is even not comparable to traditional methods. We think that the reason is mainly due to the problem of overfitting during training, i.e., synthetic data for testing are constructed in a similar manner as training data. Our method achieves the best accuracy for most of data, such as cow (metallic paint materials), goblet, and harvest (most of regions contain interreflection or cast shadows).
Visual quality comparisons on data cow and pot1 are shown in Figure 6. Our method provides much more accurate results due to its excellent modeling capacity for metallic materials on cow and such a performance advantage is consistent with those on synthetic data with metallic materials as reported in Table 1. Even though our method achieves similar performance on pot1 for center regions as compared to other methods (LS [33], IW12 [18], ST14 [28]), our method provides more accurate estimation for boundaries (e.g., regions of spout and kettleholder). Both the quantitative results and visual quality results on real data also validate the effectiveness of our method.
4.3 Ablation Studies
In this section, we perform ablation studies to further investigate the contribution of important components in SPLINENet. Specifically, the effectiveness of LINet, symmetric loss, and asymmetric loss are independently studied. Unless otherwise stated, all methods in this section use exactly the same settings as in Section 4.1 and Section 4.2. We use the same testing set as in Section 4.2 which is built from real dataset Diligent [26] for evaluation.
Effectiveness of LINet.
Considering the same network structure of NENet and that in CNNPS [16] and the fact that our SPLINENet is composed of LINet and NENet, we compare our SPLINENet without symmetric loss or asymmetric loss (denoted as ‘LINet+NENet’) with CNNPS [16] to validate the effectiveness of LINet. Their quantitative and qualitative performance are shown in Table 2 and Figure 8, respectively. As can be observed, LINet+NENet significantly outperforms CNNPS [16], which verifies the effectiveness of using LINet to generate dense observation maps to estimate surface normals.
Effectiveness of symmetric loss.
We then validate the effectiveness of symmetric loss by comparing the performance of LINet+NENet with that of the method with an additional symmetric loss (denoted as ‘Nets+Sym.’). Such an experiment is to verify the effectiveness of enforcing isotropy property. Figure 7 shows that symmetric loss can help generate more reliable dense observation maps. The quantitative performance can be consistently improved for all data by introducing symmetric loss as displayed in Table 2. Visual comparisons in Figure 8 intuitively shows such an advantage for general materials (rectangles in orange).
Effectiveness of asymmetric loss.
We perform a comparison between Nets+Sym. and SPLINENet in this experiment to verify the effectiveness of our method for the consideration of global illumination effects. Asymmetric loss can help produce more reliable dense observation maps as can be observed from observation maps in Figure 7. Interestingly, our methods (Nets+Sym. and SPLINENet) successfully inpaint regions damaged by global illumination effects as shown in the second row of Figure 7. The overall quantitative performance is improved by introducing asymmetric loss as shown in Table 2. Most of improvements are observed for data with heavy global illumination effects, e.g., goblet (interreflection), harvest (castshadows). Visual comparisons in Figure 8 intuitively shows such an advantage (rectangles in red).
5 Conclusion
This paper proposes SPLINENet to address the problem of photometric stereo with general reflectance and global illumination effects using a small number of images. The basic idea of SPLINENet is to generate dense lighting observations from a sparse set of lights to guide the estimation of surface normals. The proposed SPLINENet is further constrained by the proposed symmetric and asymmetric loss functions to enforce isotropic constrain and perform outlier rejection of global illumination effects.
Limitations.
Interestingly, even though deep learning based methods achieve superior performance for nonLambertian reflectance, their performance drops for diffusedominant surfaces that can be well fitted by traditional methods with Lambertian assumption. Figure 9 illustrates results of four traditional methods and three deep learning based methods (including PSFCN [7]) on two real data^{5}^{5}5Buddha is courtesy of Dan Goldman and Steven Seitz (found from http://www.cs.washington.edu/education/courses/csep576/05wi//projects/project3/project3.htm). Sheep is from [25]. with diffuse surfaces. Such results, which are consistent with those of ball and pot1 in Table 2, indicate the limitation of deep learning methods for diffuse surfaces. To explicitly consider diffuse surface at the same time maintain the performance advantage on nonLambertian surfaces for deep learning based methods can be one of further works.
References
 [1] J. Ackermann and M. Goesele. A survey of photometric stereo techniques. Foundations and Trends in Computer Graphics and Vision, 9(34):149–254, 2015.

[2]
J. Ackermann, F. Langguth, S. Fuhrmann, and M. Goesele.
Photometric stereo for outdoor webcams.
In
Proc. of Computer Vision and Pattern Recognition
, pages 262–269, 2012.  [3] N. Alldrin, T. Zickler, and D. Kriegman. Photometric stereo with nonparametric and spatiallyvarying reflectance. In Proc. of Computer Vision and Pattern Recognition, pages 1–8, 2008.
 [4] N. G. Alldrin and D. J. Kriegman. Toward reconstructing surfaces with arbitrary isotropic reflectance: A stratified photometric stereo approach. In Proc. of Internatoinal Conference on Computer Vision, pages 1–8, 2007.
 [5] M. Chandraker, S. Agarwal, and D. Kriegman. Shadowcuts: Photometric stereo with shadows. In Proc. of Computer Vision and Pattern Recognition, pages 1–8, 2007.
 [6] M. Chandraker, J. Bai, and R. Ramamoorthi. On differential photometric reconstruction for unknown, isotropic BRDFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12):2941–2955, 2013.
 [7] G. Chen, K. Han, and K.Y. K. Wong. Psfcn: A flexible learning framework for photometric stereo. In Proc. of European Conference on Computer Vision, pages 3–19, 2018.
 [8] L. Chen, Y. Zheng, B. Shi, A. SubpaAsa, and I. Sato. A microfacetbased reflectance model for photometric stereo with highly specular surfaces. In Proc. of Internatoinal Conference on Computer Vision, pages 3162–3170, 2017.
 [9] T. Chen, M. Goesele, and H.P. Seidel. Mesostructure from specularity. In Proc. of Computer Vision and Pattern Recognition, volume 2, pages 1825–1832, 2006.
 [10] H.S. Chung and J. Jia. Efficient photometric stereo on glossy surfaces with wide specular lobes. In Proc. of Computer Vision and Pattern Recognition, pages 1–8, 2008.
 [11] A. S. Georghiades. Incorporating the torrance and sparrow model of reflectance in uncalibrated photometric stereo. In Proc. of Internatoinal Conference on Computer Vision, volume 3, page 816, 2003.
 [12] D. B. Goldman, B. Curless, A. Hertzmann, and S. M. Seitz. Shape and spatiallyvarying brdfs from photometric stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6):1060–1071, 2010.
 [13] S. Herbort and C. Wöhler. An introduction to imagebased 3D surface reconstruction and a survey of photometric stereo methods. 3D Research, 2(3):1–17, 2011.
 [14] T. Higo, Y. Matsushita, and K. Ikeuchi. Consensus photometric stereo. In Proc. of Computer Vision and Pattern Recognition, pages 1157–1164, 2010.
 [15] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In Proc. of Computer Vision and Pattern Recognition, pages 4700–4708, 2017.
 [16] S. Ikehata. CNNPS: CNNbased photometric stereo for general nonconvex surfaces. In Proc. of European Conference on Computer Vision, pages 3–18, 2018.
 [17] S. Ikehata and K. Aizawa. Photometric stereo using constrained bivariate regression for general isotropic surfaces. In Proc. of Computer Vision and Pattern Recognition, pages 2179–2186, 2014.
 [18] S. Ikehata, D. Wipf, Y. Matsushita, and K. Aizawa. Robust photometric stereo using sparse regression. In Proc. of Computer Vision and Pattern Recognition, pages 318–325, 2012.

[19]
J. Johnson, A. Alahi, and L. FeiFei.
Perceptual losses for realtime style transfer and superresolution.
In Proc. of European Conference on Computer Vision, pages 694–711. Springer, 2016.  [20] S. Li and B. Shi. Photometric stereo for general isotropic reflectances by spherical linear interpolation. Optical Engineering, 54(8):083104, 2015.
 [21] D. Miyazaki, K. Hara, and K. Ikeuchi. Median photometric stereo as applied to the segonko tumulus and museum objects. International Journal of Computer Vision, 86(23):229, 2010.
 [22] Y. Mukaigawa, Y. Ishii, and T. Shakunaga. Analysis of photometric factors based on photometric linearization. JOSA A, 24(10):3326–3334, 2007.
 [23] F. Romeiro, Y. Vasilyev, and T. Zickler. Passive reflectometry. In Proc. of European Conference on Computer Vision, pages 859–872, 2008.
 [24] H. Santo, M. Samejima, Y. Sugano, B. Shi, and Y. Matsushita. Deep photometric stereo network. In Computer Vision Workshop (ICCVW), IEEE International Conference on, pages 501–509, 2017.
 [25] B. Shi, Y. Matsushita, Y. Wei, C. Xu, and P. Tan. Selfcalibrating photometric stereo. In Proc. of Computer Vision and Pattern Recognition, pages 1118–1125, 2010.
 [26] B. Shi, Z. Mo, Z. Wu, D. Duan, S.K. Yeung, and P. Tan. A benchmark dataset and evaluation for nonlambertian and uncalibrated photometric stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2):271–284, 2019.
 [27] B. Shi, P. Tan, Y. Matsushita, and K. Ikeuchi. Elevation angle from reflectance monotonicity: Photometric stereo for general isotropic reflectances. In Proc. of European Conference on Computer Vision, pages 455–468. Springer, 2012.
 [28] B. Shi, P. Tan, Y. Matsushita, and K. Ikeuchi. Bipolynomial modeling of lowfrequency reflectances. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6):1078–1091, 2014.
 [29] P. Tan, L. Quan, and T. Zickler. The geometry of reflectance symmetries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12):2506–2520, 2011.

[30]
T. Taniai and T. Maehara.
Neural inverse rendering for general reflectance photometric stereo.
In
International Conference on Machine Learning
, pages 4864–4873, 2018.  [31] S. Tozza, R. Mecca, M. Duocastella, and A. Del Bue. Direct differential photometric stereo shape recovery of diffuse and specular surfaces. Journal of Mathematical Imaging and Vision, 56(1):57–76, 2016.
 [32] F. Verbiest and L. Van Gool. Photometric stereo with coherent outlier handling and confidence estimation. In Proc. of Computer Vision and Pattern Recognition, pages 1–8, 2008.
 [33] R. J. Woodham. Photometric method for determining surface orientation from multiple images. Optical Engineering, 19(1):191139–191139, 1980.
 [34] L. Wu, A. Ganesh, B. Shi, Y. Matsushita, Y. Wang, and Y. Ma. Robust photometric stereo via lowrank matrix completion and recovery. In Proc. of Asian Conference on Computer Vision, pages 703–717. 2010.

[35]
T.P. Wu and C.K. Tang.
Photometric stereo via expectation maximization.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3):546–560, 2010.  [36] S.K. Yeung, T.P. Wu, C.K. Tang, T. F. Chan, and S. J. Osher. Normal estimation of a transparent object using a video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(4):890–897, 2015.
 [37] C. Yu, Y. Seo, and S. W. Lee. Photometric stereo from maximum feasible Lambertian reflections. In Proc. of European Conference on Computer Vision, pages 115–126, 2010.
 [38] Q. Zheng, A. Kumar, B. Shi, and G. Pan. Numerical reflectance compensation for nonlambertian photometric stereo. IEEE Transactions on Image Processing, 2019.

[39]
J.Y. Zhu, T. Park, P. Isola, and A. A. Efros.
Unpaired imagetoimage translation using cycleconsistent adversarial networks.
In Proc. of Internatoinal Conference on Computer Vision, pages 2223–2232, 2017.