Color Constancy with Derivative Colors

11/25/2016 ∙ by Huan Lei, et al. ∙ Xidian University The Hong Kong University of Science and Technology 0

Information about the illuminant color is well contained in both achromatic regions and the specular components of highlight regions. In this paper, we propose a novel way to achieve color constancy by exploiting such clues. The key to our approach lies in the use of suitably extracted derivative colors, which are able to compute the illuminant color robustly with kernel density estimation. While extracting derivative colors from achromatic regions to approximate the illuminant color well is basically straightforward, the success of our extraction in highlight regions is attributed to the different rates of variation of the diffuse and specular magnitudes in the dichromatic reflection model. The proposed approach requires no training phase and is simple to implement. More significantly, it performs quite satisfactorily under inter-database parameter settings. Our experiments on three standard databases demonstrate its effectiveness and fine performance in comparison to state-of-the-art methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 6

page 8

page 11

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Human visual system inherently has the ability of color constancy. However, for images captured by digital cameras, the object colors are shifted by variations of the illuminant. In practice, many high-level computer vision applications demand color constancy as a preprocessing step to ensure that the same object color taken under different illuminants can be accurately matched to their canonical ones. The most significant part of computational color constancy is to estimate the illuminant color.

Many color constancy algorithms have been presented so far. According to whether the parameter setting of an algorithm is kept fixed or not, they can be classified into the static and the learning-based groups

gijsenij2011computational . While, according to whether an algorithm depends on understanding the physical process of the reflected light, color constancy approaches can alternatively be categorized into the physics-based and the statistics-based groups mainly finlayson2001solving , which is the very categorization we adopt in this paper. Although the statistics-based methods are accused of error-prone with limited object colors (e.g., forsyth1990novel ; finlayson2001color ; cardei2002estimating ) while the physics-based ones are not in theory, existing physics-based methods are nearly altogether inferior to the statistics-based ones, particularly due to their mediocre performance on the standard databases. Nonetheless, the physics-based methods have such advantages such as fast execution, requiring few parameters and no training phase gijsenij2011computational , which continually attract researchers in this field to exploit physics-based information for better color constancy algorithms.

Widely among the physics-based methods, the dichromatic reflection model shafer1985using has been explored to achieve color constancy. Two dominant reasons contribute to this choice. Firstly, the dichromatic reflection model provides a fine description about the imaging process compared with the traditional Lambertian model. Secondly, under the neutral interface reflection (NIR) assumption lee1986method , the specular components of highlight pixels will perfectly contain information about the illuminant color. The earlier methods solve for color constancy with intersection of dichromatic lines formed by different object colors (e.g., lee1986method ; tominaga1996multichannel ; finlayson2001convex ). Yet, they are rarely functional outside the lab since a challenging pre-segmentation of different object colors is demanded in the highlight regions. In addition, they are inapplicable to scenes with uniform object color because of the absence of intersection. Then, the Planckian locus is introduced as a constraint which makes estimation from uniform object color possible finlayson2001solving . Later, the inverse-intensity chromaticity (IIC) space is defined to estimate the illuminant color tan2004color , which eliminates the requirement for pre-segmentation. Although the approach is suitable for both uniform and highly textured surfaces, its performance is unacceptable on the standard databases. There are also some methods estimating the illuminant color from intersection of dichromatic planes formed by different object colors (e.g., toro2007multilinear ; shi2008dichromatic ; toro2008dichromatic

), which are therefore inapplicable to uniform object colors as well. In addition, the identification of different dichromatic planes is rather difficult. Recently, geometric mean of highlight pixels is suggested to be taken as the illuminant color

drew2014zeta , which again avoids pre-segmentation. More excitingly, the approach improves the estimation accuracy reported by the physics-based methods in a large degree.

Similar to the specular components of highlight pixels, pixels whose albedos are achromatic also contain perfect reflectance of the illuminant, whether they are diffuse or with specular components. In this paper, we exploit to achieve color constancy for a single image based on both achromatic regions and specularity in highlight regions. Although estimating the illuminant color in a pixel level is effective in achromatic regions, it is difficult to extract information about the illuminant color from highlight regions. As a result, we propose to estimate the illuminant color using derivative colors, which are able to successfully combine the illuminant information from both achromatic and highlight regions. While extracting derivative colors from achromatic regions to approximate the illuminant color well is basically straightforward, the success of our extraction in highlight regions is attributed to different rates of variation of the diffuse and specular magnitudes in the dichromatic reflection model. Thus, compare to the physics-based methods presented purely according to the Lambertian model (e.g., geusebroek2001color ; geusebroek2002physical ), the proposed approach exploits the significant specular information as well. On the other hand, superior to the dichromatic-line-based or dichromatic-plane-based methods, the proposed approach readily overcomes the challenge of pre-segmenting different object colors and is applicable to both uniform and highly textured scenes. Meanwhile, better than IIC, which demands enough highlight pixels of the same object color to be with constant diffuse magnitude, the proposed approach eliminates this inappropriate constraint and dependence on the diffuse component. More importantly, extracting the derivative colors with suitably spatial operators, the proposed approach is able to estimate the illuminant color robustly with kernel density estimation. Similar to the previous physics-based methods, the proposed approach requires no any training session. Yet, it overcomes their drawbacks such as difficult to implement and mediocre performance.

Indeed, we are not the first to perform the estimation from derivative structures of an image. Methods exploring derivative structures of an image exist in the statistics-based group (e.g., van2007edge ; gijsenij2010generalized ; chakrabarti2012color ; finlayson2013corrected ). van de Weijer et al. proposed the gray-edge method which assumes the average reflectance in the derivative structure of an image to be achromatic van2007edge . One drawback of the approach is that it uses information from derivative structure of an image indiscriminately. After all, not derivatives of all pixels are helpful for estimating the illuminant color. Another drawback of the approach should be its inferior inter-database performance. That is, when the optimal parameter setting from one database is applied to another, its performance becomes mediocre. Then, two interesting works based on the gray-edge method suggest that the derivative information should be utilized selectively gijsenij2012improving ; joze2012role . In particular, Gijsenij et al. experimentally showed that performance of the gray-edge method can be improved significantly if contributions of specular edges are weighted higher gijsenij2012improving . One thing which should be noted explicitly is that their edge classification mechanism according to van2005edge makes edges of achromatic pixels to be both specular and shadow. vaezi Joze et al. extended the white-patch assumption land1977retinex to bright pixels and alternatively showed that constraining the gray-edge method to the bright pixels can obtain better color constancy results joze2012role . Nevertheless, identical to the original gray-edge method, they still suffer from the problem of unsatisfactory performance under inter-database configurations. In contrast, to the best of our knowledge, we are the first to analyze the physical feasibility of estimating the illuminant color from derivative structure of an image in the chromaticity space, which shows more in-depth explanation for the improved performance achieved by gijsenij2012improving ; joze2012role . Superior to van2007edge ; gijsenij2012improving ; joze2012role , the proposed approach solves the problem in the chromaticity space with the nonparametric kernel density estimation, eliminating the requirements for carefully parameter tuning. As a matter of fact, it performs rather satisfactorily under inter-database parameter settings. As to the rest statistics-based methods which solves color constancy from the derivative structure of an image, they altogether require training data to be available (i.e., gijsenij2010generalized ; chakrabarti2012color ; finlayson2013corrected ). The proposed approach, despite requiring no training session, performs either competitive or better than them. Cheng et al. attributed the effectiveness of these derivative-based methods to color difference according to purely experimental feedback cheng2014illuminant . Yet, based on our analysis, such success is due to the extraction of derivative colors from the achromatic and highlight regions. We conduct experiments on three linear standard databases: the SFU laboratory database barnard2002comparison , the Gehler-Shi database shi2010data , and the SFU HDR database funt2010rehabilitation . Our experimental results demonstrate the effectiveness and exciting performance of the proposed approach in comparison to state-of-the-art methods.

2 Related Work

We review firstly methods in the physics-based group, focusing on methods based on the dichromatic reflection model, and then methods in the statistics-based group.

Klinker et al. suggested to extract a T-shape color distribution from some uniform object color in the RGB space klinker1988measurement , which is rather difficult and unreliable when applied to real images. Lee proposed to estimate the illuminant color by detecting intersections of multiple dichromatic lines in the chromaticity space lee1986method . Then, many approaches were introduced to improve its performance tominaga2002natural ; tominaga1989standard ; tominaga1996multichannel ; lehmann2001color ; finlayson2001convex . In particular, Finlayson and Schaefer imposed a constraint on the illuminant color based on statistics of natural illuminate colors finlayson2001convex . However, since these approaches rely on intersections of different dichromatic lines, they are altogether infeasible for images with uniform object color. Nevertheless, the chromaticities of common light sources altogether locate closely to the Planckian locus of black-body radiators as shown in finlayson2001solving ; mazin2012illuminant . Therefore, Finlayson and Schaefer further utilized the Planckian locus as a constraint to address the above mentioned issue and solved the problem in theory finlayson2001solving . However, applying this method to images with multiple object colors demands a challenging pre-segmentation to acquire clusterings of uniform object color. Tan et al. introduced an inverse-intensity chromaticity (IIC) space to estimate the illuminant color using hough transformation tan2004color . Although the approach is suitable for both uniform and highly textured surfaces, its performance is unacceptable on the standard databases. Toro and Funt presented a multi-linear constraint on the illuminant color with several dichromatic planes toro2007multilinear

, which requires the representative colors of any given material to be identifiable. And the identification is fulfilled with generalized principal component analysis in

toro2007multilinear and with mean shift in toro2008dichromatic . Shi et al. relaxed such requirements with a voting procedure that involves performing the hough transform twice shi2008dichromatic . However, this voting procedure still estimates color of the scene illuminant based on the intersection of different dichromatic planes, so it will cease to be effective for uniform surfaces. Meanwhile, using the hough transform twice will inevitably make the estimation rather time-consuming. All of these approaches, despite with reasonable theoretical derivations, either are difficult to implement or show mediocre performance on real images, which significantly limit their applications in practice. Recently, Drew et al. proposed to estimate the illuminant color from a novel feature named the zeta-image drew2014zeta . They detected specular pixels with a planar constraint and improved the estimation accuracy in a large degree. In contrast, Prinet et al. proposed to extract specular information from the temporal derivative structure of an image in video sequences and estimate the illuminant color with MAP prinet2013illuminant . Nevertheless, it is inapplicable for color constancy from a single image.

As to the statistics-based group, they can be further categorized into two dominant subgroups: the derivative-based methods and the pixel-based methods. Since the derivative-based methods are more consistent with our work, we analyze them in the first place, and then the pixel-based methods.

van de Weijer et al. generalized the earlier gray assumptions to derivative structures of an image and presented the gray-edge method van2007edge , whose performance depends highly on the carefully tuned parameters. Generalized gamut mapping gijsenij2010generalized , which builds the gamut from the derivative structure of the image rather than colors are originally present in the image, performs more stably than the original algorithm forsyth1990novel . Chakrabarti et al. explored to solve color constancy by modeling the spatial-spectral statistics of an image originally with the gaussian function chakrabarti2008color . Later, they improved the performance by replacing the gaussian function with the heavy-tailed radial exponential function chakrabarti2012color , and report satisfactory results on the Gehler-Shi database shi2010data

. Yet, the performance of their approach depends highly on the training subsets selected from the database. Finlayson introduced the corrected-moment approach by adding a correction step to the gray-based methods

finlayson2013corrected . Although the approach reports better experimental results on the standard databases, it requires the correct illuminant colors to be provided as candidates. While these methods generally use information from the derivative structure of an image indiscriminately, two interesting works based on the gray-edge method van2007edge show that the derivative information extracted from different kinds of pixels can influence the estimation results differently gijsenij2012improving ; joze2012role . In particular, Gijsenij et al. experimentally showed that performance of the original gray-edge method van2007edge could be improved significantly when the estimation is accomplished based on specular edges in gijsenij2012improving . And they detected the specular edges using an iterative weighting mechanism. However, the optimal parameter settings of this approach vary seriously among different databases. vaezi Joze et al. extended the white-patch assumption land1977retinex to bright pixels and alternatively showed that constraining the original gray-edge method to bright pixels can improve its performance significantly joze2012role . Yet, their approach still suffers from the problem of inferior inter-database performance, similar to the methods gijsenij2012improving ; van2007edge .

Besides, many algorithms based on pixel level information exist in the literature of color constancy. The low-level statistics-based methods estimate the illuminant color based on simple statistical assumptions. For example, the gray-world hypothesis assumes the average reflectance in a scene under a neutral light source to be achromatic buchsbaum1980spatial . The white-patch hypothesis assumes that the maximum response in the RGB-channels is caused by a perfect reflectance land1977retinex . Shades of gray estimates the illuminant color from a more general Minkowski framework finlayson2004shades . Despite the simplicity of these low-level statistics-based methods, their performance is far from satisfactory even under optimal parameter settings. The gamut mapping algorithm represents the object colors that can be observed under a canonical illuminant with the canonical gamut, and by mapping gamut of the input image to the canonical gamut, the illuminant color can be therefore estimated forsyth1990novel . Yet, its performance tends to be error-prone when the number of different object colors is limited. In addition, the canonical gamut depends on the training subsets selected from the database as well. Color by correlation finlayson2001color is actually a discrete implementation of gamut mapping, but it can formulate many other algorithms into its framework. Alternatively, the bayesian approaches gehler2008bayesian ; brainard1997bayesian ; rosenberg2003bayesian

model the variability of surface reflectance and the illuminant color as random variables, and then estimate the illuminant color from the posterior distribution conditioned on image intensity data. In addition, introducing a representation of binarized chromaticity histograms, the illuminant color can be recovered with either neural networks

cardei2002estimating

or support vector regression

funt2004estimating . However, the performance of these approaches on the standard databases are not outstanding. Cheng et al. attributed the effectiveness of derivative-based methods to color difference according to purely experimental feedback cheng2014illuminant , and proposed to estimate the illuminant color through a principle component analysis on bright and dark pixels cheng2014illuminant . The performance of their approach is generally good on standard databases. Chakrabarti recently proposed to estimate the illuminant color by training a luminance-to-chromaticity classifier chakrabarti2015color .

There are also some methods exploring complex information to achieve color constancy among the statistics-based group (e.g., gijsenij2011color ; bianco2008improving ; van2007using ). Most recently, vaezi Joze and Drew presented an exemplar-based approach, which estimated the illuminant color by using the color and texture feature cues to perform a nearest neighbour search among the training data vaezi2014exemplar . They provided state-of-the-art results on standard databases. Yet, when this approach was applied to inter-database settings, its performance degraded dramatically. Cheng et al. presented to estimate the illuminant color quite fast using an ensemble of regression trees, which is trained with four kinds of simple color feature cues cheng2015effective . More generally, Li et al. proposed such an approach, which estimates the illuminant color by training a sparse representation for multiple feature cues obtained from different color constancy approaches and achieves fairly good results li2015multi . Yet, the approach is rather complicated and meanwhile time-consuming to be executed.

Methods beyond the physics-based and statistics-based groups are limited in number. Gao et al. presented to fulfill color constancy using double-opponency from a biological standpoint gao2015color . This method performs comparable to the complex ones under optimal parameter settings on different databases. Meanwhile, it shows better inter-database performance than the exemplar-based approach vaezi2014exemplar .

3 Approach

Since achromatic regions can be utilized readily to estimate the illuminant color, our analysis is mainly focused on color constancy from the specular components in highlight regions. We introduce the well-known dichromatic reflection model in Section 3.1, validate the theoretical feasibility of using derivative colors extracted from highlight regions to estimate the illuminant color in Section 3.2. Yet, despite its simplicity, we also concisely analyze derivative colors from achromatic regions in the last part of Section 3.2. Finally in Section 3.3, we design an algorithm which is quite robust for real images.

3.1 The Dichromatic Reflection Model

For a scene illuminated by a single illuminant, its spectral distribution is usually considered uniform. Highlights of inhomogeneous dielectric objects are linear combinations of the diffuse and specular components according to the dichromatic reflection model shafer1985using . Based on the neutral interface reflection (NIR) assumption lee1990modeling , the interface reflection spectrum will be the same as that of the illuminant. We formulate the dichromatic model depending on this assumption. A pixel in an image of inhomogeneous dielectric object taken by a digital color camera can be expressed as

(1)

in which is the relative illumination spectrum, is the spectral reflectance of the material, and is a vector with its elements representing the sensitivity functions of the camera’s corresponding channels. In addition, refers to the visible spectrum. , are respectively the magnitudes of body reflection and interface reflection. More concisely, Eq. (1) can be simplified as

(2)

in which and denote magnitudes of the diffuse reflection and specular reflection ordinally. and correspond respectively to the intrinsic object color and the illuminant color. Without loss of any generality, we constrain

(3)
Figure 1: (a)The synthetic image rendered using Torrance-Sparrow reflection model torrance1967theory . (b)The diffuse component of the synthetic image. (c)The specular component of the synthetic image. (d)The ratio in different locations after differentiating the image with operator . (e)The ratio in different locations after differentiating the image with operator .

3.2 Derivative Color

To obtain the illuminant color based on specular components of the dichromatic reflection model, naively we consider a pair of highlight pixels , with the same intrinsic object color D. By differing them, we obtain

(4)

If the ratio is large enough to make

(5)

the illuminant color will be successfully extracted. We explore such a possibility from the physical standpoint.

In fact, according to the Lambert’s Law lambert1760law , the magnitude of diffuse reflection depends on the diffuse albedo , intensity of incident illuminant , together with the angle between the illumination direction and the surface normal. It can be expressed simply as

(6)

Usually, variations on and are small for local points in the process of image formation. As a result, for a local region of uniform intrinsic object color, will also vary slightly since they will share the same diffuse albedo . In contrast, the magnitude of specular reflection is quite sensitive to the geometric configurations between the illuminant, the object, and the camera, which therefore makes it vary more widely in the local regions. Based on the model on specularity that Torrance and Sparrow model presented in torrance1967theory , has the following characteristics as

(7)

in which is the Fresnel reflection and is derived from the Fresnel equation, is the geometrical attenuation factor, and is the surface roughness. In addition, is the angle between the surface normal and the viewing direction, while is the angle between the surface normal and the bisector of the viewing direction and the illumination direction. This dependence of on the geometrical configurations has also been exploited in IIC tan2004color .

Since varies more sensitive than , one direct solution to obtain the expression similar to Eq. (4) with large ratio is to base on the derivative structure of the image. Let f be an arbitrary differential operator, and I be an image satisfying the dichromatic reflection model. The derivative structure of the image is then calculated as . If J(x) is obtained by convoluting I(x) with f in a region of uniform intrinsic object color D, it can be simply expressed as

(8)

However, the differential operator f should be carefully determined such that the ratio obtained for J(x) in Eq. (8) could be as large as possible. For example, we show typically a synthetic image I with uniform intrinsic object color D in Fig. 1. Fig. 1 and Fig. 1 are respectively its diffuse and specular components. By differentiating respectively the diffuse and specular components of image I with the operator

(9)

we can readily obtain the ratio for each spatial point x, which are altogether less than 10. Fig. 1 plots the result. In contrast, using the operator

(10)

the ratios we achieve can be greater than 500. Fig. 1 plots the result. It can be seen that evidently, the operator produces larger ratios than . In addition, for both and , locations containing higher specular components are basically with larger ratios. Note in this experiment, we perform differentiations along the horizontal and vertical directions for their simplicities.

Figure 2: (a)The synthetic image with distinctive intrinsic object colors. (b) The synthetic image with similar intrinsic object colors. (c)The ratio in different locations after differentiating the image in (a) with operator . (d)The ratio in different locations after differentiating the image in (b) with operator .

Beyond uniform regions, the ratio for J(x) which is obtained by convoluting I(x) with f in a region of non-uniform intrinsic object colors is explored as well. In fact, each , either obtained in uniform regions or non-uniform regions, can be altogether represented as

(11)

where , and is an arbitrary vector with . The expression shown in Eq. (8) is just a special case of Eq. (11), with . Usually, for non-uniform regions, is quite large, which as a result leads the ratio to be small. We show two such examples in Fig. 2. Fig. 2 is a synthetic image with four distinctive intrinsic object colors. Fig. 2 is a synthetic image with four similar intrinsic object colors. Figs. 2, 2 are respectively their ratios in different locations after differentiated by operator . It can be seen that in a non-uniform region, even when the intrinsic object colors are close to each other, the ratio achieved is still quite small. To this end, we introduce the definition on the derivative color as follows. Each J(x) in the derivative structure J of an image I, is denoted as a derivative color if . Let

We refer , , as the red, green and blue chromaticities of the derivative color J(x) respectively. Since uniform regions generally produces large ratio for J(x) than non-uniform regions, we focus on estimating the illuminant color using derivative colors obtained from uniform regions.

Undoubtedly, the larger the ratio is, the better the derivative color will approximate to . Taking each derivative color from the image shown in Fig. 1 as an estimate of the illuminant color, we track the variation of the estimation accuracy according to the ratio with experiment. In addition, the angular error is adopted to measure our estimation accuracy, which is recently referred as the recovery error in finlayson2014reproduction . In particular, it is defined as hordley2006reevaluation

(12)

where S is the ground truth illuminant color, and is the estimated illuminant color. Specifically, here we have . Since larger ratio means better estimation, it will therefore contribute to smaller angular error. Fig. 3 shows the experimental result, which consists well with our speculation. Besides, it can be seen that when the ratio is larger than 30 around, the angular error is close to 0 and varies quite slightly, which is caused by the small variations on the chromaticities of the derivative color J(x). We show variations on the red, green and blue chromaticities of according to the ratio in Fig. 3. It can be seen that they altogether keep nearly constant when the ratio is larger than 30.

Figure 3: (a)The variation of the angular error according to the ratio . (b)The variations of chromaticities of the estimated illuminant color J(x) according to the ratio. We plot the red, green and blue chromaticities correspondingly with red, green and blue dots.

We have validated the theoretical feasibility of estimating the illuminant color with derivative colors extracted from uniform highlight regions. As to pixels in achromatic regions, which can be generally expressed as

(13)

the ratio of each derivative color can be actually considered as . That is, derivative color from achromatic regions are identical to the illuminant color. Specifically, such successful extraction in achromatic regions are due to either diffuse shadow or shading (variations on ), or highlight (variations on ). Therefore, using derivative colors, we can achieve color constancy from both uniform highlight regions and achromatic regions.

3.3 Illuminant Estimation

Figure 4: Example on distributions of . The first row shows the original images. The second row shows their distributions in the rg chromaticity space correspondingly. We plot points in with color coded dots such that values of these colors explicitly represent frequencies of the points. In addition, the ground truth is plotted with red cross. It can be seen that the majority of the data are densely distributed around the ground truth. And the third row shows the gray-scale of the original images, on which the pixels whose derivative colors are with frequencies larger than 50 are labelled as red. It can be seen that they generally locate in the uniform regions.

In reality, it is readily achievable to extract derivative colors identical to the illuminant color from achromatic regions. Yet, to achieve color constancy using the derivative colors from uniform highlight regions of real images, two issues should be addressed as effectively as possible. The first one is that in order to obtain large enough ratio , what kind of differential operators can be used to extract the derivative colors. Besides, since it is usually impossible to obtain the diffuse and specular components of an image separately, the ratio therefore becomes unavailable for each location. As a result, we cannot determine which derivative color is a better estimate of the illuminant color. Thus, the second issue is that what kind of pixels generally produce larger ratios and can be used to estimate the illuminant color better.

As a matter of fact, due to the universal existence of image noise, quantization error, and textured surfaces et al., the differential operator should be determined with efforts. However, based on the recent success of the gaussian derivative filters in a number of works (e.g., van2007edge ; gijsenij2010generalized ; chakrabarti2012color ; finlayson2013corrected ), we found that a combination of the second-order gaussian functions , , works well in extracting the derivative colors from highlight regions as well as achromatic regions. Particularly, the differential operators take the following expressions,

(14)

where the constants in these expressions are omitted for convenience since they have no any influence on the illuminant estimation. The parameter controls the scale of the convolution. And we expect to be small so that more derivative colors are extracted from uniform regions. The derivative structures of an image obtained from these differential operators are represented as , , and respectively. We denote the set of locations for the derivative colors in these derivative structures as , , correspondingly.

Sufficiently, for highlight pixels, if is large and is small, the ratio will be large. In order to obtain large , highlight regions with strong specular components are good choices. However, to the best of our knowledge, the detection of highlight is still a challenging and open problem in computer vision at the current stage, even though some solutions are suggested (e.g., delpozo2007detecting ; angelopoulou2007specular ; yilmaz2014detection ). In addition, without guidance of the illuminant color, it is also impossible to identify achromatic regions. Due to this fact, we detect the expected regions in a compromised way, which is similar to the way adopted in tan2004color ; joze2012role ; drew2014zeta . Firstly, sort the intensity of all pixels after the saturated pixels are clipped. Secondly, the pixels with higher intensity are labelled to build a binary mask for the original image. Finally, erode the binary mask until no more than

pixels are left. The reasons facilitating our detection mechanism are two-fold. On the one hand, highlight regions with high specular components are usually much brighter. In addition, white patches, which are achromatic, generally have higher intensity as well. Considering the difficulty of detecting achromatic and highlight regions strictly, using bright regions as a replacement is a reasonable trade-off in reality. On the other hand, since the small regions in the binary mask have a high probability to be noise and unreliable, we add the erosion operation to remove them, which indeed stabilizes the estimation accuracy as shown in the experimental part later. The set of locations for all pixels in the finally selected regions is denoted as

.

We then represent the selected derivative colors as

(15)

in which

For the selected derivative colors in achromatic regions of , perfect illuminant color is obtained. For the selected derivative colors in highlight regions of , large is obtained. As to , it is impossible to ensure it to be small for all pixels in . The reason is that is usually small in uniform region and large in non-uniform region, while real images generally suffer from the influence of varying intrinsic object colors. Therefore, it is impossible to obtain large ratio for all pixels in . However, derivative colors from the uniform highlight regions are generally with large ratios, which therefore makes them close to the illuminant color. Projecting the selected derivative colors in

into the rg chromaticity space, we observe that despite with outliers, the majority of the data distribute densely around the ground truth. Fig.

4 shows several examples. The first row is the original images. The second row is their distributions in the rg chromaticity space correspondingly. It can be seen that the majority of the data are densely distributed around the ground truth. And the third row shows pixels whose derivative colors are with frequencies larger than 50. It can be seen that they generally locate in the uniform regions.

With some abuse of notation, the data set projected into the rg chromaticity space is represented as , and . is computed as . And , are respectively the red and green chromaticities of the derivative color . We then present to take the point with the maximum density in the rg chromaticity space as an estimate of the illuminant chromaticity. And the density of each point z is computed using the following function:

(16)

where is the Parzen kernel estimator (which has the properties of non-negative and integrating to 1). Its typical form is a Gaussian: duda1999pattern , with controlling the smoothness of the kernel function on the data. Then, the estimated illuminant chromaticity is taken as

(17)

An estimate of the illuminant color can be readily obtained from , which equals

(18)

Since the proposed approach is based on a key exploitation of the derivative colors, we abbreviate our algorithm as the DCs algorithm and summarize it in Algorithm 1.

Objective: Estimate the illuminant color of a color
image I ;

Steps:

1:  Clip the saturated pixels and select pixels with higher intensity to build a binary mask;
2:  Erode the binary mask to remove small segments until no more than pixels are left;
3:  Extract the derivative colors from the selected locations , using the differential operators , and ;
4:  Compute the probability for each data in the rg chromaticity space with kernel density estimation;
5:  Estimate the data with the maximum density, ;
6:  Compute the illuminant color based on .
Algorithm 1

4 Experimental Evaluation

We evaluate the proposed approach on three standard databases: the SFU laboratory database barnard2002comparison , the Gehler-Shi database shi2010data , and the SFU HDR database funt2010rehabilitation . The error between the ground truth S and the estimated illuminant color is computed according to Eq. (12). In addition, besides the median and mean angular errors, we report the trimean, the Best 25 percent or the Worst 25 percent selectively on each database in order to compare the performance of different approaches better. Among these measures, the median indicates performance of the method on the majority of the images, while the trimean also provides an indication on the extreme values of the distribution. The best 25 percent and worst 25 percent errors are robust measures which refer to the mean of the 25 percent lowest and highest error values respectively. Further, to summarize the performance of different algorithms with more insight, a sign test hordley2006scene

is conducted between each pair of them over every database. This sign test determines whether one algorithm tends to have lower errors compared to another by using significance testing to reject the null hypothesis that the medians of the error distributions of two algorithms are the same. Commonly, the confidence level for accepting the null hypothesis is chosen as 95% (e.g.,

gijsenij2011computational ; chakrabarti2012color ; gao2015color ).

According to our categorization in the introduction, we consider the existing algorithms in our comparison mainly from two categories: (1) physics-based methods: using natural illuminant colors as constraints (NICs) finlayson2001convex , inverse-intensity chromaticity space (IIC) tan2004color , and Zeta-image drew2014zeta ; (2) statistics-based methods: gray-world (GW) buchsbaum1980spatial , white-patch (WP) land1977retinex , shades of gray (SoG) finlayson2004shades , general gray-world (GG), gamut mapping (GM(pixel)) forsyth1990novel , Bayesian gehler2008bayesian , neural network (NN) cardei2002estimating , support vector regression (SVR) funt2004estimating , gray-edge (GE1 and GE2) van2007edge , natural image statistics (NIS) gijsenij2011color , generalized gamut mapping (GM(jet)) gijsenij2010generalized , weighted gray-edge (WGE) gijsenij2012improving , spatio-spectral statistics (SS) chakrabarti2012color , PCA on dark and bright pixels (DBPCA) cheng2014illuminant , corrected-moment (CM) finlayson2013corrected , and multi-cue (MC) li2015multi . The biological-based method double-opponency (DOCC(sum and max)) gao2015color is also included. The error distributions of most approaches are directly available from websites cc_website , li_website_all . The results of IIC, WGE, DBPCA, and GM(jet) are obtained by executing the codes published by the authors when the direct results are unavailable on a database. The results of MC are from website li_website_mc . And those of DOCC are from website docc_website . Please note that our results of DOCC on the latest Gehler-Shi database are obtained from the authors via private communication.

Method Median Mean Tri-mean Best-25% Worst-25%
DN 15.60 17.27 16.56 3.63 32.49
GW 0.91 23.45
WP 1.86 20.97
SoG 0.60 16.49
SS 3.45 5.63 4.33 1.24 12.90
GG 0.50 13.75
GE1 3.18 5.58 3.75 1.07 14.05
GE2 2.74 5.19 3.26 1.11 13.51
DBPCA 2.83 6.41 3.69 0.47 18.34
DOCC(sum) 4.93 6.25 5.50 1.76 12.48
DOCC(max) 2.40 5.46 3.33 0.41 15.34
GM(jet) 2.28 3.92 2.70 0.52 9.91
GM(pixel) 2.27 3.70 2.53 0.46 9.32
SVR 2.17
CM(3 edge) 3.60 4.10
CM(9 edge) 2.00 2.60
IIC 8.23 2.24 40.23
NICs 2.68
WGE 2.44 5.59 2.90 0.70 16.00
Zeta-image 1.90 4.30
DCs 1.71 4.21 2.45 0.41 12.12
Table 1: Performance on the SFU laboratory database.
   

 

   

(1)DN

 

(2)IIC

 

(3)GW

 

(4)WP

 

(5)DOCC(sum)

 

(6)SS

 

(7)SoG

 

(8)GE1

 

(9)GE2

 

(10)GG

 

(11)NUS

 

(12)WGE

 

(13)DOCC(max)

 

(14)GM(jet)

 

((15)GM(pixel)

 

((15)DCs

 

Score

 

 

 (1)   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   0  

 

 (2)   1   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   1  

 

 (3)   1   1   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   2  

 

 (4)   1   1   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   2  

 

 (5)   1   1   1   1   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   4  

 

 (6)   1   1   1   1   1   0   0   0   -1   -1   -1   -1   -1   -1   -1   -1   5  

 

 (7)   1   1   1   1   1   0   0   0   0   0   0   -1   -1   -1   -1   -1   5  

 

 (8)   1   1   1   1   1   0   0   0   0   0   0   -1   -1   -1   -1   -1   5  

 

 (9)   1   1   1   1   1   1   0   0   0   0   0   -1   -1   -1   -1   -1   6  

 

 (10)   1   1   1   1   1   1   0   0   0   0   0   0   0   -1   -1   -1   6  

 

 (11)   1   1   1   1   1   1   0   0   0   0   0   0   0   -1   -1   -1   6  

 

 (12)   1   1   1   1   1   1   1   1   1   0   0   0   0   0   -1   -1   9  

 

 (13)   1   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   9  

 

 (14)   1   1   1   1   1   1   1   1   1   1   1   0   0   0   0   0   11  

 

 (15)   1   1   1   1   1   1   1   1   1   1   1   1   0   0   0   0   12  

 

 (16)   1   1   1   1   1   1   1   1   1   1   1   1   0   0   0   0   12  

 

Table 2: WST test on the SFU laboratory database.

Throughout the experimental evaluations, we extract the derivative colors with scales for the gaussian operators. And the kernel bandwidth is set to be . In addition, to simply the algorithm, we detect the original regions with the top bright pixels for all databases, i.e., setting to be 5. To this end, the only undetermined parameter in the proposed approach is , which is used to set the intensity threshold. We demonstrate the stable performance of DCs under different settings of on each database with experiments. The average runtime per-image is reported on each database to show the efficiency of the proposed approach. We will later release our code with the paper.

4.1 SFU Laboratory Database

Figure 5: The relationship between performance of the proposed approach on the different databases and the parameter . It can be seen that the proposed approach performs quite stably while is ranging from 1 to 4.

The SFU laboratory database contains totally 321 images of 31 different scenes, which are recorded under 11 different illuminants in laboratory settings. Among these images, 223 are with minimal specularities, while 98 are with non-negligible dielectric specularities, which consequently makes this database quite suitable for evaluating the proposed approach.

We firstly track how the the proposed approach performs under variations of parameter with experiments. Fig. 5 shows the results. It can be noticed that performance of the proposed approach keeps quite stable while is ranging from 1 to 4. The results reported in Table 1 and Table 2 are obtained with . To demonstrate that the erosion operation is necessary, we conduct similar experiment by directly selecting as locations of the top 5% bright pixels. Compared with the results shows in Fig. 5, the median and mean angular errors degrade to be and 5.07 respectively. Thus, removing the small regions in the binary mask with an erosion operation improves the estimation accuracy.

Statistical metrics of both the proposed approach and the other ones (under their optimal parameter settings) are reported in Table 1. Table 2 shows the results of the sign test, in which we omit the analysis of SVR, NICs, CM and Zeta-image because their error distributions are unavailable. Particularly, a sign (0) at location indicates that there is no significant difference between the median errors of method and method at the 95% confidence level. A sign (1) indicates that the median of method is significantly lower than method , and a sign (-1) indicates the opposite situation. In addition, the score in the last column computes the times that a method performs significantly better than others.

To the best of our knowledge, the proposed approach provides the lowest median angular error (1.71) on the SFU laboratory database. From Table 1, it can seen that its trimean and the Best 25 percent are also lower compared to those of state-of-the-art methods. Table 2 shows that DCs significantly outperforms most methods. In addition, it has no significant difference with DOCC(max), GM(jet) and GM(pixel). Besides satisfactory performance, the proposed approach has such important advantages as no training phase and simple implementation. In addition, with simple Matlab implementation on a laptop with the CPU Interl Core i5-4200U, our average runtime on the SFU laboratory database is 0.82 second for each image. This runtime is shorter than most methods with code provided. However, since the performance of an approach is more important than the runtime, we only report the time here for a referee.

4.2 Gehler-Shi Database

We then apply the proposed approach to a database free of laboratory controls: the Gehler-Shi database shi2010data , which is the reprocessed version of the Color Checker database created by the authors of gehler2008bayesian . This database includes totally 568 images, among which 246 are labeled as indoor scenes and 322 outdoor scenes. More importantly, images of this database are all 12-bit linear images, which are generated directly from their RAW formats and therefore free of any color correction. Each image contains a color checker with known coordinates in the image space, which is masked during the illuminant estimation. The black-level offset for camera Canon 1D is zero, and that for Canon 5D is 128. Therefore, before estimating the illuminant color from images taken by Canon 5D, the black offset 128 should be subtracted. Similarly, it can be noticed from Fig. 5 that performance of the proposed approach keeps more stably while is ranging from 1 to 4. The results reported in Table 3 and Table 4 are obtained with .

Table 3 reports performance of various methods on the indoor, outdoor and the entire Gehler-Shi databases. The optimal results (with ) of DBPCA is reported by executing the provided by Cheng et al. The results of WGE is obtained with the parameter setting . In addition, to obtain the results of GM(jet), we set the parameters to be , and meanwhile adopt the training and test configurations recommended in li2015multi ; li2014evaluating , which is also used to generate results of other algorithms that demand a training session. Moreover, the optimal parameter settings for DOCC(max) and DOCC(sum) are , respectively. We notice that DCs provides the lowest median and mean errors compared to state-of-the-art methods. Further, its median on the indoor subset is the lowest as well, slightly better than that of WGE. In addition, it can be seen that similar to most approaches, DCs yields better results on the outdoor subset than on the indoor subset. The dominant reason attributed to this outcome is interreflection in the indoor scenes, which as a result making the neutral illuminant assumption or NIR assumption in the dichromatic reflection model being violated. Chakrabarti et al. suggested a similar explanation for this phenomenon chakrabarti2012color . Table 4 shows the results of the sign test. It can be seen that DCs significantly outperforms all the other methods, with the only exception of MC. However, compared to the complexity of MC, DCs is much simpler and more efficient. Our average runtime for processing an image is 5.58 seconds on the Gehler-Shi database.

Method All Images (568) Indoor (246) Outdoor (322)
Median Mean Worst-25% Median Mean Worst-25% Median Mean Worst-25%
DN 4.80 9.26 24.03 17.68 17.23 28.30 2.75 3.17 6.00
WP 9.15 10.26 20.51 11.33 11.64 22.05 6.44 9.21 19.18
Bayesian 5.14 6.74 15.03 6.54 7.92 16.27 4.23 5.84 13.52
SoG 4.48 6.42 15.01 6.88 7.74 16.26 3.07 5.41 13.58
GM(pixel) 3.98 6.00 14.31 6.12 7.46 15.84 3.00 4.90 12.37
GG 3.90 6.35 15.83 5.94 8.08 17.51 2.69 5.02 13.34
NN 3.77 5.16 11.53 5.72 7.41 15.25 2.81 3.45 7.11
GM(jet) 3.68 5.52 13.38 5.09 6.76 14.69 2.72 4.57 11.68
GW 3.63 4.77 10.51 3.25 4.10 8.96 3.97 5.28 11.47
GE1 3.28 4.19 8.75 2.97 3.73 8.09 3.58 4.54 9.15
GE2 3.35 4.23 8.61 3.07 3.78 7.92 3.69 4.58 9.02
SS 3.24 3.99 7.62 3.49 4.23 8.12 2.97 3.81 7.20
SVR 3.23 4.14 9.05 4.46 5.66 11.91 2.47 2.97 5.99
NIS 3.12 4.32 9.88 3.71 4.93 10.90 2.76 3.86 8.79
HVI 3.06 4.36 9.89 3.27 4.14 8.85 2.96 4.54 10.63
DOCC(sum) 2.70 3.88 8.93 2.84 3.78 8.28 2.65 3.96 9.36
DOCC(max) 2.69 4.64 11.85 4.22 5.99 14.23 2.02 3.61 9.23
NUS 2.46 4.07 10.06 3.06 4.50 10.32 1.99 3.74 9.73
MC 2.13 3.25 8.09 3.37 4.42 9.97 1.39 2.35 6.02
IIC 4.53 8.07 20.88 7.33 11.91 28.68 3.49 5.13 11.85
WGE 2.51 3.50 8.00 2.10 3.17 7.73 2.91 3.76 8.01
DCs 1.86 3.14 7.93 2.06 3.53 8.99 1.68 2.84 7.04
Table 3: Performance on the Gehler-Shi database.
   

 

   

(1)WP

 

(2)DN

 

(3)IIC

 

(4)Bayesian

 

(5)SoG

 

(6)GM(pixel)

 

(7)GG

 

(8)NN

 

(9)GM(jet)

 

(10)GW

 

(11)GE1

 

(12)GE2

 

(13)SS

 

(14)HVI

 

((15)SVR

 

((16)NIS

 

(17)DOCC(max)

 

(18)DOCC(sum)

 

(19)WGE

 

(20)NUS

 

(21)MC

 

(22)DCs

 

Score

 

 

 (1)   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   0  

 

 (2)   1   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   1  

 

 (3)   1   0   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   1  

 

 (4)   1   1   0   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   2  

 

 (5)   1   1   1   0   0   0   0   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   3  

 

 (6)   1   1   1   1   0   0   0   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   4  

 

 (7)   1   1   1   1   0   0   0   0   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   4  

 

 (8)   1   1   1   1   0   0   0   0   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   4  

 

 (9)   1   1   1   1   0   0   0   0   0   0   0   0   0   -1   -1   -1   -1   -1   -1   -1   -1   -1   4  

 

 (10)   1   1   1   1   1   1   0   0   0   0   0   0   0   0   -1   -1   -1   -1   -1   -1   -1   -1   6  

 

 (11)   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0   -1   -1   -1   -1   -1   -1   8  

 

 (12)   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0   -1   -1   -1   -1   -1   -1   8  

 

 (13)   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0   -1   -1   -1   -1   -1   -1   8  

 

 (14)   1   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   -1   -1   -1   -1   -1   -1   9  

 

 (15)   1   1   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   -1   -1   -1   -1   -1   10  

 

 (16)   1   1   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   -1   -1   -1   -1   -1   10  

 

 (17)   1   1   1   1   1   1   1   1   1   1   1   1   1   1   0   0   0   0   0   0   -1   -1   14  

 

 (18)   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   0   0   0   0   -1   -1   16  

 

 (19)   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   0   0   0   0   -1   -1   16  

 

 (20)   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   0   0   0   0   -1   -1   16  

 

 (21)   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   0   0   20  

 

 (22)   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   0   0   20  

 

Table 4: WST test on the Gehler-Shi database.

4.3 SFU HDR Database

The SFU HDR database, recently collected by Funt and Shi, includes high dynamic range (HDR) linear images of 105 scenes captured using a Nikon D700 digital still camera. Three subsets are provided for color constancy research. The first one includes 105 16-bit base images, with a color checker placed in the scenes to record the illuminant color. The second one contains images of the same scene, but with the color checker removed. The third subset contains 105 images of float format, which is constructed from the base images. In the experiment, we use images in the third subset to evaluate the proposed approach. For images with a color checker positioned in the scene, the color checker is masked during illuminant estimation such that performance of the approach can be fully evaluated. We track how the the proposed approach performs under varying settings of and plot the results in Fig. 5. Again we notice that its performance keeps quite stable while is ranging from 1 to 4. The results reported in Table 5 and Table 6 are obtained with .

Table 5 shows the results for multiple approaches. In this table, the results of GW, WP, SoG, GG, GE1 and GE2 were obtained by running the Matlab codes from cc_website ; gijsenij2011computational with optimal parameter settings. The optimal results of DBPCA is reported with . The results of WGE is obtained with the parameter settings . And the results of WP(post blurred) and CM are directly cited from finlayson2013corrected . It is noticed that the median error of DCs is lower than most approaches except GG and CM. Evidently, the performance of DCs on the SFU HDR database is poorer than that on the SFU laboratory and the Gehler-Shi databases. Funt and Shi pointed out in their work funt2010rehabilitation that all scenes in the SFU HDR database contain some variation in the illumination color because of interreflections. Therefore, interreflections should be undoubtedly responsible for this poorer performance of DCs on the SFU HDR database. Table 6 shows the results of the sign test. The analysis of CM is again omitted for its error distribution is unavailable. We notice that DCs achieves the highest score among the tested algorithms. In particular, it performs better than GW, WGE and DOCC(max), while shows indistinguishable performance with the other methods. The average runtime for each image in this database is 2.65 seconds, with the time for loading an HDR image neglected.

Method Median Mean Trimean Best-25% Worst-25%
DN 14.67 15.11 14.95 11.19 19.51
GW 7.45 8.08 7.78 1.94 15.17
WGE 5.76 6.67 5.72 1.78 13.29
WP 4.06 6.36 4.75 1.59 14.45
WP(blur) 3.90 6.30