Python implementation of two low-light image enhancement techniques via illumination map estimation
Exposure correction is one of the fundamental tasks in image processing and computational photography. While various methods have been proposed, they either fail to produce visually pleasing results, or only work well for limited types of image (e.g., underexposed images). In this paper, we present a novel automatic exposure correction method, which is able to robustly produce high-quality results for images of various exposure conditions (e.g., underexposed, overexposed, and partially under- and over-exposed). At the core of our approach is the proposed dual illumination estimation, where we separately cast the under- and over-exposure correction as trivial illumination estimation of the input image and the inverted input image. By performing dual illumination estimation, we obtain two intermediate exposure correction results for the input image, with one fixes the underexposed regions and the other one restores the overexposed regions. A multi-exposure image fusion technique is then employed to adaptively blend the visually best exposed parts in the two intermediate exposure correction images and the input image into a globally well-exposed image. Experiments on a number of challenging images demonstrate the effectiveness of the proposed approach and its superiority over the state-of-the-art methods and popular automatic exposure correction tools.READ FULL TEXT VIEW PDF
Python implementation of two low-light image enhancement techniques via illumination map estimation
With the prevalence of camera-embedded mobile devices and inexpensive digital cameras, people are increasingly interested in taking photos, so that photo sharing on social networks has become a trendy lifestyle. However, despite modern cameras are equipped with many sophisticated techniques and are generally easy to control and use, capturing well-exposed photos under complex lighting conditions (e.g., low light and back light) remains a challenge for non-professional photographers. Hence, poorly exposed photos are inevitably created; see Figure Dual Illumination Estimation for Robust Exposure Correction for examples. Due to the unclear details, weak contrast and dull color, such photos usually look unpleasing and fail to capture user-desired effects, which increases the need for effective exposure correction techniques.
Because of the inherent nonlinearity and subjectivity, exposure correction is a challenging task. Indeed, existing image editing softwares (e.g., Photoshop, GIMP and Lightroom) offer various tools for users to interactively adjust the tone and exposure of photos, while they remain difficult for non-experts since these tools basically require a tedious process to balance multiple controls (e.g., brightness, contrast, color, etc.). Although the “Auto-Tone” feature in Lightroom and the “Auto-Level” feature in Photoshop allow automatic exposure correction by just a single click, they may not always apply the right adjustments to the input image, making them fail to produce satisfactory results. Figure 1 shows an example image processed by these tools.
Researchers have also developed various exposure correction methods. However, they are mostly designed for solely correcting under- [WZHL13, GLL17, ZYX18] or over-exposure [GCZS10, LYKK14, ABK18], thereby having limited applicability. There also exists some methods that are applicable to images of arbitrary exposure conditions. Early methods such as histogram equalization and its variants [Zui94, Kim97, Sta00] work by stretching the dynamic range of the intensity histogram, but tend to generate unrealistic results. Some subsequent methods rely on S-shaped tone mapping curves [RSSF02, YS12] or wavelet [HJS13] to work, while more recent methods [GCB17, CWKC18, HHX18, WZF19] train tone adjustment models on datasets to allow exposure correction. However, they do not work well on overexposed images and may induce unnatural results; see Figure 10.
This paper presents a novel exposure correction approach, which is built upon the observation that under- and over-exposure correction can be jointly formulated as a trivial illumination estimation problem of the input image and the inverted input image. Although previous methods have demonstrated the effectiveness of illumination estimation in correcting underexposed photos, they barely explore its potential in handling overexposure. Unlike them, we found that overexposure correction can also be formulated as an illumination estimation problem by inverting the input image, since the originally overexposed regions would appear as underexposed, allowing us to fix overexposed regions in the input image by correcting underexposed regions in the inverted input image. Hence, we introduce dual illumination estimation, where we separately predict forward illumination for the input image and reverse illumination for the inverted input image. Two intermediate exposure correction images of the input image are then recovered from the estimated forward and reverse illuminations, with one that fixes the underexposed regions and the other one restores the overexposed regions. Next, we apply an effective multi-exposure image fusion to the intermediate exposure correction images and the input image to seamlessly blend the locally best exposed parts in each of the three images into a globally well-exposed image.
The contribution of this paper is a simple yet effective exposure correction method built upon a novel dual illumination estimation. To show the effectiveness of our method, we evaluate it on a number of challenging images and compare it against both state-of-the-art methods and the popular exposure correction tools via user study. Experiments show that results generated by our method are more preferred by human subjects, and our method is effective to deal with previously challenging images (e.g., images with both under- and over-exposed regions). Moreover, our method is fully automatic and can run at near-interactive rate.
Exposure correction is an important research problem with an immense literature. Perhaps, the most fundamental method is histogram equalization (HE), which globally enhances image contrast by stretching the intensity histogram. Despite the simplicity and effectiveness in contrast enhancement, it tends to generate unrealistic results because of ignoring relationship between pixels.
Mertens et al. [MKVR09] proposed to blend well-exposed regions from an image sequence with bracketed exposure into a single high-quality image. Despite the success of this technique, it cannot be directly applied to a single image because of requiring multi-exposure image sequence as input. Later, Zhang et al. [ZNZX16] adapted this technique to underexposed video enhancement by first constructing multi-exposure image sequence for each video frame using sampled tone mapping curves, and then obtained the enhanced video by progressively fusing the image sequences in a spatio-temporal aware fashion. In contrast, our method can not only work for a single image, but also is applicable to images of various exposure conditions, not just underexposed images.
Bennett and McMillan [BM05] decomposed an input image into base and detail layers, and applied different sigmoid mappings for the two layers to restore the underexposed regions while preserving the image details. However, it often produces over saturated results. Yuan and Sun [YS12] described an automatic exposure correction method for consumer photographs. The key idea behind their method is to infer optimal exposure for each subregion and map subregions to their desired exposure levels using detail-preserving tone mapping curve. While this method demonstrates promising results, it may also fail because it relies on reliable region segmentation, which is a challenging task.
Reverse tone mapping can also be used to correct exposure of an image by inferring a high dynamic range (HDR) image from a single low dynamic range (LDR) input [MG16, EKM17, EKD17]. Our work differs them in two aspects. First, we do not change the bit depth of the input image. Second, our results can be displayed in any devices, without an additional tone mapping operation.
Interactive exposure correction methods were also developed. Lischinski et al. [LFUS06] presented an interactive tool for tone adjustment. Given an input image, they allow users to quickly select the regions of interest by drawing a few brush strokes and then locally adjust the brightness, contrast, and other appearance factors in the selected regions through a group of sliders. Dodgson et al. [DGV09] introduced contrast brushes, an interactive method for contrast enhancement. Such methods benefit from user interactions, while our approach is fully automatic.
Content-aware methods utilize high-level image contents to acquire better exposure correction effects for areas of interest, e.g., human faces, skin and sky, etc. Joshi et al. [JMAK10] improved the quality of faces in personal photo collection by using high-quality photos of the same person as examples. Dale et al. [DJS09] presented an example-based image restoration method that leverages a large database of Internet images. Kaufman et al. [KLW12] described a photo enhancement framework that takes both local and global image semantics into account. A common limitation of these methods is that they are usually very sensitive to the reliability of the extracted image semantics.
Since the pioneering work of Bychkovsky et al. [BPCD11] who provided a dataset consisted of image pairs for tone adjustment, there is an increasing number of learning-based exposure correction methods. Yan et al. [YZW16] described a semantic-aware photo enhancement network. Cai et al. [CGZ18]
learned a contrast enhancer from multi-exposure images by constructing a training dataset of low-contrast and high-contrast image pairs for end-to-end CNN (convolutional neural networks) learning, while Chenet al. [CWKC18]
designed an unpaired learning model based on generative adversarial networks (GANs). Reinforcement learning was also employed to train photo adjustment models. For instance, Huet al. [HHX18] achieved a white-box photo post-processing framework by modeling retouching operations as differential filters. Unlike [HHX18], Park et al. [PLYSK18]
casted enhancement as a Markov Decision Process of several fundamental global color adjustment actions, and trained an agent on unpaired data to reveal the optimal sequence of actions. The limitation of learning-based methods is that they do not work well on images that are significantly different with the training images.
Figure 2 presents the system overview of our exposure correction algorithm. Given an input image, we first perform dual illumination estimation to obtain the forward and reverse illuminations, from which we recover the intermediate under- and over-exposure corrected images. Then, the two intermediate exposure correction images together with the input image are fused into the desired image that seamlessly blends the best exposed parts in each of the three images. In the following, we elaborate the proposed approach. Specifically, we first describe the dual illumination estimation (Section 3.1). Next, we present the multi-exposure image fusion (Section 3.2). Finally, we illustrate the implementation details and parameter setting (Section 3.3).
Background. Fundamental to our dual illumination estimation is the assumption in Retinex-based image enhancement [WZHL13, GLL17, ZYX18], which assumes that an image (normalized to [0,1]) can be characterized as a pixel-wise product of the desired enhanced image and a single-channel illumination map :
where denotes pixel-wise multiplication. With this assumption, image enhancement can be reduced to an illumination estimation problem, since we can recover the desired image as long as the illumination map is known. However, Retinex-based methods do not work well on overexposed images. The reason is that attenuating exposure of an image require the illumination map in Eq. 1 to exceed the normal gamut (i.e., ), since the resulting image is recovered by . Figure 3 shows an example, where the Retinex-based enhancement methods further increase the exposure of the overexposed input image, generating visually unpleasing images in Figure 3(b) and (c).
Key observation. Unlike previous Retinex-based enhancement methods, we observed that overexposure correction can also be formulated as an illumination estimation problem by inverting the input image, since the originally overexposed regions would appear as underexposed in the inverted image, allowing us to fix overexposed regions in the input image by correcting corresponding underexposed regions in the inverted input image. Specifically, to correct overexposed regions in an input image , we first obtain its inverted image and estimate the corresponding illumination map . We then compute underexposure corrected image by and recover the desired overexposure corrected image . Note, the inverted input image is usually an unrealistic image, but the recovered overexposure corrected image is realistic. Figure 4 validates our observation, where we successfully correct an overexposed image by performing illumination estimation on the inverted input image.
It is worth noting that inverted image has been utilized in previous enhancement methods [DWP11, LWWG15]. The differences between our use of inverted image and these methods are twofold. First, they focus on enhance low-light image/video, while we aim to correct overexposed photos. Second, their observation is that inverted low-light images look like hazy image, and thus dehazing algorithm is employed to produce the final results. In contrast, we observed that overexposed images are underexposed when inverted and can be indirectly corrected by illumination estimation.
Based on this observation, we design the dual illumination estimation, where the first pass estimates forward illumination for the input image and aims to correct underexposed regions, while the other pass is performed on the inverted input image for obtaining reverse illumination and correcting overexposed regions. The reason behind this design is that the input image may be partially under- and over-exposed, thus requiring two-pass illumination estimation to correct regions of different exposure conditions. Note the forward and reverse illuminations are separately estimated in the same illumination estimation framework. Below we describe the illumination estimation framework.
Illumination estimation framework. To estimate the illumination of a given image , we first obtain an initial illumination by taking the maximum RGB color channels as the illumination value at each pixel [Lan77], which is expressed as
where denotes the color channel at pixel . The reason why we use the maximal color channel as the initial illumination is that smaller illumination may have the risk of sending color channels of the recovered image out of the color gamut, according to . Although the initial illumination map roughly depicts the overall illumination distribution, it typically contains richer details and textures that are not led by illumination discontinuities, making result recovered from it unrealistic; see Figure 5 (b) and (c). Hence, we propose to estimate a refined illumination map from by preserving the prominent structure, while removing the redundant texture details. To this end, we define the following objective function for obtaining the desired illumination map :
where and are spatial derivatives in the horizontal and vertical directions, respectively. and are spatially varying smoothness weights. The first term enforces to be similar to the initial illumination map , while the second term aims to remove the redundant texture details in by minimizing the partial derivatives. is a weight for balancing the two terms.
Intuitively, the objective function in Eq. 3 is similar in shape to that of the WLS smoothing [FFLS08]. However, our smoothness weights are defined differently. Specifically, the -direction smoothness weight is written as
where is inspired by the relative total variation (RTV) [XYXJ12] and defined as
is the standard deviation. Formally,is defined as
where the function computes the spatial Euclidean distance between pixels and . Since the -direction smoothness weight is defined similarly, we here do not give its definition.
The solution to the objective function in Eq. 3 can be efficiently obtained; see [LLW04, LFUS06, FFLS08] for available solvers. Note that, similar to [FZH16, GLL17], to recover results with better brightness, we alternatively perform a Gamma adjustment to the estimated illumination , i.e., , and recover the exposure correction result by . In our experiments, we empirically set as 0.6. Figure 5 demonstrates the effectiveness of our illumination estimation in correcting an underexposed image. As can be seen, by optimizing the objective function in Eq. 3, we obtain piece-wise smooth illumination with little texture details, from which we recover visually pleasing underexposure correction result.
Figure 6 compares our illumination estimation against previous edge-preserving image smoothing methods [FFLS08, XYXJ12]. For fair comparison, we generated their illuminations based on the same initial illumination, using implementations provided by the authors with well-tuned parameters. Moreover, the Gamma adjustment is applied to the illuminations produced by each method when recovering the exposure correction result. As shown, our illumination better removes the redundant texture details in the initial illumination while also preserving the salient illumination structures, and it recovers visually pleasing result with more distinct contrast and more vivid color. Note that, although the forward illumination estimation is performed in Figure 6, the above conclusion also holds for the reverse illumination estimation, since the two are built upon the same illumination estimation algorithm.
As analyzed above, by performing the proposed dual illumination estimation, we can obtain two intermediate exposure correction versions of an input image, with one corrects the underexposed regions and the other one restores the overexposed regions. Intuitively, to generate globally well-exposed image, the key is to seamlessly fuse the locally best exposed parts in the two intermediate exposure correction images. Considering that there may exist normally exposed regions in the input image, we additionally adopt the input image and perform a multi-exposure image fusion on the three images for the final exposure correction result.
Let and denote the intermediate under- and over-exposure corrected images of an input image . We then employ the exposure fusion technique [MKVR09] to fuse the image sequence into a globally well-exposed image . Specifically, we first compute a visual quality map for each image in the sequence by:
where indicates the -th image in the image sequence. , and are quantitatively measures for contrast, saturation, and well-exposedness; see [MKVR09] for details. , and are parameters for controlling the influence of each measure, which are set to 1 by default. Note, pixels with higher visual quality values are more likely better exposed. The three visual quality maps are then normalized such that they sum up to one at each pixel .
Next, the multi-resolution image fusion technique originated by Burt and Adelson [BA83] is employed to seamlessly blend images in the sequence under the guidance of the pre-computed visual quality maps. Figure 7 shows an example. As shown, the fused image in Figure 7(d) adaptively keeps the visually best parts in the multi-exposure image sequence (Figure 7(a)-(c)), and has better visual appeal compared with the input image due to the improved brightness, clear details, distinct contrast and vivid color. However, we notice that there is a clear quality degradation of locally best exposed regions from the image sequence in the fused image, such as the faces and the sky. We found that this is because the influence of these regions are weakened by same regions with lower visual quality in the sequence during the fusion. Hence, instead of normalizing the visual quality maps, we propose to modify the visual quality maps by only keeping the largest value at each pixel along the image sequence, which is expressed as
Using the modified visual quality maps, we obtain an improved result with clearer face and cloud details as well as better contrast and more vivid color, as shown in Figure 7(e).
We implement our algorithm using Matlab on Core i5-7400 CPU 3.0GHz. Similar to [XLXJ11], we alternatively optimize the objective function in Eq. 3 in the Fourier domain for speedup, which takes 0.3 seconds for estimating the illumination of a 1 mega-pixel image. For the multi-exposure image fusion, we adopt the implementation provided by the authors of [MKVR09], which takes about 1.5 seconds to produce the result for a 1 mega-pixel image. Note, as analyzed in [MKVR09], an optimized GPU implementation would greatly accelerate the fusion process and enable real-time performance. Our code will be made publicly available at http://zhangqing-home.net/ .
The key parameter in our algorithm is , which controls the smoothness level of the resulting illuminations. In general, larger yields smoother illumination, which allows recovering exposure corrected image with stronger local contrast. However, an excessively smoothed illumination would in turn decrease the brightness and contrast. To obtain better visual results, we set in all our experiments, which produces good results. Figure 8 shows an example illustrating how affects the forward illumination and the recovered underexposure corrected image.
To objectively evaluate the effectiveness of our method, we conducted the user study in two aspects. We first conducted a user study to compare our method with existing automatic exposure correction tools, and then conducted another study to compare our method with the state-of-the-art methods.
In this study, we compare our method with two automatic exposure correction tools, including Auto-Level in Photoshop and Auto-Tone in Lightroom, and interactive exposure correction via Lightroom. The two automatic tools and our method are first applied to 100 randomly selected images from the MIT-Adobe FiveK dataset [BPCD11]. We then collected the results and recruited 100 participants via Amazon Mechanical Turk to rate their preferences to the results. Similar to [YS12, KLW12], each participant was informed to perform pairwise comparison between our result and one of the four other images: 1) input image, 2) result of Auto-Level, 3) result of Auto-Tone, and 4) the corresponding expert-retouched results from Expert C in the employed dataset, according to the following five common requirements for the desired exposure correction result: (i) suitable brightness, (ii) clear details, (iii) distinct contrast, (iv) vivid color, and (v) well-preserved photorealism. For each pairwise comparison, the participants have three options: “I prefer the image on the left”, “I prefer the image on the right”, and “the two images look the same to me”. Note that, to avoid subjective bias, each image pair was presented anonymous and in a random order throughout the evaluation.
|Our method vs. Input||6%||9%||85%|
|Our method vs. Auto-Level||7%||11%||82%|
|Our method vs. Auto-Tone||21%||12%||67%|
|Our method vs. Expert-retouched||42%||21%||37%|
Table 1 summarizes the statistical results of the pairwise comparison. Comparing the results, we can notice that our method has better performance on the pairwise comparison between the input image, Auto-Level, and Auto-Tone, which convincingly demonstrates that results generated by our method are more preferred by human subjects. In addition, as shown in the last row of the table, the preference of our method is comparable to that of the expert retouching, manifesting that our method is able to produce high-quality exposure correction results and can be a good candidate exposure correction tool for non-expert users.
Figure 9 shows several example images employed in the user study and the exposure correction results generated by the compared tools and our method. As can be seen, the input images are diverse, including (i) a fruit stand image with overexposed apples (1st row), (ii) a landscape image with underexposed building and overexposed sky (2nd row), (iii) an overexposed wedding image with abnormal skin color and unclear details (3rd row), and (iv) a globally underexposed image with little portrait details (4th row). As shown, Auto-Level is less effective in correcting underexposure and fail to restore severely overexposed regions. Compared with Auto-Level, Auto-Tone is relatively more effective, but may also fail to produce satisfactory results due to inherent difficulty of automatically balancing multiple appearance factors. In contrast, our method generates visually pleasing results with normal brightness, clear details, distinct contrast and vivid color, which are comparable to the corresponding expert-retouched results. Please see the supplementary material for more image results in the user study.
Here we compare our method with three recent learning-based exposure correction methods, including HDRNet [GCB17], DPE [CWKC18], and Exposure [HHX18]. Since these methods are trained on the MIT-Adobe FiveK dataset, unlike the first user study, we are unable to use the randomly selected images from the MIT-Adobe FiveK dataset for evaluation. Hence, we first crawled 100 test images that have over 50% pixels with normalized intensity lower than 0.3 (69 images) or higher than 0.7 (31 images), from Flicker by searching with keywords “low light”, “underexposed”, “overexposed”, “back light”, and “portrait”; see Figure 10 for examples. Then, the three compared methods and our method were used for exposure correction of the 100 test images. For fair comparison, we produced their results using publicly-available implementation provided by the authors with recommended parameter setting. Next, similar to the first user study, pairwise preference comparison on the Amazon Mechanical Turk with 100 participants were conducted between our result and one of the four images: 1) input image, 2) result of HDRNet [GCB17], 3) result of DPE [CWKC18], and 4) result of Exposure [HHX18].
|Our method vs. Input||6%||11%||83%|
|Our method vs. HDRNet [GCB17]||16%||12%||72%|
|Our method vs. DPE [CWKC18]||13%||9%||78%|
|Our method vs. Exposure [HHX18]||18%||15%||67%|
The user study results are shown in Table 2, where we can see that our results were favored by more human subjects, indicating that our method outperforms the compared methods. Figure 10 shows some test images and their exposure correction results generated by different methods in the user study. As shown, HDRNet [GCB17] erroneously increases the exposure of the overexposed portrait images in the third and fourth rows, leading to visually unpleasing results. DPE [CWKC18] is effective in color and contrast enhancement, but may generate unrealistic results. Exposure [HHX18] produces competitive results for the underexposed images in the first and second rows, while it fails to generate satisfactory results for the last two overexposed images. In comparison, we produce more appealing results. Additional image results in the user study are provided in the supplementary material.
Relationship to intrinsic image decomposition. Although the assumption in Eq. 1 looks the same as that of the intrinsic image decomposition (IID), our problem is essentially different from IID. In contrast, IID assumes that an image is the pixel-wise product of the material reflectance and the illumination, where the reflectance component corresponds to an unrealistic image independent of illumination, while the same part (i.e., ) in our model is the desired natural-looking exposure correction image. Figure 11 shows an example comparing our result with the reflectance component produced by the state-of-the-art IID method.
Limitations. While our method produces satisfactory results for most of our test images, it still has a few limitations. First, as shown in Figure 12, our method fails to produce visually compelling results for the two images, since some parts of the facial regions are almost black and white, and without any trace of color and texture in the original images. Note that other state-of-the-art methods also fail to produce satisfactory results for the input images in Figure 12; see supplementary material for their results. Another limitation is that our method may amplify noise together with the fine scale details when the input image is noisy.
We have presented a novel exposure correction method. Unlike previous methods, we propose to estimate dual illuminations, which allows us to conveniently recover high-quality intermediate under- and over-exposure corrected images. A multi-exposure image fusion technique is then adopted to integrate the locally best exposed parts in the two intermediate exposure correction images and the input image into a globally well-exposed image. Overall, our method is simple yet effective and can run fully automatically in near-interactive rate. We have performed extensive experiments on a number of images and compare our method with popular automatic exposure correction tools and the state-of-the-art methods to demonstrate its effectiveness.
Our future work is threefold. First, we will investigate how to suppress noise during exposure correction. Second, we are interested in adopting semantics information and texture synthesis techniques to recover the missing image content for extremely under- and over-exposed regions as that in Figure 12. Third, inspired by [ZHF18], we will try to improve the results by guiding the illumination estimation with saliency.
Acknowledgments The authors thank the reviewers for their valuable comments. This work was partially supported by the NSFC (No. U1811461, No. 61802453 and No. 61602183), and the Science and Technology Project of Guangzhou (201707010140).