Frequency Domain Loss Function for Deep Exposure Correction of Dark Images

04/22/2021 ∙ by Ojasvi Yadav, et al. ∙ Trinity College Dublin 0

We address the problem of exposure correction of dark, blurry and noisy images captured in low-light conditions in the wild. Classical image-denoising filters work well in the frequency space but are constrained by several factors such as the correct choice of thresholds, frequency estimates etc. On the other hand, traditional deep networks are trained end-to-end in the RGB space by formulating this task as an image-translation problem. However, that is done without any explicit constraints on the inherent noise of the dark images and thus produce noisy and blurry outputs. To this end we propose a DCT/FFT based multi-scale loss function, which when combined with traditional losses, trains a network to translate the important features for visually pleasing output. Our loss function is end-to-end differentiable, scale-agnostic, and generic; i.e., it can be applied to both RAW and JPEG images in most existing frameworks without additional overhead. Using this loss function, we report significant improvements over the state-of-the-art using quantitative metrics and subjective tests.



There are no comments yet.


page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Exposure correction; i.e., adjusting the light conditions of an image is a classical Pizer et al. (1987); Huang et al. (2012) and active Chen et al. (2018); Loh and Chan (2019b); Chen Wei and Wenhan Yang (2018)

problem in computer vision. Low light conditions during capture may result in dark, noisy and blurry pictures and digital single-lens reflex cameras (DSLRs) are equipped with advanced hardware capable of handling such scenarios such as large aperture sizes, slow shutter speeds, sensitive sensors etc. Normally, a better or rather a more expensive digital camera has a higher range of these exposure settings and is thus capable of taking better quality pictures in harsh lighting conditions. Additionally, in DSLRs images are stored in minimally processed RAW format (typically 12-13 bits) that allows capturing a wide range of intensity values and rendering them later, as appropriate. In recent years, interest in mobile photography has increased significantly due to the rise of smartphones and popularity of social networking sites such as Instagram, Flickr, Snapseed etc. Both, casual hobbyists and professional photographers upload millions of images daily to these websites. Compared to DSLRs, smartphones are easier to operate and ideal for capturing quick, candid shots. Moreover, the quality of images shot using mobile phones has improved significantly, to be almost on par with DSLRs today. However, unlike DSLRs, smartphones are constrained by the growing consumer demand for slimmer hardware which constraints the sensor size. Smaller sensor leads to poor low light performance.

In order to compensate for the lack of complex hardware and the RAW image format, mobile phones usually rely on post-processing software. Regarding this, smartphones have an advantage over DSLRs because nowadays even an entry-level smartphone comes with a CPU and RAM capable of running complex software efficiently. Moreover, the abundance of easily available data has led to the research and development of data-driven smartphone software for improving the quality of the photos. In fact, such software is central to tackling the image quality - hardware trade off.

A typical software solution to this problem is multiple-exposure; i.e. taking multiple pictures using different exposure settings and fusing them to recreate a corrected image. But, the process is slow and also sensitive to camera movement. Professional post processing software such as Adobe Photoshop, Lightroom, PhotoScape etc. may give creditable results but requires interactive tools and expertise on the user side. Moreover, such software is not free or open source. Therefore, the task addressed in this paper is develop a solution that takes a single, underexposed and noisy JPEG image and corrects it with little or no user intervention, in real time.

Previous attempts have relied on the off-the-shelf algorithms such as histogram equalization, wavelet transform etc. Pizer et al. (1987); Huang et al. (2012)

or recently, an approach to train an end-to-end deep neural network

Chen et al. (2018). However, while the classical approaches rely on strong assumptions regarding the image content, Chen et al. (2018) train their network on RAW images captured using a DSLR under constrained environments. Therefore, none of them are suitable for photographs captured in the wild using mobile phones.

In this work, we present a simple yet effective approach for correcting the exposure of in the wild photographs under harsh lighting conditions. Our method is based on the observation that the high frequencies of an underexposed image follow a unique pattern due to the presence of noise. On the other hand, in a properly exposed image the low frequencies dominate

or form most of the relevant content. Based on this, we propose a multi-scale or rather scale-agnostic loss function in the frequency space, combine it with the traditional L1 loss, and train a framework for exposure correction. Our loss function is differentiable and hence the framework is end-to-end trainable using backpropagation. Our method is efficiently able to handle noisy and underexposed JPEG images and produces properly exposed good quality outputs. Moreover, our loss function is generic and works also for RAW image format and can be plugged in to other state-of-the-art (SoA) frameworks for exposure correction with no additional overhead. We demonstrate this by improving the SoA results for exposure correction of RAW images from the

See In The Dark (SID) dataset Chen et al. (2018). We evaluate the proposed method thoroughly using quantitative metrics and subjective tests. Our loss function also improves SoA solutions to various image reconstruction problems. To display this breadth of applicability, we have compared our loss function with the best performing and publicly available approaches of five additional applications.

To summarize, we make the following contributions in this work.

  1. We propose a new loss function in the frequency domain which corrects in the wild underexposed images efficiently and can be plugged into to any deep framework easily without additional cost.

  2. Using this loss function, we advanced the SoA in exposure correction from RAW images on the standard See In The Dark Chen et al. (2018) dataset ( db PSNR and 0.009 SSIM).

  3. We used this loss function also to train the standard Pix2Pix

    Isola et al. (2017) framework for exposure correction of JPEG images and observed a significant improvement ( db PSNR and 0.0193 SSIM).

  4. We verified the improvement by conducting subjective experiments and observed that the improvement in scores is consistent with human judgement.

  5. We have made the JPEG version of the SID dataset Chen et al. (2018) publicly available. Other researchers can use this JPEG dataset in their experiments without having to download the original RAW dataset and converting it to JPEG.

  6. We also investigated additional applications of our novel frequency loss function. We outperformed SoA methods for the many popular image enhancement tasks and concluded that the frequency loss function has a wide range of applications.

The rest of the paper is organized as follows. In Sec. 2 we discuss the related research. In Sec. 3, we introduce the proposed loss function and describe the frameworks used for exposure correction. In Sec. 4 we report and analyze the RAW and JPEG image correction, and discuss the results of subjective experiments.

2 Related Work

Exposure correction during content creation in creative industries is usually performed using commercially available software such as Adobe Photoshop, LightRoom, PhotoScape etc. But they are expensive and it requires a reasonable level of expertise to use them. Therefore, automatic exposure correction for natural images remains an active research area and there exists a plethora of literature. We roughly divide the research into two categories — classical and deep learning based.

Classical approaches: For low light enhancement, apart from the simple and widely used approaches such as histogram equalization and gamma correction, advanced methods abound Dong et al. (2011); Malm et al. (2007); Łoza et al. (2013) perform well. Similarly, there are plenty of classical techniques for image denoising Rudin et al. (1992); Portilla et al. (2003); Elad and Aharon (2006) and deblurring Zhuo et al. (2010); Shan et al. (2008); Donatelli et al. (2006). Typically, these approaches work in the frequency domain and involve estimating the statistics of the noise in the image signal and subsequently rectify it for the desired result.

Deep-Learning-based approaches: With the success of deep learning several data-driven methods have been proposed. Retinex Chen Wei and Wenhan Yang (2018) used separate networks to (1) decompose input into reflectance and illumination (Decom-net) and (2) A separate network to merge it back (Enhance-net). The authors collected their own LOL dataset which consisted of low/normal-light pairs. There is no ground truth for the Decom-net and to tackle this Decom-net only learns key constraints like reflectance shared by paired low/normal-light images and smoothness of illumination. This achieves an accurate representation of image decomposition. After the merging of decomposed images by Enhance-net, visually pleasing images are produced.

Similarly, Yu et al. (2018) segment the image into sub-images of various dynamic range exposures. Each sub-image is corrected locally by means of policy network while making sure a global correction is also in place. A discriminator network is then used for aesthetic evaluation, forming a complete Reinforced Adversarial Network. LIME Guo et al. (2016) estimate illumination maps by finding maximum values in R, G, B channels. The illumination map is further refined by the use of an additional structure prior to obtain the final illumination map on which enhancement can be done accordingly. In Loh and Chan (2019a), a new dataset, ExDARK Loh and Chan (2019a), for low light images is introduced. The dataset is exhaustive which covers wide scenes like ambient, single object, screen, shadow etc. This paper is geared towards object detection, thus their exposure correction is more oriented to information retrieval than image enhancement. LLnet Lore et al. (2017) proposed a stacked-sparse auto-encoder which learns to denoise and lighten synthetically darkened grayscale images. The network was then applied to natural low-light images. Shen et al. (2017) used an end-to-end image translation framework that learns a mapping between dark and bright images. Kinoshita and Kiya (2018) used image-segmentation based on luminance distribution. In this method, multiple images of multiple exposures are produced using image segmentation. Each picture is individually corrected and merged back. Using this approach results with clear bright and dark regions were produced. In Huang et al. (2019); Zamir et al. (2019), the authors introduced loss functions based on MS-SSIM and PSNR metrics. These works show that alternative loss functions can outperform traditional L1 and L2 loss functions.

In Liba et al. (2019) the authors used an approach that combines pictures from multiple frames. They employ a technique known as “motion metering” to identify the number of frames and per-frame exposure times to minimize noise and motion blur in multiple frames. These frames are then combined using a learning-based auto white balancing algorithm. To mute the exposure correction effects, light shadows are darkened which increase the contrast. This lightweight approach is suitable for processing images on mobile phones. A zero-shot approach has been introduced in Zhang et al. (2019). This unsupervised approach doesn’t rely on any prior image examples or prior training. A small CNN called ExCNET is trained at test time. This network estimates the “S-curve” that accurately represents the input test image. ExCNET is flexible for various scenes and lighting conditions. Another unsupervised technique using CNNs was employed by Jiang et al. (2019). Instead of training their GAN on low-bright image pairs, authors constrained the unpaired training using information extracted from the input itself. A global-local discriminator structure, a self-regularized perceptual loss fusion and an attention mechanism were also introduced to achieve effective results for enhancing natural low-light images.

To make use of colour cues in the scene, Atoum et al. (2019) use an end-to-end mapping to aid the colour enhancement process. Based on these cues, the network then focuses on processing these local regions as well as the global image to produce colour accurate low light images. Lv and Lu (2019) also propose an end-to-end CNN approach which works by synthesizing the image locally and globally. The task is performed by two attention maps which work separately to (1) distinguish underexposed regions from well lit regions (2) distinguish noise from real textures. They also introduce a decomposition and fusion structured approach and another network to enhance the contrast. Malik and Soundararajan (2019) also decompose the input image via CNN but the decomposition is a Laplacian pyramid decomposition (SCNN). The subbands are enhanced at multiple scales and then combined to obtain the enhanced image (ReCNN). These networks in combination (LLRNet) train on the ‘See in the Dark’ Chen et al. (2018) dataset to perform contrast enhancement. A CNN in conjunction with Discrete Wavelet Transform (DWT) was used in Guo et al. (2019). Their network performs denoising and exposure correction. The network is evaluated on synthetic low light datasets and natural low light datasets.

Our work is motivated from and close to the work done by Chen et al. (2018). In this work, the authors created the SID dataset and proposed an architecture for RAW images. SID is considered as a benchmark dataset for exposure correction. However, we contribute a new framework-independent loss function which works both for RAW and JPEG images and performs significantly better than their solution.

3 Proposed Approach

In this section we discuss our main contributions. We implemented our loss in two variants, which we describe in detail in in Sec 3.1. In Sec 3.2, we describe the pipelines used for exposure correction for RAW and JPEG images, respectively.

3.1 Frequency Loss

The problem of exposure correction can be modelled as an image translation problem where a mapping is learnt from underexposed to properly exposed domain. Typically, the SoA image translation systems such as Pix2Pix Isola et al. (2017) or CycleGAN Zhu et al. (2017) are trained using L1 and adversarial losses. The networks learn to map the current domain (for example, low light image, sketch etc.) to a new domain (corrected image, painting etc.) and also to reconstruct or fill in the missing information. In the case of exposure correction, we observed (Fig. 3

, second column) that such mapping often results in the amplification of noise and other artefacts present in the original domain. Classical techniques to remove artefacts such as noise, blur, etc. often involve analysing the image in the frequency domain. Motivated from the above observations, we propose loss functions based on the discrete cosine transform (DCT) and fast Fourier transform (FFT). When combined with the traditional L1 and adversarial losses, our framework generates well-exposed and less noisy outputs. Our DCT-based loss function

between images and of dimensions is defined as follows:


where is a scaling factor and refers to the discrete cosine transform of image . We compute this loss at different scales of the images and obtain:


Similarly, we can also define . Essentially during training, the DCT or the FFT of ground truth and predictions are computed and the mean of absolute difference between the two is then calculated. This process is repeated again over 2 lower resolutions. The code for the F-loss is given in the provided link111 Essentially, the proposed loss function explicitly guides the network to learn the true frequency components of the correctly exposed image distribution and to ignore the noisy frequencies of the poorly exposed inputs. This has several advantages. First, it is differentiable and thus suitable for training a neural network in an end-to-end fashion using backpropagation. Secondly, it is generic and can be added to any existing SoA framework without additional overhead. Thirdly, by computing DCT/FFT at different resolution the network sees artefacts at different scales and thus learns a scale-agnostic representation, a desirable feature for in the wild images. In our experiments, we notice that our loss function boosts the performance of the SoA frameworks for exposure correction by reducing noise, blur and other impurities such as color artefacts, etc.

3.2 Framework

Figure 1: Framework: A standard encoder-decoder architecture (yellow) is coupled with a GAN component (green). The Pix2Pix framework used for JPEG images roughly follows this pipeline with additional skip connections. For RAW images, we use the framework of Chen et al. (2018), which does not use a GAN component; i.e., uses only the yellow section of the pipeline.

We used this loss function for two different tasks. (a) Exposure correction of RAW images (b) Exposure correction of JPEG images. The proposed loss function was simply plugged on to the following frameworks and the networks trained again from scratch. The overall approach and the high-level architecture used in this work are described in Fig 1. Additional details are discussed in Sec 4.

(a) Exposure correction of RAW images: For this task, we chose the same framework as proposed by Chen et al. (2018), which is also the current SoA for exposure correction for RAW images. Their model follows a basic encoder-decoder framework adapted to RAW image processing. The input is packed into 4 channels and the spatial resolution is reduced by a factor of 2 in each dimension. The black level is subtracted and the data is scaled by an amplification ratio. The packed and amplified data is fed into a fully-convolutional network. The output is a 12 channel image with half the spatial resolution. This half sized output is processed by a sub-pixel layer to recover the original resolution.

(b) Exposure correction of JPEG images: This task was chosen to evaluate the capacity of this loss function for in the wild scenarios For example in the case of low light mobile photography as discussed in Sec 1. The model architecture used is the pix2pix model Isola et al. (2016), which is a popular framework for paired image-translation problems, such as ours. Their model follows a U-Net architecture which is trained with L1 and adversarial losses.

4 Experiments and Results

To validate the performance of our loss, we retrain the two architectures mentioned in Sec 3.2 with and without our loss function. In this section, we go over the training procedures to generate our results and explain the quantitative and qualitative comparisons that we underwent to show the performance of both the FFT and DCT variants of our loss function.

4.1 RAW Exposure Correction

For our experiments on RAW exposure correction, we use the SID dataset Chen et al. (2018) that was released alongside the current SoA network architecture. This dataset consists of indoor and outdoor images of various scenes. For every scene, one picture was taken with a low shutter speed and another with a high shutter speed. By nature of photography, the image with low shutter speed had low illumination and is also quite noisy. The same image shot with high shutter speed, however, was properly illuminated which resulted in a much clearer and noiseless image. The SID dataset consists of 1865 RAW images that were used for training and 599 images that were used for testing. During the training process, we take random crops of from the images for data augmentation. Additionally these random crops are also randomly flipped and rotated. For testing, the whole full resolution images were processed.

We trained the model three times, once only using the L1 loss as in the original implementation and once with L1 loss + FFT/DCT loss respectively. To establish the causal effect of our loss function, the models were trained using the default settings; i.e. the network structure and the hyperparameters were unchanged from the original implementation

Chen et al. (2018). The learning rate starts from 10-4

for epochs 0 to 2000, after which it becomes 10

-5 for the next 2000 epochs. We trained for a total of 4000 epochs.

L1 (SoA) Chen et al. (2018) 28.60 0.767
L1 + DCT 28.61 0.769
L1 + FFT 28.89 0.776
Table 1: Results for our RAW exposure correction experiments. For both PSNR and SSIM, higher scores are better.

Evaluation/Results: The results of our experiments can be seen in Table 1. With all the same parameters otherwise, we were able to improve on the performance of the network just by adding any variant of our loss to the total loss of the network. We further observe that the FFT variant performs significantly better than the original implementation. Qualitative results for this experiment can be seen in Figure 2. These results show that our loss function, in particular the FFT variant, reduces distortion artefacts and increases image sharpness. We further observe that our loss function reduces the noise in the corrected images and leads to smoother edges and accurate colours.

Input L1 (SoA) L1 + DCT (Ours) L1 + FFT (Ours) Ground truth

Figure 2: RAW results. In the blue border crop, the pavement cross is sharper for the FFT-loss output. For the yellow border crop, the L1-loss (SoA) output has green artefacts at the bottom while FFT-loss does not. For the red border crop, the colours are more accurate for FFT-loss. For the green border crop, the window pane is sharper for FFT-loss.

4.2 JPEG Exposure Correction

For the JPEG exposure correction, we used the pix2pix architecture Isola et al. (2016). We chose this network for this experiment as it is a landmark framework which is stable, widely used and has stood the test of time for numerous image reconstruction tasks. We trained this network on SID dataset Chen et al. (2018) by converting the RAW images to JPEG format. The JPEG version was obtained by selecting only the dark end of the RAW histogram. After conversion, we used the same images for training and testing as were used for training and testing on RAW images. This JPEG dataset is available on the link222 for future research and comparison. We use the default pix2pix parameters and default training procedure. To investigate the causal effects of adding our loss function, we did not alter neither the network structure nor the hyperparameters from the original code. The GAN loss is also unchanged from the original application Chen et al. (2018). We train one model with the L1 loss + adversarial loss as a baseline. We then train two more models with L1 loss + adversarial loss + FFT/DCT loss respectively. We train all models from scratch with a learning rate of for 100 epochs, which then linearly decays to over the next 100 epochs.

Input L1 + GAN (SoA) L1+ GAN + DCT (Ours) L1 + GAN + FFT (Ours) Ground truth

Figure 3: JPEG results. For the blue border crop, there is least noise in the FFT-loss output. For the yellow border crop, both DCT-loss and FFT-loss give sharper text written on the book. For the red border crop, the colours are the most accurate for DCT-loss. For the green border crop, FFT-loss has the least amount of artefacts.
L1 + GAN (SoA) Isola et al. (2016) 23.9487 0.7623
L1 + GAN + DCT 24.6305 0.7816
L1 + GAN + FFT 24.4624 0.7727
Table 2: Results for our JPEG exposure correction experiments. For both PSNR and SSIM, higher scores are better.

Evaluation/Results: We show the results in Table 2. As with the previous experiment, adding either variant of our loss to the total loss of the network increases its performance in both the PSNR and SSIM. As opposed to the RAW case, however, for the JPEG images the DCT variant performs better. We further show qualitative results in Fig 2 and 3. We observe the same increases in image quality for our loss function as in the RAW case. The sharpness of the images is increased and noisy artefacts are reduced.

4.3 Subjective Study

To provide some qualitative analysis of our loss function, we conducted a subjective study on 20 participants. During the study the participants were told to choose one image that they find more appealing, given two images to choose from. Possible characteristics of the images like noise, blur and discoloration were pointed out to each participant at the start of the session during a small training session.

At every choice, the participants were shown an image taken from the test dataset that had been processed by a network trained with either the L1-loss, DCT loss variant or FFT loss variant and the same image processed by a network trained with either one of the other loss functions. During the subjective test we mixed results from both the RAW and the JPEG exposure correction. However, the participants were only shown the same image, processed by the same network architecture at one time. Only the type of loss function used to train the network differed in the choices the participant was shown. Each participant saw 40 unique images. Due to the pairwise comparison, the participants were shown 120 images in total.

To analyse the results of the subjective test, we used pairwise comparison scaling Perez-Ortiz and Mantiuk (2017). We show our results in figure 4. Compared to the L1 loss, both the DCT and the FFT variants were chosen significantly more often by our test subjects, over all images. Additionally, the FFT variant was chosen significantly more often than the DCT version. For only the RAW images, the FFT variant was chosen significantly more often than the others, with only minor differences for L1 and DCT. For the JPEG images, the DCT variant was the one chosen significantly more often. These results match the results of our quantitative analysis, discussed earlier.

Figure 4: Just-Objectionable-Difference (JOD) Perez-Ortiz and Mantiuk (2017) between the L1 (SoA), FFT and DCT loss for all images and for only the RAW and JPEG images respectively.

4.4 Additional Applications

In this section we show that our frequency loss function also improves additional image enhancement tasks, such as: super-resolution

Dong et al. (2015), denoising Lehtinen et al. (2018), deblurring Kupyn et al. (2019), inpainting Xie et al. (2019) and video denoising Claus and van Gemert (2019)

. For that, we first retrained the given models from scratch on default settings. Datasets used in this training were the same as the datasets used in respective original studies, namely, Set5 for super-resolution, Set14 for denoising, GOPRO dataset

Nah et al. (2016) for deblurring, and Paris Street-view Doersch et al. (2015)

for inpainting. Then, we added our frequency loss function to each model and retrained them from scratch again, keeping the same default settings and using the same datasets. Based on the findings of the previous section, FFT-loss seemed to be superior than DCT-loss due to a larger variance of frequency coefficients (DCT is widely used in compression because of this ability to squeeze frequency coefficients in a space of small variance). Hence, we picked the FFT-loss to add to each model. We then compared the results with and without the additional frequency loss. The results are shown in Table

3. All training parameters, model structures, and datasets were kept the same and the only difference was the additional frequency loss function. Hence, the improvement was due to our frequency loss function in all these image enhancement tasks. This shows that the range of applications of our frequency loss function is diverse, exhibiting promise in various image enhancement tasks. We believe that there is ample scope for exploring these additional applications in more detail in the future.

Default settings Added F-loss
SRCNNDong et al. (2015) 28.86 / 0.92 29.10 / 0.94
Gaussian-cleanLehtinen et al. (2018) 30.30 / 0.87 30.80 / 0.89
DeblurGAN-v2Kupyn et al. (2019) 29.18 / 0.89 29.39 / 0.90
LBAMXie et al. (2019) 26.11 / 0.86 *26.39 / 0.87
ViDeNN-spatialClaus and van Gemert (2019) 31.5 32.48

* - Our frequency loss computation was modified to suit classic inpainting loss functions, where instead of finding the loss between ground truth (GT) and output (O), the loss gets calculated including a mask (M) between and and between () and ().

Table 3: Results additional applications. The models were trained with and without frequency loss function, while keeping model parameters and datasets constant. The numbers signify PSNR/SSIM scores, higher scores are better.

5 Conclusions and Limitations

In this paper we presented a novel loss function for use in low light exposure correction. Our loss function transforms the image output into the frequency domain, which is more able to capture differences in high frequency regions. This leads to an increase in sharpness and a reduction of noise and other artefacts as shown in our quantitative and qualitative results. Especially our subjective study shows that adding our loss will lead to significantly better image quality. One practical limitation of our loss function is that it needs it’s normalizing parameter to be manually tuned. We have further shown that our loss function can also be used in other image enhancement tasks, which should be investigated in mode detail in the future.


  • Y. Atoum, M. Ye, L. Ren, Y. Tai, and X. Liu (2019) Color-wise attention network for low-light image enhancement. arXiv preprint arXiv:1911.08681. Cited by: §2.
  • C. Chen, Q. Chen, J. Xu, and V. Koltun (2018) Learning to see in the dark. In Proceedings of the CVPR, pp. 3291–3300. Cited by: item 2, item 5, §1, §1, §1, §2, §2, Figure 1, §3.2, §4.1, §4.1, §4.2, Table 1.
  • W. W. Chen Wei and J. L. Wenhan Yang (2018) Deep retinex decomposition for low-light enhancement. In British Machine Vision Conference, Cited by: §1, §2.
  • M. Claus and J. van Gemert (2019) ViDeNN: deep blind video denoising. CoRR abs/1904.10898. External Links: Link, 1904.10898 Cited by: §4.4, Table 3.
  • C. Doersch, S. Singh, A. Gupta, J. Sivic, and A. A. Efros (2015) What makes paris look like paris?. Communications of the ACM 58 (12), pp. 103–110. Cited by: §4.4.
  • M. Donatelli, C. Estatico, A. Martinelli, and S. Serra-Capizzano (2006) Improved image deblurring with anti-reflective boundary conditions and re-blurring. Inverse problems 22 (6), pp. 2035. Cited by: §2.
  • C. Dong, C. C. Loy, K. He, and X. Tang (2015) Image super-resolution using deep convolutional networks. CoRR abs/1501.00092. External Links: Link, 1501.00092 Cited by: §4.4, Table 3.
  • X. Dong, G. Wang, Y. Pang, W. Li, J. Wen, W. Meng, and Y. Lu (2011) Fast efficient algorithm for enhancement of low lighting video. In 2011 IEEE International Conference on Multimedia and Expo, pp. 1–6. Cited by: §2.
  • M. Elad and M. Aharon (2006) Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image processing 15 (12), pp. 3736–3745. Cited by: §2.
  • X. Guo, Y. Li, and H. Ling (2016) LIME: low-light image enhancement via illumination map estimation. IEEE Transactions on image processing 26 (2), pp. 982–993. Cited by: §2.
  • Y. Guo, X. Ke, J. Ma, and J. Zhang (2019) A pipeline neural network for low-light image enhancement. IEEE Access 7, pp. 13737–13744. Cited by: §2.
  • H. Huang, H. Tao, and H. Wang (2019)

    A convolutional neural network based method for low-illumination image enhancement


    Proceedings of the 2Nd International Conference on Artificial Intelligence and Pattern Recognition

    AIPR ’19, New York, NY, USA, pp. 72–77. External Links: ISBN 978-1-4503-7229-9, Link, Document Cited by: §2.
  • S. Huang, F. Cheng, and Y. Chiu (2012) Efficient contrast enhancement using adaptive gamma correction with weighting distribution. IEEE transactions on image processing 22 (3), pp. 1032–1041. Cited by: §1, §1.
  • P. Isola, J. Zhu, T. Zhou, and A. A. Efros (2016) Image-to-image translation with conditional adversarial networks. arxiv. Cited by: §3.2, §4.2, Table 2.
  • P. Isola, J. Zhu, T. Zhou, and A. A. Efros (2017) Image-to-image translation with conditional adversarial networks. In Proceedings of the CVPR, pp. 1125–1134. Cited by: item 3, §3.1.
  • Y. Jiang, X. Gong, D. Liu, Y. Cheng, C. Fang, X. Shen, J. Yang, P. Zhou, and Z. Wang (2019) EnlightenGAN: deep light enhancement without paired supervision. arXiv preprint arXiv:1906.06972. Cited by: §2.
  • Y. Kinoshita and H. Kiya (2018) Automatic exposure compensation using an image segmentation method for single-image-based multi-exposure fusion. APSIPA Transactions on Signal and Information Processing 7, pp. e22. External Links: Document Cited by: §2.
  • O. Kupyn, T. Martyniuk, J. Wu, and Z. Wang (2019) DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Vol. , pp. 8877–8886. Cited by: §4.4, Table 3.
  • J. Lehtinen, J. Munkberg, J. Hasselgren, S. Laine, T. Karras, M. Aittala, and T. Aila (2018) Noise2Noise: learning image restoration without clean data. CoRR abs/1803.04189. External Links: Link, 1803.04189 Cited by: §4.4, Table 3.
  • O. Liba, K. Murthy, Y. Tsai, T. Brooks, T. Xue, N. Karnad, Q. He, J. T. Barron, D. Sharlet, R. Geiss, S. W. Hasinoff, Y. Pritch, and M. Levoy (2019) Handheld mobile photography in very low light. ACM Trans. Graph. 38 (6), pp. 164:1–164:16. External Links: ISSN 0730-0301, Link, Document Cited by: §2.
  • Y. P. Loh and C. S. Chan (2019a) Getting to know low-light images with the exclusively dark dataset. Computer Vision and Image Understanding 178, pp. 30–42. External Links: Document Cited by: §2.
  • Y. P. Loh and C. S. Chan (2019b) Getting to know low-light images with the exclusively dark dataset. Computer Vision and Image Understanding 178, pp. 30–42. Cited by: §1.
  • K. G. Lore, A. Akintayo, and S. Sarkar (2017)

    LLNet: a deep autoencoder approach to natural low-light image enhancement

    Pattern Recognition 61, pp. 650–662. Cited by: §2.
  • A. Łoza, D. R. Bull, P. R. Hill, and A. M. Achim (2013) Automatic contrast enhancement of low-light images based on local statistics of wavelet coefficients. Digital Signal Processing 23 (6), pp. 1856–1866. Cited by: §2.
  • F. Lv and F. Lu (2019) Attention-guided low-light image enhancement. arXiv preprint arXiv:1908.00682. Cited by: §2.
  • S. Malik and R. Soundararajan (2019) Llrnet: a multiscale subband learning approach for low light image restoration. In 2019 IEEE International Conference on Image Processing (ICIP), pp. 779–783. Cited by: §2.
  • H. Malm, M. Oskarsson, E. Warrant, P. Clarberg, J. Hasselgren, and C. Lejdfors (2007) Adaptive enhancement and noise reduction in very low light-level video. In 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8. Cited by: §2.
  • S. Nah, T. H. Kim, and K. M. Lee (2016) Deep multi-scale convolutional neural network for dynamic scene deblurring. CoRR abs/1612.02177. External Links: Link, 1612.02177 Cited by: §4.4.
  • M. Perez-Ortiz and R. K. Mantiuk (2017) A practical guide and software for analysing pairwise comparison experiments. arXiv preprint arXiv:1712.03686. Cited by: Figure 4, §4.3.
  • S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. ter Haar Romeny, J. B. Zimmerman, and K. Zuiderveld (1987) Adaptive histogram equalization and its variations. Computer vision, graphics, and image processing 39 (3), pp. 355–368. Cited by: §1, §1.
  • J. Portilla, V. Strela, M. J. Wainwright, and E. P. Simoncelli (2003) Image denoising using scale mixtures of gaussians in the wavelet domain. IEEE Trans Image Processing 12 (11). Cited by: §2.
  • L. I. Rudin, S. Osher, and E. Fatemi (1992) Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena 60 (1-4), pp. 259–268. Cited by: §2.
  • Q. Shan, J. Jia, and A. Agarwala (2008) High-quality motion deblurring from a single image. Acm transactions on graphics (tog) 27 (3), pp. 73. Cited by: §2.
  • L. Shen, Z. Yue, F. Feng, Q. Chen, S. Liu, and J. Ma (2017) MSR-net: low-light image enhancement using deep convolutional network. CoRR abs/1711.02488. External Links: Link, 1711.02488 Cited by: §2.
  • C. Xie, S. Liu, C. Li, M. Cheng, W. Zuo, X. Liu, S. Wen, and E. Ding (2019) Image inpainting with learnable bidirectional attention maps. External Links: 1909.00968 Cited by: §4.4, Table 3.
  • R. Yu, W. Liu, Y. Zhang, Z. Qu, D. Zhao, and B. Zhang (2018) DeepExposure: learning to expose photos with asynchronously reinforced adversarial learning. In Advances in Neural Information Processing Systems 31, pp. 2149–2159. External Links: Link Cited by: §2.
  • S. W. Zamir, A. Arora, S. Khan, F. S. Khan, and L. Shao (2019) Learning digital camera pipeline for extreme low-light imaging. arXiv preprint arXiv:1904.05939. Cited by: §2.
  • L. Zhang, L. Zhang, X. Liu, Y. Shen, S. Zhang, and S. Zhao (2019) Zero-shot restoration of back-lit images using deep internal learning. In Proceedings of the 27th ACM International Conference on Multimedia, MM ’19, New York, NY, USA, pp. 1623–1631. External Links: ISBN 978-1-4503-6889-6, Link, Document Cited by: §2.
  • J. Zhu, T. Park, P. Isola, and A. A. Efros (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pp. 2223–2232. Cited by: §3.1.
  • S. Zhuo, D. Guo, and T. Sim (2010) Robust flash deblurring. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2440–2447. Cited by: §2.