1 Introduction
Image denoising is an important research problem in lowlevel vision, aiming at recovering the latent clean image from its noisy observation . Despite the significant advances in the past decades [8, 14, 57, 56], real image denoising still remains a challenging task, due to the complicated processing steps within the camera system, such as demosaicing, Gamma correction and compression [46].
From the Bayesian perspective, most of the traditional denoising methods can be interpreted within the Maximum A Posteriori (MAP) framework, i.e., , which involves one likelihood term and one prior term . Under this framework, there are two methodologies that have been considered. The first attempts to model the likelihood term with proper distributions, e.g., Gaussian, Laplacian, MoG [33, 59, 55] and MoEP [10], which represents different understandings for the noise generation mechanism, while the second mainly focuses on exploiting better image priors, such as total variation [40], nonlocal similarity [8], lowrankness [15, 17, 47, 52] and sparsity [31, 58, 53]. Despite better interpretability led by Bayesian framework, these MAPbased methods are still limited by the manual assumptions on the noise and image priors, which may largely deviate from the real images.
In recent years, deep learning (DL)based methods have achieved impressive success in image denoising task
[57, 4, 56]. However, as is well known, training a deep denoiser requires large amount of cleannoisy image pairs, which are timeconsuming and expensive to collect. To address this issue, several noise generation^{1}^{1}1 The phrase “noise generation” indicates the generation process of noisy image from clean image throughout this paper. approaches were proposed to simulate more cleannoisy image pairs to facilitate the training of deep denoisers. The main idea behind them is to unfold the incamera processing pipelines [19, 7], or directly learn the distribution as in [11, 25] using generative adversarial network (GAN) [16]. However, the former methods involve many hyperparameters needed to be carefully tuned for specific cameras, and the latter ones suffer from simulating very realistic noisy image with highdimensional signaldependent noise distributions. Besides, to the best of our knowledge, there is still no metric to quantitatively assess the quality of the generated noisy images w.r.t. the real ones.Against these issues, we propose a new framework to model the joint distribution instead of only inferring the conditional posteriori as in conventional MAP framework. Specifically, we firstly factorize the joint distribution from two opposite directions, i.e., and , which can be well approximated by a image denoiser and a noise generator. Then we simultaneously train the denoiser and generator in a dual adversarial manner as illustrated in Fig. 1. After that, the learned denoiser can either be directly used for the real noise removal task, or further enhanced with new cleannoisy image pairs simulated by the learned generator. In summary, the contributions of this work can be mainly summarized as:

Different from the traditional MAP framework, our method approximates the joint distribution from two different factorized forms in a dual adversarial manner, which subtlely avoids the manual design on image priors and noise distribution. What’s more, the joint distribution theoretically contains more complete information underlying the data set comparing with the conditional posteriori .

Our proposed method can simultaneously deal with both the noise removal and noise generation tasks in one unified Bayesian framework, and achieves superior performance than the stateofthearts in both these two tasks. What’s more, the performance of our denoiser can be further improved after retraining on the augmented training data set with additional cleannoisy image pairs simulated by our learned generator.

In order to assess the quality of the simulated noisy images by a noise generation method, we design two metrics, which, to the best of our knowledge, are the first metrics to this aim.
2 Related Work
2.1 Noise Removal
Image denoising is an active research topic in computer vision. Under the MAP framework, rational priors are necessary to be preassumed to enforce some desired properties of the recovered image. Total variation [40] was firstly introduced to deal with the denoising task. Later, the nonlocal similarity prior, meaning that the small patches in a large nonlocal area may share some similar patterns, was considered in NLM [8] and followed by many other denoising methods [14, 15, 30, 28]. Lowrankness [15, 17, 54, 52] and sparsity [31, 50, 30, 58, 53] are another two wellknown image priors, which are often used together within the dictionary learning methods. Besides, discriminative learning methods also represent another research line, mainly including Markov random field (MRF) methods [6, 41, 44], cascade of shrinkage fields (CSF) methods [42, 43] and the trainable nonlinear reaction diffusion (TNRD) [12] method. Different from above priorsbased methods, noise modeling approaches focus on the other important component of MAP, i.e., likelihood or fidelity term. E.g., Meng and De La Torre [33] proposed to model the noise distribution as mixture of Gaussians (MoG), while Zhu et al. [59] and Yue et al. [55] both introduced the nonparametric Dirichlet Process to MoG to expand its flexibility. Furthermore, Cao et al. [10] proposed the mixture of expotential power (MoEP) distributions to fit more complex noise.
In recent years, DLbased methods achieved significant advances in the image denoising task. Jain and Seung [23] firstly adopted a fivelayer network to deal with the denoising task. Then Burger et al. [9]
obtained the comparable performance with BM3D using one plain multilayer perceptron (MLP). Later, some autoencoder based methods
[49, 2] were also immediately proposed. It is worthy mentioning that Zhang et al. [57] proposed the convolutional denoising network DnCNN and achieved the stateoftheart performance on Gaussian denoising. Following DnCNN, many different network architectures were designed to deal with the denoising task, including RED [32], MemNet[45], NLRN [29], N3Net [37], RIDNet [4] and VDN [56].2.2 Noise Generation
As is well known, the expensive cost of collecting pairs of training data is a critical limitation for deep learning based denoising methods. Therefore, several methods were proposed to explore the generation mechanism of image noise to facilitate an easy simulation of more training data pairs. One common idea was to generate image pairs by “unprocessing” and “processing” each step of the incamera processing pipelines, e.g., [19, 7, 24]. However, these methods involve many hyperparameters to be tuned for specifi camera. Another simpler way was to learn the real noise distribution directly using GAN [16] as demonstrated in [11] and [25]. Due to the complexity of real noise and the instability of training GAN, it is very difficult to train a good generator for simulating realistic noise.
3 Proposed Method
Like most of the supervised deep learning denoising methods, our approach is built on the given training data set containing pairs of real noisy image and clean image , which are accessible thanking to the contributions of [3, 1, 51]. Instead of forcely learning a mapping from to , we attempt to approximate the underlying joint distribution of the cleannoisy image pairs. In the following, we present our method from the Bayesian perspective.
3.1 Two Factorizations of Joint Distribution
In this part, we factorize the joint distribution from two different perspectives, and discuss their insights respectively related to the noise removal and noise generation tasks.
Noise removal perspective: The noise removal task can be considered as inferring the conditional distribution under the Bayesian framework. The learned denoiser in this task represents an implicit distribution to approximate the true distribution . The output of can be seen as an image sampled from this implicit distribution . Based on such understanding, we can obtain a pseudo clean image pair as follows^{2}^{2}2We mildly assume that is easily implemented by sampling from the empirical distribution of the training data set, and so does as ., i.e.,
(1) 
which can be seen as one example sampled from the following pseudo joint distribution:
(2) 
Obviously, the better denoiser is, the more accurately that the pseudo joint distribution can approximate the true joint distribution .
Noise generation perspective: In real camera system, image noise is derived from multiple hardwarerelated random noises (e.g., short noise, thermal noise), and further affected by incamera processing pipelines (e.g., demosaicing, compression). After introducing an additional latent variable , representing the fundamental elements conducting the hardwarerelated random noises, the generation process from to can be depicted by the conditional distribution . The generator in this task expresses an implicit distribution to approximate the true distribution . The output of can be seen as an example sampled from , i.e., . Similar as Eq. (1), a pseudo noisy image pair is easily obtained:
(3) 
where denotes the distribution of the latent variable
, which can be easily set as an isotropic Gaussian distribution
.Theoretically, we can marginalize the latent variable to obtain the following pseudo joint distribution as an approximation to :
(4) 
where . As suggested in [26], the number of samples can be set as 1 as long as the minibatch size is large enough. Under such setting, the pseudo noisy image pair obtained from the generation process in Eq. (3) can be roughly regarded as an sampled example from .
3.2 Dual Adversarial Model
In the previous subsection, we have derived two pseudo joint distributions from the perspectives of noise removal and noise generation, i.e., and . Now the problem becomes how to effectively train the denoiser and the generator , in order to well approximate the joint distribution . Fortunately, the tractability of sampling process defined in Eqs. (1) and (3) makes such training possible in an adversarial manner as GAN [16], which gradually pushes and toward the true distribution . Specifically, we formulate this idea as the following dual adversarial problem inspired by TripleGAN [13],
(5) 
where , , and denotes the discriminator, which tries to distinguish the real cleannoisy image pair from the fake ones and . The hyperparameter controls the relative importance between the denoiser and generator . As in [5], we use the Wassertein1 distance to measure the difference between two distributions in Eq. (5).
The working mechanism of our dual adversarial network can be intuitively explained in Fig. 1. On one hand, the denoiser , delivering the knowledge of , is expected to conduct the joint distribution of Eq. (2), while the noise generator , conveying the information of , is expected to derive the joint distribution of Eq. (4). Through the adversarial effect of discriminator , the denoiser and generator are both gradually optimized so as to pull and toward the true joint distribution during training. On the other hand, the capabilities of and are mutually enhanced by their dual regularization between each other. Given any real image pair and one pseudo image pair from generator or from denoiser , the discriminator will be updated according to the adversarial loss. Then is fixed as a criterion to update both and simultaneously as illustrated by the dotted lines in Fig. 1, which means and are keeping interactive and guided by each other in each iteration.
Previous researches [22, 60] have shown that it is benefical to mix the adversarial objective with traditional losses, which would speed up and stabilize the training of GAN. For noise removal task, we adopt the loss, i.e., , which enforces the output of denoiser to be close to the groundtruth. For the generator , however, the direct loss would not be benefical because of the randomness of noise. Therefore, we propose to apply the constrain on the statistical features of noise distribution:
(6) 
where represents the Gaussian filter used to extract the firstorder statistical information of noise. Intergrating these two regularizers into the adversarial loss of Eq. (5), we obtain the final objective:
(7) 
where and are hyperparameters to balance different losses. More sensetiveness analysis on them are provided in Sec. 5.2.
3.3 Training Strategy
In the dual adversarial model of Eq. (7), we have three objects to be optimized, i.e., the denoiser , generator and discriminator . As in most of the GANrelated papers [16, 5, 13], we jointly train , and but update them in an alternating manner as shown in Algorithm 1. In order to stabilize the training, we adopt the gradient penalty technology in WGANGP [18], enforcing the discriminator to satisfy 1Lipschitz constraint by an extra gradient penalty term.
After training, the generator is able to simulate more noisy images given any clean images, which are easily obtained from the original training data set or by downloading from internet. Then we can retrain the denoiser by adding more synthetic cleannoisy image pairs generated by to the training data set. As shown in Sec. 5, this strategy can further improve the denoising performance.
3.4 Network Architecture
The denoiser , generator and discriminator in our framework are all parameterized as deep neural networks due to their powerful fitting capability. As shown in Fig. 1, the denoiser takes noisy image as input and outputs denoised image , while the generator takes the concatenated clean image and latent variable as input and outputs the simulated noisy image . For both and , we use the UNet [39] architecture as backbones. Besides, the residual learning strategy [57] is adopted in both of them. The discriminator
contains five stride convolutional layers to reduce the image size and one fully connected layer to fuse all the information. More details about the network architectures are provided in the supplementary material due to page limitation. It should be noted that our proposed method is a general framework that does not depend on the specific architecture, therefore most of the commonly used networks architectures
[57, 32, 4] in lowlevel vision tasks can be substituted.4 Evaluation Metrics
For the noise removal task, PSNR and SSIM [48] can be readily adopted to compare the denoising performance of different methods. However, to the best of our knowledge, there is still no any quantitative metric having been designed for noise generation task. To address this issue, we propose two metrics to compare the similarity between the generated and the real noisy images as follows:

PGap (PSNR Gap): The main idea of PGap is to compare the synthetic and real noisy images indirectly by the performance of the denoisers trained on them. Let , denote the available training and testing sets, whose noise distributions are same or similar. Given any one noisy image generator , we can synthesize another training set:
(8) After training two denoisers on the original data set and on the generated data set under the same conditions, we can define PGap as
(9) where represents the PSNR result of denoiser on testing data set . It is obvious that, if the generated noisy images in are close to the real noisy ones in , the performance of would be close to , and thus the PGap would be small.

AKLD (Average KL Divergence): The noise generation task aims at synthesizing fake noisy image from the real clean image to match the real noisy image in distribution. Therefore, the KL divergence between the conditional distributions on the fake image pair and on the real image pair can serve as a metric. To make this conditional distribution tractable, we utlize the pixelwisely Gaussian assumption for real noise in recent work VDN [56], i.e.,
(10) where
(11) denotes the reshape operation from matrix to vector,
denotes the Gaussian filter, and the square of is pixelwise operation. Based on such explicit distribution assumption, the KL divergence between and can be regarded as an intuitive metric. To reduce the influence of randomness, we randomly generate synthetic fake noisy images:(12) for any real clean image , and define the following average KL divergence as our metric, i.e.,
(13) Evidently, the smaller AKLD is, the better the generator is. In the following experiments, we set .
5 Experimental Results
In this section, we conducted a series of experiments on several realworld denoising benchmarks. In specific, we considered two groups of experiments: the first group (Sec. 5.2) is designed for evaluating the effectiveness of our method on both of the noise removal and noise generation tasks, which is implemented on one specific real benchmark containing training, validation and testing sets; while the second group (Sec. 5.3) is conducted on two real benchmarks that only consist of some noisy images as testing set, aiming at evaluating its performance on general realworld denoising tasks. Due to the page limitation, the running time comparisons are listed in the supplementary material.
In brief, we denote the jointly trained Dual Adversarial Network following Algorithm 1 as DANet. As discussed in Sec. 3.3, the learned generator in DANet is able to augment the original training set by generating more synthetic cleannoisy image pairs, and the retrained denoiser on this augmented training data set under loss is denoted as .
5.1 Experimental Settings
Parameter settings and network training: In the training stage of DANet, the weights of and were both initialized according to [20], and the weights of
were initialized from a zerocentered Normal distribution with standard deviation 0.02 as
[38]. All the three networks were trained by Adam optimizer [27] with momentum terms for and for both and . The learning rates were set as , and for , and, respectively, and linearly decayed in half every 10 epochs.
In each epoch, we randomly cropped patches with size from the images for training. During training, we updated three times for each update of and . We set , throughout the experiments, and the sensetiveness analysis about them can be found in Sec. 5.2. As for , we set it as , meaning the denoiser and generator contribute equally in our model. The penalty coefficient in WGANGP [18] is set as 10 following its default settings. As for , the denoiser
was retrained with the same settings as that in DANet. All the models were trained using PyTorch
[35].


Metrics  Methods  
5Δ  
CBDNet  ULRD  GRDN  DANet  


PGap  8.30  4.90  2.28  2.06 


AKLD  0.728  0.545  0.443  0.212 

5.2 Results on SIDD Benchmark
In this part, SIDD [1] benchmark is employed to evaluate the denoising performance and generation quality of our proposed method. The full SIDD data set contains about cleannoisy image pairs as training data, and the rest image pairs are held as the benchmark for testing. For fast training and evaluation, one medium training set (320 image pairs) and validation set (40 image pairs) are also provided, but the testing results can only be obtained by submission. We trained DANet and on the medium version training set, and evaluated on the validation and testing sets.
Noise Generation: The generator in DANet is mainly used to synthesize the corresponding noisy image given any clean one. As introduced in Sec. 4, two metrics PGap and AKLD are designed to assess the generated noisy image. Based on these two metrics, we compared DANet with three recent methods, including CBDNet [19], ULRD [7] and GRDN [25]. CBDNet and ULRD both attempted to generate noisy images by simulating the incamera processing pipelines, while GRDN directly learned the noise distribution using GAN [16]. It should be noted that ULRD [7] and GRDN [25] both make use of the metadata of the images.
Table 1 lists the PGap values of different compared methods on SIDD validation set. For the calculation of PGap, SIDD validation set is regarded as the testing set in Eq. (9). Obviously, our proposed DANet achieves the best performance. Figure 1 displays the PSNR curves of different denoisers trained on the real training set or only the synthetic training sets generated by different methods, which gives an intuitive illustration for our defined PGap. It can be seen that all the methods tend to gradually overfit to their own synthetic training set, especially for CBDNet. However, DANet performs not only more stably but also better than other methods.


Datasets  Metrics  Methods  
10Δ  
CBM3D  WNNM  DnCNN  CBDNet  RIDNet  VDN  DANet  


Testing  PSNR  25.65  25.78  23.66  33.28    39.26  39.25  39.43 
10Δ  
SSIM  0.685  0.809  0.583  0.868    0.955  0.955  0.956  


Validation  PSNR  25.29  26.31  38.56  38.68  38.71  39.29  39.30  39.47 
10Δ  
SSIM  0.412  0.524  0.910  0.909  0.913  0.911  0.916  0.918  

The average AKLD results calculated on all the images of SIDD validation set are also listed in Table 1. The smallest AKLD of DANet indicates that it learns a better implicit distribution to approximate the true distribution . Fig. 3 illustrates one typical example of the real and synthetic noisy images generated by different methods, which provides an intuitive visualization for the AKLD metric. In summary, DANet outperforms other methods both in quantization and visualization, even though some of them make use of additional metadata.


Metrics  
5Δ  
+  


PSNR  38.66  39.30  39.33  39.39 


SSIM  0.901  0.916  0.916  0.917 



Metrics  
6Δ  
0  5  10  50  +  


PGap  5.33  3.10  2.06  4.17  15.14 


AKLD  0.386  0.216  0.212  0.177  0.514 

Noise Removal: To verify the effectiveness of our proposed method on realworld denoising task, we compared it with several stateoftheart methods, including CBM3D [14], WNNM [17], DnCNN [57], CBDNet [19], RIDNet [4] and VDN [56]. Table 2 lists the PSNR and SSIM results of different methods on SIDD validation and testing sets. It should be noted that the results on testing sets are cited from official website^{3}^{3}3https://www.eecs.yorku.ca/~kamel/sidd/benchmark.php, but the results on validation set are calculated by ourself. For fair comparison, we retrained DnCNN and CBDNet on SIDD training set. From Table 2, it is easily observed that: 1) deep learning methods obviously performs better than traditional methods CBM3D and WNNM due to the powerful fitting capability of DNN; 2) DANet and both outperform the stateoftheart realworld denoising methods, substantiating their effectiveness; 3) surpasses DANet about 0.18dB PSNR, which indicates that the synthetic data by facilitates the training of the denoiser .
Fig. 4 illustrates the visual denoising results of different methods. It can be seen that CBM3D and WNNM both fail to remove the realworld noise. DnCNN tends to produce oversmooth edges and textures due to the
loss. CBDNet, RIDNet and VDN alleviate this phenomenon to some extent since they adopt more robust loss functions. DANet recovers sharper edges and more details owning to the adversarial loss. After retraining with more generated image pairs,
obtains the closer denoising results to the groundtruth.Hyperparameter Analysis: Our proposed DANet involves two hyperparameters and in Eq. (7). The pamameter mainly influences the performance of denoiser , while directly affects the generator .


Metrics  Methods  
3Δ  
BaseD  DANet  


PSNR  39.19  39.30 


SSIM  0.907  0.916 



Metrics  Methods  
3Δ  
BaseG  DANet  


PGap  4.07  2.06 


AKLD  0.223  0.212 

Table 4 lists the PSNR/SSIM results of DANet under different settings, where represents the results of the denoiser trained only with loss. As expected, small value, meaning that the adversarial loss plays more important role, leads to the decrease of PSNR and SSIM performance to some extent. However, when value is too large, the regularizer will mainly dominates the performance of denoiser . Therefore, we set as a moderate value throughout all the experiments, which makes the denoising results more realistic as shown in Fig. 4 even sacrificing a little PSNR performance.
The PGap and average AKLD results of DANet under different values are shown in Table 4. Note that represents the results of the generator trained only with the regularizer of Eq. (6). Fig. 5 also shows the corresponding visual results of one typical example. As one can see, fails to simulate the real noise with , which demonstrates that the regularizer of Eq. (6) is able to stabilize the training of GAN. However, it is also difficult to train only with the regularizer of Eq. (6) as shown in Fig. 5 (f). Taking both the quantitative and visual results into consideration, is constantly set as in our experiments.
Ablation studies: To verify the marginal benefits brought up by our dual adversarial loss, two groups of ablation experiments are designed in this part. In the first group, we train DANet without the generator and denote the trained model as BaseD. On the contrary, we train DANet without the denoiser and denote the trained model as BaseG. And the comparison results of these two baselines with DANet on noise removal and noise generation tasks are listed in Table 6 and Table 6, respectively. It can be easily seen that DANet achieves better performance than both the two baselines in noise removal and noise generation tasks, especially in the latter, which illustrates the mutual guidance and amelioration between the denoiser and the generator.
5.3 Results on DND and Nam Benchmarks
To evaluate the performance of our method in general realworld denoising tasks, we test on two realworld benchmarks, i.e., DND [36] and Nam [34]. These two benchmarks do not provide any training data, therefore they are suitable to test the generalization capability of any denoiser. Following the experimental setting in RIDNet [4], we trained another model using image patches from SIDD [1], Poly [51] and RENOIR [3] for fair comparison. To be distinguished from the model of Sec. 5.2, the trained models under this setting are denoted as GDANet and , aiming at dealing with the general denoising task in real application. For the training of , we employed the images of MIR Flickr [21] as clean images to synthesize more training pairs using .


Metrics  Methods  
9Δ  
CBM3D  WNNM  DnCNN  CBDNet  RIDNet  VDN  GDANet  


PSNR  34.51  34.67  32.43  38.06  39.26  39.38  39.47  39.58 


SSIM  0.8244  0.8646  0.7900  0.9421  0.9528  0.9518  0.9548  0.9545 



Metrics  Methods  
9Δ  
CBM3D  WNNM  DnCNN  CBDNet  RIDNet  VDN  GDANet  


PSNR  35.36  35.33  35.68  39.20  39.33  38.66  39.91  39.79 


SSIM  0.8708  0.8812  0.8811  0.9676  0.9623  0.9613  0.9693  0.9689 

DND Benchmark: This benchmark contains 50 real noisy and almost noisefree image pairs. However, the almost noisefree images are not publicly released, thus the PSNR/SSIM results can only be obtained through online submission system. Table 7 lists the PSNR/SSIM results released on the official DND benchmark website^{4}^{4}4https://noise.visinf.tudarmstadt.de/benchmark/. From Table 7, we have the following observations: 1) outperforms the stateoftheart VDN about 0.2dB PSNR, which is a large improvement in the field of realworld denoising; 2) GDANet obtains the highest SSIM value, which means that it preserves more structural information than other methods as that can be visually observed in Fig. 6; 3) DnCNN cannot remove most of the real noise because it overfits to the Gaussian noise case; 4) the classical CBM3D and WNNM methods cannot handle the complex real noise.
Nam Benchmark: This benchmark contains 11 real static scenes and the corresponding noisefree images, which are obtained by averaging 500 noisy images of the same scenes. We cropped these images into patches, and randomly selected 100 of them for the purpose of evaluation. The quantitative PSNR and SSIM results are given in Table 8. It is easy to see that our proposed GDANet performs better than the other compared methods. Note that VDN does not achieve good performance since the noisy images in this benchmark are JPEG compressed, which is not considered in VDN. For easy comparison, we also display one typical denoised example by different methods in Fig. 7, and the better visual performance of our methods can be observed.
Discussion: Different from the results in Sec. 5.2, GDANet performs more stably than as shown in Table 7 and 8, especially on SSIM metric. That’s because the noise types simulated by the generator , which are mainly determined by the training data set, does not match well with that contained in the testing set. Therefore, GDANet is suggested to be used in such general realworld denoising task with uncertain noise types, while is more suitable in the scenario that provides similar training and testing data sets.
6 Conclusion
We have proposed a new Bayesian framework, namely dual adversarial network (DANet), for realworld image denoising. Different from the traditional MAP framework relied on subjective preassumptions on the noise and image priors, our proposed method focuses on learning the joint distribution directly from data. To estimate the joint distribution, we attempt to approximate it by its two different factorized forms using an dual adversarial manner, which correspondes to two tasks, i.e., noise removal and noise generation. For assessing the quality of synthetic noisy image, we have designed two applicable metrics, to the best of our knowledge, for the first time. The proposed DANet intrinsically provides a general methodology to facilitate the study of other lowlevel vision tasks, such as superresolution and deblurring. Comprehensive experiments have demonstrated the superiority of DANet as compared with stateoftheart methods specifically designed for both the noise removal and noise generation tasks.
Acknowledgements: This research was supported by National Key R&D Program of China (2018YFB1004300) and the China NSFC project under contracts 11690011, 61721002, U1811461, and Hong Kong RGC RIF grant (R500118).
References

[1]
(201806)
A highquality denoising dataset for smartphone cameras.
In
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
, Cited by: §3, §5.2, §5.3.  [2] (2013) Adaptive multicolumn deep neural networks with application to robust image denoising. In Advances in Neural Information Processing Systems 26, pp. 1493–1501. Cited by: §2.1.
 [3] (2014) RENOIR  a benchmark dataset for real noise reduction evaluation.. arXiv preprint arXiv:1409.8230. External Links: Link Cited by: §3, §5.3.
 [4] (2019) Real image denoising with feature attention. In Proceedings of the IEEE International Conference on Computer Vision, pp. 3155–3164. Cited by: §1, §2.1, §3.4, §5.2, §5.3.
 [5] (2017) Wasserstein gan. arXiv preprint arXiv:1701.07875. Cited by: §3.2, §3.3.
 [6] (2009) Training an active random field for realtime image denoising. IEEE Transactions on Image Processing 18 (11), pp. 2451–2462. Cited by: §2.1.
 [7] (2019) Unprocessing images for learned raw denoising. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11036–11045. Cited by: §1, §2.2, §5.2.
 [8] (2005) A nonlocal algorithm for image denoising. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, pp. 60–65. Cited by: §1, §1, §2.1.
 [9] (2012) Image denoising: can plain neural networks compete with bm3d?. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399. Cited by: §2.1.
 [10] (201512) Lowrank matrix factorization under general mixture noise distributions. In The IEEE International Conference on Computer Vision (ICCV), Cited by: §1, §2.1.
 [11] (2018) Image blind denoising with generative adversarial network based noise modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3155–3164. Cited by: §1, §2.2.
 [12] (2017) Trainable nonlinear reaction diffusion: a flexible framework for fast and effective image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (6), pp. 1256–1272. Cited by: §2.1.
 [13] (2017) Triple generative adversarial nets. In Advances in neural information processing systems, pp. 4088–4098. Cited by: §3.2, §3.3.
 [14] (2007) Image denoising by sparse 3d transformdomain collaborative filtering. IEEE Transactions on image processing 16 (8), pp. 2080–2095. Cited by: §1, §2.1, §5.2.
 [15] (2012) Nonlocal image restoration with bilateral variance estimation: a lowrank approach. IEEE transactions on image processing 22 (2), pp. 700–711. Cited by: §1, §2.1.
 [16] (2014) Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680. Cited by: §1, §2.2, §3.2, §3.3, §5.2.
 [17] (2014) Weighted nuclear norm minimization with application to image denoising. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2862–2869. Cited by: §1, §2.1, §5.2.
 [18] (2017) Improved training of wasserstein gans. In Advances in neural information processing systems, pp. 5767–5777. Cited by: §3.3, §5.1.
 [19] (2019) Toward convolutional blind denoising of real photographs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1712–1722. Cited by: §1, §2.2, §5.2, §5.2.

[20]
(2015)
Delving deep into rectifiers: surpassing humanlevel performance on imagenet classification
. In Proceedings of the IEEE international conference on computer vision, pp. 1026–1034. Cited by: §5.1.  [21] (2010) New trends and ideas in visual concept detection: the mir flickr retrieval evaluation initiative. In Proceedings of the international conference on Multimedia information retrieval, pp. 527–536. Cited by: §5.3.
 [22] (2017) Imagetoimage translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134. Cited by: §3.2.
 [23] (2008) Natural image denoising with convolutional networks. In Advances in Neural Information Processing Systems 21, pp. 769–776. Cited by: §2.1.
 [24] (2019) Generating training data for denoising real rgb images via camera pipeline simulation. arXiv preprint arXiv:1904.08825. Cited by: §2.2.
 [25] (2019) Grdn: grouped residual dense network for real image denoising and ganbased realworld noise modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0. Cited by: §1, §2.2, §5.2.
 [26] (2014) Autoencoding variational bayes. In ICLR 2014 : International Conference on Learning Representations (ICLR) 2014, Cited by: §3.1.
 [27] (2015) Adam: a method for stochastic optimization. In ICLR 2015 : International Conference on Learning Representations 2015, External Links: Link Cited by: §5.1.
 [28] (2013) A nonlocal bayesian image denoising algorithm. Siam Journal on Imaging Sciences 6 (3), pp. 1665–1688. Cited by: §2.1.
 [29] (2018) Nonlocal recurrent network for image restoration. In NIPS 2018: The 32nd Annual Conference on Neural Information Processing Systems, pp. 1673–1682. Cited by: §2.1.
 [30] (2009) Nonlocal sparse models for image restoration. In 2009 IEEE 12th International Conference on Computer Vision, pp. 2272–2279. Cited by: §2.1.
 [31] (2007) Sparse representation for color image restoration. IEEE Transactions on image processing 17 (1), pp. 53–69. Cited by: §1, §2.1.
 [32] (2016) Image restoration using very deep convolutional encoderdecoder networks with symmetric skip connections. In NIPS’16 Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 2810–2818. Cited by: §2.1, §3.4.
 [33] (201312) Robust matrix factorization with unknown noise. In The IEEE International Conference on Computer Vision (ICCV), Cited by: §1, §2.1.
 [34] (2016) A holistic approach to crosschannel image noise modeling and its application to image denoising. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1683–1691. Cited by: §5.3.
 [35] (2019) PyTorch: an imperative style, highperformance deep learning library. In Advances in Neural Information Processing Systems, pp. 8024–8035. Cited by: §5.1.
 [36] (2017) Benchmarking denoising algorithms with real photographs. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1586–1595. Cited by: §5.3.
 [37] (2018) Neural nearest neighbors networks. In NIPS 2018: The 32nd Annual Conference on Neural Information Processing Systems, pp. 1087–1098. Cited by: §2.1.
 [38] (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In ICLR 2016 : International Conference on Learning Representations 2016, External Links: Link Cited by: §5.1.
 [39] (2015) Unet: convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pp. 234–241. External Links: Link Cited by: §3.4.
 [40] (1992) Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena 60 (14), pp. 259–268. Cited by: §1, §2.1.
 [41] (2009) Learning optimized map estimates in continuouslyvalued mrf models. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 477–484. Cited by: §2.1.
 [42] (2014) Shrinkage fields for effective image restoration. In CVPR ’14 Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2774–2781. Cited by: §2.1.
 [43] (2017) Halfquadratic inference and learning for natural images. Cited by: §2.1.
 [44] (2011) Learning nonlocal range markov random field for image restoration. In CVPR 2011, pp. 2745–2752. Cited by: §2.1.
 [45] (2017) MemNet: a persistent memory network for image restoration. In 2017 IEEE International Conference on Computer Vision (ICCV), pp. 4549–4557. Cited by: §2.1.
 [46] (2001) Statistical calibration of ccd imaging process. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vol. 1, pp. 480–487. Cited by: §1.
 [47] (2018) Weakly supervised lesion detection from fundus images. IEEE transactions on medical imaging 38 (6), pp. 1501–1512. Cited by: §1.
 [48] (2004) Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13 (4), pp. 600–612. Cited by: §4.
 [49] (2012) Image denoising and inpainting with deep neural networks. In Advances in Neural Information Processing Systems 25, pp. 341–349. Cited by: §2.1.

[50]
(2017)
Kroneckerbasisrepresentation based tensor sparsity and its applications to tensor recovery
. IEEE transactions on pattern analysis and machine intelligence 40 (8), pp. 1888–1902. Cited by: §2.1.  [51] (2018) Realworld noisy image denoising: a new benchmark. arXiv preprint arXiv:1804.02603. Cited by: §3, §5.3.
 [52] (2017) Multichannel weighted nuclear norm minimization for real color image denoising. ICCV. Cited by: §1, §2.1.
 [53] (201809) A trilateral weighted sparse coding scheme for realworld image denoising. In The European Conference on Computer Vision (ECCV), Cited by: §1, §2.1.
 [54] (2017) Robust online matrix factorization for dynamic background subtraction. IEEE transactions on pattern analysis and machine intelligence 40 (7), pp. 1726–1740. Cited by: §2.1.
 [55] (2019) Robust multiview subspace learning with nonindependently and nonidentically distributed complex noise. IEEE transactions on neural networks and learning systems. Cited by: §1, §2.1.
 [56] (2019) Variational denoising network: toward blind noise modeling and removal. In Advances in Neural Information Processing Systems, pp. 1688–1699. Cited by: §1, §1, §2.1, 2nd item, §5.2.
 [57] (2017) Beyond a gaussian denoiser: residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing 26 (7), pp. 3142–3155. Cited by: §1, §1, §2.1, §3.4, §5.2.
 [58] (2009) Nonparametric bayesian dictionary learning for sparse image representations. In Advances in neural information processing systems, pp. 2295–2303. Cited by: §1, §2.1.
 [59] (2016) Blind image denoising via dependent dirichlet process tree. IEEE transactions on pattern analysis and machine intelligence 39 (8), pp. 1518–1531. Cited by: §1, §2.1.
 [60] (2017) Unpaired imagetoimage translation using cycleconsistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pp. 2223–2232. Cited by: §3.2.