DR-Net: Transmission Steered Single Image Dehazing Network with Weakly Supervised Refinement

12/02/2017 ∙ by Chongyi Li, et al. ∙ Zhejiang University Tianjin University Australian National University 0

Despite the recent progress in image dehazing, several problems remain largely unsolved such as robustness for varying scenes, the visual quality of reconstructed images, and effectiveness and flexibility for applications. To tackle these problems, we propose a new deep network architecture for single image dehazing called DR-Net. Our model consists of three main subnetworks: a transmission prediction network that predicts transmission map for the input image, a haze removal network that reconstructs latent image steered by the transmission map, and a refinement network that enhances the details and color properties of the dehazed result via weakly supervised learning. Compared to previous methods, our method advances in three aspects: (i) pure data-driven model; (ii) the end-to-end system; (iii) superior robustness, accuracy, and applicability. Extensive experiments demonstrate that our DR-Net outperforms the state-of-the-art methods on both synthetic and real images in qualitative and quantitative metrics. Additionally, the utility of DR-Net has been illustrated by its potential usage in several important computer vision tasks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 3

page 6

page 7

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

High-quality images are desired in computer vision applications and multimedia content sharing. However, images captured in outdoors environments often suffer from noticeable interference from the haze, which is a natural atmospheric phenomenon caused by floating particles (e.g., dust, smoke, and liquid droplets). Haze has two main effects on the captured images: attenuation of the light and contamination with an additive component to the image [1]. Specifically, the scattering of floating particles distorts the direct transmission of light from the scene to the camera, deviating the light from a straight trajectory and dispersing it if the particles are comparable in size to the wavelength of the probing light. The attenuated transmission decreases of intensity while the surrounding scattered light induces a blurred appearance of the scene [2].

To remove the haze artifacts, previous single image dehazing methods usually follow the similar pipeline of (1) modeling the medium transmission, (2) refining the coarse transmission model, (3) estimating the global atmospheric light, and (4) reconstructing the latent image according to the predicted model parameters. However, this pipeline imposes several limitations. Firstly, transmission is ordinarily estimated based on priors. However, the priors relying on statistics are not accurate when hazy images captured under uncontrolled light conditions, different haze concentrations, and varying scene depth. Secondly, conventional global atmospheric light estimation methods (

e.g., dark channel, quad-tree sub-division, bright channel, et al.) often make mistakes when there are white objects, highlight regions, or shadows. Thirdly, the errors in the separate estimation steps will be accumulated and amplified when the separately estimated variables are combined, which leads to suboptimal dehazing performance [21].

To tackle these limitations, we propose a new dehazing network that benefits from pure data-driven learning, end-to-end architecture, and weakly supervised refinement. Observing in Figure 1

(a), it is obvious that the presence of haze greatly impairs the visual quality of the image. Compared to the result of the recent deep learning based method

[21] shown in Figure 1(b), our latent image (i.e., final refined result) has better contrast, details, and color.

(a) Hazy image (b) [21] (c) Our result
Fig. 1: Sample dehazing result of DR-Net.

Ii Related Work

Image dehazing aims to recover a clear image from an image captured under hazy scenes. Many approaches developed to address this ill-posed problem can be categorized as supplementary-information based methods and single-image based methods.

Supplementary information based methods usually require additional knowledge such as 3D geographical models [3], scene depth [4], multiple images of the scene under different weather conditions [5], polarization filters [6], and so forth. Nevertheless, these methods are mostly computationally intensive and not applicable to dynamic scenes. Much attention, therefore, has been devoted to single image dehazing methods [7, 8, 9, 12, 13, 15, 14, 16, 1, 19, 20]. In [9], He et al.observed an interesting phenomenon of haze-free outdoors images that at least one channel has some pixels with very low intensities. Based upon this prior, the transmission and global atmospheric light were roughly estimated. After that, the dehazed image was achieved based on the refined transmission by soft-matting [10] or guided filter [11] as well as the estimated global atmospheric light. In [1], Berman et al.proposed a nonlocal image dehazing method, which relied on the assumption that colors of a haze-free image are well approximated by a few hundred distinct colors that form tight clusters in RGB space.

With the emergence of deep learning solutions [17], deep learning based methods have achieved a promising performance in image dehazing. In [19, 20]

, a convolutional neural network (CNN) was utilized to predict the transmission. After that, guided filtering

[11] as post-processing was used to suppress halo effect in the predicted transmission caused by the patch based prediction. With the transmission and the global atmospheric light estimated by conventional methods, the haze-free image was reconstructed. Different from Cai et al.[19] and Ren et al.[20], which separately estimated model parameters and used post-processing, DR-Net directly produces a clear image in its end-to-end system. In [21], Li et al.estimated the parameters of a haze image formation model in one unified CNN model. Such all-in-one model made it easy to embed the model into other deep models. Though deep learning based methods have advantages, there is still much room for improvement, such as robustness for varying scenes, fidelity and visual quality of reconstructed images, and flexibility for applications.

Ii-a Our Contributions

DR-Net is a pure data-driven, end-to-end, fully-convolutional network that consists of three main subnetworks designed for specific tasks. The contributions are summarized as follows:

  • To the best of our knowledge, this is the first attempt that investigates the combination of strongly and weakly supervised learning for single image dehazing. DR-Net reconstructs latent images based on strongly supervised learning, while the dehazed results are further refined by weakly supervised learning built on a Generative Adversarial Network (GAN). Furthermore, DR-Net includes a transmission prediction subnetwork that improves haze removal performance and training convergence.

  • Instead of following the traditional pipeline, DR-Net directly predicts clear images in a pure data-driven and end-to-end manner, which is more flexible and suitable for practical applications. In addition, DR-Net directly minimizes the reconstruction loss to avoid the accumulated errors from individual estimations of transmission and global atmospheric light. This produces more accurate reconstruction results.

  • DR-Net achieves the best performance on both synthetic and real hazy images. Additionally, DR-Net generalizes well to varying scenes and lighting conditions. Code and data will be available after publication.

Iii DR-Net

To automatically reveal the underlying correlations between hazy image and haze-free image in an end-to-end manner, DR-Net employs a transmission prediction subnetwork, a haze removal subnetwork, and a refinement subnetwork. An overview of the DR-Net architecture is shown in Figure 2. In what follows, we explain these three subnetworks in detail. Before that, we first formulate the problem.

Fig. 2: An overview of the DR-Net architecture. From top to the bottom are the transmission prediction subnetwork, haze removal subnetwork, and refinement subnetwork. Different color blocks represent different operations.

Iii-a Problem Formulation

For an image captured under the haze, only a part of the reflected light from the scene reaches the imaging sensor due to the absorption and scattering effects, which decrease the visibility and contrast of the scene. According to the atmospheric scattering model [22], hazy image formation can be described as

(1)

where denotes the pixel coordinates, is the observed image, is the haze-free latent image, is the global atmospheric light, and is the transmission, which represents the percentage of the scene radiance reaching the camera. When the haze is homogenous, can be further expressed in an exponential decay term as

(2)

where is the atmospheric attenuation coefficient and is the distance from the scene to the camera. The purpose of single image dehazing is to reconstruct from .

Iii-B Transmission Prediction Subnetwork

Inspired by [23], we employ a transmission prediction subnetwork in our DR-Net. This subnetwork also adopts the multi-scale fully convolutional network architecture proposed in [20], which consists of a coarse-scale network and a fine-scale network. The task of the coarse-scale network is to predict a holistic transmission of the scene and the task of the fine-scale network is to refine the textures of the transmission. In the fine-scale network, the outputs of the first layer are concatenated with the output from the coarse-scale network as the inputs of the second layer. Unlike [20], we remove the pooling and up-sampling operations that tend to cause blurring in the predicted transmission output.

Iii-B1 Loss functions of the transmission prediction subnetwork

For the coarse-scale transmission prediction network, we impose a reconstruction objective, that is, we minimize the Mean Squared Error (MSE) loss function

(3)

where is the number of each batch, is the dimension of the transmission map, is the learned transmission prediction mapping function of the coarse-scale network, is input hazy image, is the predicted coarse transmission map, is the ground truth of transmission map. Using MSE loss, a predicted transmission map with coarse details and textures is achieved. Then, the coarse transmission map is concatenated with the outputs of the first layer of fine-scale network as the second layer inputs of the fine-scale network. For the fine-scale transmission prediction network, we also minimize the MSE loss function

(4)

where is the learned transmission prediction mapping function of the fine-scale network.

To further preserve structure and texture of the predicted fine transmission, we add a structural similarity index (SSIM) loss [30] to fine-scale network. Firstly, the SSIM value for every pixel between the predicted fine transmission map and the ground truth is calculated as follows:

(5)

where and are the corresponding image patches with size in the fine transmission map and ground truth, respectively. Above, is the center pixel of image patch, is the mean of ,

is the standard deviations of

, is the mean of , is the standard deviations of , is the covariance between and . Using the parameters default in the SSIM loss, we set the values of and to 0.02 and 0.03. Using (5), the SSIM loss between the predicted fine transmission map and the ground truth transmission map is expressed as

(6)

where is the dimension of transmission map.

The final loss function for the transmission prediction subnetwork is the linear combination of the above-introduced losses with the following weights:

(7)

The blending weights are picked empirically based on preliminary experiments on the training data.

Iii-C Haze Removal Subnetwork

Transmission represents the percentage of the scene radiance reaching the camera and transmission map indicates the haze concentrations in the input hazy image, which is the significant clue for haze removal. Different from previous methods that estimate transmission map and then directly reconstruct latent result based on the guidance of transmission map, we feed the predicted fine transmission map to the haze removal subnetwork as an additional feature map in order to automatically reveal the underlying correlations among hazy image, transmission map, and haze-free image.

For the haze removal subnetwork, we learn the residual between the original hazy image and the corresponding haze-free image. Residual learning allows the end-to-end training easier and more effective, because this may simply drive the weights of multiple nonlinear layers toward zero [28]. Although the concept of predicting residuals has been used in previous methods [29, 26, 25, 24]

, it has not been studied in the context of learning based haze removal. Furthermore, we boost it by stacking early layers at the end of each block, which strengthens feature propagation and alleviates the vanishing-gradient problem

[27]. The haze removal subnetwork architecture can be found in Figure 2.

Iii-C1 Loss functions of the haze removal subnetwork

To learn the residual mapping, we add the estimated residual to the input hazy image and minimize the MSE loss

(8)

where is the learned residual mapping function, is the ground truth of hazy image , and is the dimension of the input image. However, we note that a direct optimizing MSE loss function tends to introduce artifacts and fake boundaries. Thus, similar to our formulation in (5) and (6), we also compute the SSIM loss () between the dehazed result and the ground truth. The final loss function for the haze removal subnetwork is the linear combination of the aforementioned losses:

(9)

Our haze removal network is a fully convolutional network for computational efficiency and it does not include batch normalization layers to not discard any useful image details.

Iii-D Refinement Subnetwork

The combination of the transmission prediction subnetwork and haze removal subnetwork can yield a clear image, however this dehazed output might have imprecise color range and contrast, in particular for outdoors data. The reason is that the dehazing subnetwork is trained with synthetic hazy images using an indoor RGB-D dataset.

To remedy this problem, we design a refinement subnetwork to enhance the contrast and color of the dehazed output based on weakly supervised learning. The refinement network is inspired by weakly supervised learning models [31, 32, 33] which aim at capturing special characteristics of a given image collection and modeling how these characteristics could be translated onto another image collection.

Our goal is to learn a mapping function from a source domain (i.e., low quality images with monotonous color and blur details) to a target domain (i.e., high quality images with vivid color and clear details). Following the GAN concept, the refinement subnetwork consists of a generator G and a discriminator D. The task of G is to trick D so that D confuses the G’s outputs (i.e., refined images) as high quality images. Specifically, G aims to refine the given low quality image while D attempts to distinguish whether the refined image is authentic or not. Note that, unlike [34, 35, 36] that generate novel images, our network only enhances its input.

We use a fully convolutional network as the generator where we incorporate nested shortcut connections (i.e

., skip connections) across the symmetric layers with the aim of addressing the vanishing gradient problem because shortcut connections can effectively propagate gradient in the process of back propagating. All convolutional layers in the generator network are followed by batch normalization and ReLU activation function, except for the last one, where a scaled hyperbolic tangent is applied to output. We adopt a CNN based architecture proposed in

[35] as our discriminator due to its simplicity and effectiveness. The refinement subnetwork architecture can be found in Figure 2.

Iii-D1 Loss functions of the refinement subnetwork

For the generator function G: and the discriminator D, the adversarial loss is expressed as:

(10)

where domain , domain , is the learned haze removal mapping function, G tries to generate image that looks similar to image from domain while D aims to distinguish between and real sample . G tries to minimize this loss against an adversarial D that tries to maximize it. Besides, we expect that our refined result keeps the content and structure of the dehazed result. Thus, we include MSE and SSIM losses to the refinement subnetwork optimization. The MSE loss is expressed as

(11)

where is the number of each batch, is the learned haze removal mapping function, is the learned mapping function of the generator network, is input hazy image, is the dimension of the input image. The calculation of SSIM loss is similar to (5) and (6). The total loss for the refinement subnetwork is expressed as

(12)

The weights are fine-tuned on the training data.

Iii-E Training and Implementation Details

Iii-E1 DR-Net Training

Instead of training the transmission prediction subnetwork, and then using the transmission output to train the haze removal subnetwork, we optimize these two subnetworks at the same time. To stabilize our training, we adopt stage-wise learning scheme for refinement subnetwork, which separately optimizes the dehazing subnetworks (i.e., the transmission prediction and haze removal subnetworks together) and refinement subnetwork.

To train the dehazing subnetworks, we first synthesize a hazy image dataset according to (1) and (2) using RGB-D images from NYU Depth dataset [18]. Here, we assume that each channel of an image has the same global atmospheric light value and transmission value. Then, we randomly select global atmospheric light from [0.7, 1.0] and set the atmospheric attenuation coefficient varying from 0.6 to 1.6. We divide NYU Depth dataset into two parts: training part with 1299 RGB-D images and validation part with 150 RGB-D images. For each RGB-D image, we randomly select 20 global atmospheric light and atmospheric attenuation coefficient values to synthesize 20 hazy images and 20 corresponding transmission maps. Last, we obtain a training set including training samples and a validation set including validation samples. Those synthetic samples include hazy images with different haze concentrations and light intensities and the corresponding transmission maps. We resize these samples to size .

For the refinement subnetwork training, we first select 3000 high quality images from CUHK-Photo Quality Dataset [37]. Next, we download 3000 hazy images from Internet, and then process them using our dehazing subnetworks. The dehazed results are used as low quality images. With the 3000 image pairs with size , we update the generator G and the discriminator G in a alternating manner. For the generator, we first train it using MSE and SSIM losses. After converging, we add adversarial loss to it for preserving the content and structure of original images.

Iii-E2 Implementation

DR-Net was implemented on a computer with Nvidia GTX 1080Ti GPU, Intel I7 CPU 4.0GHz and 32GB RAM using the TensorFlow framework. We trained the DR-Net using ADAM

[38] and set the learning rate and momentums to 0.0002 and 0.9 for these three subnetworks. The batch sizes for the dehazing subnetworks and the refinement subnetwork were 16 and 8, respectively. It took 10 hours to optimize entire DR-Net (the dehazing subnetworks about 6 hours; refinement subnetwork about 4 hours). Note that the optimization of DR-Net is very fast. The reasons are 1) transmission map as an additional feature map accelerates training convergence; 2) the architecture of DR-Net is simple and efficient. The processing time of transmission prediction subnetwork, haze removal subnetwork, and refinement subnetwork for an image with size are 0.04s, 0.06s and 0.15s (i.e., total time is 0.25s) on the above-mentioned machine. In contrast to previous methods that rely on guided filtering to suppress artifacts, our method directly produces clear results without relying on post-processing, which significantly reduces processing time. DR-Net is very fast for practical use.

Iii-E3 Parameter settings

Transmission prediction subnetwork: For the coarse-scale, from the first layer to the fourth layer consist of 16, 16, 16, and 1 filters with size , , , and , respectively. For the fine-scale, from the first layer to the fourth layer consist of 16, 16, 16, and 1 filters with size , , , and , respectively.

Haze removal subnetwork: We use filter size. The number of feature map for each convolutional layer is the same (i.e., 32) except for the last layer (i.e., 3).

Refinement subnetwork: For the generator network, all of convolutional layers have 32 filters with size , except for the last one.

In our architecture, we pad zeros for all convolution layers to preserve image size. Additionally, our DR-Net is a fully convolutional network, which can be seamlessly embedded with other deep network architectures.

We also investigated the impacts of kernel size, filter number and network depth on DR-Net. Similar to the conclusions of other low-level networks, the larger kernel sizes, more filters and deeper network depth can improve the performance to some extent at the cost of running time. Our DR-Net can generate good enough results with the above-mentioned parameters while the complexity and computation time can be reduced further as needed.

Iv Experiments

To evaluate our DR-Net, we use both synthetic and real data and compare with several recent state-of-the-art single image dehazing methods: Meng et al.[12], Cai et al.[19], Ren et al.[20], and Li et al.[21]. Among these methods, Cai et al.[19], Ren et al.[20], and Li et al.[21] are deep learning based methods that are similar with our method. To further demonstrate the performance of our DR-Net, we also present the results of our dehazing subnetworks. The results after our dehazing subnetworks and refinement subnetwork are denoted as ID and IR, respectively. More results are provided in the supplementary material.

To validate the generalization and utility of our method, we conduct experiments on challenging data and illustrate the potential usage of DR-Net in keypoint matching and object localization as well. We analyze the effects of transmission prediction subnetwork at last.

We do not compare the processing time since different methods are implemented on different platforms with or without GPU acceleration.

Iv-a Experiments on Synthetic Data

(a)Hazy images (b) [12] (c) [19] (d) [20] (e) [21] (f) ID (g) IR (h) GT
Fig. 3: Results on synthetic hazy images. The details are amplified in the red boxes.

Figure 3 presents sample results for several synthesized hazy images. It is visible that Meng et al.[12] tends to over-enhance the inputs and introduces color deviation (e.g., the color of door) while Cai et al.[19], Ren et al.[20] and Li et al.[21] leave haze on the results. In comparison, the results of our DR-Net (ID and IR) are superior to all the others and much more similar to the ground truth data.

In Table LABEL:label:1, we show the average values of different metrics on testing set. The values in bold represent the best results. Since ground truth images are known for the synthetic images, we use MSE, PSNR (dB), and SSIM metrics for quantitative evaluation on testing set. A higher SSIM value indicates a result that is more close to the ground truth in terms of structural properties. A higher PSNR (lower MSE) value indicates similarity in terms of pixel-wise values.

As visible, our ID and IR achieve the best MSE, PSNR, and SSIM values. The reasons include 1) ID directly minimizes the reconstruction loss to avoid the accumulated errors from separated parameters estimation; and 2) in the optimization process, SSIM and MSE losses are used.

Metrics [12] [19] [20] [21] ID IR
MSE 2053 655 1381 1048 477 526
PSNR 15.0 19.8 16.7 17.9 21.3 20.9
SSIM 0.78 0.87 0.80 0.81 0.91 0.89
TABLE I: Quantitative evaluation on synthetic testing set.

Iv-B Experiments on Real Data

To demonstrate the effectiveness of DR-Net on real data, we collected a test dataset including 30 hazy images with a variety of haze levels, image content, and light conditions downloaded from the Internet. Figure 4 shows visual comparisons on several real hazy images.

(a)Hazy images (b) [12] (c) [19] (d) [20] (e) [21] (f) ID (g) IR
Fig. 4: Results on real hazy images. From top to bottom are ‘Tiananmen’, ‘wheat’, and ‘Dubai’.

In Figure 4, the results of Meng et al.[12], ID, and IR leave less haze artifacts. Nevertheless, Meng et al.[12] introduces obvious color deviation (e.g., the background of ‘Tiananmen’ and ‘Dubai’). The results of Cai et al.[19], Ren et al.[20] and Li et al.[21] remain haze (e.g., the background of ‘wheat’ ). Compared to other methods, our IR has the most competitive visual results among all, with high contrast, clear structure, plausible details and vivid color.

Since the ground truth images are not available for the real data, we conduct a user study to provide realistic feedback for quantitative evaluation. We invited 10 subjects who had experience with image processing to rank the visual quality. Subjects were allowed to zoom in and out at will without time restriction. The scores range from 1 (worst) to 10 (best). We collected scores and presented the average scores of different methods in Table LABEL:label:2.

Metric [12] [19] [20] [21] ID IR
Scores 4.7 6.1 7.7 6.5 8.2 9.1
TABLE II: User study on 30 real hazy images.

As shown, our ID and IR receive the best scores, which indicates that, from a visual perspective, our method can produce much better performance on real hazy images. Meng et al.[12] achieves worst score, which indicates that over-enhancement and color deviation importantly affect the visual quality. Besides, we believe that there is a gap between real hazy images and synthetic haze images, which leads to IR ranking second in the quantity comparisons on synthetic testing set.

Iv-C Experiments on Challenging Data

We expect that dehazing methods also generalize well to challenging data. Thus, we carry out an experiment on challenging data, including image with heavy haze, haze-free image, and image with white object. The visual comparisons are shown in Figure 5.

(a)Test images (b) [12] (c) [19] (d) [20] (e) [21] (f) ID (g) IR
Fig. 5: Results on challenging images. From top to bottom are ‘buildings’ with low light and heavy haze, ‘pizza’ without haze and ‘canyon’ with white cloud.

In Figure 5, for image ‘buildings’, the results of Ren et al.[20], ID and IR remain less haze. Besides, the buildings in the IR has better details and brightness. Other methods, however, have less effect on image ‘buildings’, even introducing bluish color deviation. Generally, it is difficult to recover images with heavy haze. For image ‘pizza’, Meng et al.[12], Ren et al.[20] and Li et al.[21] produce over-enhancement (e.g., the color of pizza). In practice, it is expected that image dehazing methods have less effect on haze-free image. For image ‘canyon’, all of methods are unsensitive to white cloud. However, the results of Meng et al.[12] significantly suffers from color deviation on the canyon. Although DR-Net is trained using synthesized data, it generalizes well to outdoor hazy images even captured under challenging scenes. Furthermore, different from Ren et al.[20] which requires different gamma values to amend transmission value according to haze concentrations, our DR-Net does not need manual parameters tuning, which is desired for practical applications.

Iv-D Applications

To illustrate the potential usage of our DR-Net, we employ it on several computer vision tasks such as keypoint matching and object detection and recognition.

Keypoint matching is the base of many important computer vision problems. We employ the SURF operator [39] for original haze image pair and our refined image pair.

(a)Hazy image pair (b)Refined result pair
Fig. 6: An example of keypoint matching.

In Figure 6, the initial matches are 226 while the matches of our refined pair are 344. DR-Net significantly increases the number of SURF keypoint matching.

Object detection and recognition has attracted much attention recently. We use the fast R-CNN [40] trained on VOC 2007 dataset to valid the performance of DR-Net. In Figure 7, the image after processing by DR-Net improves the accuracy of objects detection and recognition. Additionally, due to the low quality of input image with JPG compression, our method produces artifacts. This phenomenon also can be observed in the results of other image dehazing methods. In future work, we will investigate how to reduce the effects induced by low quality images.

In short, experiments on applications provide additional evidence for the potential usage of DR-Net.

(a)Hazy image (b)Refined result
Fig. 7: An example of object detection and recognition.

Iv-E Effects of Transmission Prediction Subnetwork

To analyze the effects of the transmission prediction subnetwork, we fix the default parameter settings, and then remove the transmission prediction subnetwork. It took about 15 hours to optimize this DR-Net. Obviously, it took more time than our original DR-Net optimization, which demonstrates that the predicted transmission map is helpful for the training convergence. Besides, we also observed the predicted transmission map has positive effects on the results.

(a) (b) (c) (d) (e) (f)
Fig. 8: An example to illustrate the effects of transmission prediction subnetwork. (a) Hazy image. (b) Dehazed result from dehazing subnetworks without transmission prediction subnetwork. (c) Refined result of (b). (d) The predicted transmission by transmission prediction subnetwork. (e) Dehazed result from dehazing subnetworks with transmission prediction subnetwork. (f) Refined result of (e). (Best viewed on high-resolution display with zoom-in.)

In Figure 8, the hazed and refined results from DR-Net with transmission prediction subnetwork has superior visual quality than the one without transmission prediction subnetwork (e.g., the regions of grassland and roof), which indicates the positive effects of the predicted transmission map.

V Conclusion

We have presented a deep learning model for single image dehazing, namely DR-Net, which includes three subnetworks. For haze removal, we propose a transmission prediction subnetwork and a haze removal subnetwork, which are designed to reconstruct clear image steered by transmission map. To improve the contrast and color of the dehazed result, we propose a refinement subnetwork based on weakly supervised learning of image-to-image translation. Experimental results on synthetic and real data have demonstrated that DR-Net achieves state-of-the-art performance.

References