Transmission Map and Atmospheric Light Guided Iterative Updater Network for Single Image Dehazing

08/04/2020 ∙ by Aupendu Kar, et al. ∙ IIT Kharagpur 3

Hazy images obscure content visibility and hinder several subsequent computer vision tasks. For dehazing in a wide variety of hazy conditions, an end-to-end deep network jointly estimating the dehazed image along with suitable transmission map and atmospheric light for guidance could prove effective. To this end, we propose an Iterative Prior Updated Dehazing Network (IPUDN) based on a novel iterative update framework. We present a novel convolutional architecture to estimate channel-wise atmospheric light, which along with an estimated transmission map are used as priors for the dehazing network. Use of channel-wise atmospheric light allows our network to handle color casts in hazy images. In our IPUDN, the transmission map and atmospheric light estimates are updated iteratively using corresponding novel updater networks. The iterative mechanism is leveraged to gradually modify the estimates toward those appropriately representing the hazy condition. These updates occur jointly with the iterative estimation of the dehazed image using a convolutional neural network with LSTM driven recurrence, which introduces inter-iteration dependencies. Our approach is qualitatively and quantitatively found effective for synthetic and real-world hazy images depicting varied hazy conditions, and it outperforms the state-of-the-art. Thorough analyses of IPUDN through additional experiments and detailed ablation studies are also presented.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 5

page 8

page 9

page 10

page 13

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

An image captured in the hazy environment suffers from obscured visibility, reduced contrast, color cast and many other degradations due to the scattering and absorption of light by fog, aerosols, sands and mists present in the atmosphere[20, 7, 53]. Such distorted images hinder the performance of several computer vision tasks related to computational photography, automatic driving systems, surveillance, and many more. Therefore, dehazing in such cases is essential for producing images of good perceptual quality and for improving the performance of subsequent computer vision tasks on them [32].

(a) Training Stage 1
(b) Training Stage 2
(c) Training Stage 3 (end-to-end network)
Fig. 1: A schematic diagram of our proposed IPUDN training framework for end-to-end dehazing.

Although initial works on dehazing considered multiple images of the same scene [42, 43, 41], lately, single image dehazing has gained popularity, which aims at producing a dehazed image from the single hazy image at hand. While many recent single image dehazing techniques like [7, 46, 27, 25, 1, 8, 38, 69, 11, 3, 59, 17, 20, 37, 18] are based on estimating transmission map and atmospheric light using various priors, quite a few attempts have been made at end-to-end single image dehazing [14, 67, 19, 34, 45, 10, 47, 15, 40, 52, 66, 51, 50, 31, 9].

Accurate estimation of transmission maps helps in proper dehazing reversing the effects of absorption and scattering [67]. Several approaches estimate the transmission map using different priors such as the dark channel prior (DCP) [20], color attenuation prior, haze-lines, etc. [7, 46, 27, 25, 1, 8, 38, 69, 11, 3, 59, 17, 20, 37, 18]. Further, accurate atmospheric light estimation is also crucial for recovering the appropriate illumination condition during dehazing. Many techniques estimate the atmospheric light from the bright pixels of DCP [20]. However, there are a few dehazing approaches that use deep neural networks for estimation of both the transmission map and atmospheric light directly from the hazy image [9, 66].

End-to-end deep neural network based frameworks for single image dehazing have been proposed lately that do not perform explicit transmission map and atmospheric light estimations [14, 67, 19, 34, 45, 10, 47, 15, 40, 51, 50, 31]. They do so to avoid sub-optimal restoration due to the estimation of transmission map and atmospheric light disjoint from the dehazing system [31].

(a) Hazy Image
(b) EPix2Pix [47]
(c) MSBDN [14]
(d) Our IPUDN
Fig. 2: Subjective comparison of our result on a real hazy image [46] with the state-of-the-art EPix2Pix [47] and MSBDN [14] methods for dehazing. Cropped regions in boxes are for detail inspection.

However, guidance by appropriate transmission map can help in effective dehazing for a wide range of haze density, as a transmission map essentially provides the amount of haze at image pixels as a function of the scene depth. Guidance by accurate atmospheric light can also prove to be useful as an atmospheric light represents the illumination associated with haze. Further, guidance by proper channel-wise atmospheric light could help in handling color distortions due to haze. Hence, a synergistic use of transmission map and atmospheric light estimation within an end-to-end deep learning framework, where dehazing and the said estimations are carried out jointly, might produce high quality, visibility enhanced and visually pleasing dehazed images.

In this paper, we propose an end-to-end long short term memory (LSTM) based iterative dehazing framework that includes judiciously integrated transmission map and atmospheric light updater networks. The framework uses initial estimates of transmission map and atmospheric light from dedicated estimator networks as priors, which are iteratively updated by the updater networks during dehazing. A schematic representation of our approach, Iterative Prior Updated Dehazing Network (IPUDN), is shown in Fig. 

1. Our system handles a wide variety of hazy conditions, ranging from low to high density, with or without color cast. As can be seen, the training of our dehazing system is carried out in three stages.

In the first stage, we train a densely connected encoder-decoder based network to obtain an initial estimate of the transmission map. A novel convolutional neural network with a global max-pooling block is also trained to get an initial estimate of the atmospheric light. Unlike existing approaches, we perform estimation of channel-wise atmospheric light without considering any specific prior. Apart from appropriate haze illumination estimation, use of channel-wise atmospheric light allows us to handle color cast when it is present.

In the second stage, we consider a novel iterative single image dehazing framework, where the iteration is aimed at handling a wide range of haze density in a graded manner reducing distortion. The initial estimates of the transmission map and atmospheric light are fed into our LSTM based iterative dehazing framework as priors, where separate updater networks are trained to update the transmission map and atmospheric light estimates. Intermediate dehazed output and updates of the said estimates are generated in each iteration, which are used in the next iteration along with the initial estimates of the transmission map and atmospheric light to achieve dehazing in a progressive manner.

As evident, in the above two stages, the transmission map and atmospheric light estimators, and the dehazing framework are trained separately with different objective functions. To achieve the best performance from our entire dehazing system, in the third and final stage, the two estimators and the dehazing framework are fine-tuned together (jointly) introducing dependencies between the three networks.

We provide details of our implementation along with comprehensive analysis including ablation study of the models and parameters in our system. The proposed approach is extensively evaluated on standard and recent datasets, compared qualitatively and quantitatively with the state-of-the-art, and shown to have utility in a few applications.

Our proposal of using a separate dehazing framework with transmission map and atmospheric light updaters successfully handles the insufficiency in the initial estimates of transmission map and atmospheric light improving the performance significantly. Further, our separate dehazing network which is guided by transmission map and atmospheric light is found to produce high quality dehazed images even in the presence of a wide range of haze density and color cast. The vital role of transmission map and atmospheric light in our dehazing approach is experimentally validated against an image-to-image mapping network performing dehazing. Our approach not only gives proper visually enhanced and realistic reconstruction for images of indoor and outdoor scenes with synthetic haze, but also performs similarly in a broad spectrum of natural hazy images.

To summarize, the main contributions of our paper are as follows:

  1. [noitemsep]

  2. We propose a recurrent convolutional neural network based novel iterative framework that progressively dehazes images, which is effective in different types of hazy conditions.

  3. We introduce novel updater networks that update initial estimates of transmission map and atmospheric light in each iterative step.

  4. We propose a novel convolutional neural network based architecture to get an initial estimate of channel-wise atmospheric light, which helps in handling color cast when present.

  5. We discuss and experimentally validate the importance of using transmission map and atmospheric light estimation, their updation, and the significance of progressive dehazing using a separate novel iterative dehazing framework.

  6. Our proposed mechanism achieves on an average around dB improvement over the state-of-the-art, and the performance boost becomes more prominent with increased presence of haze.

The rest of the paper is organized as follows. Section 2 discusses the related work. Section 3 elaborately describes our proposed image dehazing approach IPUDN. Section 4 presents the results of the extensive experiments, and the qualitative and quantitative comparisons of our approach with the state-of-the-art. Section 5 discusses different analysis and ablation studies related to our proposed approach, and also presents additional experimental results. Finally, Section 6 concludes the paper discussing future scope. Our project web-page is available at aupendu.github.io/iterative-dehaze.

2 Related Work

Earlier dehazing techniques involved depth map estimation from multiple images [43, 41, 42]. But due to different constraints in multiple image generation and restoration, the recent attention in the domain is on single image dehazing. We categorize techniques on single image dehazing into hand-crafted prior based and learning-based data-driven solutions.

2.1 Handcrafted Prior based Image Dehazing

Single image dehazing is an ill-posed problem in computer vision. Different astutely considered priors or assumptions have been used to solve this problem. Tan et al. [59] performed dehazing by maximizing contrast. Fattal et al. [17] proposed a dehazing technique based on the relationship between surface shading and transmission map. He et al. [20] proposed the dark channel prior (DCP), which is the most popular prior used in image dehazing. DCP is based on the observation that local regions in natural non-hazy scenes have very low intensity in at least one of their color channels. As haze increases, the pixel values in the dark channel increases. Thus, the value in the dark channel is measure of haze, which is used for the estimation of transmission map. Several techniques refined the DCP for halo free transmission map estimation using different edge-preserving smoothing filters [36, 21, 37, 38]. Later, many priors have been proposed, such as color-lines [44, 18], color attenuation prior [69], color ellipsoidal prior [8], gamma correction prior [25], haze-lines [7], and many more, which performed well in image dehazing. Color-lines in RGB space of non-hazy natural images pass through the origin [44], whereas these color-lines are shown to deviate from the origin for hazy images by Fattal [18]. This deviation of color-lines from the origin due to the presence of haze is exploited to propose dehazing based on color-lines prior. Color attenuation prior [69] is based on the assumption that with the increase in haze, the brightness of the image increases but saturation decreases. Based on this assumption, a linear model is proposed for image dehazing. In [8]

, the authors fit the hazy pixel clusters of RGB space in color ellipsoid and then calculate a prior vector to estimate the transmission map using color ellipsoid geometry for dehazing. Ju et al. 

[25] propose gamma correction prior image dehazing model which generates a gamma corrected image from the input hazy image. Both the images are then used to compute the scene depth for performing dehazing. Haze-Lines [7] prior is based on the assumption that the colors in a non-hazy image can be represented by using a few hundred distinct vectors. In the presence of haze, these colors form clusters along lines in RGB space, named as haze-lines, which pass through the coordinate value corresponding to the atmospheric light. For a hazy image, based on the predicted atmospheric light, the haze-lines are computed to estimate the transmission map for dehazing. Apart from transmission map estimation, atmospheric light estimation is another essential task for good quality dehazing [20, 1]. The most popular way of estimating atmospheric light is by averaging the top 0.1% of brightest pixels in DCP [20].

Hazy images are prone to color cast issues in the presence of different atmospheric conditions like the sandstorm [23, 13]. Huang et al. propose a dehazing technique which handles color cast due to sandstorms. Ancuti et al. [3] introduced white balancing in image dehazing for handling color cast. Choi et al. [11] suggested a similar approach, where they introduced haze density weight along with white balancing for haze aware image dehazing. Peng et al. [46] proposed image dehazing through saturation correction that handles color cast. Recently, Kim et al. [27] proposed saturation-based transmission map estimation for dehazing which performs color correction using white balancing approach. The authors mention that such a approach is useful for hazy images with fine or yellow cast.

2.2 Learning based Image Dehazing

The unprecedented success of deep convolution neural network (CNN) in different computer vision tasks motivated the community to applying it for image dehazing. Cai et al. [9] introduced DehazeNet to estimate the transmission map for image dehazing. Li et al. [31] introduced AODNet, an end-to-end CNN model on a re-formulated atmospheric model to generate a dehazed image. Zhang et al. [66] proposed DCPDN, a dense encoder-decoder model for estimation of transmission map and atmospheric light. Then they followed the expression of atmospheric model shown in Equation (1) to produce the haze-free image. Qu et al. [47]

proposed a generative adversarial network (GAN) based Pix2pix network that considers image-to-image translation over the atmospheric model expression for dehazing. Park et al. 

[45] propose GAN based model comprising of CycleGAN and conditional GAN for dehazed texture-aware image dehazing. Chen et al. [10] showed that how patch size effects the quality for DCP based image dehazing and proposed an efficient patch map selection mechanism through CNN for image dehazing. Li et al. [34] exploits the idea of DCP along with a gradient prior to propose a semi-supervised deep model to dehaze images. Golts et al. [19] used DCP loss for training unsupervised deep network for image dehazing. Liu et al. [40] introduced GridDNet, an end-to-end deep learning architecture with pre- and post-processing blocks for the same. Ren et al. [50] used multi-scale CNNs for image dehazing. Later, Ren et al. [51] proposed a multi-scale gated fusion network using encoder-decoder architecture that produced haze-free images. Dudhane et al. [15] proposed RYF-Net, which includes RNet and YNet for transmission map estimation in RGB and YCbCr space, and FNet which fuses the generated transmission maps for image dehazing. Wang et al. [60] showed that the image’s Y channel in YCbCr space is influenced to a greater extent by atmospheric illumination in hazy weather than the chrominance channels. Based on this, they proposed multiscale CNN based AIPNet for image dehazing. Zhang et al. [67] proposed fast multi-scale dehazing model FAMED-Net, which fused the response from a three scale encoder to perform dehazing. Li et al. [35] proposed a method to learn different levels of haze and develop an integration strategy to get the final dehazed output. Recently, Dong et al. [14] proposed multi-scale based deep network which works on strengthen-operate-subtract-boosting strategy for image dehazing. Recently, Santra et al. [52] proposed a patch quality comparator for transmission map estimation for image dehazing. DCP based atmospheric light computation was used and it was shown that their patch selection through patch quality comparator could handle color cast.

3 Proposed Iterative Prior Updated Dehazing Network (IPUDN)

We divide our learning-based framework into three parts: transmission map estimator, atmospheric light estimator, and dehazing architecture. We discuss the need for the three separate models in Section 3.1. Later, we describe the transmission map and atmospheric light estimator models in Sections 3.2 and 3.3, respectively. In Section 3.4, we present our the recurrent convolutional neural network based dehazing architecture with iterative update strategies for transmission map and atmospheric light.

3.1 Haze Formation Model and Motivation

In the dehazing literature, many image dehazing approaches follow the atmospheric scattering model of haze formation [29].

(1)

where is the value at the pixel in the hazy image channel, is the corresponding scene radiance, is the atmospheric light and is the transmission map. is depth-dependent and it is defined as , where is the attenuation coefficient, which is related to haze density, and is the distance between the camera and the scene. Therefore, generating the dehazed image from the hazy image requires depth-dependent estimation of along with .

Many papers in literature estimate the transmission map and the atmospheric light separately, and perform the dehazing operation as follows

(2)

Estimation of and using separate objective functions rather than jointly with dehazed output estimation may result in sub-optimal image reconstruction [31]. Even when and are estimated using reconstruction loss of the dehazed output through Equation (2), sub-optimal image reconstruction may take place, as the reconstruction loss will essentially represent the aggregate of the individual losses in the estimated and .

A few recently developed deep learning based methods consider end-to-end training for dehazing without estimating the transmission map and atmospheric light. A couple of such techniques [47, 51, 55] have been found to perform very well on images with less amount of haze, but do not often work satisfactorily for a wide range of haze density. Absence of guidance by an appropriate transmission map may be a pivotal reason for this observation, as a transmission map essentially provides the amount of haze at image pixels and its use might make a model aware of the amount of dehazing required.

With the motivation to overcome the above issues, we use a separate deep dehazing network along with transmission map and atmospheric light estimation models. The dehazing network takes an hazy image along with its estimated transmission map and atmospheric light as inputs, and updates the two estimates iteratively. The dehazing network guided by the updated transmission map and atmospheric light reconstructs the optimal dehazed image.

3.2 Transmission Map Estimation Network

The transmission map provides useful information about haze density, which helps in the proper dehazing of an image. We use the densely connected encoder-decoder network of [66] to estimate the transmission map. We train the transmission map estimation model using structural similarity (SSIM) [61]

as a loss function instead of the mean-squared error (MSE) loss function.

[66] shows that use of SSIM loss gives sharper edges retaining structural information, which leads to reduction of halo artifacts, one of the main issues associated with image dehazing. Once the model is trained, we use it as a transmission map estimator while training the dehazing network and later during testing.

3.3 Atmospheric Light Estimation Network

Fig. 3: Atmospheric light estimation model to predict atmospheric light across color channels. The convolution blocks hierarchically extract regional contributions to atmospheric light, which is pooled globally to get the maximum contribution as the estimate.
(a) Dehazing Model using LSTM based recurrence
(b) Atmospheric Light Updater
(c) Transmission Map Updater
Fig. 4: Our proposed LSTM based iterative dehazing network with transmission map and atmospheric light updater. Here, and are the dehazed output, updated atmospheric light and updated transmission map, respectively, at time step . is the output from the LSTM at time step . The inputs to the dehazing model are , to the atmospheric light updater are and to the transmission map updater are , where and

are the hazy input image, and initial estimates of atmospheric light and transmission map, respectively. The dehazing model comprises of a pivotal LSTM block which introduces inter-time step dependencies and consecutive residual blocks to enable intricate changes in image pixels. The updaters employ hierarchical feature extraction, with the atmospheric light updater using average pooling to aggregate all pixel-level contributions.

Atmospheric light is a critical factor for generating dehazed outputs with proper lighting condition. Inaccurate estimation of atmospheric light may lead to under or overexposed images with color distortions. For atmospheric light estimation, we propose a novel convolutional neural network architecture as shown in Fig. 3. As evident, we use sequentially stacked convolution layers, where each of them is followed by group normalization [62]

and ReLU non-linearity.

max-pool layer with stride

helps in reducing spatial dimension in subsequent pairs of convolution blocks. Our model computes three atmospheric light values corresponding to each color channel. Larger max-pooling kernel reduces the effect of local factors like object color while estimating the atmospheric light, which is a single value for a color channel. We also use a global max-pooling at the end, which pools a single maximum intensity across the spatial dimension in a channel, as our target is to estimate the intensity of ambient light in the image channels. The intuition of the global max-pool is inspired from the idea of atmospheric light estimation using DCP, where emphasis is given on higher intensity pixels. To validate our usage, we performed experiments comparing global average-pooling to global max-pooling as shown in Section 5.1.3 and found that global max-pooling gives far better estimates of atmospheric light. We trained our atmospheric light estimator using mean-squared error as the loss function. As presence of color cast in a hazy image affects one or more color channels, estimating channel-wise atmospheric light facilitates color cast reduction.

Techniques Non-cast Hazy Images Color Cast Hazy Image
Outdoor
Low Haze
(LSOT)
Outdoor
Mid Haze
(MSOT)
Outdoor
High Haze
(HSOT)
Indoor
Low Haze
(LSIT)
Indoor
Mid Haze
(MSIT)
Indoor
High Haze
(HSIT)
Color Cast
Random Haze
(SCHT)
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PID [11]
TIP’15
14.29/0.7097/
34.42
12.10/0.5255/
43.22
10.80/0.4049/
49.93
16.67/0.7763/
34.76
12.16/0.5867/
46.28
9.82/0.4109/
56.75
8.96/0.1260/
77.99
AOD-Net [31]
ICCV’17
15.00/0.7037/
30.38
11.62/0.5097/
42.56
10.05/0.3962/
50.29
17.07/0.7942/
27.55
11.09/0.5628/
46.71
8.86/0.4170/
57.32
9.62/0.1271/
77.26
PQC [52]
TIP’18
18.54/0.8272/
31.20
14.59/0.6793/
40.90
11.44/0.5482/
45.82
19.21/0.8513/
26.36
13.85/0.7061/
38.61
10.66/0.5746/
49.07
12.68/0.3160/
74.25
EPix2Pix [47]
CVPR’19
20.28/0.7729/
42.01
17.10/0.5779/
55.95
14.74/0.3921/
63.14
21.28/0.8805/
24.57
15.13/0.7531/
37.60
12.27/0.6399/
48.95
13.20/0.3206/
77.90
GridDNet [40]
ICCV’19
18.56/0.8308/
22.59
12.48/0.5976/
37.96
9.07/0.4205/
49.54
29.66/0.9826/
8.58
21.61/0.8855/
19.57
15.56/0.7258/
33.98
9.47/0.1455/
79.08
ALC [46]
TCSVT’20
19.45/0.8500/
27.67
16.89/0.7438/
36.33
14.56/0.6289/
44.25
19.94/0.8725/
22.59
16.27/0.7778/
31.98
13.50/0.6763/
42.38
12.26/0.3148/
79.15
Haze-Lines [7]
TPAMI’20
16.59/0.7941/
41.83
13.44/0.6350/
58.06
11.16/0.4972/
76.17
15.62/0.7303/
45.88
11.75/0.6020/
62.87
8.48/0.4493/
79.11
11.05/0.2749/
77.04
FSID [27]
TIP’20
17.83/0.8107/
28.49
13.36/0.6720/
39.12
10.69/0.5514/
47.01
18.51/0.7900/
29.63
13.90/0.7265/
35.89
11.29/0.6317/
43.20
10.80/0.3521/
75.30
MSBDN [14]
CVPR’20
19.98/0.8585/
20.33
13.63/0.6312/
36.29
10.02/0.4545/
49.66
31.13/0.9823/
7.4944
23.05/0.9143/
15.78
17.39/0.7837/
27.56
9.76/0.1452/
79.31
Ours IPUDN
25.83/0.9430/
22.44
24.43/0.9220/
22.77
22.81/0.8822/
26.76
30.02/0.9645/
12.44
27.17/0.9401/
16.04
23.94/0.8902/
21.95
26.74/0.9157/
28.92
TABLE I: Comparison of different state-of-the-art approaches with the proposed for single image dehazing on images from OTS Outdoor and NYU Indoor datasets with synthetic haze. (Best: Bold red highlight, Second best: Blue highlight)

3.4 Iterative Dehazing Network

In our dehazing network, we employ two main strategies. Firstly, we propose an iterative transmission map and atmospheric light updating strategy. Secondly, an LSTM based recurrent convolutional neural network is used to maintain inter-time step dependencies. Our whole updater based iterative dehazing network is shown in Fig. 4

3.4.1 Recurrent Dehazing Formulation

We use 6-layer residual network for dehazing as shown in Fig. 4(a). It consists of four main parts: (a) Input feature extraction, , (b) Recurrent layer, , (c) Consecutive 6 residual blocks for higher-level feature extraction, , (d) Output layer for dehazed image reconstruction, . Our dehazing network can be described mathematically as:

(3)

We choose LSTM as a recurrent block due to its empirical superiority over the gated recurrent unit (GRU) for our dehazing framework. The LSTM layer

takes features extracted by in the present state and the previous recurrent state as inputs. The LSTM after the first convolution block helps to keep dependencies in consecutive time steps enabling interaction between intermediate features from subsequent states. Unlike conventional LSTM [22], motivated by [26, 58, 49], we use the entire model recursively in each time step which reduces the model size required significantly. In our work, we use convolutional LSTM [63] as shown in Equation (4). At timestep , LSTM receives features from the input feature extraction block and recurrent state at timestep . The LSTM computes an input gate , a forget gate , an output gate , and a cell state , and can be formulated as,

(4)

where

is sigmoid function,

is hyperbolic tangent function, is element-wise multiplication, and is convolution operation.

Result: =dehazed image
Data: =hazy image
(Total number of iterations);
(Estimate transmission map);
(Estimate atmospheric map);
;
while  do
       ;
       ;
       ;
       ;
       ;
      
end while
if is train then
       Calculate loss between ;
       Calculate gradients;
       Update parameters;
      
end if
Algorithm 1 Proposed dehazing algorithm

3.4.2 Iterative Updater Mechanism

We have already discussed the recurrent mechanism of our dehazing network as shown in Fig. 4. Now we discuss our iterative updating mechanism. The algorithm as shown in Algorithm 1 describes the workflow of training and testing of the iterative updation based dehazing mechanism. During forward propagation, the already trained transmission map estimation model and atmospheric light estimation model take the hazy image as input and give estimated transmission map and atmospheric light , respectively as outputs. Dehazing network takes the estimated transmission map and atmospheric light along with the hazy image as inputs and iteratively dehaze the image, where the transmission map and atmospheric light are also updated. is the input of our dehazing network at time step . It contains of two types of inputs: static and dynamic inputs. Static inputs are hazy image , initially estimated transmission map and initially estimated atmospheric light , which are time-independent. Now, the time-dependent dynamic inputs contain dehazed image , updated transmission map and updated atmospheric light . At the first timestep , , , and . After each time step, the transmission map and atmospheric light are updated. Two separate updater networks estimate the transmission map update and atmospheric light update required. These corresponding estimated updates are added to and from the previous time step to respectively get and in the current time step. The input to the transmission map updater also contains both static and dynamic inputs as shown in Fig. 4(c). The input hazy image , initially estimated transmission map , dehazed image at time step , and updated transmission map at time step form . Similarly, atmospheric light updater as shown in Fig. 4(b) takes the input which has the input hazy image , atmospheric light , dehazed image at time step , and updated atmospheric light at time step . As said earlier, the updated transmission map at timestep is and updated atmospheric light at timestep is . After finishing all the time steps (iterations), the dehazing model, and transmission map and atmospheric light updaters are trained by back-propagating the loss between the dehazed and the ideal hazy-free image. In the case of testing, we obtain the required dehazed image after finishing all the iterations.

Dataset Measures Techniques
PID [11]
TIP’15
AOD-Net [31]
ICCV’17
PQC [52]
TIP’18
EPix2Pix [47]
CVPR’19
GridDNet [40]
ICCV’19
ALC [46]
TCSVT’20
Haze-Lines [7]
TPAMI’20
FSID [27]
TIP’20
MSBDN [14]
CVPR’20
Our
IPUDN
BeDDE
VI
RI
0.8602
0.9679
0.8716
0.9674
0.8930
0.9696
0.8956
0.9640
0.8909
0.9682
0.8620
0.9696
0.8715
0.9589
0.8991
0.9683
0.7688
0.9039
0.9065
0.9711
O-Haze
VI
RI
PSNR/ SSIM
0.8723
0.9543
15.53/ 0.37
0.8382
0.9546
15.17/ 0.36
0.8661
0.9618
16.69/ 0.49
0.9078
0.9715
17.38/ 0.61
0.7653
0.9150
13.54/ 0.37
0.8666
0.9623
16.06/ 0.45
0.8797
0.9623
15.81/ 0.52
0.8946
0.9582
16.81/ 0.52
0.8151
0.9627
16.83/ 0.45
0.9164
0.9737
19.39/ 0.64
I-Haze
VI
RI
PSNR/ SSIM
0.9174
0.9710
15.67/ 0.57
0.8855
0.9669
14.60/ 0.56
0.9241
0.9745
15.25/ 0.57
0.9336
0.9727
15.80/ 0.61
0.8693
0.9197
12.24/ 0.47
0.9064
0.9724
14.34/ 0.55
0.9083
0.9685
15.48/ 0.60
0.9309
0.9711
17.21/ 0.61
0.9125
0.9744
16.57/ 0.64
0.9418
0.9782
16.21/ 0.62
TABLE II: Comparison of different state-of-the-art approaches with the proposed for single image dehazing on real-world hazy images.
(Best: Bold red highlight, Second best: Blue highlight)

3.4.3 Dehazing Network Architecture

In our network architecture, is a single layer convolution, includes consecutive residual blocks and is also a single layer convolution. All the convolution layer filters have

size and padding

with non-linearity. There are input channels and output channels in the first convolution layer due to concatenation of RGB image, -channel atmospheric light and transmission map, all of them both in static and dynamic form. takes the output of with channels as input and outputs -channel RGB image. In our convolutional LSTM block , all the convolutions have input channels and output channels. In our experiment, we consider six iterations /time steps based on empirical evidence. We experimented with different number of time steps, and the detailed discussions are given in Section 5.2.3. We use six consecutive convolution blocks in both updater networks to estimate the changes required in the transmission map and atmospheric light at each time step. Both the updater blocks have consecutive convolution blocks with parametric non-linearity. We use non-linearity in the last layer so that the changes can be in both positive and negative directions. In the case of the atmospheric light updater, we use global average pooling to get a single global update instead of pixel-wise updates, as the former has been empirically found superior (See Section 5.1.2).

Fig. 5: Subjective evaluation of the different methods on hazy images with synthetically generated haze. Zooming into image regions like the cropped ones in boxes will show the effectiveness of our method. The 1st-3rd rows: Results on hazy images with color cast. The 4th-9th rows: Results on hazy images without color cast.
Fig. 6: Subjective evaluation of the different methods on real-world hazy images. Zooming into image regions like the cropped ones in boxes will show the effectiveness of our method. The 1st-3rd rows: Results on hazy images with color cast. The 4th-9th rows: Results on hazy images without color cast.

3.4.4 Loss Function

A combination of different loss functions from , , , adversarial [30] loss and perceptual [24] loss has been used for training dehazing models. In a similar manner, we consider two different loss functions for training our dehazing network. One of them is the loss, which we empirically find to be superior to the loss for training our network (See Section 5.2.3). The other loss used by us is the perceptual difference loss [24]. Therefore, the total reconstruction loss is defined as

(5)

where is the mean absolute difference loss and is the perceptual difference loss. is a hyper-parameter and we use in our experiments. While computing our reconstruction loss during training, we only impose the supervision on the final output for a model with the number of time steps as . Detailed discussions on the supervision on final outputs and the iterative supervisions are given in Section 5.2.3

. Our perceptual loss function is defined as

(6)

where is the ground truth for supervision. In our Perceptual loss, we use the absolute error between the high level features extracted by VGG network from actual haze-free and dehazed images at final time step. In the above, we use relu2_2 layer of vgg19 architecture [57] as the feature extractor.

3.5 Stage-wise Training and Fine Tuning

In our proposed approach, there are three trainable architectures: the transmission map estimation model, the atmospheric light estimation model, and the dehazing network with updater mechanism. Instead of training the whole network together, we divide the training procedure into three stages. This is done as we experimentally found that training the whole system as one from the beginning makes the convergence slow, and the training gets stuck in poor local minima. The required different objective functions possibly push the training in different directions producing small gradient magnitudes. Therefore, we train each of the three networks with separately with the relevant objective functions. In the first stage, the transmission map and atmospheric light estimators are trained separately. In the second stage, the dehazing network is trained separately using the reconstruction loss. In the third stage, all the three trained networks are fine-tuned considering them together with the multiple objective functions. This fine-tuning, which is carried out at a lower learning rate, is performed to introduce fine dependencies between the three networks. Fine-tuning them together allows us to achieve the best performance from our entire dehazing system, while remaining in the local vicinity of the solutions provided by the individual trainings. Similar stage-wise training has been successfully adopted in different applications [5, 16, 66] earlier.

Techniques Non-cast Hazy Images Color Cast Hazy Image
Outdoor
Low Haze
(LSOT)
Outdoor
Mid Haze
(MSOT)
Outdoor
High Haze
(HSOT)
Indoor
Low Haze
(LSIT)
Indoor
Mid Haze
(MSIT)
Indoor
High Haze
(HSIT)
Color Cast
Random Haze
(SCHT)
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
Baseline-1
RESNet16
17.43/0.7047
/53.24
17.05/0.6867/
53.66
16.00/0.6188/
55.84
17.75/0.7014/
46.70
15.99/0.6258/
50.92
13.85/0.5100/
58.11
18.73/0.7167/
52.50
Baseline-2
RESNet6+A+TM
22.99/0.8737/
38.48
19.19/0.7704/
47.90
15.69/0.6640/
55.88
26.14/0.9244/
24.85
20.11/0.8071/
38.30
16.40/0.6853/
49.94
23.35/0.8287/
44.65
Baseline-3
RESNet6+A+TM+LSTM
23.16/0.8832/
35.21
19.76/0.7907/
44.05
17.37/0.7003/
52.86
27.44/0.9378/
21.63
22.28/0.8394/
33.34
18.25/0.7258/
44.51
23.99/0.8401/
42.38
Baseline-4
RESNet6+A+TM+IUN
23.54/0.8934/
34.38
22.17/0.8670/
38.27
20.76/0.8214/
41.50
27.37/0.9230/
25.06
25.08/0.8953/
30.44
21.80/0.8172/
39.95
24.78/0.8670/
39.67
Baseline-5 (Our IPUDN)
RESNet6+A+TM+LSTM+IUN
23.85/0.9096/
33.26
22.57/0.8883/
36.67
21.61/0.8431/
39.61
28.24/0.9405/
22.04
25.16/0.9070/
27.38
21.90/0.8362/
35.42
25.08/0.8818/
38.19
TABLE III: Ablation study of our model architecture showing the performance improvements achieved by including its various components successively.
(Best: Bold red highlight, Second best: Blue highlight)

4 Experimental Results

4.1 Datasets

For training, validation and testing our dehazing network, we generate synthetic hazy image datasets from indoor and outdoor images with depth information. We use the NYU Depth Dataset V2 [56], which contains the indoor images with depth maps. Outdoor images are taken from the Outdoor Training Set (OTS) dataset [32], where we adopt the state-of-the-art deep learning algorithm of [39] for depth estimation. We divide both indoor and outdoor images into three non-overlapping training, validation, and testing sets. Details of training and validation are given in Section 4.2. Both validation and testing sets contain images each, with indoor and outdoor images. We prepare all the images in the testing set with three different levels of hazy condition. Haze is synthetically added using Equation 1. The three different hazy indoor image test datasets with images in each are named as Low-haze Synthetic Indoor Test-set (LSIT), Mid-haze Synthetic Indoor Test-set (MSIT) and High-haze Synthetic Indoor Test-set (HSIT). The three different hazy outdoor test datasets with images in each are named as Low-haze Synthetic Outdoor Test-set (LSOT), Mid-haze Synthetic Outdoor Test-set (MSOT) and High-haze Synthetic Outdoor Test-set (HSOT). The outdoor images from the testing set are also used to prepare a synthetic color-cast hazy image dataset using random amounts of haze and it is named as Synthetic Color-cast Haze Test-set (SCHT). Our training dataset contains outdoor images and indoor images. For quantitative evaluation on synthetic images, we consider all the images from the aforesaid datasets corresponding to the testing set. For the quantitative evaluation on real-world images, we use O-Haze [2], I-Haze [4] and BeDDE [68] datasets.

4.2 Implementation Details

As discussed in Section 3.5, we train our dehazing framework in three stages, as shown in Fig. 1. In the first stage, we train the transmission map and atmospheric light estimation model separately using the training set, which contains both indoor and outdoor images. We synthetically add random amounts of haze during training and validation using Equation 1. A random amount of haze is generated with a random selection of the attenuation coefficient and the atmospheric light . We extract patches of size for each batch update, and we randomly augment those images using random horizontal and vertical flipping, and rotation. We validate the model after each iteration to choose the best model for testing. We train both the transmission and atmospheric light estimation models for iterations using Adam optimizer [28]

with default settings in the PyTorch environment. We set the initial learning rate as

and decay the learning rate by half after every iterations.

In the second stage, we use the already trained transmission and atmospheric light estimation model to compute the initial transmission map and atmospheric light for each training image. We do not update the transmission and atmospheric light estimation models during this stage of training. We use the same augmentation strategy, optimizer, and initial learning rate as in the previous stage. We train the dehazing network using the hazy images from the training set, with the corresponding initial transmission map and atmospheric light as inputs. We train for iterations where each iteration does batch updates with a batch size of and decay the learning rate by half after every iterations.

At the final stage, we fine-tune the whole dehazing system together. We found that the fine-tuning results in performance improvement as dependencies between the three networks are invoked. During fine-tuning, we keep the learning rate of transmission map and atmospheric light estimation model times lesser than the dehazing network and train the whole framework for another iterations with the same training set up. Although we extract patches during training, we feed the hazy image as a whole into the network while performing dehazing.

4.3 Quantitative Evaluation

4.3.1 Evaluation on Synthetic Hazy Images

For comparative evaluation on synthetic datasets, we consider the datasets having images with different amounts of haze and color cast described in Section 4.1. Table I presents the comparison with different dehazing approaches that represent the state-of-the-art. We perform the quantitative evaluation of the images produced by PID [11], AOD-Net [31], PQC [52], ALC [46], EPix2Pix [47], GridDNet [40], Haze-Lines [7], MSBDN [14], FSID [27]

, and our IPUDN. We use structural similarity (SSIM) measure to quantify the structural difference between a dehazed output image and the corresponding non-hazy image and peak-signal-to-noise ratio (PSNR) to measure their pixel to pixel difference. We also use the CIEDE2000 

[54] measure to compute the pixel-wise color difference between the dehazed image and its non-hazy counterpart. It is evident from the table that except for the ’Indoor Low Haze’ case, our IPUDN outperforms all the other techniques by substantial margins in terms of PSNR, SSIM, and CIEDE2000 in a broad spectrum of hazy images. In the ’Indoor Low Haze’ case, our approach performs close to the best performing approach, MSBDN [14], while outperforming most of the others.

4.3.2 Evaluation on Real-world Hazy Images

We further evaluate our proposed method’s performance in three different real-world benchmark datasets, O-Haze [2], I-Haze [4], and BeDDE [68]. O-Haze and I-Haze dataset contain outdoor and indoor images, respectively, where haze machines are used to create hazes in the scene. The recently published BeDDE dataset contains actual natural hazy images and their non-hazy references. They also proposed two different metrics, Visibility Index (VI) and the Realness Index (RI), specifically designed to analyze the performance of dehazing algorithms in real-world datasets. Visibility Index assesses the quality of the dehazed image using the similarity of visibility between the dehazed image and its non-hazy reference. Realness Index measures the similarity between dehazed image and its non-hazy reference in a pre-defined feature space to evaluate the realness of the image given the reference. Dehazed images with good visibility but having artifacts will perform poorly in RI, and dehazed images with low visibility will perform poorly in VI. For both VI and RI, a higher value signifies a better performance. Table II shows the quantitative evaluation of real-world images using these VI and RI measures along with PSNR and SSIM. VI and RI are shown alone for the BeDDE dataset, as typical ground truths required to calculate PSNR and SSIM are not available in this dataset. It is apparent from the table that our IPUDN performs best in terms of all the measure on all the datasets, except PSNR and SSIM based evaluation on the I-Haze dataset, where it performs close to the best in comparison to the others. This indicates effective dehazing performance in terms of visibility and artifact-free reconstruction.

Fig. 7: Ablation study of our model using our estimation of atmospheric light and transmission map. (a) Hazy input, (b) Ground truth, (c) Baseline-1: RESNet16, (d) Baseline-2: RESNet6+A+TM, (e) Baseline-3: RESNet6+A+TM+LSTM, (f) Baseline-4: RESNet6+A+TM+IUN, (e) Our IPUDN: RESNet6+A+TM+LSTM+IUN. (zoom for the best view)

4.4 Subjective Evaluation

4.4.1 Evaluation on Synthetic Hazy Images

In Fig. 5, we perform subjective evaluation on synthetically generated hazy images considering our IPUDN and the following recently published approaches that give state-of-the-art performance: PQC [52], ALC [46], EPix2Pix [47], GridDNet [40], Haze-Lines [7], MSBDN [14], FSID [27]. The images in Fig. 5 contain synthetic haze of varying amounts and a few of them also contain synthetic color cast. The first three images in Fig. 5 are hazy images with green, yellow, and blue casts, respectively. These color casts are shown, as similar casts naturally occur in images captured in haze (See first three images in Fig. 6). The remaining six images are non-cast hazy images, where the first three are of outdoor scenes, and the last three are of indoor scenes.

From the results generated on the color cast hazy images by the existing approaches, we see that most of them do not remove the color casts by significant amount and in all image regions, while a few of them over-saturates the cast color. In a couple of dehazed images, color distortion is also evident. As can be seen from the dehazed results obtained using our IPUDN on the color cast hazy images, color casts are satisfactorily removed and visually realistic dehazed images close to the ground truths are generated. It is evident that our approach outperforms the others. Considering the dehazing performance of all the approaches on all the images in the figure, we can see that the amount of haze reduced by the existing techniques is limited as compared to that of our IPUDN. A few existing approaches perform quite well for indoor images, however our approach does better. As evident in the fourth-row results of the figure produced by a couple of existing approaches, artifacts are visible in regions of high haze density.

For the synthetic indoor and outdoor hazy images in Figure 5 having different amounts of haze and different variety of color casts, our approach removes haze and color cast substantially, enhances visibility, performs faithful color restoration, achieves visually realistic reconstruction, produces images close to the ground truths, and outperforms the other techniques.

Atmospheric
Light Intensity
Update
Non-cast Hazy Images Color Cast Hazy Image
Outdoor
Low Haze
(LSOT)
Outdoor
Mid Haze
(MSOT)
Outdoor
High Haze
(HSOT)
Indoor
Low Haze
(LSIT)
Indoor
Mid Haze
(MSIT)
Indoor
High Haze
(HSIT)
Color Cast
Random Haze
(SCHT)
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
Local
23.18/0.8875/
35.67
20.12/0.8037/
44.41
17.77/0.7482/
48.61
28.35/0.9475/
20.23
22.98/0.8550/
32.05
18.74/0.7423/
43.84
24.22/0.8451/
43.70
Global
23.85/0.9096/
33.26
22.57/0.8883/
36.67
21.61/0.8431/
39.61
28.24/0.9405/
22.04
25.16/0.9070/
27.38
21.90/0.8362/
35.42
25.08/0.8818/
38.19
TABLE IV: Performance comparison of the global update in our atmospheric light updater model with local update. (Best: Bold highlight)
Image Type Haze Density Pool Type
Max-Pool Average Pool
Non-cast
Hazy
Outdoor Low Haze (LSOT) 9.7 26.7
Outdoor Mid Haze (MSOT) 7.8 24.7
Outdoor High Haze (HSOT) 6.8 11.4
Indoor Low Haze (LSIT) 25.2 22.2
Indoor Mid Haze (MSIT) 9.1 22.4
Indoor High Haze (HSIT) 5.2 17.8
Color Cast
Hazy
Random Haze (SCHT) 0.97 1.69
TABLE V: MSE comparison of Max-pooling with Average pooling for atmospheric light estimation network. All the numbers are multiple of .
(Best: Bold highlight)
Techniques Non-cast Hazy Images Color Cast Hazy Image
Outdoor
Low Haze
(LSOT)
Outdoor
Mid Haze
(MSOT)
Outdoor
High Haze
(HSOT)
Indoor
Low Haze
(LSIT)
Indoor
Mid Haze
(MSIT)
Indoor
High Haze
(HSIT)
Color Cast
Random Haze
(SCHT)
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE200
ASM based
dehazing
24.56/ 0.9086/
29.96
21.06/ 0.8214/
40.68
15.94/ 0.6739/
50.22
27.84/ 0.9264/
20.60
21.47/ 0.8230/
34.26
17.39/ 0.7012/
45.23
22.97/0.8021/
48.17
IPUDN
(Ours)
25.83/0.9430/
22.44
24.43/0.9220/
22.77
22.81/ 0.8822/
26.76
30.02/0.9645/
12.44
27.17/0.9401/
16.04
23.94/0.8902/
21.95
26.74/ 0.9157/
28.92
TABLE VI: Dehazing performance comparison of the atmospheric scattering model (ASM) with the dehazing model of our proposed IPUDN, where both take estimates of transmission map and atmospheric light as inputs along with the hazy image. (Best: Bold highlight)
Loss
Function
Non-cast Hazy Images Color Cast Hazy Image
Outdoor
Low Haze
(LSOT)
Outdoor
Mid Haze
(MSOT)
Outdoor
High Haze
(HSOT)
Indoor
Low Haze
(LSIT)
Indoor
Mid Haze
(MSIT)
Indoor
High Haze
(HSIT)
Color Cast
Random Haze
(SCHT)
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
MSE Loss
23.61/0.8939/
36.32
22.98/0.8708/
39.82
21.46/0.8043/
46.06
27.54/0.9226/
26.86
24.79/0.8853/
32.38
22.10/0.8167/
39.70
24.88/0.8708/
41.22
Recursive
L1 Loss
24.03/0.9141/
33.77
23.15/0.8937/
37.80
21.09/0.8313/
41.66
28.07/0.9408/
22.46
25.08/0.9028/
28.38
21.22/0.8257/
37.64
25.09/0.8778/
39.90
L1 Loss
23.85/0.9096/
33.26
22.57/0.8883/
36.67
21.61/0.8431/
39.61
28.24/0.9405/
22.04
25.16/0.9070/
27.38
21.90/0.8362/
35.42
25.08/0.8818/
38.19
TABLE VII: Performance of our dehazing network IPUDN for various relevant loss functions. (Best: Bold highlight)
Recursive
Time-steps
Non-cast Hazy Images Color Cast Hazy Image
Outdoor
Low Haze
(LSOT)
Outdoor
Mid Haze
(MSOT)
Outdoor
High Haze
(HSOT)
Indoor
Low Haze
(LSIT)
Indoor
Mid Haze
(MSIT)
Indoor
High Haze
(HSIT)
Color Cast
Random Haze
(SCHT)
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
PSNR/SSIM/
CIEDE2000
3
25.73/0.9305/
23.73
23.60/0.9023/
27.61
22.07/0.8453/
32.78
30.04/0.9633/
13.57
25.52/0.9318/
17.64
21.69/0.8679/
24.75
26.84/0.9131/
30.40
6
25.83/0.9430/
22.44
24.43/0.9220/
22.77
22.81/0.8821/
26.75
30.02/0.9645/
12.44
27.16/0.9400/
16.04
23.94/0.8901/
21.95
26.73/0.9157/
28.92
9
25.73/0.9421/
24.02
24.21/0.9099/
28.57
22.32/0.8617/
33.20
29.88/0.9588/
16.34
26.78/0.9306/
21.65
23.44/0.8867/
26.04
26.00/0.9044/
31.77
TABLE VIII: Study on suitable number of time steps /iterations in our iterative dehazing framework IPUDN. (Best: Bold highlight)

4.4.2 Evaluation on Real-world Hazy Images

We compare the performance of our IPUDN with other state-of-the-art techniques on real-world hazy images. We show the subjective evaluation of the dehazing methods considered in Section 4.4.1 on real images in Fig. 6. The first-row shows results on a real hazy image having a green color cast [46]. The results in the second-row are for a real hazy image with a yellow-red color cast due to sandstorm [46]. The third-row shows results on a real hazy image with a bluish color cast [11]. The rest are non-cast real hazy images from [32, 11, 20]. As evident from the dehazed results of the existing approaches on real color cast hazy images, most of them do not remove color casts substantially. Color distortion is also introduced by a few of them and in a couple of dehazed results we see loss of object color. Our IPUDN removes the color casts substantially and maintains visually realistic object color without introducing visible color distortion or loss.

Considering the dehazing performance of all the approaches on all the images in the figure, we see that the amount of dehazing by the existing approaches, particularly in regions with thick haze, is limited compared to our IPUDN. In a few cases, color artifacts are evident in the results by the existing approaches, which includes non-realistic reproduction of color (like bluish color in place of green), unlike the results of our method. Our approach produces better dehazing results for all the images reducing haze substantially and producing visually realistic output with faithful color reconstruction. This is true for the critical hazy images in the fourth and seventh rows as well, where dense haze in present in distant areas in the former and the atmospheric light estimation is difficult in the latter owing to absence of sky region.

From the above analysis on a variety of real-world hazy images, we can see that our IPUDN performs effective dehazing and color cast removal in a variety of hazy conditions outperforming the other techniques. The dehazed images generated by our method have faithful and realistic color appearance without any visible distortion.

5 Analysis and Discussions

5.1 Ablation study

5.1.1 The Contribution of our Dehazing Mechanism

We present an ablation study of our dehazing mechanism in Table III. We use five different model variations, such as (I) Baseline-1 (RESNet16): sixteen consecutive residual blocks forming an image-to-image mapping network, which take the hazy image as the input, (II) Baseline-2 (RESNet6+A+TM): six consecutive residual blocks, which take the estimated transmission map and atmospheric light along with hazy image as inputs, (III) Baseline-3 (RESNet6+A+TM+LSTM): Baseline 2 along with the LSTM module, (IV) Baseline-4 (RESNet6+A+TM+IUN): Baseline-2 along with our proposed iterative updater network (IUN), (V) Baseline-5 (RESNet6+A+TM+IUN+LSTM): our proposed IPUDN, which is Baseline-2 along with the LSTM and the IUN. For a fair comparison, we apply only mean absolute error () as a loss function to train the model, and we perform all the experiments in the second stage of our training framework. We can see from Table III that Baseline-2 having the transmission map and atmospheric light with lesser residual blocks as compared to the Baseline-1, produce significantly boosted results. These experimental findings support our argument of using estimated transmission map and atmospheric light while performing end-to-end training. It guides the network to determine the haze density, and the network performs extraordinarily well in all types of hazy conditions. Baseline-3 with the additional LSTM module based recursive mechanism provides a minor improvement in the results, but in most cases, the improvements are much more prominent in Baseline-4 with the additional IUN mechanism. Our IUN module successfully restricts the huge performance drop with the increase in haze density, and this experimentally proves the effectiveness of our proposed iterative mechanism. Baseline-5, which is our proposed IPUDN, includes the LSTM module with the Baseline-4. LSTM helps to introduce inter time step dependencies in the feature layers and further improves the performance.

We further show the qualitative evaluation on a synthetic hazy image and a real hazy image of the different baselines of our model in Fig. 7. We witness that except IPUDN, all the variations suffer from unpleasant artifacts. Baseline 4 is relatively better in handling the artifacts but suffers from color distortion visible in the first-row results, in the cropped regions. However, our proposed IPUDN produces a dehazed output close to ground truth for the synthetic hazy image and a visually realistic better dehazed output for the real hazy image.

Fig. 8: Intermediate outputs of our IPUDN. Initially estimated transmission map and atmospheric light along with hazy image are shown at T=0. Updated maps and dehazed image are shown in each iterative time steps. First row shows atmospheric light and its updates. Second row shows transmission map and its updated maps. Third row shows dehazed output in each iteration. (zoom for the best view)
Fig. 9: Low and high-level features extracted from a hazy image using trained atmospheric light estimation model.
Fig. 10: Object detection and recognition on real hazy images and dehazed outputs by the different approaches. (zoom for the best view)
Fig. 11: Dehazed output by our IPUDN on images with different imaging conditions. Top row images are inputs and bottom row images are corresponding dehazed output by our IPUDN.

5.1.2 Global vs. Local Atmospheric Light Updater Model

As discussed in Section 3.4.3, we apply globally update the atmospheric light in our dehazing framework. We decide to adopt a global update instead of a local update based on our experimental findings. In our dehazing framework, we use average pooling after the atmospheric light updater model to get a single overall update in each color channel. Average pooling is considered to ensure that all the pixel values contribute to the global update. The Table IV shows the experimental results without this pooling so that the update happens locally in comparison to when the pooling is used. It is clearly evident that in most cases there is performance degradation when local update is considered. This may indicate that the atmospheric light in our approach represents the haze illumination as a global quantity in a image.

5.1.3 Pooling in Atmospheric Light Estimation Model

In Section 3.3, we have discussed the intuition behind our max-pooling layers, including the global max-pooling, in our atmospheric light estimation model. To validate this, we experiment with two different pooling mechanisms, max- and average pooling, in the atmospheric light estimation model. Here, in Table V, we present the quantitative evaluation on the synthetic hazy images mentioned in Section 4.3.1. We use the mean squared error between the actual and estimated atmospheric lights of the two trained networks with max-pooling and average-pooling. We experimentally found that max-pool performs far better with a substantial margin. Moreover, as described in Section 3.3, max-pool fits well with the popular idea of atmospheric light estimation using DCP.

5.2 Additional Experiments and Discussion

5.2.1 Atmospheric Scattering Model (vs) Separate Dehazing Network

Table VI presents a study comparing the separate dehazing model of our proposed IPUDN to the atmospheric scattering model for dehazing. In case of atmospheric scattering model based dehazing, we estimate the transmission map and atmospheric light using our densely connected encoder-decoder network and the proposed convolutional neural network based model, respectively. We then reconstruct the dehazed image using the Equation (2) and optimize both the network using the reconstruction loss along with transmission map and atmospheric light estimation loss, similar to the process employed by a few existing approaches [66, 12]. In the table, for a wide range of hazy conditions, we observe the superior dehazing performance of our proposed approach of employing a separate dehazing model that takes initial estimates of transmission map and atmospheric light as inputs along with the hazy image. This signifies that our separate dehazing framework containing the updater networks successfully handles the insufficiency in the initial estimates of transmission map and atmospheric light.

5.2.2 A Study on Our Atmospheric Light Estimation Model

The Fig. 9 shows the extracted features by our trained atmospheric light estimation model from an image in the initial and final layers before the global max-pooling. In the initial layer, instead of learning standard low-level kernels like those computing edges and orientations, we observe that the low-level kernels operate on input images and give outputs with different intensity level shifts. Progressing to the final layer, the high-level feature contents become very smooth, possibly to provide a single value per color channel as the estimated atmospheric light after the global max-pooling.

5.2.3 A Study on Our Iterative Dehazing Model

Loss Function: The Table VII shows the comparison of MSE and loss functions. We impose both the losses on the final dehazed output. As can be seen, loss is experimentally found to be superior to MSE. We also perform experiments with recursive supervision, and notice that both and recursive losses give similar performance. However, we choose loss over recursive loss as it is computationally economical, and gives consistently better performance in CIEDE2000 metrics. This possibly means supervision in the final output helps to preserve the color information in a better way over recursive supervision.

Recursive Time-steps: We perform detailed study on the time steps of our iterative update process. We use three different time steps to analyze the performance of the model with respect to the number of time steps. We performed experiments with Number of time steps using the same experimental set up as discussed earlier in Section 4.2. The Table VIII presents the performance evaluation of our models that are trained with different number of time steps. The experimental results show that Number of time steps is the best suited for our network among the three, which we use in this paper.

Intermediate Outputs of Our Dehazing Model: Fig. 8 shows the iterative updates of atmospheric light, transmission map, and generated dehazed output at each iterative time step. In our dehazing network, after each iteration, the atmospheric light updater module updates the initial estimated atmospheric light, and we observe that the updated color value becomes the darker version of the initial estimated color. On the other hand, the transmission map estimator network initially generates a smoothed map of the image structure. After that, in our dehazing network, the transmission map updater updates that map, upon which image structure details appear in the map, as evident from Fig. 8. These structures details are then diminished a little with increase in time step, possibly striking a fine balance between structure preservation and noise reduction.

5.3 Applications

5.3.1 A High-level Vision Task

Hazy environment creates a hindrance to high-level vision tasks like object detection and recognition for applications such as surveillance, autonomous driving, etc. and dehazing can alleviate the issue. Here, we consider the state-of-the-art YOLOv3 [48] object detection and recognition model, and apply it on real hazy images and their dehazed outputs by state-of-the-art dehazing approaches and our proposed IPUDN. Fig. 10 shows the visual comparison of object detection and recognition. We see that the degree of dehazing by our IPUDN is better, which helps to detect the maximum number of objects in the scene. As can be seen in the first-row images, three distant objects camouflaged by haze is successfully detected after dehazing using our approach. All these three objects are not detected when the other approaches are applied. A similar observation can be made in the second-row images, where a couple of objects hidden by haze on the left side of the image is detected after dehazing using our approach. Thus, our IPUDN is seen to produce dehazed outputs suitable for object detection and recognition in hazy environment.

5.3.2 Low-level vision Tasks

Haze-like effect can form due to different environmental conditions like rain, halation and different image capturing conditions like glare reflection and underwater photography [53, 64, 65, 33, 6]. There are different domain-specific models to overcome these image degradation effects. However, to observe the extendability of our model, we test it in the mentioned conditions without any re-training and found that our dehazing framework produces outputs with substantially better clarity than the input images. We present examples of such images and their dehazed version using our IPUDN in Fig. 11. In the rainy dehazed image, although the rain streaks remain, the resulting haziness is removed. In the glare reflected image, we can see that IPUDN removes the haziness due to the glare and enhances the image. Our proposed framework also reduces the haze due to halation producing a better quality image. Although most underwater images look like hazy color-cast images, their formation follows a different light scattering model due to the presence of water as the light transmission medium [33]. However, application of our IPUDN without re-training results in satisfactory removal of haze and color cast in the underwater image shown. Further, we also apply our IPUDN to a non-hazy image. As can be seen very little difference is made by our dehazing approach, but there is a slight visually pleasing enhancement.

6 Conclusion

In this paper, we propose a single image dehazing framework which involves transmission map and atmospheric light estimation and updation. The updater networks work jointly with the estimation of the dehazed image using an LSTM based convolutional architecture. Our novel model for atmospheric light estimation produces channel-wise estimates, which allows handling of color cast in hazy images. Our proposed iterative dehazing model with the novel updater networks is designed to work in a wide variety of hazy conditions with different amounts of haze. Our dehazing approach is experimentally found to perform effectively for both indoor and outdoor images, and also for real-world and synthetic hazy images. Hazy conditions with different amounts of haze and color casts are well-handled by our approach, which outperforms the state-of-the-art. The dehazed results produced by our approach is found to achieve enhanced visibility while ensuring visually realistic faithful restoration. In a detailed ablation study, one of the findings is that our iterative updating strategy is more effective than LSTM based recursion. Moreover, the effectiveness is further improved by their combined use. Employing a separate dehazing network taking transmission map and atmospheric light priors is also found effective compared to the use of an image-to-image mapping network. Finally, our updater strategy is generic in nature and it can be used in any other dehazing network with simple modifications.

References

  • [1] C. Ancuti, C. O. Ancuti, C. De Vleeschouwer, and A. C. Bovik (2020) Day and night-time dehazing by local airlight estimation. IEEE Transactions on Image Processing 29 (), pp. 6264–6275. Cited by: §1, §1, §2.1.
  • [2] C. O. Ancuti, C. Ancuti, R. Timofte, and C. De Vleeschouwer (2018) O-haze: a dehazing benchmark with real hazy and haze-free outdoor images. In

    Proceedings of the IEEE conference on computer vision and pattern recognition workshops

    ,
    pp. 754–762. Cited by: §4.1, §4.3.2.
  • [3] C. O. Ancuti and C. Ancuti (2013) Single image dehazing by multi-scale fusion. IEEE Transactions on Image Processing 22 (8), pp. 3271–3282. Cited by: §1, §1, §2.1.
  • [4] C. Ancuti, C. O. Ancuti, R. Timofte, and C. De Vleeschouwer (2018) I-haze: a dehazing benchmark with real hazy and haze-free indoor images. In International Conference on Advanced Concepts for Intelligent Vision Systems, pp. 620–631. Cited by: §4.1, §4.3.2.
  • [5] E. Barshan and P. Fieguth (2015) Stage-wise training: an improved feature learning strategy for deep models. In Feature Extraction: Modern Questions and Challenges, pp. 49–59. Cited by: §3.5.
  • [6] D. Berman, D. Levy, S. Avidan, and T. Treibitz (2020) Underwater single image color restoration using haze-lines and a new quantitative dataset. IEEE Transactions on Pattern Analysis and Machine Intelligence (), pp. 1–1. Cited by: §5.3.2.
  • [7] D. Berman, T. Treibitz, and S. Avidan (2020) Single image dehazing using haze-lines. IEEE Transactions on Pattern Analysis and Machine Intelligence 42 (3), pp. 720–734. Cited by: §1, §1, §1, §2.1, TABLE I, TABLE II, §4.3.1, §4.4.1.
  • [8] T. M. Bui and W. Kim (2017) Single image dehazing using color ellipsoid prior. IEEE Transactions on Image Processing 27 (2), pp. 999–1009. Cited by: §1, §1, §2.1.
  • [9] B. Cai, X. Xu, K. Jia, C. Qing, and D. Tao (2016) Dehazenet: an end-to-end system for single image haze removal. IEEE Transactions on Image Processing 25 (11), pp. 5187–5198. Cited by: §1, §1, §2.2.
  • [10] W. Chen, J. Ding, and S. Kuo (2019) PMS-net: robust haze removal based on patch map for single images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11681–11689. Cited by: §1, §1, §2.2.
  • [11] L. K. Choi, J. You, and A. C. Bovik (2015) Referenceless prediction of perceptual fog density and perceptual image defogging. IEEE Transactions on Image Processing 24 (11), pp. 3888–3901. Cited by: §1, §1, §2.1, TABLE I, TABLE II, §4.3.1, §4.4.2.
  • [12] Z. Deng, L. Zhu, X. Hu, C. Fu, X. Xu, Q. Zhang, J. Qin, and P. Heng (2019-10) Deep multi-model fusion for single-image dehazing. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Cited by: §5.2.1.
  • [13] S. K. Dhara, M. Roy, D. Sen, and P. K. Biswas (2020) Color cast dependent image dehazing via adaptive airlight refinement and non-linear color balancing. IEEE Transactions on Circuits and Systems for Video Technology (), pp. 1–1. Cited by: §2.1.
  • [14] H. Dong, J. Pan, L. Xiang, Z. Hu, X. Zhang, F. Wang, and M. Yang (2020-06) Multi-scale boosted dehazing network with dense feature fusion. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: Fig. 2, 2(c), §1, §1, §2.2, TABLE I, TABLE II, §4.3.1, §4.4.1.
  • [15] A. Dudhane and S. Murala (2019) RYF-net: deep fusion network for single image haze removal. IEEE Transactions on Image Processing 29, pp. 628–640. Cited by: §1, §1, §2.2.
  • [16] A. Eitel, J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard (2015) Multimodal deep learning for robust rgb-d object recognition. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 681–687. Cited by: §3.5.
  • [17] R. Fattal (2008) Single image dehazing. ACM Transactions on Graphics (TOG) 27 (3), pp. 72. Cited by: §1, §1, §2.1.
  • [18] R. Fattal (2014) Dehazing using color-lines. ACM Transactions on Graphics (TOG) 34 (1), pp. 13. Cited by: §1, §1, §2.1.
  • [19] A. Golts, D. Freedman, and M. Elad (2020) Unsupervised single image dehazing using dark channel prior loss. IEEE Transactions on Image Processing 29 (), pp. 2692–2701. Cited by: §1, §1, §2.2.
  • [20] K. He, J. Sun, and X. Tang (2011) Single image haze removal using dark channel prior. IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (12), pp. 2341–2353. Cited by: §1, §1, §1, §2.1, §4.4.2.
  • [21] K. He, J. Sun, and X. Tang (2012) Guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (6), pp. 1397–1409. Cited by: §2.1.
  • [22] S. Hochreiter and J. Schmidhuber (1997) Long short-term memory. Neural computation 9 (8), pp. 1735–1780. Cited by: §3.4.1.
  • [23] S. Huang, B. Chen, and W. Wang (2014) Visibility restoration of single hazy images captured in real-world weather conditions. IEEE Transactions on Circuits and Systems for Video Technology 24 (10), pp. 1814–1824. Cited by: §2.1.
  • [24] J. Johnson, A. Alahi, and L. Fei-Fei (2016)

    Perceptual losses for real-time style transfer and super-resolution

    .
    In European Conference on Computer Vision, pp. 694–711. Cited by: §3.4.4.
  • [25] M. Ju, C. Ding, Y. J. Guo, and D. Zhang (2020) IDGCP: image dehazing based on gamma correction prior. IEEE Transactions on Image Processing 29 (), pp. 3104–3118. Cited by: §1, §1, §2.1.
  • [26] J. Kim, J. Kwon Lee, and K. Mu Lee (2016) Deeply-recursive convolutional network for image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1637–1645. Cited by: §3.4.1.
  • [27] S. E. Kim, T. H. Park, and I. K. Eom (2020) Fast single image dehazing using saturation based transmission map estimation. IEEE Transactions on Image Processing 29 (), pp. 1985–1998. Cited by: §1, §1, §2.1, TABLE I, TABLE II, §4.3.1, §4.4.1.
  • [28] D. P. Kingma and J. Ba (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §4.2.
  • [29] H. Koschmieder (1924) Theorie der horizontalen sichtweite. Beitrage zur Physik der freien Atmosphare, pp. 33–53. Cited by: §3.1.
  • [30] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. (2017) Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681–4690. Cited by: §3.4.4.
  • [31] B. Li, X. Peng, Z. Wang, J. Xu, and D. Feng (2017) Aod-net: all-in-one dehazing network. In Proceedings of the IEEE International Conference on Computer Vision, pp. 4770–4778. Cited by: §1, §1, §2.2, §3.1, TABLE I, TABLE II, §4.3.1.
  • [32] B. Li, W. Ren, D. Fu, D. Tao, D. Feng, W. Zeng, and Z. Wang (2019) Benchmarking single-image dehazing and beyond. IEEE Transactions on Image Processing 28 (1), pp. 492–505. Cited by: §1, §4.1, §4.4.2.
  • [33] C. Li, C. Guo, W. Ren, R. Cong, J. Hou, S. Kwong, and D. Tao (2019) An underwater image enhancement benchmark dataset and beyond. IEEE Transactions on Image Processing 29, pp. 4376–4389. Cited by: §5.3.2.
  • [34] L. Li, Y. Dong, W. Ren, J. Pan, C. Gao, N. Sang, and M. Yang (2020) Semi-supervised image dehazing. IEEE Transactions on Image Processing 29 (), pp. 2766–2779. Cited by: §1, §1, §2.2.
  • [35] Y. Li, Q. Miao, W. Ouyang, Z. Ma, H. Fang, C. Dong, and Y. Quan (2019-10) LAP-net: level-aware progressive network for image dehazing. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Cited by: §2.2.
  • [36] Z. Li, J. Zheng, Z. Zhu, W. Yao, and S. Wu (2014) Weighted guided image filtering. IEEE Transactions on Image Processing 24 (1), pp. 120–129. Cited by: §2.1.
  • [37] Z. Li and J. Zheng (2015) Edge-preserving decomposition-based single image haze removal. IEEE Transactions on Image Processing 24 (12), pp. 5432–5441. Cited by: §1, §1, §2.1.
  • [38] Z. Li and J. Zheng (2017) Single image de-hazing using globally guided image filtering. IEEE Transactions on Image Processing 27 (1), pp. 442–450. Cited by: §1, §1, §2.1.
  • [39] Z. Li and N. Snavely (2018) Megadepth: learning single-view depth prediction from internet photos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2041–2050. Cited by: §4.1.
  • [40] X. Liu, Y. Ma, Z. Shi, and J. Chen (2019) GridDehazeNet: attention-based multi-scale network for image dehazing. In Proceedings of the IEEE International Conference on Computer Vision, pp. 7314–7323. Cited by: §1, §1, §2.2, TABLE I, TABLE II, §4.3.1, §4.4.1.
  • [41] S. G. Narasimhan and S. K. Nayar (2001) Removing weather effects from monochrome images. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. II–186. Cited by: §1, §2.
  • [42] S. G. Narasimhan and S. K. Nayar (2003) Contrast restoration of weather degraded images. IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (6), pp. 713–724. Cited by: §1, §2.
  • [43] S. K. Nayar and S. G. Narasimhan (1999) Vision in bad weather. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Vol. 2, pp. 820–827. Cited by: §1, §2.
  • [44] I. Omer and M. Werman (2004) Color lines: image specific color representation. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Vol. 2, pp. II–II. Cited by: §2.1.
  • [45] J. Park, D. K. Han, and H. Ko (2020) Fusion of heterogeneous adversarial networks for single image dehazing. IEEE Transactions on Image Processing 29 (), pp. 4721–4732. Cited by: §1, §1, §2.2.
  • [46] Y. Peng, Z. Lu, F. Cheng, Y. Zheng, and S. Huang (2020) Image haze removal using airlight white correction, local light filter, and aerial perspective prior. IEEE Transactions on Circuits and Systems for Video Technology 30 (5), pp. 1385–1395. Cited by: Fig. 2, §1, §1, §2.1, TABLE I, TABLE II, §4.3.1, §4.4.1, §4.4.2.
  • [47] Y. Qu, Y. Chen, J. Huang, and Y. Xie (2019) Enhanced pix2pix dehazing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8160–8168. Cited by: Fig. 2, 2(b), §1, §1, §2.2, §3.1, TABLE I, TABLE II, §4.3.1, §4.4.1.
  • [48] J. Redmon and A. Farhadi (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767. Cited by: §5.3.1.
  • [49] D. Ren, W. Zuo, Q. Hu, P. Zhu, and D. Meng (2019) Progressive image deraining networks: a better and simpler baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3937–3946. Cited by: §3.4.1.
  • [50] W. Ren, S. Liu, H. Zhang, J. Pan, X. Cao, and M. Yang (2016) Single image dehazing via multi-scale convolutional neural networks. In European conference on computer vision, pp. 154–169. Cited by: §1, §1, §2.2.
  • [51] W. Ren, L. Ma, J. Zhang, J. Pan, X. Cao, W. Liu, and M. Yang (2018) Gated fusion network for single image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3253–3261. Cited by: §1, §1, §2.2, §3.1.
  • [52] S. Santra, R. Mondal, and B. Chanda (2018) Learning a patch quality comparator for single image dehazing. IEEE Transactions on Image Processing 27 (9), pp. 4598–4607. Cited by: §1, §2.2, TABLE I, TABLE II, §4.3.1, §4.4.1.
  • [53] Y. Y. Schechner and Y. Averbuch (2007) Regularized image recovery in scattering media. IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (9), pp. 1655–1660. Cited by: §1, §5.3.2.
  • [54] G. Sharma, W. Wu, and E. N. Dalal (2005) The ciede2000 color-difference formula: implementation notes, supplementary test data, and mathematical observations. Color Research & Application: Endorsed by Inter-Society Color Council, The Colour Group (Great Britain), Canadian Society for Color, Color Science Association of Japan, Dutch Society for the Study of Color, The Swedish Colour Centre Foundation, Colour Society of Australia, Centre Français de la Couleur 30 (1), pp. 21–30. Cited by: §4.3.1.
  • [55] P. Sharma, P. Jain, and A. Sur (2020) Scale-aware conditional generative adversarial network for image dehazing. In The IEEE Winter Conference on Applications of Computer Vision, pp. 2355–2365. Cited by: §3.1.
  • [56] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus (2012) Indoor segmentation and support inference from rgbd images. In European Conference on Computer Vision, pp. 746–760. Cited by: §4.1.
  • [57] K. Simonyan and A. Zisserman (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. Cited by: §3.4.4.
  • [58] Y. Tai, J. Yang, and X. Liu (2017) Image super-resolution via deep recursive residual network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3147–3155. Cited by: §3.4.1.
  • [59] R. T. Tan (2008) Visibility in bad weather from a single image. In 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. Cited by: §1, §1, §2.1.
  • [60] A. Wang, W. Wang, J. Liu, and N. Gu (2019) AIPNet: image-to-image single image dehazing with atmospheric illumination prior. IEEE Transactions on Image Processing 28 (1), pp. 381–393. Cited by: §2.2.
  • [61] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, et al. (2004) Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13 (4), pp. 600–612. Cited by: §3.2.
  • [62] Y. Wu and K. He (2018) Group normalization. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19. Cited by: §3.3.
  • [63] S. Xingjian, Z. Chen, H. Wang, D. Yeung, W. Wong, and W. Woo (2015)

    Convolutional lstm network: a machine learning approach for precipitation nowcasting

    .
    In Advances in neural information processing systems, pp. 802–810. Cited by: §3.4.1.
  • [64] W. Yang, R. T. Tan, J. Feng, Z. Guo, S. Yan, and J. Liu (2020) Joint rain detection and removal from a single image with contextualized deep networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 42 (6), pp. 1377–1393. Cited by: §5.3.2.
  • [65] W. Yang, R. T. Tan, S. Wang, Y. Fang, and J. Liu (2020) Single image deraining: from model-based to data-driven and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence (), pp. 1–1. Cited by: §5.3.2.
  • [66] H. Zhang and V. M. Patel (2018) Densely connected pyramid dehazing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3194–3203. Cited by: §1, §1, §2.2, §3.2, §3.5, §5.2.1.
  • [67] J. Zhang and D. Tao (2020) FAMED-net: a fast and accurate multi-scale end-to-end dehazing network. IEEE Transactions on Image Processing 29 (), pp. 72–84. Cited by: §1, §1, §1, §2.2.
  • [68] S. Zhao, L. Zhang, S. Huang, Y. Shen, and S. Zhao (2020) Dehazing evaluation: real-world benchmark datasets, criteria and baselines. IEEE Transactions on Image Processing (), pp. 1–1. Cited by: §4.1, §4.3.2.
  • [69] Q. Zhu, J. Mai, and L. Shao (2015) A fast single image haze removal algorithm using color attenuation prior. IEEE Transactions on Image Processing 24 (11), pp. 3522–3533. Cited by: §1, §1, §2.1.