Real-world Underwater Enhancement: Challenging, Benchmark and Efficient Solutions

by   Risheng Liu, et al.
Dalian University of Technology

Underwater image enhancement is an important low-level vision task with many applications, and numerous algorithms have been proposed in recent years. Despite the demonstrated success, these results are often generated based on different assumptions using different datasets and metrics. In this paper, we propose a large-scale Realistic Underwater Image Enhancement (RUIE) dataset, in which all degraded images are divided into multiple sub-datasets according to natural underwater image quality evaluation metric and the degree of color deviation. Compared with exiting testing or training sets of realistic underwater scenes, the RUIE dataset contains three sub-datasets, which are specifically selected and classified for the experiment of non-reference image quality evaluation, color deviation and task-driven detection. Based on RUIE, we conduct extensive and systematic experiments to evaluate the effectiveness and limitations of various algorithms, on images with hierarchical classification of degradation. Our evaluation and analysis demonstrate the performance and limitations of state-of-the-art algorithms. The findings from these experiments not only confirm what is commonly believed, but also suggest new research directions. More importantly, we recognize that underwater image enhancement in practice usually serves as the preprocessing step for mid-level and high-level vision tasks. We thus propose to exploit the object detection performance on the enhanced images as a brand-new `task-specific' evaluation criterion for underwater image enhancement algorithms.



There are no comments yet.


page 1

page 3

page 9

page 10

page 11

page 12


A Benchmark dataset for both underwater image enhancement and underwater object detection

Underwater image enhancement is such an important vision task due to its...

A Realistic Fish-Habitat Dataset to Evaluate Algorithms for Underwater Visual Analysis

Visual analysis of complex fish habitats is an important step towards su...

Performance Analysis of Cone Detection Algorithms

Many algorithms have been proposed to help clinicians evaluate cone dens...

Shallow-UWnet : Compressed Model for Underwater Image Enhancement

Over the past few decades, underwater image enhancement has attracted in...

Twice Mixing: A Rank Learning based Quality Assessment Approach for Underwater Image Enhancement

To improve the quality of underwater images, various kinds of underwater...

Domain Adaptation for Underwater Image Enhancement via Content and Style Separation

Underwater image suffer from color cast, low contrast and hazy effect du...

Wavelength-based Attributed Deep Neural Network for Underwater Image Restoration

Underwater images, in general, suffer from low contrast and high color d...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The development and utilization of ocean resources is of great significance to human beings, however, underwater images of low visibility bring failures on multimedia sharing and automatic computer vision systems for object detection, tracking, and recognition. Therefore, it is crucial to develop underwater image enhancement technology for the benefit of more underwater computer vision tasks.

In reality, there are two main reasons leading to the degradation of underwater image. On the one hand, varying attenuation of light leads causes different degrees of color cast, depending on optical wavelength, dissolved organic compounds, water salinity, and concentration of phytoplankton, etc. As red light with a longer wavelength is absorbed more than green and blue light, the water always looks bluish or greenish. On the other hand, the reflected light is absorbed and scattered by suspending particles in the medium before they reach the camera, which lead to low contrast and haze effect.

Fig. 1: Schematic diagram of underwater optical imaging.

For the above degradations, the goal of underwater image enhancement (UIE) algorithms is to remove the effect of light scattering (referred to as dehazing) and correct the color cast in underwater images. Numerous UIE algorithms have been recently proposed with different assumptions and evaluation criteria. Generally speaking, UIE algorithms can be categorized into types: non-model-based image enhancement methods, prior driven model-based methods and learning-based convolutional neural networks (CNNs). As traditional non-model-based works

[1, 2, 3, 4] aim to develop image enhancing techniques (white balance, histogram equalization and fusion-based methods , etc.), underwater imaging model provide a classical description for the degraded image generation. To solve the imaging model and obtain the clear scene, prior-driven methods investigate various field knowledges [5, 6, 7, 8, 9, 10, 11, 12]

for the estimation of depth-dependent transmission map, while recent data-driven end-to-end CNNs

[13, 14] directly learn this essential parameter from the degraded input.

Generic UIE algorithms in the literature are usually evaluated with different images and metrics with some certain image quality indexes (e.g., contrast, saturation and luminance, etc.). In view of the constant progress of underwater image enhancement algorithms, it is necessary to continuously update and enrich the underwater database for the evaluation of various algorithms, the generation of synthetic image for the training of data-driven CNNs. However, although some works have already proposed some underwater databases containing numerous images for investigation [15], there are still three major limitations of those databases: 1) The scenes and tones of the images in the dataset are relatively monotonous, which make they be unfit for evaluating the performance of algorithms under different illumination and different color casts. 2) The scene depth is shallow and the scattering effect is not obvious in these databases, hence, they seem to be unsuitable to evaluate the performance of model-based algorithms, especially the estimate of transmission. 3) There are few marine organisms in images, which limit the application of these databases to actual detection tasks.

Therefore, our work is directly motivated by above requirements to overcome the above three hurdles, and our contributions are summarized as follows:

  • We propose a large-scale underwater benchmark called the Realistic Underwater Image Enhancement (RUIE) dataset. An overview of RUIE could be found in Tables I, and image examples are displayed in Figure 2. Compared with exiting realistic image sets of underwater scenes, the RUIE dataset includes a large diversity of images, which are divided into multiple sub-datasets according to natural underwater image quality evaluation metric and the degree of color deviation.

  • Based on RUIE, we conduct substantial and systematic experiments to evaluate the performance of various algorithms in processing images with multiple degrees of degradation or different types of color cast. Both of quantitative and qualitative analysis demonstrate the advantages and limitations of the algorithms used. Not only do the findings from these experiments confirm what is commonly believed, but also bring in rich insights of new research directions in underwater image enhancement.

  • We recognize that underwater image enhancement in practice usually serves as the preprocessing step for mid-level and high-level vision tasks. We thus propose to exploit the object detection performance on the enhanced images as a brand-new “task-specific” evaluation criterion for various algorithms. Moreover, based on experimental result, we analyzed the current underwater image evaluation measures and showed that there is not much correlation between these image quality metrics and the image-classification accuracy when the images are preprocessed by the existing enhancement methods, and suggest new research directions for the combination of enhancement and high-level vision tasks.

Ii Related work

Generic UIE algorithms aim to generate high-quality images from a single degrade input by image enhancement techniques or overcome issues via underwater optical image processing. According to whether based on the imaging model, most existing UIE algorithms can be categorized into following three types of approaches.

Ii-1 Non-model-based Methods

Taking no account of the optical imaging model, those algorithms improve the image quality by adjusting the pixel value of the image. The traditional image enhancement methods can be divided into two categories: One is spatial domain methods, which directly processes the pixels of the image, based on gray mapping transformation, such as histogram transformation [16], the Gray World algorithm [17], contrast limited adaptive histogram equalization (CLAHE) [2], multi-scale retinex with color restore [1], automatic white balance [18], color constancy [19, 20]. The other is transform domain methods, which convert inputs from the spatial domain to the change filed by some form of mapping, then use some specific properties of the change filed to perform image processing [21].

However, although the spatial domain methods can improve the image visual quality to some extent, there are still trends such as amplifying noise, introducing artifacts, and causing color distortion. Though transform domain methods perform well in removing noises, they can not achieve satisfactory results for issues such as low contrast, color deviation and haze effect. Due to the complexity of underwater environment and illumination conditions, underwater image degradation can not be completely solved by solely image enhancement technique.

Ii-2 Model-based Methods

(a) UIQS. From left to right, five sub-datasets AE are ranked according to the non-reference image quality metrics, and the corresponding image quality is reduced successively.
(b) UCCS, which is divided into three subsets “Green”, “Green-blue” and “Blue”, according to the degree of color cast.
(c) UTTS. There are five sub-datasets AE in sequence, and the corresponding image quality is reduced in turn.
Fig. 2: Example images from the three sets in RUIE.

Based on Jaffe-McGlamery imaging model [22, 23], physical model-based methods estimate the model parameters by various domain assumption, then the clear underwater scene can be restored by inverting the degradation process. The underwater optical imaging model can be formulated as:


where is the captured degraded image, is the clear scene radiance to be recovered, represents different color channel. There are two critical parameters: global atmospheric light , and transmission matrix denotes the portion of the light that does not reach the camera, it can be defined as:


where is the scattering coefficient of the atmosphere, which is related to water quality, water depth and salinity, and represents the scene depth.

In order to solve the physical model, most state-of-the-art underwater enhancement methods estimate the key parameters and in either physically grounded or data-driven ways, and correct the color cast using traditional color balance or histogram equalization techniques.

Because the imaging model of underwater images and hazy images is similar, to some extent, many UIE algorithms are extensions of prior-based dehazing algorithms in underwater scenes. However, due to the serious attenuation of red light in the water, the original dehazing priors are not applicable to underwater images, so some investigations have improved those dehazing priors to make them suitable for UIE. For instance, there are several prior-based UIE methods are derived from the famous dark channel prior (DCP) [24], which can effectively estimate the depth map of hazy image. Chiang et al. [6] improved the DCP by compensating the attenuation to restore the color balance. Drews-Jr et al. [9] applied their proposed modified underwater DCP only on the blue and green channels. Galdran et al. [10] proposed the Red Channel Prior based on the DCP by using the serious attenuation of red channel. Besides, more various physical priors have been proposed for the underwater image restoration. Nicholas et al. [5] took advantage of the characteristic of the channel difference to estimate the transmission. Wang et al. [25] derived a maximum attenuation identification method (MAI), which only use the red channel information to derive depth map and background light. Peng et al. [12] presented a depth estimation method using image blurriness and light absorption. Wang and Liu et al. [26] proposed an adaptive attenuation-curve prior for both of UIE and hazy image restoration.

However, the common shortcoming of the above prior-based UIE algorithms is that, sometimes they are invalid due to the complex environment and severe color cast, which make the situation outside the scope of these priors.

Ii-3 Data-driven Enhancement CNNs

Very recently, training heuristically constructed deep-learning-based methods on large-scale datasets have delivered prominent performance in many vision and recognition tasks

[27, 28]. Some existing methods suggest that CNNs can also benefit image enhancement in uncontrolled outdoor environments and are effective to handle complex scenes by accurate depth information estimation or direct end-to-end image enhancement. Generally, the dehazing convolutional neural networks (CNNs) can be divided into two types: the one is end-to-end CNNs which learn the clear scene from the degraded input [29, 30, 31], the other restore scenes by learned transmissions according to the physical imaging model [32].

As underwater environment is similar to ground hazy scenes to some extent, but there are more complexities, due to environmental factors such as water flow, color deviation, insufficient illumination and other environmental factors that cannot be ignored. Accordingly, an effective UIE CNN may require a more complex network structure or a well-designed loss function. Actually, there is not many investigations for specialized UIE CNNs, compared with the rapid development of CNN in the application of haze removal. Li et. al

[13] structured the WaterGAN to synthesize training dataset, and designed an end-to-end network consisting of a depth estimation module and a followed color correction module. Chen et al. [14] proposed a filtering-based restoration scheme composed of a parameter search, filtering, and enhancement, and designed a GAN-based restoration scheme which adopted multi-stage loss strategy for the training. Different from the above two end-to-end UIE CNNs, Liu et. al [33] proposed a lightweight learning framework to aggregate both domain knowledge and learning strategy for the estimation of transmissions in challenging vision tasks, including UIE.

In this work, based on different settings of realistic images, the performance of some classical and state-of-the-art UIE algorithms will be evaluated on both non-reference metrics and task-specific criterion.

Iii The Proposed Dataset

Most existing underwater image and video datasets generally have both or one of the following limitations: monotonous deepwater scenes in images and insufficient data size. As the images captured from underwater robotic grasping are quite different from those obtained in general marine scenes, it is difficult to provide strong support for the investigations of high-level version tasks such as robotic detection and recognition. Therefore, the establishment of a large-scale, diversified, specific task-driven database has a great impetus for the training and evaluation of CNN models. In addition, an appropriate database is also an important data basis for the feedback training of the underwater robotic grasping system and relevant theoretical researches.

This newly proposed dataset consists of 3630 real-word underwater images, which are from ten videos captured in the first Underwater Robot Picking Contest, Zhangzidao, Dalian, China. As for the environment of video sampling, the water depth is approximately 3 15 m, and all targets were randomly placed in the detection areas. All videos were captured between 8 AM to 11 AM and 13 PM to 16 PM, September 21st 22nd.

As these videos were captured in different time periods of the day, and there were large differences of the water quality between shooting locations, the illumination, depth of field, fuzzy degree and color cast in the pictures are quite different. In order to measure the performance of UIE algorithms from multiple aspects, we have created three sub-datasets, their specific settings and functions are as follows:

Underwater Image Quality Set (UIQS): This sub-dataset is used to test UIE algorithms for the improvement of comprehensive image quality. Specificly, according to the underwater colour image quality evaluation (UCIQE) [34] metric, we assessed the quality of images and ranked them by their corresponding UCIQE scores, as UCIQE metric is a linear combination of the chroma, saturation, and contrast of underwater images. Then we equally divided them into five subsets, which are denoted as [A, B, C, D, E] in order of quality from high to low, in order to facilitate testing the performance of different algorithms in various underwater conditions, and image examples are displayed in Figure 2.

Underwater Color Cast Set (UCCS): This sub-dataset is used to test UIE algorithms for the ability of correcting color casts. According to the average value of b channel (red-green bias) in the CIElab color space, we collect 300 images from UIQS and produce dataset UCCS for the evaluate of color shift correction capability. It contains three 100-image sub-datasets of bluish, greenish and blue-green tone. The corresponding example images are shown in the second line of Figure 2.

Underwater Task-driven Testset (UTTS): In order to adapt to high-level visual tasks such as identification and detection, UTTS contains 300 images, of which each contains several easily recognizable marine animals. We currently focus on three categories: abalones, sea cucumbers and sea urchins, as they are always used as target organisms in robotic grabbing tasks. Furthermore, similar to UIQS, UTTS is sorted into five sub-datasets according to the UCIQE scores, in order to explore the impact of image quality on the accuracy results.

Since enhanced images are often subsequently fed for high-level version tasks, we argue that the optimization target of enhancement in these tasks is neither pixel-level or perceptual-level quality, but the utility of the enhanced images in the given semantic analysis task. We thus propose the task-driven evaluation for UIE algorithms, and study the problem of object detection in the presence of haze as an example. Specially, based on the network structure of yolo-v3, we trained an underwater target detection CNN, of which the training data includes 1800 labeled pictures captured from shallow waters with the depth of less than three meters. Then the trained detection CNN is employed to test the enhanced results obtained by various UIE algorithms. And those algorithms will be ranked via the mean Average Precision (mAP) achieved.

Subset Image Number
Underwater image quality set (UIQS) 3630 (7265)
Underwater color cast set (UCCS) 300 (1003)
Underwater task-driven testset (UTTS) 300 (605)
TABLE I: Subsets of RUIE for training and testing
Non-model-based methods
Method Enhancement technique Test data Criterion Code
UCM [37] Unsupervised color balance and histogram stretching R Histogram distribution
MSRCR [1] Multiscale retinex with color restoration R
CLAHE [2] Contrast-limited adaptive histogram equalization R
Fusion [3] White Balance, bilateral filtering, image fusion R, CC Local feature points matching
Ghani [39] minimizes under-enhanced and over-enhanced areas R Entropy, MSE, PSNR
Prior-based methods
Method Physical prior Post process Test data Criterion Code
BP [5] Radiance attenuation R/ CC
P. Drews-Jr [9] Underwater DCP on g,b R/ CC RMSE
UHP [7] Color distribution R RGB Median angle
ENOM [8] Underwater DCP R
Li [40] Underwater DCP R
LDP [41] Histogram distribution prior R, S, CC MSE, PSNR, Entropy, PCQI, UCIQE
Peng [12] Blurriness& Light Absorption R, S PSNR, SSIM, BRISQUE, UIQM
WCID [6] Residual energy ratios R, CC
Galdran [10] Red channel prior R Edge number, Gradient ratio
Lu [42] UDCP with median filter R PSNR, CNR, SSIM
Li [11] UDCP with median filter R CNR
Yang [43] UDCP with median filter R
DPATN [33] Learning-based UDCP R
TABLE II: Brief description of some UIE algorithms. As for the test data, “R”, “S” and “CC” represent realistic images, synthetic underwater images and ColorChecker images. The criterions related are Mean Squared Error (MSE), Root Mean Squard Error (RMSE), Peak Signal to Noise Ratio (PSNR), Structural Similarity Index (SSIM), Contrast-to-noise ratio (CNR), Patch-based Contrast Quality Index (PCQI) [35], UCIQE, UIQM, Blind/ Referenceless Image Spatial Quality Evaluator (BRISQUE) [36].

Iv Benchmark Evaluation Results

In this section, based on the rich resources provided by RUIE, we quantitatively and qualitatively evaluate 10 representative underwater image enhancement algorithms: non-model based MSRCR (Multi-Scale retinex with color restore [1]), CLAHE ( Contrast limited adaptive histogram equalization [2]) and Fusion [3]; underwater-prior-based BP (Bianco prior [5]), UHP (underwater haze-line prior) and NOM (enhancement with a new optical model); extended applications of dehazing priors (Dark channel prior [24]), (Boundary constrained context regularization) [44], (Color attenuation prior) [45] and (Haze-line prior) [46], where the subscript CB indicates the color balance post-processing. The more characteristics of these methods are illustrated in Table II. And in order to save time cost, we cut and compressed all of the original images to 300400 for the following experiments.

Iv-a Evaluation Criteria

For the quality assessment of real-world underwater images, as there is no ground truth for reference, two nonreference underwater image quality evaluation indicators are used as the standard for the objective quality assessment. The one is underwater image quality measure (UIQM) [47], which includes three underwater image attribute measures: the underwater image colourfulness measure (UICM), the underwater image sharpness measure (UISM), and the underwater image contrast measure (UIConM). UIQM is expressed as a linear combination of the above three measurement components as:


where , and are the scale factor of the three components. According to the original paper, we set, , .

The other is underwater colour image quality evaluation (UCIQE) [34], which uses a linear combination of the chroma, saturation, and contrast of underwater images in CIElab color space. The UCIQE score can be obtained as:



is the standard deviation of chroma;

is the contrast of brightness; is the average of saturation; , and is the scale factor. According to the original paper, we set , , in the assessment.

Iv-B Comprehensive Image Quality Comparison On UIQS

UICM -74.29 -2.07 -20.80 -21.75 -69.04 -22.82 42.45 -16.18 -10.11 0.057 -14.12 -31.96
UISM 1.657 4.925 5.532 4.784 0.162 3.251 2.317 4.947 3.042 3.805 3.098 3.650
A UIConM 0.459 0.712 0.807 0.818 0.563 0.743 0.684 0.811 0.778 0.833 0.759 0.782
UIQM 0.035 3.942 3.933 3.725 0.114 2.974 4.327 3.902 3.396 4.103 3.232 2.971
UCIQE 0.240 0.493 0.451 0.469 0.264 0.500 0.486 0.479 0.486 0.501 0.475 0.462
UICM -77.51 -0.718 -16.32 -16.72 -69.59 -19.40 35.48 -15.24 -2.001 5.208 -5.765 -28.98
UISM 1.825 4.852 5.598 4.845 0.257 3.369 2.298 4.857 3.136 3.805 3.193 3.798
B UIConM 0.512 0.717 0.805 0.825 0.619 0.745 0.665 0.806 0.781 0.822 0.759 0.784
UIQM 0.184 3.975 4.071 3.909 0.328 3.112 4.058 3.886 3.661 4.210 3.493 3.108
UCIQE 0.266 0.483 0.452 0.470 0.293 0.500 0.477 0.475 0.491 0.505 0.476 0.472
UICM -82.01 -1.497 -19.50 -19.19 -70.57 -18.59 37.84 -14.83 -1.85 6.256 -7.92 -31.69
UISM 1.836 4.663 5.525 4.719 0.262 3.376 2.251 4.793 3.094 3.793 3.180 3.731
C UIConM 0.520 0.703 0.791 0.811 0.635 0.742 0.652 0.802 0.766 0.809 0.740 0.758
UIQM 0.089 3.848 3.909 3.753 0.358 3.124 4.061 3.863 3.600 4.190 3.363 2.918
UCIQE 0.283 0.475 0.453 0.470 0.311 0.501 0.478 0.478 0.493 0.502 0.472 0.478
UICM -84.82 -2.079 -16.27 -17.00 -71.70 -18.49 29.53 -16.97 1.695 8.351 -10.65 -28.55
UISM 1.889 4.431 5.422 4.548 0.351 3.235 2.289 4.814 3.076 3.671 3.131 3.608
D UIConM 0.539 0.688 0.778 0.799 0.637 0.727 0.639 0.7990 0.754 0.787 0.723 0.725
UIQM 0.092 3.709 3.925 3.719 0.361 3.034 3.794 3.651 4.134 3.800 3.210 2.852
UCIQE 0.301 0.463 0.455 0.471 0.338 0.508 0.466 0.477 0.496 0.500 0.472 0.486
UICM -77.19 0.971 -23.68 -26.65 -64.64 -22.23 11.58 -17.79 -16.30 -7.198 -32.91 -36.94
UISM 3.291 3.650 6.085 5.057 1.131 4.153 3.513 3.870 4.486 4.805 4.252 4.729
E UIConM 0.696 0.630 0.785 0.841 0.715 0.755 0.721 0.790 0.804 0.815 0.787 0.738
UIQM 1.285 3.358 3.937 3.749 1.068 3.300 3.940 3.467 3.738 4.129 3.142 2.991
UCIQE 0.362 0.375 0.430 0.449 0.413 0.504 0.457 0.489 0.484 0.486 0.462 0.485
TABLE III: Non-reference Underwater Image Quality Evaluation of algorithms on UIQS.

First, based on subset UIQS we compared the image quality improvement capabilities of above eleven methods. Shown as the qualitative comparison in Figure 5, when the underwater scattering effect is not serious (picture with quality level D), most methods can achieve better enhancement, MSRCR can correct color cast well, but it can not guarantee enough image saturation and contrast. CLAHE and Fusion can notably improve the image brightness, saturation and contrast, but the lack of imaging model lead to residual haze effect. But the fixed parameter settings make above enhancement methods be unable to work adaptively. BP is effective for haze removal, but it can not deal with color cast well, especially when the water is greenish. UHP may generate over-saturation and excessive contrast so as to the loss of image details. As for DPATN and extended applications of dehazing priors, obviously they can remove haze effect effectively and produce natural hue.

For the underwater image taken under turbid scene (e.g. pictures with quality grade of D and E), There is obvious weakness of MSRCR that it even aggravates the effect of scattering, but it may be caused by generalized algorithm parameters. CLAHE and Fusion can improve the contrast of this image, however, due to the lack of imaging model, they introduce considerable artifacts and obvious residual haze effect in areas where underwater haze effect is severe. BP has little effect on those severely degraded inputs because with low contrast and color cast. Another prior-based method UHP can make relatively clear results, especially for scenes far from the camera. ENOM tends to make the result severely reddish hue so as to failure, however, it is worth noting that because the reddish hue can result in higher UICM values, those visually failing result still get higher UIQM scores. In comparison, DPATN and extended applications of dehazing priors can remove haze well in challenging scenes. Among them, can improve the image brightness to recover more image details, and performs best at improving visibility and contrast.

Furthermore, shown as Table III, we make quantitative comparisons on UIQS with non-reference image quality measures UIQM [47] and UCIQE, the highest two scores are marked in bold. Specially, as for the extended applications of dehazing priors, we compare the performance of two strategies: with and without color balance (CB) post-processing. Shown in Figure 5, without the post-processing, they can somewhat improve the image quality (especially ). But their performances obviously reduce with the image quality decreasing, while post-processing can significantly improve the performance and maintain more stable ability, guaranteeing the enhancement capacity in challenging scenes. We also get that the combination of imaging model and image enhancement technology is essential for the improvement of image quality.

Figure 3 illustrates the performance comparison of single underwater dehazing strategy (dashed line) and the “dehazing + white balance post-processing” strategy (solid line). Obviously, on the five sub-datasets of UIQS, especially the challenging scenes, the pure white balance processing improves the image quality much more than four underwater dehazing algorithms. The combination of underwater dehazing and white balance can achieve better and more stable performance.

Generally, in terms of the physics-based methods, one of the difficult challenges is severe color cast, because it always means that red channel is so dim that there is a lot of image details lost. Another challenging scene is the low-light scene because they often saturate pixels that interfere with the estimation of the global background light and medium transmission maps. In such cases, the estimated medium transmission map is approximately equal to 1 because the pixel values of the input image are about 0 (i.e., the input image almost is dark). According to Eq. (1), if the medium transmission maps are approximate to 1, the prior-based methods have little effect on the input image. Therefore, compared with pixel-based image enhancement techniques like MSRCR, CLAHE and Fusion, prior-based methods perform limitations when they are used for processing underwater images captured under low-light conditions. Comparing image enhancement methods using UCIQE and UIQM or other no-reference metrics is difficult because the metrics weight contrast and colorfulness differently. For example the UIQM algorithm removes the of pixels with brightest and darkest values before computing the image colorfulness, whereas the UCIQE algorithm uses all pixels. Depending on factors like this and the weight given to different components, a white balancing step or a histogram equalization step can have a significant effect on the quantitative output of the metrics.

Iv-C Color Correction Comparison on UCCS

Then based on UCCS, we evaluate the color correction capabilities of different algorithms. Shown as Figure 7, MSRCR can correct both greenish and bluish tones well. As for CLAHE and Fusion, the ability to handle greenish tones is superior than which to blue tones. BP can enhance the contrast in bluish scenes, but it tends to fail on greenish pictures. UHP may produce partial darkness, so as to lost image information. ENOM always produce over corrections and makes reddish results. Above four underwater applications of dehazing priors, and can correct blue tone well and lead to more natural results, performs best when dealing with low illumination and greenish tone.

The corresponding quantitative results are showed in Table IV, in which we adopt and ( respectively represent the average value of channel a and b under CIElab space) as evaluation criteria to measure the color cast degree. reflects the red-green bias, and a larger value means greener tone. reflects the blue-yellow bias, a larger value denotes bluer tone. Near-zero values of and represent lower color cast. And the highest two scores are marked in bold.

Method Blue Green-Blue Green
Input -25.84 / -6.56 -24.36 / 4.24 -30.97 / 12.10
MSRCR 1.17 / 0.47 2.42 / 1.05 2.58 / 0.31
CLAHE -10.95 / -2.71 -6.73 / 1.67 -1.68 / 1.46
Fusion -10.27 / -2.67 -6.07 / 1.66 -1.21 / 1.42
BP -16.14 / -6.16 -7.68 / 0.94 -8.30 / 5.17
UHP -24.23 / -5.90 -23.15 / 4.85 -29.77 / 12.35
ENOM -11.70 / -1.08 -7.87 / 6.59 -9.84 / 6.42
DPATN -10.15/ -3.13 -4.16/ 2.03 -1.15/ 1.48
0.88 / 6.90 35.25 / 9.05 36.01 / 22.35
-12.21 / -2.37 1.77 / 0.84 0.76 / 1.58
-8.31 / 1.00 1.59 / 4.76 3.76 / 2.43
-15.83 / -2.15 -4.69 / 5.46 1.70 / 2.79
TABLE IV: Average / scores on UCCS, specially, the best results are shown in bold fonts.
Fig. 3: Comparison of underwater dehazing strategies (without post-processing)and “dehazing + white balanc” strategies on sub-datasets of UIQS.

Iv-D Task-driven Comparison

First, based on the synthetic dataset, we adopt the commonly used yolo-v3 [48] and use the same fixed model to detect objects from the enhanced results. Combined with the visual comparison in Figure 9, it is obvious that MSRCR, CLAHE and Fusion can achieve better white balance effect. On the other hand, it also indicates that the color correction is essential for application tasks.

The mAP results and the number of detected object (NDO) on different subsets are shown in Figure 4. It is worth noting that although both and adopt ”underwater dehazing + color balance post-processing” strategy, they perform best in effectiveness and stability on all sub-datasets, significantly improving the number of detected object while increasing the mAP. By contrast, specially designed prior-based algorithms ENOM have little improvement or even reduction for mAP and NDO (e.g. on subset B and C). In addition, among the three enhancement techniques CLAHE, Fusion and BP, BP has the most improvement in mAP and detection number. In most cases, to some extent CLAHE and Fusion can play a positive role for NDO, and CLAHE can maintain or slightly increase the mAP, while Fusion seems to have a negative effect on mAP.

On the other hand, as shown in Figure 6, we compare the mAP results with the average non-reference quality score on every sub-dataset of UTTS, but there is only a weak correlation. For instance, on grade B, CLAHE and Fusion can greatly improve the UCIQE and UIQM of the image, but the mAP results sometimes instead be lower than that of the degraded inputs. On grade A, D and E, UHP can make a more satisfactory improvement of mAP, but the effect on image quality evaluation is the worst of all methods. Therefore, we think it is necessary to evaluate the enhanced results in task-driven way.

Discussion: Does underwater image enhancement help CNN-based image classification?

In the field of dehazing research, [29, 49, 50] apply task-driven evaluation to restored results to compare the performance of different algorithms. [50] is the first to explore whether haze removal can make scene in the important task of image classification. From the experimental results on synthetic and real hazy image datasets, [50] found that the existing image-dehazing methods cannot improve much the image-classification performance and sometimes even reduce the image-classification performance.

As shown above, we observed a similar phenomenon to the [50] study from the experimental results, that is, the existing underwater image enhancement algorithm does not bring much benefit to improve CNN-based image classification accuracy, which may be Image enhancement does not introduce new information to aid image classification. However, it is obvious that the existing algorithm can increase the number of detected objects while maintaining or slightly increasing the mAP, which is undoubtedly of great significance in practical applications such as subsea fishing or marine biological detection. To further improve underwater classification or detect performance in real photos or videos, there are at least two notable potential options, and we can see future efforts:

  • The first viable strategy is to integrate the image enhancement module into the underwater classification network and participate in its training. To be specific, we need to simulate real underwater scenes to obtain labeled clear images and corresponding scene depth information. On this basis, we can develop specialized GANs to generate labeled synthetic underwater pictures as training data for underwater image enhancement CNNs, which can be integrated into the underwater classification network and participate in its training to obtain optimal image enhancement parameters for particular tasks. If we view the synthetic underwater images as the source domain (with abundant labels) and the realistic ones as the target domain (with scarce labels), then the unsupervised domain adaption can be performed to reduce the domain gap in low-level features, by exploiting unannotated realistic underwater images. For example, [51] provided an example of pre-training the robust low-level CNN filters using unannotated data from both source and target domains, leading to much improved robustness when applied to testing on the target domain data.

  • The second is to label marine organisms on real underwater images and directly train the classification networks based on those annotated pictures. In this way, image enhancement can be left out. However, obviously the performance of this scheme is likely to be limited by the training data, that is, it may only achieve satisfactory classification accuracy on some types of degraded images but fail on others. Put it into practical applications, if the quality of the waters does not change much, this strategy is worth considering as it can take full advantage of existing high-performance classification networks (such as yolo [48], Faster R-CNN [52] etc.) to leave out the design for the image enhancement module.

Both of the above two schemes require a great many of real underwater images for training. As for the first, if a GAN is used to synthesize underwater images, thousands of real images are needed as reference targets, and the second mentioned strategy needs images contain enough underwater creatures and various scenes. Therefore, as RUIE consists of high-definition real images with different degrees of degradation and different color shifts, it can serve these two strategies to work.

Apparently, the above discussions can be straightforwardly applied to other high-level vision tasks in uncontrolled outdoor environments (e.g., bad weathers and poor illumination), such as tracking, recognition, semantic segmentation, etc.

V Conclusions and future work

In this paper, we propose the RUIE benchmark and evaluate the state-of-the-arts in underwater single image enhancement. From the results presented, there seems to be no single-best enhancement model for all criteria: UHP and are favored by UCIQE and UIQM; DCP, CLAHE and Fusion are kore competitive in terms of color correction; shows to have the most appreciated subjective quality; and lead to superior detection performance on real underwater images; and finally is the most efficient among all. We see the highly complicated nature of the underwater image enhancement problem, in both real-world generalization and evaluation criteria. For future research, we advocate to evaluate and optimize enhancement algorithms towards more dedicated cafeterias (e.g., subjective visual quality, or high-level target task performance), rather than solely UCIQE/UIQM, which are found to be poorly aligned with other metrics we used. In particular, correlating enhancement with high-level computer vision problems will likely lead to innovative robust computer vision pipelines that will find many immediate applications. Another blank to fill is developing no-reference metrics that are better correlated with human perception, for evaluating enhanced results. That progress will accelerate the needed shift from current full-reference evaluation on only synthetic images, to the more realistic evaluation schemes with no ground truth.

Besides, we analyzed the current underwater image quality evaluation measures based on pixel-wise errors and local structural similarities and showed that there is not much correlation between these enhancement metrics and the image-classification accuracy when the images are preprocessed by the existing UIE methods. While we believe this is due to the fact that image enhancement does not introduce new information to help image classification, we do not exclude the possibility that the existing UIE methods are not sufficiently good in recovering the original clear image and better UIE methods developed in the future may help improve image classification. We hope this study can draw more interests from the community to work on the important problem of underwater degraded image classification, which plays a critical role in applications such as underwater detection and robotics.

Fig. 4: Object detection number and mean average precision (MAP) on UTTS. Methods 1 to 12 are respectively MSRCR, CLAHE, Fusion, BP, UHP, ENOM, DPATN, , , .
Fig. 5: Comparison on UIQS sub-dataset. In sequence the quality levels of the five inputs are A , B, C, D and E respectively.
Fig. 6: The mean Average Precision (mAP) and UCIQE/ UIQM scores on UTTS. Methods 1 to 12 are respectively MSRCR, CLAHE, Fusion, BP, UHP, ENOM, DPATN, , , .
Fig. 7: Comparison on synthetic underwater images.
People 173 286 240 240 240 212 208 219 153 227 226 229 227
Vehicle 428 664 575 556 556 479 453 462 519 523 540 525 530
Others 49 88 78 86 92 76 72 53 69 73 71 65 78
Sum 650 1038 893 882 888 767 733 668 804 823 837 819 835
TABLE V: Number of detected object on UIQS.
Category Input MSRCR CLAHE Fusion BP UHP ENOM
Sea urchin 3127 3261 3693 4160 3731 3431 3623 4487 4100 4630 4881
Holothurian 227 644 263 400 527 376 334 398 414 611 548
Scallop 49 61 29 41 64 53 167 96 85 66 46
Sum 3403 3966 3985 4601 4322 3860 4124 4981 4599 5307 5475
TABLE VI: Number of detected object on synthetic dataset
0.0119 0.0126 0.0232 4.5847 0.7985 0.6012 0.8758 0.8565 0.6568 0.5112 0.4256
TABLE VII: Average running time (second per image).
Fig. 8: Comparison on UTTS.

Fig. 9: Visualization of object detection results ( on synthetic dataset) after applying different dehazing algorithms.


  • [1] D. J. Jobson, Z. Rahman, and G. A. Woodell, “A multiscale retinex for bridging the gap between color images and the human observation of scenes,” IEEE Transactions on Image Processing, vol. 6, no. 7, pp. 965–976, 2002.
  • [2] S. M. Pizer, R. E. Johnston, J. P. Ericksen, B. C. Yankaskas, and K. E. Muller, “Contrast-limited adaptive histogram equalization: speed and effectiveness,” in Visualization in Biomedical Computing, 1990.
  • [3] C. Ancuti, C. O. Ancuti, T. Haber, and P. Bekaert, “Enhancing underwater images and videos by fusion,” in

    Computer Vision and Pattern Recognition

    , 2012.
  • [4] K. B. Gibson, “Preliminary results in using a joint contrast enhancement and turbulence mitigation method for underwater optical imaging,” in OCEANS’15 MTS/IEEE Washington.   IEEE, 2015, pp. 1–5.
  • [5] N. Carlevaris-Bianco, A. Mohan, and R. M. Eustice, “Initial results in underwater single image dehazing,” in Oceans, 2010.
  • [6] J. Y. Chiang and Y. Chen, “Underwater image enhancement by wavelength compensation and dehazing,” IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 1756–1769, 2012.
  • [7] D. Berman, T. Treibitz, and S. Avidan, “Diving into haze-lines: Color restoration of underwater images,” in British Machine Vision Conference, 2017.
  • [8] H. Wen, Y. Tian, T. Huang, and W. Gao, “Single underwater image enhancement with a new optical model,” in International Symposium on Circuits and Systems, 2013.
  • [9] P. D. Jr, E. D. Nascimento, F. Moraes, S. Botelho, and M. Campos, “Transmission estimation in underwater single images,” in IEEE International Conference on Computer Vision Workshops, 2013, pp. 825–830.
  • [10] A. Galdran, A. Alvarez-Gila, and A. Alvarez-Gila, “Automatic red-channel underwater image restoration,” Journal of Visual Communication and Image Representation, vol. 26, no. C, pp. 132–145, 2015.
  • [11] C. Y. Li, J. C. Guo, R. M. Cong, Y. W. Pang, and B. Wang, “Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior,” IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, vol. 25, no. 12, pp. 5664–5677, 2016.
  • [12] Y. T. Peng and P. C. Cosman, “Underwater image restoration based on image blurriness and light absorption.” IEEE Trans Image Process, vol. 26, no. 4, pp. 1579–1594, 2017.
  • [13] J. Li, K. A. Skinner, R. M. Eustice, and M. Johnson-Roberson, “Watergan: Unsupervised generative network to enable real-time color correction of monocular underwater images,” IEEE Robotics Automation Letters, 2017.
  • [14] X. Chen, J. Yu, S. Kong, Z. Wu, X. Fang, and L. Wen, “Towards quality advancement of underwater machine vision with generative adversarial networks,” arXiv preprint arXiv:1712.00736, 2017.
  • [15] J. Li, K. A. Skinner, R. M. Eustice, and M. Johnson-Roberson, “Watergan: Unsupervised generative network to enable real-time color correction of monocular underwater images,” IEEE Robotics and Automation Letters, vol. 3, no. 1, pp. 387–394, 2018.
  • [16] R. Hummel, “Image enhancement by histogram transformation,” Computer Graphics and Image Processing, vol. 6, no. 2, pp. 184–195, 1977.
  • [17] G. Buchsbaum, “A spatial processor model for object colour perception,” Journal of The Franklin Institute-engineering and Applied Mathematics, vol. 310, no. 1, pp. 1–26, 1980.
  • [18] Y. C. Liu, W. H. Chan, and Y. Q. Chen, “Automatic white balance for digital still camera,” IEEE Transactions on Consumer Electronics, vol. 41, no. 3, pp. 460–466, 2004.
  • [19] van de Weijer J, G. T, and G. A, “Edge-based color constancy.” IEEE Trans Image Process, vol. 16, no. 9, pp. 2207–2214, 2010.
  • [20] D. H. Foster, “Color constancy,” Vision Research, vol. 51, no. 7, pp. 674–700, 2011.
  • [21] G. Singh, N. Jaggi, S. Vasamsetti, H. Sardana, S. Kumar, and N. Mittal, “Underwater image/video enhancement using wavelet based color correction (wbcc) method,” in Underwater Technology (UT), 2015 IEEE.   IEEE, 2015, pp. 1–5.
  • [22] J. S. Jaffe, “Computer modeling and the design of optimal underwater imaging systems,” IEEE Journal of Oceanic Engineering, vol. 15, no. 2, pp. 101–111, 1990.
  • [23] B. L. Mcglamery, “A computer model for underwater camera systems,” Proc Spie, vol. 208, no. 208, pp. 221–231, 1980.
  • [24] K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,” in Computer Vision and Pattern Recognition, 2009.
  • [25] N. Wang, H. Zheng, and B. Zheng, “Underwater image restoration via maximum attenuation identification,” IEEE Access, vol. PP, no. 99, pp. 1–1, 2017.
  • [26] Y. Wang, H. Liu, and L. P. Chau, “Single underwater image restoration using adaptive attenuation-curve prior,” IEEE Transactions on Circuits and Systems I Regular Papers, vol. PP, no. 99, pp. 1–11, 2017.
  • [27] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, p. 436, 2015.
  • [28] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  • [29] B. Li, X. Peng, Z. Wang, J. Xu, and D. Feng, “Aod-net: All-in-one dehazing network,” in Proceedings of the IEEE International Conference on Computer Vision, vol. 1, no. 4, 2017, p. 7.
  • [30] W. Ren, S. Liu, H. Zhang, J. Pan, X. Cao, and M.-H. Yang, “Single image dehazing via multi-scale convolutional neural networks,” in European Conference on Computer Vision, 2016.
  • [31] B. Cai, X. Xu, K. Jia, C. Qing, and D. Tao, “Dehazenet: An end-to-end system for single image haze removal,” IEEE Transactions on Image Processing, vol. 25, no. 11, pp. 5187–5198, 2016.
  • [32] H. Zhang and V. M. Patel, “Densely connected pyramid dehazing network,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  • [33] R. Liu, X. Fan, M. Hou, Z. Jiang, Z. Luo, and L. Zhang, “Learning aggregated transmission propagation networks for haze removal and beyond,” IEEE transactions on neural networks and learning systems, no. 99, pp. 1–14, 2018.
  • [34] M. Yang and A. Sowmya, “An underwater color image quality evaluation metric,” IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 6062–6071, 2015.
  • [35] S. Wang, K. Ma, H. Yeganeh, Z. Wang, and W. Lin, “A patch-structure representation method for quality assessment of contrast changed images,” IEEE Signal Processing Letters, vol. 22, no. 12, pp. 2387–2390, 2015.
  • [36] A. Mittal, A. K. Moorthy, and A. C. Bovik, “No-reference image quality assessment in the spatial domain,” IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, vol. 21, no. 12, p. 4695, 2012.
  • [37] K. Iqbal, M. O. Odetayo, A. E. James, R. A. Salam, and A. Z. Talib, “Enhancing the low quality images using unsupervised colour correction method,” in IEEE International Conference on Systems Man and Cybernetics, 2010.
  • [38] M. S. Hitam, E. A. Awalludin, N. J. H. W. Y. Wan, and Z. Bachok, “Mixture contrast limited adaptive histogram equalization for underwater image enhancement,” in International Conference on Computer Applications Technology, 2013.
  • [39] A. S. A. Ghani and N. A. M. Isa, “Underwater image quality enhancement through integrated color model with rayleigh distribution,” Applied Soft Computing, vol. 27, pp. 219–230, 2015.
  • [40] Y. Li, F. Guo, R. T. Tan, and M. S. Brown, “A contrast enhancement framework with jpeg artifacts suppression,” in European Conference on Computer Vision, 2014, pp. 174–188.
  • [41] C.-Y. Li, J.-C. Guo, R.-M. Cong, Y.-W. Pang, and B. Wang, “Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior,” IEEE Transactions on Image Processing, vol. 25, no. 12, pp. 5664–5677, 2016.
  • [42] H. Lu, Y. Li, L. Zhang, and S. Serikawa, “Contrast enhancement for images in turbid water.” Journal of the Optical Society of America A Optics Image Science and Vision, vol. 32, no. 5, p. 886, 2015.
  • [43] H. Y. Yang, P. Y. Chen, C. C. Huang, Y. Z. Zhuang, and Y. H. Shiau, “Low complexity underwater image enhancement based on dark channel prior,” in International Conference on Innovations in Bio-Inspired Computing and Applications, 2012, pp. 17–20.
  • [44] G. Meng, Y. Wang, J. Duan, S. Xiang, and C. Pan, “Efficient image dehazing with boundary constraint and contextual regularization,” pp. 617–624, 2013.
  • [45] Q. Zhu, J. Mai, and L. Shao, “A fast single image haze removal algorithm using color attenuation prior,” IEEE Transactions on Image Processing, vol. 24, no. 11, pp. 3522–3533, 2015.
  • [46] D. Berman and S. Avidan, “Non-local image dehazing,” in Computer Vision and Pattern Recognition, 2016.
  • [47] K. Panetta, C. Gao, and S. Agaian, “Human-visual-system-inspired underwater image quality measures,” IEEE Journal of Oceanic Engineering, vol. 41, no. 3, pp. 541–551, 2016.
  • [48] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
  • [49] B. Li, W. Ren, D. Fu, D. Tao, D. Feng, W. Zeng, and Z. Wang, “Reside: A benchmark for single image dehazing,” arXiv preprint arXiv:1712.04143, 2017.
  • [50] Y. Pei, Y. Huang, Q. Zou, Y. Lu, and S. Wang, “Does haze removal help cnn-based image classification?” arXiv preprint arXiv:1810.05716, 2018.
  • [51] Z. Wang, J. Yang, H. Jin, E. Shechtman, A. Agarwala, J. Brandt, and T. S. Huang, “Deepfont: Identify your font from an image,” in Proceedings of the 23rd ACM international conference on Multimedia.   ACM, 2015, pp. 451–459.
  • [52] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in International Conference on Neural Information Processing Systems, 2015, pp. 91–99.