Around 71% of Earth’s surface is covered by the oceans, and more than 90% of that is below 200 meters, where nearly no natural light penetrates. Due to physical obstacles, even nowadays most of the deep sea is still unexplored. Deep sea exploration is however receiving increasing attention, as it is the largest living space on Earth, contains interesting resources and is the last uncharted area of our planet. Since humans cannot easily access this hostile environment, Unmanned Underwater Vehicles (UUVs) have been used for deep sea exploration for decades. With the rapid development of underwater robotic techniques, UUVs are able to reach and to measure even in several kilometer water depth nowadays, providing platforms for carrying various sensors to explore, measure and map the oceans.
Optical sensors, e.g. cameras, are able to record the seafloor into high resolution images, advantageous for human interpretation. Consequently, many UUV platforms are nowadays equipped with camera systems for visual mapping of the seafloor due to the significant improvement of imaging capabilities during the last decades. However, underwater computer vision remains less investigated than on land because underwater images are suffering from several effects e.g. attenuation and scattering, which significantly decrease the visibility and the image quality. In addition, since no natural light penetrates the deep ocean, artificial light sources are also needed. This non-homogeneous illumination on limited-size platforms can degrade image quality. Besides radiometric effects, geometric distortion caused by multi-layer refraction (see e.g. [1, 2]) is also non-negligible: As water pressure increases by about 1 atmosphere for every 10 meters of depth, deep sea camera systems are protected in a waterproof housing with a thick glass window against extremely high water pressure. The above effects often make standard computer vision solutions struggle and fail in (deep)ocean applications.
The recent trend to employ machine learning methods for various vision tasks even increases the performance gap between underwater vision and approaches on land, since learning methods usually require a large amount of training data to achieve good performance. However, the lack of appropriate underwater images with ground truth data is a bottleneck of developing learning-based approaches in this field. The lack of training and evaluation data is even more serious in the deep sea scenario due to the difficulties and high cost of data acquisition. Simulation of deep sea images, in particular of illumination, attenuation and scattering effects could be one way to obtain development or training material for UUV perception.
Existing underwater imaging simulators are either not physically accurate or, in the case of the simulator presented by Sedlazeck et al., unfortunately far from real-time to be integrated into a robotic simulation platform. This paper therefore proposes a physical model-based deep sea underwater image simulator which uses in-air images and corresponding depth maps as an input and computes synthetic underwater images with both radiometric and geometric effects. This simulator considers point light sources (with main direction and angular fall-off) and with arbitrary poses in the model for the special conditions in the deep sea. We present several optimization strategies to improve the computational performance of the simulator, which enables us to integrate the deep sea camera simulation into the a UUV simulator based on the robotics simulator Gazebo .
The remainder of this paper is organized as follows: Section II briefly describes the state-of-the-art literature and the main contributions of this paper. Section III demonstrates the deep sea image formation model considering directional light sources. Section IV provides detailed information about the components in the image formation model and discusses the optimization strategies for the deep sea robotic imaging simulator. The given experimental results illustrate the implementation of our deep sea imaging system in the UUV simulator. Section V provides insights on the simulated images compared to other published methods which all use postprocessing of RGB-D images, before the conclusion in Section VI.
Ii Related Work and Main Contributions
Light rays are attenuated and scattered while traversing underwater volumes, which can be formulated by corresponding radiometric physical models .  and  decompose underwater image formation into three components: direct signal, forward-scattering and backscatter, which is known as the Jaffe-McGlamery model.  describes underwater image formation for shallow water cases. Underwater image formation also has been intensively studied in underwater image restoration that can be considered as the inverse problem of underwater image formation. The most widely applied model has been presented by , which was initially used to recover the depth cues from atmospheric scattering images (e.g. in fog or haze):
In this simplified underwater image formation model, the image is described as a weighted linear combination of object color and background color . Here, is the distance between the camera and scene point, while represents the attenuation coefficient.
Based on the simplified model,  additionally applies a color transmission map and presents a method to generate synthesized underwater images from in-air RGB-D images taken on the ground.  proposes a generative adversarial network (GAN) - WaterGAN, which has been trained by real underwater images. It also takes in-air RGB-D images as the input to generate synthetic underwater images. The target function of the GAN discriminator is also based on the simplified model.
However, the simplified model is only valid in shallow water cases, where the scene has global homogeneous illumination from the sunlight.  addresses many weaknesses of this model which introduces significant errors in both direct signal and backscatter component.
Obviously, the model does not apply to deep sea scenarios where artificial light sources are required to illuminate the scene and light distribution is extremely uneven. The light originates from the artificial sources on the robot and interacts with the water body in front of the camera, leading to very different visual effects in the images, especially in the backscatter component (see Fig. 1). Hence, the underwater image formation model in deep sea requires additional knowledge about the light sources like corresponding poses and properties.  uses the recursive rendering equation adapted onto underwater imagery with considering point light source in their model.  proposes a deep sea underwater renderer based on physical models, which extends the Jaffe-McGlamery model to incorporate color images, shadows, and multi light sources. They also implement the refraction effect caused by deep sea camera housing with thick glass flat port based on computationally very demanding backtracing of viewing rays. For image restoration rather than simulation,  considers a directional light source in the image formation model and applies it to restore the true color of underwater scenes.
A key intended use case for underwater image simulation is integrating it into a UUV simulation platform, which enables developing, testing and coordinating performance in underwater robotics before risking expensive hardware in real applications. For instance,  developed a software tool called UWSim, for visualization and simulation of underwater robotic missions. This simulator includes a camera system to render the images as it is seen by underwater vehicles but without adding any water effect. Such simulators require interactive performance rather than offline rendering as is done e.g. in ray-tracers of CG movies that can spend hours to generate a single frame. Though ray-tracing approaches are becoming faster and can be hardware-accelerated by recent GPUs, in this contribution we decide to build our solution upon the slightly less realistic, classical rendering by rasterization, that generates ”on-land” resp. ”in-air” images of a scene, jointly with a depth map (e.g. from the GPU’s z-buffer) and suggest a post-processing module that adds deep sea effects to such data. Already 
extended the open-source robotics simulator Gazebo to underwater scenarios called UUV Simulator. This simulator uses so-called RGB-D sensor plugins to generate the depth and color images, and then convert them to underwater scene by using the simplified shallow water model (Eq.1).
applies trained convolutional neural networks to style transfer the image output from and additionally add forward scattering and haze effect. However, their improvements still rely on the simplified model and the haze addition lacks a physical interpretation.
The main contributions of this paper are: (1) A deep sea underwater image solution based on the Jaffe-McGlamery model considering multiple point light sources (including angular characteristic) with corresponding poses and properties. (2) Analysis and evaluation of the components in the deep sea image formation model and several optimization algorithms to improve the simulator’s performance. (3) Integration of the deep sea imaging simulator into the Gazebo based UUV simulator, which can be used for underwater robotic development and rapid prototyping.
Iii Deep Sea Image Formation Model
In the deep sea scenario, there is no sun light to illuminate the scene. Only artificial light sources, which are attached to the underwater vehicles, provide the illumination. This moving light source configuration makes the appearance of deep sea images strongly dependant on the geometric relationships between the camera, light source and the object (see Fig. 2).
Iii-a Radiation of Light Source
This paper considers ”directed” point light sources, which are commonly used on underwater vehicle platforms. This type of light source usually has the highest light emanation along its central axis and an intensity drop-off with increasing angle to the central axis. This angular characteristic can be formulated as radiation intensity distribution (RID) curve. In , the RID is modeled using a Gaussian function. By default we adopt this Gaussian model to describe the light source radiation distribution as it fits reasonably well to our experimental measurements (see Fig. 3
), but it would also be possible to directly use the interpolated measurements using a lookup-table. In the Gaussian model the radiance along each light ray can be calculated as:
Where , are the relative light irradiance at angle and the maximum light irradiance along the central axis respectively. The dependency on the wavelength can be obtained from the color spectrum curve of the LED, which is often provided by the manufacturer or can be measured by a spectrophotometer.
Iii-B Attenuation and Reflection
Light is attenuated when it travels through the water, where the loss of irradiance depends on the running distance and the water properties. Different wavelengths of light are absorbed with different strengths, which causes the radiometric changes in the underwater images. This is because different types of water hold different water attenuation coefficients, resulting in varying of color shifts in images (e.g. coastal water images often appear more greenish while the deep water images appear more blueish, see Fig. 4). 
measured and classified Earth’s waters into five typical oceanic spectra and nine typical coastal spectra. shows how the corresponding attenuation curves vary between the different types and can serve as a first approximation for typical coefficients (and their expected variations). Due to the point light source property, the Inverse Square Law must be applied in order to simulate the quadratic decay of the light irradiance along the distance from the point-source it originated from. When we combine the attenuation effect with the object reflection model, which assumes light is reflected equally in all directions on the object surface (Lambertian surface), the entire attenuation and reflection model can be formulated as:
Here, is the irradiance which arrives at the pixel of the image and is the object color. The attenuation parameter indicates the strength of irradiance attenuation through the specific type of water on wavelength . and refer to the distance from light to object and from object to camera, respectively. indicates the incident angle between the light ray from the light source and surface normal. In the multiple light sources case, the computation is a summation of camera viewing rays for all light sources. Note that the denominator only contains because with increasing each pixel will simply integrate the light from a larger surface area.
Rendering of scattering in this paper is based on the Jaffe-McGlamery model, and is the most complex part of the physical models in this paper due to its accumulative character. In the Jaffe-McGlamery model, the scattering is partitioned into two parts: forward scattering and backscatter. Forward scattering usually describes the light which is scattered by a very small angle, which resulting in unsharpness of the scene in the images. This paper approximates the forward scattering effect with a Gaussian filter and the size of filter mask depends on the average scene depth . We neglect the forward scattering from light to the scene because the RID curve of the light is usually very smooth (e.g. modeled as a Gaussian function), where a small extra smoothing can be neglected without making a significant error. Backscatter refers to light rays which are interacting with ocean water and scattered backwards to the camera, this leads to a ”veiling light” effect in the medium. This effect is happening along the whole light path. Following , the 3D field in front of the camera can be discretized by slicing into several slabs with certain thicknesses, the irradiance on each slab is then accumulated in order to form up the backscatter component:
Eq. 4 gives the computation of the backscatter component from each light source. Here indicates the slab index and denotes the direct irradiance reaching slab . and represent the distances from voxel on slab to light source and camera respectively. denotes the forward scattering component of the slab which convolves by the Gaussian filter and indicates the convolution operator. refers to the Volume Scattering Function (VSF), where is the angle between the light ray that hits the voxel and the light ray scattered from the voxel to the camera (see Fig. 2). The VSF model in this paper applies the measurements from . is the thickness of slab and is the angle between the camera viewing ray and the central axis.
A light ray can change its direction when it passes from one medium to another. This geometric relationship is described by Snell’s law. In deep sea camera systems, due to the extreme high water pressure, the thickness of the housing cannot be neglected. Therefore, the viewing ray from the camera is refracted twice at the outer and inner interfaces of the glass housing until it reaches the object.
denote the direction unit vectors of the incident ray and the refracted ray,and are the refractive indices of the two media, the normal vector points toward the side where the ray is coming from. Then the calculation of the refracted ray is given by  as:
where and , refers to the incident angle.
,  and  also consider optics and electronics of the camera (e.g. vignetting, lens transmittance and sensor response) in their model. These effects are needed to simulate the image of a particular camera and could be added also to our simulator. This is however out of scope for this contribution, where we just use white balance, but besides that focus rather on efficient rendering of the backscatter to produce a realistic image.
This section shows the implementation of our deep sea robotic imaging simulator. The complete workflow is illustrated in Fig. 5.
Establish the 3D backscatter lookup table, each unit cell accumulates the backscatter elements along the viewing ray from the camera which is calculated by Eq. 4.
Compute the forward scattering component by smoothing the direct signal through a Gaussian filter.
Generate the direct signal component with considering attenuation and object surface reflection according to Eq. 4.
Interpolate the backscatter component from the backscatter lookup table with respect to the depth value from the depth map.
Form up the underwater color image by combining the direct signal, forward scattering and the backscatter component.
Optionally, add refraction effect to the image. It generates a kd-tree for the refracted projections and searches the nearest neighbor to compute the 2D coordinate for interpolating the pixel from the original image.
Several optimization procedures are employed in order to improve the performance of the deep sea imaging simulator, as described in the following.
Iv-a Backscatter Rendering Acceleration
In the deep sea image simulation, one of the most computationally costly part is the simulation of the backscatter component. Backscatter happens through the water body between the camera and the 3D scene, which is an accumulative phenomenon in the image. However, when the relative geometry between camera and light source is fixed, given the same water, backscatter remains constant in the 3D volume in front of the camera. For example, if there are no objects but only water in front of the camera, the image will be relatively constant and only contains the backscatter component. Once the object appears in the scene, the backscatter volume is cut depending on the depth between the object and the camera, the remaining part is accumulated to form up the image backscatter component.
Therefore, we construct a 3D frustum of a pyramid for the camera’s field of view and slice it into several volumetric slabs with certain thicknesses parallel to the image plane (see Fig. 6). Each slab is rasterized into unit cells according to the image size. We pre-compute the accumulative backscatter elements for each unit cell and store them in a 3D lookup table. Since the backscatter component of each pixel is an integration of all the illuminated slabs multiplied by the corresponding slab thickness along the viewing ray, the calculation of the backscatter for a pixel with depth then is simplified to interpolate the value between the closest two unit cells along the viewing ray.
During the rendering of slabs, we noticed that under our UUV’s camera-light configuration, the light irradiance on the slabs of the first few meters dominates the appearance of the backscatter component and scattering becomes smoother and eventually disappears in the far field. Of course this depends strongly on the relative pose of the light source(s) and is different in each individual camera system but this is a fundamental difference to the shallow water illumination model, where also far away from the camera a lot of light is available. Sample ”scatter irradiance” patterns on slabs can be seen in Fig. 9. In order to generate a more accurate backscatter component for low numbers of slabs, we propose an adaptive slab thickness sampling function based on the Taylor series expansion of the exponential function:
where indicates the slab thickness of slab index , refers to the maximum depth of the scene field which is divided into number of slabs . The empirical value 2.2 ensures the slab thickness is monotonically increasing and . This equation leads to denser slab samplings closer to the camera. As it is shown in Fig. 7, under the light setup described in the caption, the brightest spot should be at the bottom right corner of the image. The sampling of slab thickness by Eq. 6 gives a more accurate backscatter rendering result than by an equal distance sampling approach.
Here is also an important factor which affects the backscatter rendering quality and performance. We created Fig. 8 to demonstrate the backscattered irradiance of the voxels along the optical center axis in deep ocean water. This figure can be a good reference for finding to simulate the underwater images under different conditions or settings.
Iv-B Refraction Rendering
 describes a solution to render the refraction effect from the color image and its depth map. They first convert the depth map into a 3D triangle mesh and then calculate the refracted viewing ray and compute the intersection of the ray with the triangle mesh by using the Möller-Trumbore algorithm. However, computing the intersection of a ray with a 3D triangle mesh is rather time-consuming.
Rather, we first back-project the original image points to 3D space given the depth map, then project the 3D points into the image using the refractive-projection model. After that, a kd-tree on the refracted projections is constructed. For each pixel in the refracted image, we search which point in the original image is projected to this pixel using the nearest neighbor algorithm such that the pixel intensity gets interpolated.
We consider the refraction rendering step optional, as it is not required for cameras behind domes that do not show refraction effects.
Iv-C Rendering Results
As it is shown in Fig. 10, (a) and (b) are the inputs from the RGB-D sensor plugin. The direct signal (c) and backscatter (d) components are computed respectively, then the simulated underwater color image (e) is constructed by the direct signal, the smoothed direct signal (forward scattering) and the backscatter. In the end, the refraction effect is added to the underwater color image to generate the final result (f).
Iv-D Integration in Gazebo UUV Platform
Gazebo is an open-source robotics simulator. It utilizes one out of four different physics engines to simulate the mechanisms and dynamics of robots. Additionally, it provides the platform for hosting sensor plugins. 
proposes the UUV Simulator which is based on Gazebo and extends Gazebo to underwater scenarios. The UUV Simulator additionally takes into account the hydrodynamic and hydrostatic forces and moments for simulating vehicle dynamics in underwater environments. Several sensor plugins which are commonly deployed on UUVs are also available, including: inertial measurement unit (IMU), magnetometer, sonar, multi-beam echo sounders and camera modules. We integrate our deep sea camera simulator into the UUV Simulator camera plugin which provides in-air and depth images as the input and it is able to reach 2Hz updating frequency using OpenMP without any GPU acceleration. The workspace interface is shown in Fig.11.
We evaluate our deep sea image simulator by comparing with three state-of-the-art methods, which also use in-air and depth images as the input to synthesize underwater images: UUV Simulator , WaterGAN  and UW_IMG_SIM . Due to the image size limitation from WaterGAN, all the evaluated images are simulated in the size of 640X480, although our method does not have this limitation. Simulation comparisons are given in Fig. 12. We create an in-air virtual scene with sand texture, and simulate the corresponding underwater images by using the different methods. We simulate a UUV with two artificial light sources which are 1m away to the camera on the left and right sides (similar in spirit to the setting in figure 1). Both lights are tilted 45 degrees towards the camera.
The UUV Simulator is only able to render the attenuation effect by the simplified model without any contact to the light source, the light cones are completely missing in their image. Their attenuation effect only considers the path from the scene points to the camera, which makes the rendered color not conform to the reality. The same problem also occurs in the WaterGAN results, due to the lack of deep sea images with depth maps and ground truth in-air images, we can only train the GAN using the training parameters as given in the official repository 111https://github.com/kskin/WaterGAN on the Port Royal, Jamaica underwater dataset222https://github.com/kskin/data. Therefore the color and the backscatter pattern of the light source is highly correlated with the training data which does not fulfill the setup in this evaluation case. UW_IMG_SIM presents the backscatter pattern of the light source. However this effect is just adding the bright spots into the image without any physical interpretation, their direct signal component also has no dependence to the light source, which also is not realistic. It is obvious that our proposed approach provides more realistic rendering results than the other methods.
This paper presents a deep sea image simulation framework which relies on (previously rendered) RGB-D in-air images as an input to synthesize underwater images. This simulator considers the effects caused by directional artificial light sources, which provides realistic rendering results in deep sea scenarios. Current underwater imaging simulation solutions are either not physically accurate, or far from real-time to be integrated to the robotic simulation platform. By detailed analyzing the deep sea image formation components, based on the Jaffe-McGlamery model, we propose several optimization strategies which enable us to achieve interactive performance and makes our deep sea imaging simulator be integrated into the Gazebo-based UUV simulator for UUV prototyping or task planning. The rendering quality is evaluated by comparing with three recent ”water-effect” methods, and it turned out that our solution produces more realistic deep sea images. This simulator can also be applied to generate datasets with ground truth for training learning based approaches and evaluating underwater computer vision algorithms.
-  A. Jordt, K. Köser, and R. Koch, “Refractive 3d reconstruction on underwater images,” Methods in Oceanography, vol. 15-16, pp. 90 – 113, 2016. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S2211122015300086
M. She, Y. Song, J. Mohrmann, and K. Köser, “Adjustment and calibration of
dome port camera systems for underwater vision,” in
German Conference on Pattern Recognition. Springer, 2019, pp. 79–92.
-  A. Sedlazeck and R. Koch, “Simulating deep sea underwater images using physical models for light attenuation, scattering, and refraction,” in VMV, 2011.
-  N. Koenig and A. Howard, “Design and use paradigms for gazebo, an open-source multi-robot simulator,” in 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(IEEE Cat. No. 04CH37566), vol. 3. IEEE, 2004, pp. 2149–2154.
-  C. D. Mobley, Light and water: radiative transfer in natural waters. Academic press, 1994.
-  J. S. Jaffe, “Computer modeling and the design of optimal underwater imaging systems,” IEEE Journal of Oceanic Engineering, vol. 15, no. 2, pp. 101–111, 1990.
-  B. McGlamery, “A computer model for underwater camera systems,” in Ocean Optics VI, vol. 208. International Society for Optics and Photonics, 1980, pp. 221–231.
-  Y. Y. Schechner and N. Karpel, “Clear underwater vision,” in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., vol. 1. IEEE, 2004, pp. I–I.
-  F. Cozman and E. Krotkov, “Depth from scattering,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1997, pp. 801–806.
-  T. Ueda, K. Yamada, and Y. Tanaka, “Underwater image synthesis from rgb-d images and its application to deep underwater image restoration,” in 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 2019, pp. 2115–2119.
-  J. Li, K. A. Skinner, R. M. Eustice, and M. Johnson-Roberson, “Watergan: Unsupervised generative network to enable real-time color correction of monocular underwater images,” IEEE Robotics and Automation letters, vol. 3, no. 1, pp. 387–394, 2017.
-  D. Akkaynak and T. Treibitz, “A revised underwater image formation model,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6723–6732.
-  T. Stephan and J. Beyerer, “Computergraphical model for underwater image simulation and restoration,” in 2014 ICPR Workshop on Computer Vision for Analysis of Underwater Imagery. IEEE, 2014, pp. 73–79.
-  M. Bryson, M. Johnson-Roberson, O. Pizarro, and S. B. Williams, “True color correction of autonomous underwater vehicle imagery,” Journal of Field Robotics, vol. 33, no. 6, pp. 853–874, 2016.
-  M. Prats, J. Perez, J. J. Fernández, and P. J. Sanz, “An open source tool for simulation and supervision of underwater intervention missions,” in 2012 IEEE/RSJ international conference on Intelligent Robots and Systems. IEEE, 2012, pp. 2577–2582.
-  M. M. M. Manhães, S. A. Scherer, M. Voss, L. R. Douat, and T. Rauschenbach, “UUV simulator: A gazebo-based package for underwater intervention and multi-robot simulation,” in OCEANS 2016 MTS/IEEE Monterey. IEEE, sep 2016. [Online]. Available: https://doi.org/10.1109%2Foceans.2016.7761080
-  O. Álvarez-Tuñón, A. Jardón, and C. Balaguer, “Generation and processing of simulated underwater images for infrastructure visual inspection with uuvs,” Sensors, vol. 19, no. 24, p. 5497, 2019.
-  N. Jerlov, “Irradiance optical classification,” Optical Oceanography, pp. 118–120, 1968.
-  D. Akkaynak, T. Treibitz, T. Shlesinger, Y. Loya, R. Tamir, and D. Iluz, “What is the space of attenuation coefficients in underwater computer vision?” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017, pp. 568–577.
-  T. J. Petzold, “Volume scattering functions for selected ocean waters,” Scripps Institution of Oceanography La Jolla Ca Visibility Lab, Tech. Rep., 1972.
-  A. S. Glassner, An introduction to ray tracing. Elsevier, 1989.
-  Y. Song, K. Köser, T. Kwasnitschka, and R. Koch, “Iterative refinement for underwater 3d reconstruction: Application to disposed underwater munitions in the baltic sea,” ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. XLII-2/W10, pp. 181–187, 2019. [Online]. Available: https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-2-W10/181/2019/