The problem of Cooperative Localization (CL)  has received a fair amount of attention in the robotics community over the years [2, 3, 4, 5, 6, 7]. It is described as the ability of a team of robots to utilize interrobot measurements in order to estimate the relative pose between vehicles and consequently constrain the pose uncertainty accumulation during operation. This is particularly important in applications where there is neither access to a global positioning system, nor there is enough information in the environment to enable localization. More formally, CL is concerned with the pose estimates of a team of two or more mobile robots which use sensory data for the purpose of enhanced localization accuracy compared to individual localization without cooperation. At the core of CL is the use of a sensor that provides information about the coordinate transformation matrix between two robots. CL has been used extensively for ground, aerial , surface , even underwater  robots. In this paper we focus on the underwater domain utilizing vision.
The main motivation of this work derives from work on underwater cave mapping automation . Cave mapping traditionally is performed by human divers who survey relative distances and orientations along segments of cave line that traverse the explored parts of a cave. The cave line represents a 1D “roadmap” inside the cave. However, the line is not located at the Voronoi diagram, also called the skeleton, or the medial axis , of the cave, but where it was convenient for the cave explorers to attach the line to a fixed point. Central to this process is the estimation of the length and orientation of each segment between attachment points. The developed system can also assist in the underwater archeology domain for surveying submerged sites. The proposed approach consists of two devices used by two divers located at two points to record the distance and orientation between them; similar to the way two robots infer their relative pose. Two underwater cooperative localization sensors have been constructed that can robustly produce relative pose measurements between them. These two devices can be mounted on underwater vehicles or deployed by divers. In this paper we describe the development of the two CL nodes, and the relativepose estimation algorithm as it pertains to the underwater domain. The proposed method is an extension of the 3D bearing only cooperative localization solution proposed by Dugas et al. .
More specifically, the proposed approach employs two cameras – each equipped with two landmarks – taking images of each other in a synchronized manner. The image () from camera A contains the landmarks associated with camera B, and the image from camera B contains the landmarks associated with camera A. The two detected landmarks are registered as bearing measurements from each camera to the other system and an analytical geometrybased solution provides the full 6DoF relative pose between the two cameras . The landmarks used in this work are dive LED lights that can provide adequate illumination to be detected in a variety of conditions. In Figure 1, Camera A is placed on the ground and camera B is moved away; the experiment was conducted inside a cavern with no ambient illumination and the photo is taken by an outside observer. As can be seen in Figure 1
, underwater there are many challenges related to the image processing. In particular, in this image the two landmark lights associated with camera B generate a light beam and there are reflections on the floor and on the diver. Additional data are used to assist in the outlier rejection process.
In order to ensure the feasibility of the proposed approach to different applications, such as marine surveying and underwater cave mapping, the CL system was rigorously tested in a variety of environments ensuring great diversity of the lighting conditions. Experiments include a cavern zone, high turbidity fresh water, clear waters both during day and night time; for a detailed description please refer to Section IV-A.
The rest of this paper continues with a discussion of related work. Section III provides a detailed overview of the proposed approach. Experimental results from different deployments underwater and in laboratory controlled conditions are presented in Section IV. The paper concludes with a discussion of lessons learned and directions of future work.
Ii Related Work
The concept of CL was first introduced by Kurazume and Hirosi , and the term was first used by Rekleitis et al.. At a recent count, more than 100 papers have been published since then covering many aspects. The CL approach has been used both in 2D and 3D, for localization  and also for SLAM . Using images [15, 16], LIDAR [17, 18], and sonar . The problem of CL with “anonymous robots” was presented by Franchi et al.. More recently, Martinelli and Renzaglia  provided the fundamental equations for fusing CL estimates with inertial data.
Uncertainty propagation during CL was first given an analytical description in . The initial formulation was based on the algorithm described in  differing primarily in that robots had access to absolute orientation measurements instead of measuring their relative orientations; further studies of performance were presented by Mourikis and Roumeliotis . Dieudonne et al. proved that for arbitrary relative measurements (range, bearing, and/or orientation) among robots, deciding if CL is possible is NP-hard. From a control perspective the observability , and consistency  of the problem was studied. More recently, Nerurkar et al. and Leung et al. proposed schemes of distributing the computations among a team of robots to improve computational efficiency of the algorithm. There is also a standard dataset available online with different combinations of sensor measurements together with ground truth data .
Of particular interest is the analysis of different sensing modalities used in CL . In particular, the effect of range , versus bearing measurements has been extensively analyzed [30, 31]. In this work we focus on bearing only measurements as cameras are great protractors, providing better angle measurements compared to distance. Initially the problem was analytically solved in 2D , and then the analytical solution was extended in 3D . At the same time, Dhima et al. produced a numerical solution, clearly a more computationally expensive and less accurate formulation. The analytical solution was further used to assist the flying formation of quadrotors .
This work utilizes the analytical solution to 3D analytical solution for cooperative localization in the underwater domain, extending the beacon detection method to better account for distortions underwater and incorporating additional sensor data for outlier filtering. Cooperative localization has been verified extensively above water. This paper provides experimental verification to the bearing only cooperative localization scheme in different underwater conditions.
Iii System Overview
The proposed system consists of two sensors, termed nodes; Node A and Node B. Each node consists of a camera and two light landmarks; see Figure 2 where one node in use is annotated. The two nodes have synchronized clocks and take images facing each other – see Figure 1 – at the same time. Concurrently, each node collects inertial, magnetic, and depth data from an IMU, magnetometer, and depth sensor; these data are used to further constrain the pose and attitude of each node. Figure 3 shows the pipeline of the proposed approach, starting from the input images, to the estimated relative pose. In the following, we describe each component in detail.
Iii-a Bearing only Cooperative Localization
For a complete treatment of the bearing only CL in 2D and 3D, please refer to the work of Giguere et al. and Dugas et al., respectively. For completeness sake, an outline of the approach will be presented next. The 2D case is based on the idea that the bearing measurement of the two landmarks, from camera A () constrains its position on a circle of fixed radius; the bearing measurement from camera B () constrains the position on a line; see Figure 4 for an illustration. For the 3D case, the main observation is that the collinearity of each camera and its landmarks produces a line; that line and a point, defined by the other camera, define a plane. Therefore, two cameras and two landmarks define a plane, and the 3D pose estimation can be performed utilizing the 2D constraints. More specifically, the relative pose between and can be analytically calculated by using two images ( taken by and taken by ) recorded at the same time111The devices are synchronized at the beginning of the experiment (before submerging) by utilizing a Network Time Protocol (NTP) over an adhoc Wi-Fi network, thus making it possible to extract images taken at the same time.; see Figure 4 for the relationship between the coordinate systems of the two nodes. From these two images, the following data is obtained. First we extract two angles and :
from image : , which is the angle between markers and about ;
from image : , the angle between the line passing through the origins of and relative and the optical axis of , where the locations of and are used to approximate the position of .
With these two angles and and the known distance between markers on a robot, a closedform solution yields the distance between the cameras :
An important fact pertaining to such an approach is that the majority of the uncertainty in the system will be on this distance . This noisy distance estimate can be improved by performing the same computation described in Eq. (1) a second time, by extracting from and from , and averaging the computed ’s. The relative position [,,
] between cameras is then derived by extending the vector going fromto the location of in the image frame to a length of exactly . Sufficient information is contained in the two images and to recover uniquely the relative orientation between the two vehicles. It corresponds to a rotation matrix that:
aligns the perceived plane containing , and with the perceived plane containing , its right marker and the other camera ; and
aligns the perceived vectors in and in in opposite directions.
Iii-B Underwater Vision for Accurate Blob Detection
Vision processing underwater is much more challenging than in air due to light scattering from suspended plankton and other matter, which causes blurring and “snow” effects; loss of contrast; and loss of color information with depth. Moreover, the visibility conditions change with the time of the day, and the currents. The proposed approach was tested in different conditions as can be seen in Section IV Figure 5(a-d). The influence of underwater conditions such as color loss , blurring, and illumination changes has been studied by Oliver et al..
We propose a detection method which accounts for the distortions of light underwater by estimating the positions of markers from the visible cone of light they produce. Each image is converted to the HSV color space then thresholded. The threshold values are custom based on the environment as can be seen in Figure 5(a-d,i-l) where the lighting conditions are clearly different. The next step in each binary image is to identify the different blobs of light. First, morphological closing is applied and distinct regions are extracted from the binary image using contour detection. Then at the two ends of the bounding rectangle of each detected region the centroids of the brightest pixels are selected and compared to each other. The brightest side is assumed to be the one closest to the illuminating landmark. This is also apparent from observing not only the images presented in Figure 5(a-d,i-l) but also the external observer view in Figure 1. In the case that a marker is not distorted significantly, both centroids are approximately equal to the center of the region. The above procedure results in a small number of landmark candidates. In particular during operation inside a cave environment or during the night where there is no ambient light the divers carry additional lights, this results in additional blobs detected; see Figure 5(a), where there are three lights, and the corresponding Figure 5(e), where there are three blobs detected. Next the outlier rejection techniques are outlined which output the most plausible pair of landmarks for each image. The two pairs are then processed as described in the previous section and the relative pose is created.
Iii-C Outlier Rejection
A verification test, unique to our approach, is applied to all candidate pairs of markers in and . As mentioned in Section III-A, there are two ways to calculate the distance : either by using from 2 candidate landmarks in and from the average (mid-point) of 2 marker candidates in , or by doing the converse. The validity of a set of candidate markers is determined by the difference between the two estimates of . Since in Eq. (1) is a closed-form solution, its computation time is low (less than on a standard computer).
However, if many outliers are present in the images, additional data from magnetometer, IMU, and depth sensors are used to eliminate all outliers. Contrary to most robotic applications where the presence of motors makes the magnetometer’s measurements unreliable, in the proposed CL technique, the magnetic field is used to identify the azimuth of each node and to estimate the relative yaw between the nodes. In addition, the IMU is utilized to infer the roll and pitch of each device using measurements from the accelerometers. Lastly, depth sensor data provides an estimate of the relative depth. The collected measurements are then used to eliminate erroneous pairs of landmarks that appear as false positives in the previous processing. See for example Figure 5(e) where three candidate markers were identified by the blob detection process, but the correct two markers were chosen by the outlier rejection system.
Iv Experimental Results
Extensive experiments were conducted in different locations to ensure the robustness of the system. In the following, first we present the hardware used and then describe the locations where the experiments were performed. Experimental results from different locations are described next and finally, we present a quantitative study conducted in our lab, using identical hardware while measuring the ground truth.
Iv-a Experimental Setup
Two underwater nodes were constructed using a custom case waterproof case capable of reaching more than depth. The processing is based on a Raspberry Pi 3 computer connected to a Raspberry Pi Camera Module v2, a Pololu MinIMU-9 v3 IMU, and a Bar30 High-Resolution depth sensor. The design intentionally kept the cost low to ensure the adoption of the system by the underwater cave exploration and marine archeology communities. Two aluminum bars are rigidly attached, and two dive lights are attached on them; see Figure 6 for the general appearance of the system. During experiments, the two landmark lights were kept at distance, however, they could be mounted in different position varying the spacing in between and .
In order to verify the versatility of the developed approach, the system was tested in a wide range of environments. In line with the primary application, the ballroom cavern at Ginnie Springs Florida was used to emulate a cave environment. No ambient light and clear water characterize this testbed; see Figure 1. The nodes were tested at a depth of to . Similar conditions were encountered during tests at a night dive over the coral reefs of Barbados. The effect of ambient light was tested in three other scenarios. First, at the high turbidity waters of Lake Murray in South Carolina. The landmark lights produced long cones of illumination that needed to be addressed. The clear waters of Barbados’ coral reefs and the spring fed waters of the basin outside the cavern at Ginnie Springs, FL, during the day produced a different set of challenges due to several false positives produced by the caustic patterns.
Iv-B Underwater Tests
Different trajectories were tested in different environments. In all experiments, was kept stationary and was moved. The magnetic and inertial data are used to set the attitude of the stationary node. Note that, even though it was held to the ground, water movement had an effect especially on the nodes attitude. Figure 7(a) presents a small segment of a trajectory inside the Ballroom cavern, where was moved back, and then moved in a circle to test different depths and orientations. The main challenge we observe with this dataset was the existence of additional light sources and sometimes reflection at the cave walls. Figure 7(b) displays a trajectory collected during a night dive over a coral reef in Barbados. Due to the clear waters, the system was able to detecting the landmarks and produce the 3D relative pose, while was moved in different patterns. Figure 7(c) has a short trajectory collected in Lake Murray, SC. While this was during a bright day, the visibility was really low due to particulates in the water, thus the landmarks disappeared after a short distance, even to the human eye. Finally, Figure 7(d) displays a longer trajectory (approximately ) collected at the basin fed from the clear waters of Ginnie Springs (just outside the cavern). The challenge here came from the caustic patterns at the bottom. However, as discussed earlier the outlier rejection ensures the correct pair of landmarks is selected.
To verify the effectiveness of using sensors other than camera for additional outlier rejection, the number of correct marker selections was counted during both camera-only rejection and rejection using additional sensor data. During underwater tests, the camera-only rejection made correct detections while the full system was correct.
Because it is difficult to obtain an accurate ground truth pose estimate underwater, the CL system was compared to AR tag based cooperative localization  for quantitative validation underwater. Two AR tags were attached to each node and used for relative pose calculation. The results of CL are compared to AR tag detection in Figure 8. For tag size in the same scale as the sensors, the AR tags were less robust than the CL measurements with many missed estimates. Several outliers appear in the results of bearing only CL. This usually occurs when one beacon exits the camera’s FoV in which case the outlier rejection does not have enough information to make the correct decision.
Iv-C Ground Truth above water
The identical hardware setup without the lights and depth sensor has been recreated for testing in the lab while establishing ground truth; see Figure 9. The different components have been tested separately, including the IMU parameters and the performance of the magnetometer. They were placed apart in fixed positions and the distance between them was measured using a measuring tape. AR tags were also used to calculate relative pose as validation. Figure 10 presents a plot of the error between the calculated distance and the measured distance as a function of the measured distance between them for both CL and AR tag based estimation. The error is bounded within when the two nodes were moved from to .
An analytical solution for 3D bearing only cooperative localization was augmented to operate underwater with the addition of IMU, magnetometer, and depth sensors. Challenging underwater conditions highlighted the effect of particulates in the water. As can be seen in most underwater images, the lights produced a beam with the brightest part at the source but with significant brightness all around. In addition reflections and the presence of other light sources produced several initial false positive blob detections; however, the outlier rejection introduced in this paper has been proven to be robust and ensures accurate pose estimates.
For improved incorporation of IMU, magnetometer, and depth sensors in the future, the sensor data will be used in a multi-sensor fusion system rather than simply for filtering of candidate markers. This will allow a more fluid relative pose estimate and improve outlier rejection.
We are currently discussing a collaboration with divers from the Woodville Karst Plain Project (WKPP)222http://www.wkpp.org/ for deploying the proposed system in the Turner Sink cave system in Florida. Future work will consider human factors for the deployment of this technology . The spacing between the landmark lights is crucial for achieving better accuracy over further distances. However, the turbidity of the water introduces additional constraints. We plan to analyze the relationship between landmark displacement and range to achieve the optimal arrangement for different visibility environments. Furthermore, we are considering deploying this system at a marine archeology dig in Greece to study its effectiveness.
Deploying the proposed setup on different AUVs or on one AUV and a fixed point will enable the creation of a motion capture system underwater. Such an experimental setup will introduce much needed ground truth estimates for underwater applications.
The authors would like to thank the National Science Foundation for its support (NSF 1513203, 1637876). We would like to thank fellow diver Lisa JongSoon Goodlin for assisting in the Ginnie Springs dives and collecting some of the videos.
-  I. M. Rekleitis, G. Dudek, and E. E. Milios, “On multiagent exploration,” in Proc. Vision Interface, 1998, pp. 455–461.
-  R. Kurazume, S. Nagata, and S. Hirose, “Cooperative positioning with multiple robots,” in Proc. ICRA, vol. 2, 1994, pp. 1250–1257.
-  W. Burgard, M. Moors, D. Fox, R. Simmons, and S. Thrun, “Collaborative multi-robot exploration,” in Proc. ICRA, vol. 1, 2000, pp. 476–481.
-  I. Rekleitis, G. Dudek, and E. Milios, “Multi-robot collaboration for robust exploration,” Ann. Math. Artif. Intell., vol. 31, no. 1-4, pp. 7–40, 2001.
-  S. I. Roumeliotis and G. A. Bekey, “Distributed multirobot localization,” IEEE Trans. Robot. Autom., vol. 18, no. 5, pp. 781–795, 2002.
-  A. Mourikis and S. Roumeliotis, “Performance analysis of multirobot cooperative localization,” IEEE Trans. Robot., vol. 22, no. 4, pp. 666–681, 2006.
-  K. Y. K. Leung, T. D. Barfoot, and H. H. T. Liu, “Decentralized localization of sparsely-communicating robot networks: A centralized-equivalent approach,” IEEE Trans. Robot., vol. 26, pp. 62–77, 2010.
-  I. Rekleitis, P. Babin, A. DePriest, S. Das, O. Falardeau, O. Dugas, and P. Giguere, “Experiments in quadrotor formation flying using on-board relative localization,” in Vision-based Control and Navigation of Small, Lightweight UAVs Workshop, IROS, 2015.
-  G. Papadopoulos, M. F. Fallon, J. J. Leonard, and N. M. Patrikalakis, “Cooperative localization of marine vehicles using nonlinear state estimation,” in Proc. IROS, 2010, pp. 4874–4879.
-  A. Bahr, J. J. Leonard, and M. F. Fallon, “Cooperative localization for autonomous underwater vehicles,” Int. J. Robot. Res., vol. 28, no. 6, pp. 714–728, 2009.
-  N. Weidner, S. Rahman, A. Q. Li, and I. Rekleitis, “Underwater cave mapping using stereo vision,” in Proc. ICRA, 2017, pp. 5709 – 5715.
-  K. Siddiqi, S. Bouix, A. Tannenbaum, and S. W. Zucker, “Hamilton-Jacobi skeletons,” Int. J. Comp. Vision, vol. 48, no. 3, pp. 215–231, 2002.
-  O. Dugas, P. Giguère, and I. Rekleitis, “6DoF Camera Localization for Mutually Observing Robots,” in Proc. ISRR, 2013.
-  R. Kurazume, S. Hirose, S. Nagata, and N. Sashida, “Study on cooperative positioning system (basic principle and measurement experiment),” in Proc. ICRA, vol. 2, 1996, pp. 1421–1426.
-  G. Dudek, M. Jenkin, E. Milios, and D. Wilkes, “A taxonomy for multi-agent robotics,” Auton. Robot., vol. 3, pp. 375–397, 1996.
-  D. Fox, W. Burgard, H. Kruppa, and S. Thrun., “Collaborative multi-robot localization,” in Proc. 23rd Ann. German Conf. Artificial Intell. (KI), 1999, pp. 255–266.
-  A. Howard, M. Mataric, and G. Sukhatme, “Localization for mobile robot teams using maximum likelihood estimation,” in Proc. IROS, 2002, pp. 434–59.
-  I. Rekleitis, G. Dudek, and E. Milios, “Probabilistic cooperative localization and mapping in practice,” in Proc. ICRA, 2003, pp. 1907–1912.
-  R. Grabowski, L. E. Navarro-Serment, C. J. J. Paredis, and P. Khosla, “Heterogenouse teams of modular robots for mapping and exploration,” Auton. Robot., vol. 8, no. 3, pp. 293–308, 2000.
-  A. Franchi, P. Stegagno, and G. Oriolo, “Probabilistic mutual localization in multi-agent systems from anonymous position measures,” in Proc. CDC, 2010, pp. 6534–6540.
-  A. Martinelli and A. Renzaglia, “Cooperative visual-inertial sensor fusion: Fundamental equations,” in Proc. Int. Symp. Multi-Robot and Multi-Agent Systems (MRS), 2017.
-  S. I. Roumeliotis and I. M. Rekleitis, “Propagation of uncertainty in cooperative multirobot localization: Analysis and experimental results,” Auton. Robot., vol. 17, no. 1, pp. 41–54, 2004.
-  Y. Dieudonné, O. Labbani-Igbida, and F. Petit, “Deterministic robot-network localization is hard,” IEEE Trans. Robot., vol. 26, no. 2, pp. 331–339, 2010.
-  A. Cristofaro and A. Martinelli, “3D cooperative localization and mapping: Observability analysis,” in Proc. Amer. Control Conf., 2011, pp. 1630–1635.
-  G. P. Huang, N. Trawny, A. I. Mourikis, and S. I. Roumeliotis, “On the consistency of multi-robot cooperative localization,” in Proc. RSS, 2009, pp. 65–72.
E. D. Nerurkar, S. I. Roumeliotis, and A. Martinelli, “Distributed maximum a posteriori estimation for multi-robot cooperative localization,” inProc. ICRA, 2009, pp. 1402–1409.
-  K. Y. K. Leung, Y. Halpern, T. D. Barfoot, and H. H. T. Liu, “The utias multi-robot cooperative localization and mapping dataset,” Int. J. Robot. Res., vol. 30, no. 8, pp. 969–974, 2011.
-  I. Rekleitis, G. Dudek, and E. Milios, “Multi-Robot Cooperative Localization: A Study of Trade-offs Between Efficiency and Accuracy,” in Proc. IROS, 2002, pp. 2690–2695.
-  N. Trawny, X. S. Zhou, and S. I. Roumeliotis, “3D relative pose estimation from six distances,” in Proc. RSS, 2009, pp. 233–240.
-  X. Zhou and S. Roumeliotis, “Determining the robot-to-robot 3D relative pose using combinations of range and bearing measurements: 14 minimal problems and analytical solutions to 3 of them,” in Proc. IROS, 2010, pp. 2983–2990.
-  ——, “Determining 3D relative transformations for any combination of range and bearing measurements,” IEEE Trans. Robot., vol. 29, no. 2, pp. 458–474, 2012.
-  P. Giguere, I. Rekleitis, and M. Latulippe, “I see you, you see me: Cooperative Localization through Bearing-Only Mutually Observing Robots,” in Proc. IROS, 2012, pp. 863–869.
-  V. Dhiman, J. Ryde, and J. J. Corso, “Mutual localization: Two camera relative 6-dof pose estimation from reciprocal fiducial observation,” in Proc. IROS, 2013, pp. 1347–1354.
-  S. Skaff, J. Clark, and I. Rekleitis, “Estimating surface reflectance spectra for underwater color vision,” in Proc. Brit. Mach. Vision Conf. (BMVC), 2008, pp. 1015–1024.
-  K. Oliver, W. Hou, and S. Wang, “Image feature detection and matching in underwater conditions,” in Proc. SPIE 7678, Ocean Sensing and Monitoring II, 2010.
-  S. Niekum. (2016) Ros wrapper for alvar, an open source ar tag tracking library. [Online]. Available: wiki.ros.org/ar˙track˙alvar
-  X. Wu, R. E. Stuck, I. Rekleitis, and J. M. Beer, “Towards a framework for human factors in underwater robotics,” in Proc. Human Factors and Ergon. Soc. Int. Ann. Meeting, vol. 59, 2015, pp. 1115–1119.