SL Sensor: An Open-Source, ROS-Based, Real-Time Structured Light Sensor for High Accuracy Construction Robotic Applications

by   Teng Foong Lam, et al.

High accuracy 3D surface information is required for many construction robotics tasks such as automated cement polishing or robotic plaster spraying. However, consumer-grade depth cameras currently found in the market are not accurate enough for these tasks where millimeter (mm)-level accuracy is required. We present SL Sensor, a structured light sensing solution capable of producing high fidelity point clouds at 5Hz by leveraging on phase shifting profilometry (PSP) codification techniques. We compared SL Sensor to two commercial depth cameras - the Azure Kinect and RealSense L515. Experiments showed that the SL Sensor surpasses the two devices in both precision and accuracy. Furthermore, to demonstrate SL Sensor's ability to be a structured light sensing research platform for robotic applications, we developed a motion compensation strategy that allows the SL Sensor to operate during linear motion when traditional PSP methods only work when the sensor is static. Field experiments show that the SL Sensor is able produce highly detailed reconstructions of spray plastered surfaces. The software and a sample hardware build of the SL Sensor are made open-source with the objective to make structured light sensing more accessible to the construction robotics community. All documentation and code is available at .



There are no comments yet.


page 7

page 12

page 14

page 15

page 16

page 17

page 18


Atlas Fusion – Modern Framework for Autonomous Agent Sensor Data Fusion

In this paper, we present our new sensor fusion framework for self-drivi...

3DUNDERWORLD-SLS: An Open-Source Structured-Light Scanning System for Rapid Geometry Acquisition

Recently, there has been an increase in the demand of virtual 3D objects...

A Fully-Integrated Sensing and Control System for High-Accuracy Mobile Robotic Building Construction

We present a fully-integrated sensing and control system which enables m...

Towards a Multispectral RGB-IR-UV-D Vision System – Seeing the Invisible in 3D

In this paper, we present the development of a sensing system with the c...

Open Source 3-D Filament Diameter Sensor for Recycling, Winding and Additive Manufacturing Machines

To overcome the challenge of upcycling plastic waste into 3-D printing f...

Slider: On the Design and Modeling of a 2D Floating Satellite Platform

In this article, a floating robotic emulation platform for a virtual dem...

Accurate Contact Localization and Indentation Depth Prediction With an Optics-based Tactile Sensor

Traditional methods to achieve high localization accuracy with tactile s...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


High accuracy 3D surface information is required for many construction robotics tasks such as automated cement polishing or robotic plaster spraying. However, consumer-grade depth cameras currently found in the market are not accurate enough for these tasks where millimeter (mm)-level accuracy is required. We present SL Sensor, a structured light sensing solution capable of producing high fidelity point clouds at 5Hz by leveraging on phase shifting profilometry (PSP) codification techniques. We compared SL Sensor to two commercial depth cameras - the Azure Kinect and RealSense L515. Experiments showed that the SL Sensor surpasses the two devices in both precision and accuracy. Furthermore, to demonstrate SL Sensor’s ability to be a structured light sensing research platform for robotic applications, we developed a motion compensation strategy that allows the SL Sensor to operate during linear motion when traditional PSP methods only work when the sensor is static. Field experiments show that the SL Sensor is able produce highly detailed reconstructions of spray plastered surfaces. The software and a sample hardware build of the SL Sensor are made open-source with the objective to make structured light sensing more accessible to the construction robotics community. All documentation and code is available at

1 Introduction

Many automated construction tasks require high accuracy millimetre (mm)-level 3D sensing to ensure successful task execution. One example is automated cement polishing where the robotic system needs to detect uneven bumps on the wall surface to determine the areas that require additional grinding. Another example is robotic plaster spraying Ercan Jenny et al. (2020), where high accuracy sensing is required to perform process monitoring and determine both the location and timing of additional surface treatment Frangez et al. (2020).

Known sensor technologies for robotic 3D sensing can be roughly sorted into stereoscopic vision, structured light and time-of-flight (ToF) Yang et al. (2019). From these, structured light is the most accurate for surface interaction Liu et al. (2020). Similar to stereoscopic vision, structured light sensors triangulate 3D geometry between camera and projector, but pixels can be matched based on additional information transported in the projection. Therefore, using multiple projections for a single scan yields sub-pixel accuracy and reduced measurement uncertainty Zhang (2018b); Rathjen (1995), but requires a strategy to handle motion between these projections. Commonly available sensors in robotic applications are therefore limited to ToF or single-projection structured light.

In this paper, we present SL Sensor 111, an open-source structured light sensor capable of producing detailed 3D scans of sub-mm accuracy in real time. Our sensor integrates with the robot’s software and therefore allows to adaptively change the projected pattern as well as the number of projections per scan. While any pattern can be used, we mainly utilise the phase shifting profilometry (PSP) codification technique, which involves shining multiple sinusoidal patterns at the measured surface, to produce a high fidelity reconstruction of the scanned area. PSP techniques usually require the sensor to be static during the entire scanning process and any movement causes artifacts in the resulting scan. A naïve strategy based on our sensor’s adaptation capability would already improve over existing sensors: Use high-accuracy PSP projections when the robot is still, and fall back to lower accuracy single-projection techniques Huang and Tang (2014); Takeda and Mutoh (1983) when the robot is moving. However, to enable high-accuracy sensing for more interactive robotic construction applications, we further propose a motion compensation strategy that allows PSP-based sensing during linear motion. Our motion compensation is based on the specific characteristic of robotic sensing: Because the robot controls its own motion, it can anticipate and compensate for this motion by adapting the projected pattern of the SL Sensor accordingly.

While there are existing open-source implementations of structured light scanning Wilm et al. (2014a); Herakleous and Poullis (2014), it is to the best of our knowledge that SL Sensor is the first open-source structured light scanner project that not only provides the software, but also a documented sample hardware build that can be made using easily available components and open-source electronics. Moreover, SL Sensor is fully compatible with the well-established ROS middleware, enabling easy integration into existing robotic systems.

2 Method

2.1 Background

In the following, we give a short introduction to the working principles of structured light scanning. The more informed reader may skip this subsection.

Structured light involves a projector shining one or more codified light patterns on the object surface. The geometry of the object deforms the pattern and the resulting scene is captured by one or more cameras. The projector can be seen as an inverse camera and the patterns help to establish distinct and accurate camera-projector correspondences regardless of whether the surface is textured or not.

Over the years, numerous light patterns that have been developed  Salvi et al. (2004). Out of these, the PSP codification technique is distinctive because of its ability to 1) establish sub-pixel accurate camera-projector correspondences 2) produce scans of high spatial resolution, which is necessary to detect minute details on construction surfaces 3) achieve high scanning speeds using hardware triggering 4) tolerate some lens defocusing and hence can work over a larger scanning range Zuo et al. (2018). Moreover, experiments from Hyun et al. (2019) has demonstrated that PSP structured light scanning works well on materials commonly found on construction sites. One of the most straightforward PSP patterns with low computational complexity is the 3+3 pattern Zuo et al. (2018). The 3+3 pattern first projects 3 high frequency sinusoidal patterns, each shifted by a phase of rad. The resulting images , can be modelled as


where is the average intensity, is the modulation intensity while is the wrapped phase. The wrapped phase can be solved using the equation


is the four-quadrant inverse tangent function and the modulo operation over is performed so that the wrapped phase values range over , which is a convention that has been used in existing PSP-related works Zhang (2018a); Larkin (2001). The phase information from the high frequency pattern contains several wraps and it is difficult to differentiate between the multiple fringes. The process of distinguishing these phase ambiguities is termed as phase unwrapping. The most straightforward way of solving this issue is to use the standard temporal phase unwrapping (TPU) technique. TPU recovers the absolute phase by projecting another 3 phase-shifted sinusoidal patterns of unit frequency. and using them to compute using the resulting images. The absolute phase can then be computed using the equations


where is the number of fringes in the high frequency patterns and rounds the value to the closest integer. As a final step, we convert the absolute phase to its corresponding horizontal or vertical projector pixel coordinate, based on whether the sinusoidal pattern propagates along the projector’s x or y axis respectively


where is the fringe wavelength in projector pixels.

2.2 Point Triangulation

The pinhole camera model is applied to both the camera and projector. The pinhole camera model maps 3D points in the world coordinate frame to 2D pixel coordinates . This gives the equations


is the scaling factor and is the intrinsic matrix. is the rotational matrix and

is the translation vector that describes the orientation and position of the world frame respectively, with respect to the camera frame.

is known as the extrinsic matrix. We define the world frame to coincide with the camera frame. Therefore,

is an identity matrix and

is a zero column vector. Given camera pixel coordinate , its corresponding or computed in the previous subsection and the projection matrices ( and ), we can establish 5 equations with 5 unknowns (, , , , ). Solving this system of equations algebraically Liu et al. (2010) provides us with the 3D triangulated point (, , ).

2.3 Motion Compensation under Linear Motion

2.3.1 Problem Formulation

Many PSP patterns including the 3+3 pattern sequence require multiple projections, making them sensitive to motion. Moving either the measured surface or the sensor would result in motion ripples in the final point cloud. There are two main approaches to enable multi-shot PSP methods to work during motion: 1) project the patterns at high speeds such that the effects of motion are negligible Zhang et al. (2010) 2) take into account motion using mathematical modelling and then take measures to compensate, minimise or mitigate its effects. For our method, we perform the latter.

Figure 1: Diagram illustrating the two problems that are required to perform motion compensation for a single 3D point in space (i.e. the red dot). Green dots show the location of captured images. A 3-step PSP vertical pattern is projected and the reference image is .

Fig. 1

illustrates the two key problems that have to be solved to successfully perform motion compensation. The first is the pixel matching problem, where we need to know how a 3D point’s location in the image shifts over time (estimating

). The second is the phase offset problem, where the movement introduces a phase error because the projector coordinates in the propagation direction of the sinusoidal pattern of a single 3D point varies over time. For a phase-shifting PSP pattern with N steps, the i-th image ( in the sequence is modelled as


Solving the full motion compensation problem is no trivial task because, as seen from eqn. 8, it involves estimating , and for all camera pixels and across all images.

2.3.2 Proposed Motion Compensation Strategy

To simplify the motion compensation problem, we first limit the motion to be linear about a single axis with no change in orientation, in the direction perpendicular to the direction of the sinusoidal pattern. By doing so, the encoded projector pixel coordinate of a 3D point is independent of the linear movement and hence no phase offset is induced Li et al. (2012).

For the pixel matching problem, most existing works model the changes in the images over time to be a 2D rigid transformation, which allows majority of the pixels to be matched by simply translating and/or rotating the images. Various methods to find the required 2D transformations have been explored, including the tracking of external markers Chen et al. (2015), tracking of visual keypoints Lu et al. (2017), object tracking Flores et al. (2018) and phase correlation image registration Wilm et al. (2015). In our case, we perform phase correlation image registration to find the single direction translations to align the images. Note that Wilm et al. (2015)

performs one image registration between every frame set and then evenly interpolates the image shifts between the individual images. For our proposed method, we perform image registration between the chosen reference image and every other image in the frame set, enabling our method to perform well even in cases of non-uniform motion.

Phase correlation image registration is a computationally efficient, frequency-domain based technique used to align two images first introduced by

Reddy and Chatterji (1996). It is robust to occlusions and hence works well for our case where the projected sinusoidal patterns cause regions in the scene to be unevenly illuminated over the captured image sequence.

We define two images and that are of size to have a displacement of between them


Let and

be the discrete Fourier transform (DFT) of

and respectively. The normalised cross-power spectrum between and is given by


where denotes the complex conjugate and is the Hadamard (elementwise) product. The required translation (,) can be found by finding the peak of ’s inverse Fourier transform (IFT)


After solving the pixel matching problem using phase correlation image registration and since no phase error is induced due to the constrained motion, the aligned images can then be processed using standard PSP algorithms to obtain a point cloud in which the effects of motion are minimised.

3 Implementation

3.1 Hardware

Figure 2: Labelled diagram of SL Sensor

The built SL Sensor is shown in Fig. 2. It contains two industrial complementary metal oxide semiconductor (CMOS) colour cameras (1440×1080 resolution) and a Digital Light Processing (DLP) projector (912×1140 resolution). The camera and projector are hardware triggered by a Versavis board Tschopp et al. (2019) that facilitates the synchronisation of pattern projection and image acquisition. The SL sensor also has an IMU that was not used for the proposed algorithms but can be used in the future for SL motion compensation strategies where a motion estimate is required Liu et al. (2018), e.g.visual-inertial odometry. The cameras and projector are positioned such that their field of views overlap when scanning an object placed 0.3-1.0m away from the sensor.

3.2 Hardware Triggering

Figure 3: Triggering schedule that allows the SL sensor to produce scans at 5Hz

In order to achieve high speed scanning, the projector and cameras are hardware triggered by the open-source Versavis board and the triggering schedule for a single scan is shown in Fig. 3. The cameras are triggered at 30Hz while the projector is triggered at 5Hz and set to project an entire pattern sequence after each trigger. Each pattern is exposed twice to ensure that each image will capture the full projection of a single pattern222Note that a full projection needs to be captured because DLP projectors display grayscale patterns using binary pulsewidth modulation Hornbeck (1997). despite any inherent image capture latencies.

On the computer’s end, projector trigger timings published over ROS by the Versavis node to identify the received images. Given that we receive a projector trigger timing for the th pattern at time and an image timestamped , we identify it as the matching image if it falls within the time range


where is the time interval between camera triggers and setting works well for our system.

3.3 Ensuring a Linear Intensity Response

The PSP decoding process assumes that the intensity of the projected light and the intensity measured by the camera have a linear relationship. To achieve this, all camera image processing options (e.g. gamma correction, automatic gain control, etc) need to be turned off. Furthermore, the projector is set to pattern mode where it displays patterns stored in its flash memory without any additional image processing. This is a better alternative to displaying images over HDMI feed where the projector automatically applies gamma correction to the images 28 that needs to be compensated for using additional steps Guo et al. (2004); Zhang (2015).

3.4 Sensor Calibration

Sensor calibration is required to obtain the intrinsics (intrinsic matrix and lens distortion coefficients) as well as extrinsics (transformation between camera and projector). We adopt the well known lens distortion model proposed by Bouguet Bouguet (2015) which consists of 5 coefficients - 3 for radial distortion and 2 for tangential distortion.

Calibration of the SL Sensor is done between each camera-projector pair separately. This is sufficient for our current use case where the depth estimation process only utilises one camera-projector pair at any given time, but it can be extended to a joint callibration sequence if future applications require it. This pairwise calibration is done with the procedure from Wilm et al. (2014b) whereby PSP patterns are projected onto a grey checkerboard calibration target. The computed shading image is used to extract the checkerboard patterns and a local homography method Moreno and Taubin (2012) is used to extract the corresponding projector coordinates from the decoded projector coordinate maps. With the checkerboard coordinates from both the camera and projector, the intrinsics and extrinsics of the two devices are estimated using OpenCV’s Camera Calibration and 3D Reconstruction library Bradski and Kaehler (2008).

The complete calibration steps for SL Sensor is as follows:

  1. Take images of the calibration board using the primary camera

  2. Perform pairwise calibration to get the primary camera intrinsics, projector intrinsics and primary camera-projector extrinsics

  3. Take images of the calibration board using the secondary camera

  4. Perform pairwise calibration to get the secondary camera intrinsics and secondary camera-projector extrinsics while fixing the projector intrinsics to those estimated in step 2

In our case, the primary camera refers to the top camera while the secondary camera is the left camera.

3.5 Software

The SL Sensor software is written under the Robot Operating System (ROS) framework. This ensures that it can be easily used with or integrated into existing robotic packages and solutions contributed by the ROS user community.

The main reconstruction pipeline is broken down into 4 main nodelets (Fig. 4). The usage of nodelets enables efficient zero copy pointer passing of images between subprocesses while still ensuring the modularity of pipeline. Parts of the Decoder and Triangulator nodelets adapt code from SLStudio software package Wilm et al. (2014a).

  1. The Image Synchroniser nodelet takes in timestamped images as well as projector trigger timings from the Versavis ROS nodelets and based on eqn. 12, groups images that belong to the same pattern sequence into a single image array for downstream processing. To enable the image synchroniser, the user must first send a service call to indicate the pattern sequence to be projected and the number of scans required. It will in turn send a service call to the Projector node to initialise the flashing of patterns by the Lightcrafter 4500 before the image grouping process is started.

  2. The Linear Motion Compensation nodelet performs phase correlation image alignment on the image array as detailed in section 2.3.2. If not required, the reconstruction pipeline can be initialised without it.

  3. The Decoder nodelet receives the captured images and converts them into horizontal and/or vertical projector coordinate maps as specified in section 2.1.

  4. The Triangulator nodelet receives the decoded projector coordinate maps and uses them to generate the final point cloud. To ensure real-time performance, the computations stated in section 2.2

    are sped up using a pre-computed determinant tensor

    Valkenburg and McIvor (1997).

Figure 4: Block Diagram of SL Sensor’s Software System

4 Sensor Evaluation

4.1 Static Tests

We compared the SL Sensor with with two commercial-grade depth cameras - the Azure Kinect and the RealSense L515. These depth cameras are commonly used in mobile robotic applications for depth sensing.

To evaluate sensor accuracy, a custom evaluation board (Fig. 5) was scanned. The board consists of 6 metallic cones mounted on a glass sheet, coated with a fine layer of non-reflective white paint. Cone fitting was performed from the 3D point clouds (10 scans per sensor) and the distances between Cone 1 and all other cones were computed. The ground truth distances were obtained using a metrology-grade GOM ATOS Core 300 structured light scanner that has an accuracy of approximately 10 m. The results of the cone fitting test are shown in Table 1. For the SL Sensor, the distances computed deviated from the ground truth by less than 1mm, with the exception of the 14 distance measured from the left camera. In general, the left camera performed slightly worse than the top camera. This is mainly because the orientation of the camera resulted in some of the cones to be only partially scanned (Fig. 6). Nevertheless, the SL sensor outperformed the other two sensors in terms of both accuracy and measurement uncertainty.

Figure 5: The custom evaluation board used for the cone fitting accuracy test
Figure 6: Point clouds obtained from the evaluated sensors of the custom evaluation board
Distance SL Sensor (Top Cam) SL Sensor (Left Cam) RealSense L515 Azure Kinect
RMSE/mm Std Dev/mm RMSE/mm Std Dev/mm RMSE/mm Std Dev/mm RMSE/mm Std Dev/mm
12 0.125 0.041 0.701 0.111 5.546 2.694 2.168 0.258
13 0.381 0.062 0.328 0.091 2.763 1.487 1.452 0.324
14 0.713 0.042 1.144 0.399 2.875 1.625 2.674 0.246
15 0.077 0.045 0.336 0.102 4.292 2.129 2.041 0.253
16 0.246 0.054 0.501 0.187 2.165 1.311 3.197 0.244
Table 1: Statistics of the measured distance between the fitted cones from the scans for the various sensors across 10 scans

To evaluate sensor precision, we took 10 consecutive static scans of a flat surface for each sensor and analysed how the measured depths varied for each pixel over the scans. This is similar to the tests done in Giancola et al. (2017); Tölgyessy et al. (2021); Zennaro et al. (2015). The results from the precision tests are plotted in Fig. 7

. The Azure Kinect’s plot has a wavy pattern where half of the pixels have close to zero measurement uncertainly while the other half experienced some variations in readings (about 0.3mm in depth standard deviation) over consecutive scans. The RealSense L515 performed worse, with several spots in the image registering depth standard deviation values of over 0.7mm. The SL sensor performed the best, as clearly seen from its predominantly dark blue plot. Notice that there is a dark border surrounding the SL sensor’s plot. These are the regions of the image where no projected light was observed and hence they were not evaluated for this test. For all three plots, we computed the mean and maximum value of the depth standard deviation over all pixels that had valid depth information (Table.

2). The SL sensor reported the smallest mean and maximum values that are almost four times lower than the values obtained from the Azure Kinect and Realsense L515. Hence, we can conclude that the SL sensor is the most precise among the sensors tested.

Figure 7: Plots of the depth standard deviation of each pixel in each device
Device Name
Depth Std.
Dev. / mm
Mean Max
Azure Kinect 0.256 1.059
RealSense L515 0.470 0.996
SL Sensor 0.070 0.251
Table 2: Table showing the mean and maximum values of the depth standard deviation over all valid pixels of the three devices

4.2 Scanning of Surface Quality Samples

Surface roughness is a key metric in determining the surface finish quality of fabricated structures in a construction site. To determine SL sensor’s potential to classify and assess surface roughness, we used it to scan four 17

17cm specimens of gypsum plaster (Fig. 8), which is a commonly used surface finishing material for building interiors. 30cm patches of the scans were then used to compute the empirical standard deviations (ESD) of orthogonal distances from the best fit planes of each extracted patch. We compare these values with the ESD values reported in Frangez et al. (2020) that measured the same samples, but with a GOM ATOS Core 300 structured light scanner (ground truth) and a Lucid Helios time-of-flight (ToF) depth camera. As seen from the results in Table 3, while there is no discernible difference in ESD values across all samples for the Helios, the SL Sensor was able to pick up on the trend of increasing roughness across the samples as confirmed by the ATOS Core scanner.

Figure 8: Pictures of the surface quality samples that were measured
Sensor ESD / mm
S1 S2 S3 S4
GOM ATOS Core 300 0.0068 0.0255 0.0532 0.1459
SL Sensor 0.0764 0.0800 0.0990 0.1546
Lucid Helios 0.91 0.88 0.87 0.94
Table 3: Table showing the ESD values obtained for the four gypsum plaster samples. The ESD values for the GOM ATOS Core 300 and Helios Lucid are those reported in Frangez et al. (2020)

4.3 Scanning along a Linear Rail

Our motion compensation strategy was first tested on a single pattern sequence where we took images of a white mask. The top camera was used for the scanning and the vertical 3+3 pattern was projected. The sensor was shifted linearly to the right by 2mm after every image. Fig. 9 shows the scans produced with and without motion compensation. It is clear that our motion compensation strategy is able to remove the distortions caused by the linear motion. We also took a cross section of the resulting depth images at the location indicated by Fig. 10. The cross section plot further affirms the effectiveness of our motion compensation strategy as it is able to remove the 1-2mm deep motion ripples found on the mask’s forehead.

Figure 9: Images of the 3D point clouds with (right) and without (left) motion compensation of the scanned mask
Figure 10: Left: Image of the scanned mask and the yellow line shows the location of the cross section Right: Cross section plot at the forehead region of the mask scans

4.4 Scanning of a Spray Plastered Wall with Adaptive Projections

To demonstrate SL Sensor’s capabilities in an actual construction robotics task, we used it to scan segments of a spray plastered wall fabricated using the methodology described in Ercan Jenny et al. . We attached the SL Sensor to a MABI Speedy 12 robotic arm which moved the sensor along a straight line trajectory at a speed of around 0.0125m/s. The sensor scanned two regions of the fabricated workpiece shown in Figure 11. We utilised the adaptive nature of our sensor and combined a horizontal and vertical trajectory over each region, switching the projected pattern and the camera used for triangulation in between these two motions. For each region, we performed a 50cm horizontal and 30cm vertical pass with motion compensated 3+3 pattern 3D scanning. The individual scans were then merged together using pairwise point-to-plane iterative closest point algorithm (ICP) from the libpointmatcher library Pomerleau et al. (2013). For comparison, we obtained scans of the two regions using a Leica Nova MS50 laser scanner (referred to as TLS in this paper) that has an accuracy of 2mm. Point clouds from both the TLS and SL Sensor were then converted into meshes using Poisson surface reconstruction Kazhdan et al. (2006). While the precision of the TLS is lower than individual SL Sensor scans, it allows us to quantify deviations of accumulated error due to point cloud registration.

Figure 11: Left: Image of the experimental setup. Right: Images of the two regions of the scanned spray plastered wall. The yellow and cyan boxes demarcate the areas scanned for the 50cm horizontal pass and 30cm vertical pass respectively. The arrows indicate the direction of scanning.

To compare the meshes from the TLS and SL Sensor, we first aligned them together using point-to-point ICP. Next, we computed the cloud-to-mesh (C2M) distance between the vertex points of the SL Sensor mesh and the TLS mesh using the 3D point cloud processing software CloudCompare 4.

The resulting meshes and C2M distance histograms are shown in Fig. 12 and Fig. 13. Note that for the heat map of C2M distances, points that deviate by less than 1mm from the TLS mesh are coloured white. Visually, the meshes from the SL Sensor match well with those produced by the TLS. More quantitative results are presented in Table 4. The table reveals that 99 of SL Sensor mesh vertices deviate by less than 2mm and 3mm from the TLS mesh for the vertical and horizontal passes respectively. A slightly larger amount of deviation is to be expected for the horizontal passes since there would be greater error accumulation from the larger number of pairwise point cloud registrations performed. This is visually shown by the deviation heat map of the R1 horizontal pass where the left portion of SL Sensor mesh matches well with the TLS mesh but we observe larger deviations on the right where the point clouds were registered last.

Another reason for large deviations between the SL Sensor and TLS meshes is the fact that the SL Sensor was able to capture minor details that the TLS could not. An example of this is shown in the A and B regions (Fig. 13) of the R2 horizontal pass. As shown in Fig. 14, the SL Sensor mesh successfully reproduced the minor edges and ridges that appear in the photos of the actual workpiece as compared to the TLS mesh where the entire area is smooth. The suspected cause of this inaccuracy in the TLS mesh is the fact that these surfaces were orientated away from the laser emitter of the TLS due to the undulating nature of the workpiece, and hence only a sparse point cloud of these regions could be obtained, resulting in the reduction of details captured.

Figure 12: Comparison of the meshes generated by the TLS and SL Sensor from the vertical pass experiments
Figure 13: Comparison of the meshes generated by the TLS and SL Sensor from the horizontal pass experiments
Figure 14: Close up view of the A and B regions labelled in Fig. 13
Scan Name Mean/mm Std. Dev/mm
0.5 Percentile
C2M Distance/mm
99.5 Percentile
C2M Distance/mm
R1 Vertical 0.01 0.53 -1.63 1.36
R2 Vertical 0.01 0.65 -1.94 1.65
R1 Horizontal 0.07 0.71 -2.43 1.83
R1 Horizontal 0.1 1.06 -2.99 2.87
Table 4: Statistics of the C2M distances computed between the TLS and SL Sensor meshes for the four scans of the spray plastered wall

5 Conclusion

In this work we introduced a new open-source structured light scanning solution. In contrast to existing sensor solutions, our SL Sensor integrates with existing robotic software over the ROS middleware framework to enable its adaptation to customised 3D scanning procedures. We described our software architecture, hardware setup and calibration procedure and verified that the sensor achieves mm-level accuracy. We compared it to commonly used sensors in robotic applications as well as commercial high-precision scanners, concluding that our sensor reaches sufficient accuracy for detailed construction applications. We further validated the effectiveness of our motion-compensation strategy enabling high-precision PSP scanning under linear motion, and showcased our sensor’s ability to switch between multiple patterns dependent on the intended robot motion in a real-world construction setting.

Future work would extend the SL Sensor’s capabilities of scanning during linear motion to arbitrary 6 DoF movements. Possible solutions could include a more robust motion compensation strategy or an adaptive pattern projection approach where we use PSP patterns when the sensor is static and switch to another pattern that is more motion tolerant when motion is detected. In addition, multi-way point cloud registration strategies can be explored to reduce error accumulation over scans and ultimately lead to a more accurate merged point cloud.

6 Acknowledgements

The authors would like to thank Johannes Pankert, Selen Ercan and Valens Frangez for the support they provided to make the experiments in this paper possible. Appreciation should also be given to Michael Riner-Kuhn who provided technical advice during the building of the SL Sensor.

7 Funding

This research was proposed and partially funded by an ETH Career Seed grant [grant no SEED-14 20-2]. It furthermore received funding from the HILTI group for research in accurate mobile construction robotics for cement polishing, and the European Union H2020 program under project PILOTING [grant no H2020-ICT-2019-2 871542].


  • J. Bouguet (2015) Camera calibration toolbox for matlab. External Links: Link Cited by: §3.4.
  • G. Bradski and A. Kaehler (2008)

    Learning opencv: computer vision with the opencv library

    ” O’Reilly Media, Inc.”. Cited by: §3.4.
  • C. Chen, Y. Cao, L. Zhong, and K. Peng (2015) An on-line phase measuring profilometry for objects moving with straight-line motion. Optics Communications 336, pp. 301–305. External Links: ISSN 0030-4018, Document, Link Cited by: §2.3.2.
  • [4] (2021) CloudCompare. Note: [GPL software] (version 2.11.1) Retrieved from Cited by: §4.4.
  • S. Ercan Jenny, E. Lloret-Fritschi, F. Gramazio, and M. Kohler (2020) Crafting plaster through continuous mobile robotic fabrication on-site. Construction Robotics 4 (3), pp. 261–271. External Links: Document Cited by: §1.
  • S. Ercan Jenny, E. Lloret-Fritschi, D. Jenny, E. Sounigo, P. Tsai, F. Gramazio, and M. Kohler (0) Robotic plaster spraying: crafting surfaces with adaptive thin-layer printing. 3D Printing and Additive Manufacturing 0 (0), pp. null. External Links: Document, Link, Cited by: §4.4.
  • J. L. Flores, G. A. Ayubi, J. M. Di Martino, O. E. Castillo, and J. A. Ferrari (2018) 3D-shape of objects with straight line-motion by simultaneous projection of color coded patterns. Optics Communications 414, pp. 185–190. External Links: ISSN 0030-4018, Document, Link Cited by: §2.3.2.
  • V. Frangez, D. Salido-Monzú, and A. Wieser (2020) Depth-camera-based in-line evaluation of surface geometry and material classification for robotic spraying. In Proceedings of the 37th International Symposium on Automation and Robotics in Construction (ISARC), pp. 693–702. External Links: Document, ISBN 978-952-94-3634-7 Cited by: §1, §4.2, Table 3.
  • S. Giancola, M. Valenti, and R. Sala (2017) A survey on 3d cameras: metrological comparison of time-of-flight, structured-light and active stereoscopy technologies. pp. 5–28. Cited by: §4.1.
  • H. Guo, H. He, and M. Chen (2004) Gamma correction for digital fringe projection profilometry. Appl. Opt. 43 (14), pp. 2906–2914. External Links: Link, Document Cited by: §3.3.
  • K. Herakleous and C. Poullis (2014) 3DUNDERWORLD-sls: an open-source structured-light scanning system for rapid geometry acquisition. CoRR abs/1406.6595. External Links: Link Cited by: §1.
  • L. J. Hornbeck (1997) Digital Light Processing for high-brightness high-resolution applications. In Projection Displays III, M. H. Wu (Ed.), Vol. 3013, pp. 27 – 40. External Links: Document, Link Cited by: footnote 2.
  • B. Huang and Y. Tang (2014) Fast 3d reconstruction using one-shot spatial structured light. In 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 531–536. Cited by: §1.
  • J. Hyun, M. G. Carmichael, A. Tran, S. Zhang, and D. Liu (2019) Evaluation of fast, high-detail projected light 3d sensing for robots in construction. In 2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA), Vol. , pp. 1262–1267. External Links: Document Cited by: §2.1.
  • M. Kazhdan, M. Bolitho, and H. Hoppe (2006) Poisson surface reconstruction. In Proceedings of the Fourth Eurographics Symposium on Geometry Processing, SGP ’06, Goslar, DEU, pp. 61–70. External Links: ISBN 3905673363 Cited by: §4.4.
  • K. G. Larkin (2001) A self-calibrating phase-shifting algorithm based on the natural demodulation of two-dimensional fringe patterns. Opt. Express 9 (5), pp. 236–253. External Links: Link, Document Cited by: §2.1.
  • Y. Li, Y. Cao, Z. Huang, D. Chen, and S. Shi (2012) A three dimensional on-line measurement method based on five unequal steps phase shifting. Optics Communications 285 (21), pp. 4285–4289. External Links: ISSN 0030-4018, Document, Link Cited by: §2.3.2.
  • K. Liu, Y. Wang, D. L. Lau, Q. Hao, and L. G. Hassebrook (2010) Dual-frequency pattern scheme for high-speed 3-d shape measurement. Opt. Express 18 (5), pp. 5229–5244. External Links: Link, Document Cited by: §2.2.
  • Y. Liu, N. Pears, P. L. Rosin, and P. Huber (2020) 3D imaging, analysis and applications. pp. 109–166. Cited by: §1.
  • Z. Liu, P. C. Zibley, and S. Zhang (2018) Motion-induced error compensation for phase shifting profilometry. Opt. Express 26 (10), pp. 12632–12637. External Links: Link, Document Cited by: §3.1.
  • L. Lu, Y. Ding, Y. Luan, Y. Yin, Q. Liu, and J. Xi (2017) Automated approach for the surface profile measurement of moving objects based on psp. Opt. Express 25 (25), pp. 32120–32131. External Links: Link, Document Cited by: §2.3.2.
  • D. Moreno and G. Taubin (2012) Simple, accurate, and robust projector-camera calibration. In 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization Transmission, Vol. , pp. 464–471. External Links: Document Cited by: §3.4.
  • F. Pomerleau, F. Colas, R. Siegwart, and S. Magnenat (2013) Comparing ICP Variants on Real-World Data Sets. Autonomous Robots 34 (3), pp. 133–148. Cited by: §4.4.
  • C. Rathjen (1995) Statistical properties of phase-shift algorithms. J. Opt. Soc. Am. A 12 (9), pp. 1997–2008. External Links: Link, Document Cited by: §1.
  • B. S. Reddy and B. N. Chatterji (1996) An FFT-based technique for translation, rotation, and scale-invariant image registration. IEEE Transactions on Image Processing 5 (8), pp. 1266–1271. External Links: Document Cited by: §2.3.2.
  • J. Salvi, J. Pagès, and J. Batlle (2004) Pattern codification strategies in structured light systems. Pattern Recognition 37 (4), pp. 827–849. Note: Agent Based Computer Vision External Links: ISSN 0031-3203, Document, Link Cited by: §2.1.
  • M. W. Takeda and K. Mutoh (1983) Fourier transform profilometry for the automatic measurement of 3-d object shapes.. Applied optics 22 24, pp. 3977. Cited by: §1.
  • [28] (2017-07) TI dlp lightcrafter™ 4500 evaluation module user’s guide. Texas Instruments. Cited by: §3.3.
  • M. Tölgyessy, M. Dekan, L. Chovanec, and P. Hubinský (2021) Evaluation of the azure kinect and its comparison to kinect v1 and kinect v2. Sensors 21 (2). External Links: Link, ISSN 1424-8220, Document Cited by: §4.1.
  • F. Tschopp, M. Riner, M. Fehr, L. Bernreiter, F. Furrer, T. Novkovic, A. Pfrunder, C. Cadena, R. Siegwart, and J. I. Nieto (2019) VersaVIS: an open versatile multi-camera visual-inertial sensor suite. CoRR abs/1912.02469. External Links: Link, 1912.02469 Cited by: §3.1.
  • R. J. Valkenburg and A. M. McIvor (1997) Accurate 3D measurement using a structured light system. In Three-Dimensional Imaging and Laser-Based Systems for Metrology and Inspection II, K. G. Harding and D. J. Svetkoff (Eds.), Vol. 2909, pp. 68 – 80. External Links: Document, Link Cited by: item 4.
  • J. Wilm, O. V. Olesen, and R. Larsen (2014a) SLStudio: open-source framework for real-time structured light. In 2014 4th International Conference on Image Processing Theory, Tools and Applications (IPTA), Vol. , pp. 1–4. External Links: Document Cited by: §1, §3.5.
  • J. Wilm, O. V. Olesen, and R. Larsen (2014b) Accurate and simple calibration of DLP projector systems. In Emerging Digital Micromirror Device Based Systems and Applications VI, M. R. Douglass, P. S. King, and B. L. Lee (Eds.), Vol. 8979, pp. 46 – 54. External Links: Document, Link Cited by: §3.4.
  • J. Wilm, O. V. Olesen, R. R. Paulsen, and R. Larsen (2015) Correction of motion artifacts for real-time structured light. In Image Analysis, R. R. Paulsen and K. S. Pedersen (Eds.), Cham, pp. 142–151. External Links: ISBN 978-3-319-19665-7 Cited by: §2.3.2.
  • S. Yang, Y. Seo, J. Kim, H. Kim, and K. Jeong (2019) Optical MEMS devices for compact 3D surface imaging cameras. Micro and Nano Systems Letters 7 (07). External Links: ISSN 2213-9621 Cited by: §1.
  • S. Zennaro, M. Munaro, S. Milani, P. Zanuttigh, A. Bernardi, S. Ghidoni, and E. Menegatti (2015) Performance evaluation of the 1st and 2nd generation kinect for multimedia applications. In 2015 IEEE International Conference on Multimedia and Expo (ICME), Vol. , pp. 1–6. External Links: Document Cited by: §4.1.
  • S. Zhang, D. V. D. Weide, and J. Oliver (2010) Superfast phase-shifting method for 3-d shape measurement. Opt. Express 18 (9), pp. 9684–9689. External Links: Link, Document Cited by: §2.3.1.
  • S. Zhang (2015) Comparative study on passive and active projector nonlinear gamma calibration. Appl. Opt. 54 (13), pp. 3834–3841. External Links: Link, Document Cited by: §3.3.
  • S. Zhang (2018a) Absolute phase retrieval methods for digital fringe projection profilometry: a review. Optics and Lasers in Engineering 107, pp. 28–37. External Links: ISSN 0143-8166, Document, Link Cited by: §2.1.
  • S. Zhang (2018b) High-speed 3d shape measurement with structured light methods: a review. Optics and Lasers in Engineering 106, pp. 119–131. External Links: ISSN 0143-8166, Document, Link Cited by: §1.
  • C. Zuo, S. Feng, L. Huang, T. Tao, W. Yin, and Q. Chen (2018) Phase shifting algorithms for fringe projection profilometry: a review. Optics and Lasers in Engineering 109, pp. 23–59. External Links: ISSN 0143-8166, Document, Link Cited by: §2.1.