Recent advances in computational imaging techniques made it possible to image around corners, which is known as non-line-of-sight (NLOS) imaging. The ability to see around corners would be beneficial in various applications. For example, the detection of objects around a corner enables autonomous vehicles to avoid collisions. Detection and localization of people without the need to go into dangerous environments make rescue operations safer and more efficient. NLOS imaging can also be used for medical applications such as endoscopy, where the region of interest is hard for the sensors to access directly (Fig. 1).
Fig. 2 shows typical scene setups for NLOS imaging techniques. While there are many techniques using the electromagnetic (EM) spectrum or acoustic waves to image around corners and through walls [Adib2013Wifi, Adib:2015, Lindell:2019:Acoustic], we focus on works that use light (electromagnetic wave in the visible and infrared spectrum). Fig. 3 shows an overview of the current state of NLOS imaging.
NLOS imaging was first proposed by Raskar and Davis [Raskar5DT], and demonstrated by Kirmani et al. [Kirmani09] for recovery of hidden planar patches. Velten et al. [Velten12] showed the first full reconstruction of a 3D object that was hidden around a corner. These works showed that ToF measurements of photons returning from the hidden scene with three bounces (Fig. 2 (a)) contain sufficient information to recover the occluded scene. Following these works, different sensing modalities were used for NLOS imaging. For example, Katz et al. [Katz14] first demonstrated the use of a speckle pattern to reconstruct 2D images. Tancik et al. [Tancik2018cosi, Tancik2018FlashPF] and Bouman et al. [Bouman17] demonstrated reconstruction and tracking of the hidden object with a traditional RGB camera.
The main challenge in NLOS imaging is that the photons reflected from the hidden object are scattered at the line-of-sight surfaces (e.g., wall and floor). Computational imaging techniques model light transport to recover the information of the hidden scene. However, their applications in practice are still limited. This paper reviews the recent advances in NLOS imaging and discusses the challenges towards real-world applications. We note that other types of imaging tasks, such as seeing through scattering media, are often also referred to as NLOS imaging, but in this paper, we use the term “NLOS imaging” to refer solely to seeing around corners.
|[Velten12, OToole:2018:ConfocalNLOS, Heide:2019:OcclusionNLOS, Gupta12, Buttafava15, Arellano17, Laurenzis14, Manna2018ErrorBA, Laurenzis:15, Jin:18, Pediredla17, Iseringhausen2018NonLineofSightRU, Adithya2019:SNLOS, Tsai17, Xin:19, tsai2019beyond, Heide14DiffuseMirror, Kadambi16, liu2019phasor_nlos, Lindell:2019:Wave]||[RGB]250,175,175 None||[RGB]250,250,175 Planar/Specular Object [chen_2019_nlos]|
|[RGB]175,250,175 Included in 3D||[RGB]175,250,175Diffraction Limited|
|Resolution [Katz14, Viswanath:18]|
|cm Resolution[Batarseh18, Beckus2018MultimodalNP]||[RGB]250,250,175 Coarse Resolution [chen_2019_nlos], Thermal [ICCP19_Maeda]|
|Occlusion Dependent [Tancik2018cosi, Tancik2018FlashPF, Sasaki18, Baradad2018InferringLF, Thrampoulidis2018ExploitingOI, Saunders2019Periscopy, Yedidia_2019_CVPR]|
|[RGB]175,250,175 cm Precision|
|[Pandharkar11, Gariepy:16, Chan17FastTracking, Chan2017NonlineofsightTO, Caramazza2018NeuralNI, Boger-Lambard18]||[RGB]250,175,175 [RGB]250,250,175 1D Distance Recovery[Batarseh18]|
|3D Tracking [Smith_2018_CVPR]||[RGB]250,250,175 Occlusion Dependent [Bouman17, Tancik2018FlashPF, Seidel2019]|
|6D Tracking [Klein-2016-Tracking], 3D localization [ICCP19_Maeda, Chandran2019]|
|Human Pose Classification [Lei_2019_CVPR]||[RGB]175,250,175 Object Type Classification [Tancik2018FlashPF, Chandran2019],|
|Human Pose Detection [ICCP19_Maeda]|
1.1 What is NLOS Imaging?
Fig. 2 illustrates different strategies to see around corners. Mathematically, NLOS imaging can be formulated as the following forward model:
where, represents the hidden scene parameters, such as albedo, motion or class of the hidden object, is the measurement, is the mapping of the hidden scene to the measurement–which depends on the choice of illumination, sensor, and the geometry of the corner–and denotes the measurement noise.
The goal of NLOS imaging is to design sensing schemes such that can be inverted, and to build algorithms such that is robustly and efficiently recovered given . Recently, data-driven approaches showed that inversion of Eq. 1 can be learned even when is not directly available [Tancik2018FlashPF].
Full Scene Reconstruction: The objective of NLOS imaging is to see around a corner as if there is a line-of-sight directly into the hidden scene. Reconstruction tasks include the recovery of the 3D, 2D, or 1D shape of the hidden object or light fields. In reconstruction,
is a direct representation of the hidden scene, such as a voxelized probability map, surface, or light transport in the hidden scene.
Direct Scene Inference:
The reconstruction procedure often requires the long acquisition and time-consuming computations. However, full scene reconstruction might not even be necessary in many scenarios–for example, detecting the presence of a moving person is sufficient for rescue operations. We define “inference” as an estimation of these latent parameters (e.g., location, class, or motion) without direct reconstruction of the hidden scene. In this case, the unknown parametercan be location, type, or movement of the hidden object. Because inference does not require full reconstruction, inference around corners can be potentially performed with less computation and measurement than full scene reconstruction.
1.2 Overview of Sensing Schemes for NLOS Imaging
We classify the existing techniques into three categories based on their sensing modalities: Time-of-flight, coherence, and intensity. Table 1 provides an overview of the current state of each method regarding NLOS imaging tasks. Section 2–4 provides the review of NLOS imaging techniques that exploit different sensing modalities, and Section 5 presents the challenges of NLOS imaging for each category.
Time-of-Flight-Based Approach (Section 2): As the photons travel from the source, the first bounce occurs on an adjoining wall or directly on the floor near the corner. The second bounce is on the hidden object around the corner. The third bounce is again on a wall or a floor in the line-of-sight region and back towards the measurement system, as shown in Fig. 2 (a). Time-resolved imaging techniques utilize the photons associated with the third bounce, which constrains possible locations of the hidden object. Section 2 reviews the reconstruction and inference algorithms for time-resolved sensing based methods.
Coherence-Based Approach (Section 3): While the information on the hidden geometry is largely lost by the diffuse reflection of photons on the relay wall, some coherence properties of light are preserved. Coherence-based methods exploit the fact that coherence preserves information about the occluded scene. Section 3 reviews NLOS imaging techniques that utilize speckle and spatial coherence of light.
Intensity-Based Approach (Section 4): Occlusions from the corner or occlusions within the hidden scene can provide rich information on the hidden scene. Using occlusions, it is possible to recover the hidden scene with a typical intensity (RGB) camera, such as a smartphone camera (Fig. 2 (c)). While many intensity-based approaches rely on occlusions, some works demonstrate NLOS imaging by exploiting surface reflectance of the relay wall or the object. Section 4 reviews the intensity-based approach.
1.3 Cameras to Image Around Corners
Different hardware has been used to see around corners. We briefly introduce such hardware to describe what sensors and illumination sources are used for NLOS imaging.
Streak Camera + Pulsed Laser: A streak camera acquires temporal information by mapping the arrival time of photons in 1D space with sweep electrodes, resulting in a 2D measurement of space and time. Streak cameras offer the best time resolution down to sub-picosecond or even 100 femtoseconds (sub-mm –0.03 mm distance resolution), among other types of ToF cameras. The disadvantages of streak cameras are the high cost (typically more than $100k), low photon efficiency, high noise level, and the need for line scanning or additional optics [Liang14]
. This results in a potentially longer acquisition time for sufficient signal-to-noise ratio (SNR). The first works on NLOS imaging used a Hamamatsu C5680 for the experiments with Kerr lens mode-locked Ti:sapphire laser (50 fs pulse width).
SPAD + Pulsed Laser: Single-photon avalanche diode (SPAD) can detect the arrival of a single photon with a time jitter of 20–100 ps (6–30 mm distance resolution). Unlike streak cameras, it is possible to have 2D SPAD arrays for capturing 3D measurements without the need for line scanning. While the temporal resolution of a SPAD is poorer than streak cameras, the photon efficiency and SNR are better. Commonly used single-pixel SPAD detectors and time-correlated single-photon counting devices are from Micro Photon Devices and Hydraharp from PicoQuant. These cost around $5k and $20-35k, respectively.
AMCW ToF Camera + Modulated Light Source: An amplitude modulated continuous wave (AMCW) ToF camera compares the phase shift of emitted and received modulated light through three or more correlation measurements [Bttgen05, Buttgen08]. The distance resolution of AMCW ToF cameras is limited by modulation frequency and the SNR. While AMCW ToF cameras can achieve few-mm resolution [Yang15], they are susceptible to strong ambient light due to the need for a longer exposure time than that for pulse-based ToF cameras. AMCW ToF cameras have a much lower cost (ranging $400–$2000) than SPADs and streak cameras and are used in commercial products, including the Microsoft Kinect and Photonic Mixer Device (PMD).
Traditional Camera: In this paper, we use “traditional camera” to refer to the sensors such as charge-coupled device (CCD) and complementary metal-oxide-semiconductor (CMOS) arrays, which measure irradiance images without ToF information. Traditional cameras can be used for both intensity and speckle measurement modalities. The combination of a diffuser and traditional camera enables the measurement of cross-correlation of temporal coherence to provide passive ToF measurement [Boger-Lambard18]. Traditional cameras are the most ubiquitous and affordable sensor discussed in this paper but do not acquire direct measurements of time-of-flight information.
Interferometer: Interference between multiple lightwaves provides depth information in m scale, which can be used for ToF-based NLOS imaging for microscopic scenes. For example, an optical coherence tomography system with temporally and spatially incoherent LED showed 10 m resolution to demonstrate NLOS reconstruction of an object as small as a coin [Xin:19]. Heterodyne interferometry, which utilizes interference of light with different wavelengths, demonstrated NLOS imaging at 70 m precision [Willomitzer:18]. A dual-phase Sagnac interferometer captures complex-valued spatial coherence functions as a change of measured intensity and is used for spatial coherence-based methods. Such an interferometer is not commercially available, and we refer readers to [RezvaniNaraghi:17] for the design and details of a dual-phase Sagnac interferometer. An interferometer is a sensitive instrument and often requires careful alignment and isolation from mechanical vibrations.
2 Time-of-Flight-Based NLOS Imaging
Among many numbers of works in NLOS imaging, ToF-based techniques are the most popular due to their ability to resolve the path length of the three-bounce photons that carry the information of the hidden scene. The ToF measurement of three-bounce photons can also be used for estimating the bidirectional reflectance distribution function (BRDF) of materials [Naik:2011]. While this review paper only considers imaging around corners, ToF measurements are also useful for seeing through scattering media [Satat:15, Kumar07, Raviv:14, Satat:16, Satat:18], analyzing light transport [Velten2013FemtophotographyCA, Kadambi:2013, Bhandari:14, Kadambi:14, Wu2014, Gariepy2015SinglephotonSL, Kadambi:16Interferometry, OToole:2017:SPAD], and novel imaging systems [Satat:16Compressive, Heshmat:16, Heshmat:18].
The Benefit of ToF Measurement As discussed in the supplement material of Velten et al. [Velten12], the change of intensity due to the displacement of a patch illustrated in Fig. 4 (a) is proportional to , where are the size, displacement and depth of a patch respectively. In contrast, the intensity change of a time-resolved measurement as illustrated in Fig. 4 (b) is proportional to . Drawing upon an example from [Velten12], the fractional change of the intensity for = 5 mm, = 5mm, z = 20 cm is 0.00003 for the traditional camera and 0.01 for the ToF camera. This example shows that intensity change is below the sensitivity of the traditional camera. Furthermore, the SNR is small because the number of three-bounce photons is small. For this reason, the existing demonstration of intensity-based NLOS imaging relies on occlusions or more specular BRDF of the wall. ToF measurement can resolve the change of the measurement to recover the 3D geometry of the hidden object.
2.1 Reconstruction Algorithms
ToF measurements provide ellipsoidal constraints on the possible object locations in the hidden scene, as illustrated in Fig. 5. Let and denote points of the wall where a photon undergoes the first and third bounces, and denote a point of the object where a photon undergoes the second bounce. Moreover, let and denote locations of the light source and the camera. Then the hidden object must be on an ellipsoid that satisfies
where and are the speed of light and time of travel respectively.
Most of the existing techniques consider discretized voxels to recover hidden geometry. This physics-based forward model maps the hidden scene to the measurement. Many works express the forward model with the ellipsoidal constraint as a linear inverse problem, which can be solved by back-projection or optimization.
where and and
denote the vectorized target voxels, ToF measurements, and noise respectively.represents the transient light transport including the ellipsoidal constraints described by Eq. 2, intensity fall-off due to the surface reflectance, and the distance between the object and the wall. The theory of light transport of multi-bounce photons is studied by Seitz et al. [Seitz:2005] for non-time-resolved imaging, and by Raskar et al. [Raskar5DT] for time-resolved imaging.
In NLOS imaging, Eq. 3 is often ill-posed, so filtering or regularization of the reconstruction is often necessary. Moreover, becomes large as and are 5D measurements (3D measurement 2D scanning) and 3D voxels in general. For example, if we have 3D measurement for scanning illumination with points, th column of has elements, which maps the voxel to the measurement. If we take 3D measurement with laser illumination spots for voxel reconstruction, is a by matrix. ToF-based methods mainly focus on efficient and robust algorithms to recover .
Recently, a forward model beyond a linear model has been studied. For example, the shape of the surface can be recovered using fewer photons by modeling photons that travel specific paths called Fermat paths [Xin:19]. Phasor-field virtual wave optics enables the modeling of full light transport, including photons that bounce more than three times, in the hidden scene [liu2019phasor_nlos].
Back-Projection: A naive way to solve Eq. 3 is to consider each voxel (an element in ), and compute the heat map of an object occupying the voxel given the measurement and light transport model . This back-projection method was exploited in the first demonstration of the 3D reconstruction of hidden objects [Velten12], and other NLOS imaging works for reconstruction [Gupta12, Buttafava15, Laurenzis14, Manna2018ErrorBA, Laurenzis:15, Jin:18]. Back-projection usually produces blurry reconstruction, as shown in Fig. 6 (b) because of the ill-posed nature of . Sharpening filters and thresholding are used to improve the reconstruction quality. Instead of considering each voxel, a probability map of can be recovered by considering the intersections of ellipsoidal constraints to perform efficient back-projection that is up to three orders of magnitude faster than naive back-projection [Arellano17]. Back-projection can be implemented optically, by illuminating and scanning along ellipsoids on the relay wall. This enables focusing the measurement to a single voxel in the hidden scene [Adithya2019:SNLOS].
Optimization: Priors on the hidden object can be incorporated to the inversion of Eq. 3 by formulating the inverse problem as minimization of a least square error with a regularizer [Gupta12, Heide14DiffuseMirror, Heide:2019:OcclusionNLOS, Kadambi16, Pediredla17]:
where encourages to follow priors of the hidden scene. Iterative algorithms such as CoSaMP [Needell10], FISTA [Beck09:FISTA], and ADMM [Boyd11] can be used to solve this optimization problem based on the priors. Enforcing priors generally results in better reconstruction quality as compared to back-projection.
While computation and memory complexity can be enormous, the optimization formulation gives flexibility in more accurate reconstruction. For example, can be factorized to consider partial occlusion and surface normal of the hidden object to recover the visibility and surface normal as well as the albedo of the hidden object [Heide:2019:OcclusionNLOS].
Confocal Imaging: In confocal imaging, the relay wall is raster-scanned as the detector collects photons from the same point as the illuminated spot [OToole:2018:ConfocalNLOS]. This makes , and the ellipsoidal constraints become spherical constraints, which makes the forward operation a 3D convolution (Fig. 7). Confocal setup relaxes the need to solve back-projection or optimization problems, and instead, a simple deconvolution solves the reconstruction problem. 3D deconvolution makes confocal imaging both memory- and computationally-efficient. Confocal imaging makes the inverse problem simple but suffers from the first-bounce photons as the detection and illumination points are the same. This issue can be mitigated by introducing a slight misalignment of the detector and illumination or time gating of SPAD. However, SNR is still limited because the single-bounce light is much stronger than the three-bounce light.
The above three methods formulate NLOS imaging as a linear inverse problem. Other problem formulations can be constructed to solve NLOS imaging from different perspectives.
Wave-Based reconstruction: Reza et al. [reza2018physical] showed that the intensity waves from modulated light in the NLOS setting can be modeled as a propagation of a wave (phasor field). The hidden scene’s impulse response can be recorded with a pulsed laser and ultrafast detector such as SPAD. The modulated light pulse can be virtually synthesized over time, using the recorded impulse response. Constructive interference of the synthesized wave appears at the hidden object. Hence, virtual propagation of a pulsed modulation on the impulse response of the hidden scene results in reconstruction [liu2019phasor_nlos]. Virtual wave optics approach to NLOS imaging models the full light transport, including more than three-bounce photons in the hidden scene. Lindel et al. [Lindell:2019:Wave] modeled the light transport of the confocal imaging system as wave propagation, where the measurement is one boundary condition. Acquiring other boundary conditions of the wave propagation with frequency-wavenumber (f-k) migration algorithm results in efficient and robust reconstruction of the hidden scene containing objects with various surface reflectance.
Inverse Rendering: A renderer can be used to model the physics-based forward model instead of analytical forward operations, as written in Eq. 3. This “synthesis-by-analysis” approach, also known as inverse rendering, changes the scene parameters such that the rendered and experimental measurements match. Inverse rendering provides more flexible reconstructions than the voxel-based reconstruction discussed above. For example, more detailed reconstruction is possible by representing the hidden object in mesh, and non-Lambertian surface reflection can be incorporated [Iseringhausen2018NonLineofSightRU]. However, accurate rendering can be time-consuming, especially because each iteration of the optimization requires computationally expensive rendering. The long run-time problem can be solved by a differential renderer that efficiently computes gradients with respect to the hidden scene parameters [tsai2019beyond].
Shape Recovery: Methods discussed above use full ToF measurement for reconstruction. However, the surface can be recovered without using all the multi-bounce photons. Tsai et al. showed that the first-returning photons provide the length of the shortest path to the hidden object, which can be used to reconstruct the boundary and surface normal of the hidden object [Tsai17]. Later, discontinuities of ToF measurement are shown to follow specific paths (Fermat paths) that give rich information about the boundary of the hidden geometry [Xin:19]. Because this approach does not rely on intensity information, it is robust to BRDF variations of the object around corners.
2.2 Inference Algorithms for Localization, Tracking, and Classification
We have reviewed algorithms to reconstruct objects around the corners. Often, reconstruction of the hidden object might not be the end goal. Instead, the inference of the properties of the hidden object, such as its location or class is sufficient. While reconstruction suffices such tasks, direct inference without reconstruction can be made more efficiently with a smaller number of measurements.
Back-Projection: The back-projection algorithm that we discussed in the reconstruction section can be used for a point localization. Instead of considering small voxels to recover the details of the object, an object can be treated as the single voxel to recover the location of the object [Gariepy:16, Chan17FastTracking]. Because the goal is not to recover the 3D shape, localization can be performed with much fewer measurements and less computation than reconstruction at a larger scale [Chan17FastTracking]. While reconstruction of the cluttered scene is challenging, tracking and size estimation of a moving object is demonstrated by Pandharka et al. [Pandharkar11].
Data-driven algorithms have become a powerful tool for pattern recognition and found promising applications in computer vision. Though data-driven approaches for full reconstruction require further theoretical development[Gregor2010LearningFA, Chen18LISTA]
, deep learning is useful for inference of unknown parameters such as object class and location. The objective of deep learning approach is to learn, where
could be hard to model explicitly. For example, neural networks with a SPAD array demonstrated the point localization and identification of a person around the corner[Caramazza2018NeuralNI]. While the imaging setup is different from a corner, Satat et al. [Satat:17] demonstrated calibration-free NLOS classification of objects behind a diffuser, where the transmissive scattering at the diffuser is similar to the reflective scattering at the relay wall.
3 Coherence-based NLOS Imaging
The challenges of NLOS imaging come from the scattering of photons on the diffuse relay wall. However, some coherent properties of light are preserved after scattering. Coherence-based methods exploit speckle patterns or spatial coherence to see around corners.
3.1 Speckle-Based NLOS Imaging:
The speckle pattern is an intensity fluctuation generated by the interference of the coherent light waves. Although a speckle pattern may seem random, the observed pattern encodes information about the hidden scene.
Reconstruction The angular correlation of the object intensity pattern is preserved in the observed speckle pattern after the scattering on the relay wall [Isaac90]. This is known as memory effect [Feng88, Freund88]:
where and are the observed speckle pattern and object intensity pattern, respectively. denotes convolution. Katz et al. [Katz14] found that the memory effect can be applied to a spatially incoherent light source such as fluorescent bulb, and demonstrated single-shot imaging through scattering, and around corners. Reconstruction of the hidden object can be performed with phase-retrieval algorithms [Shechtman2014PhaseRW, Jaganathan2015PhaseRA]. Speckle pattern can also be generated with active coherent illumination when the speckle pattern cannot be observed in passive sensing [Viswanath:18]. While this approach achieves diffraction-limited resolution, its field of view is limited because of the memory effect(order of a few degrees of the angular field of view [Katz14]). Because scattering of the diffuse wall has a similar nature as the scattering of a diffuser, speckle-based reconstruction can be used to see through scattering media as well [Bertolotti2012NoninvasiveIT].
Inference for Tracking When coherent light illuminates objects around a corner, the light scattered from the objects creates a speckle pattern on the relay wall. When the object moves, the speckle pattern moves as well. The motion of the object can be tracked by computing the cross-correlation of two images taken at different times [Smith:2017]. Smith et al. [Smith_2018_CVPR] showed that this principle applies to NLOS imaging, and demonstrated motion tracking of the multiple hidden reflectors using the active coherent illumination. The speckle-based tracking has precision less than 10 m, but is currently limited to microscopic motions because of the small field of view of the memory effect [Smith_2018_CVPR]. The data-driven approach demonstrated MNIST and pose-estimation classification from speckle patterns [Lei_2019_CVPR].
3.2 Spatial-Coherence-Based NLOS Imaging:
Spatial coherence refers to the correlation of the phase of a light wave at different observation points (while temporal coherence refers to the correlation at different observation time). Spatial coherence can be used to reconstruct the scene in the camera’s field of view [Beckus:18]. Because the spatial coherence of light is preserved through the scattering at the diffuse wall, such reconstruction techniques can be applied to NLOS imaging [Batarseh18, Beckus2018MultimodalNP]. Measurement of spatial coherence requires a unique imaging system such as the Dual-Phase Sagnac Interferometer (DuPSaI) [RezvaniNaraghi:17].
The propagation of coherence in free space can be approximated in a closed-form. This lets the spatial coherence measurement be expressed as a system of linear equations as written in Eq. 3, where illustrates the propagation of the spatial coherence function through space and scattering. This inverse problem can be solved as minimization of least square error and regularizers that incorporate priors such as sparsity and small total variation. Beckus et al. [Beckus2018MultimodalNP] showed that coherence measurement and intensity measurement could be incorporated into a single optimization problem for reconstruction.
Inference for Localization Using spatial coherence, the distance of the incoherent light source from the detector can be estimated from the phase of the spatial coherence function. While only 1D localization was demonstrated [Batarseh18], multiple measurements of the spatial coherence functions can be used to localize the light source by triangulation.
4 Intensity-Based NLOS Imaging
Velten et al. [Velten12] first studied the use of intensity measurement for NLOS imaging and showed that a traditional camera requires high sensitivity for diffuse surfaces. We illustrate this in Fig. 4, and refer readers to the supplemental material of [Velten12] for further details. To overcome this challenge, most of the existing intensity-based NLOS imaging works exploit occlusions in the scene. Before the development of NLOS imaging, occlusions have been used for computational imaging for a long time. For example, light transport from occlusions are used to synthesize images from a different point of view [Sen:2005:DP], spatially varying BRDF to capture incident light [Alldrin:2006:PLP], recover 4D light field for refocusing [Veeraraghavan:2007], and create an anti-pinhole camera to see outside the field-of-view of the camera [Torralba12:accidentalpinhole]. Tancik et al. [Tancik2018cosi, Tancik2018FlashPF] and Bouman et al. [Bouman17] first addressed the NLOS imaging problem directly. Bouman et al. showed 1D tracking, and Tancik et al. introduced the idea of using intensity-based reconstruction for NLOS imaging. We also review another class of technique that exploits non-Lambertian surface reflection.
4.1 Exploiting Occlusions
Spatially varying occlusions make it feasible to use intensity measurement since the variability of intensity due to occlusions increases the rank of the underlying ray transport matrix. Passive intensity-based NLOS imaging with inexpensive cameras has shown a real-time demonstration of seeing around corners. However, the separation of the ambient light and the signal light from the object is necessary, which often requires moving hidden objects or background subtraction.
|Ambient Light||Need of Priors|
|SPAD||[RGB]250,175,175High||[RGB]175,250,175 Robust||[RGB]175,250,175 Not Required|
|Camera||[RGB]250,250,175Medium||[RGB]250,250,175 Sensitive to|
|Strong Light||[RGB]175,250,175 Not Required|
|Camera||[RGB]175,250,175 Low||[RGB]250,175,175 Sensitive||[RGB]250,175,175Scene Geometry|
|Coherence||None||Dual Phase Sagnac|
|Interferometer||[RGB]250,250,175 Medium||[RGB]250,175,175 Sensitive||[RGB]250,175,175Scene Geometry|
|Camera||[RGB]175,250,175 Low||[RGB]250,175,175 Sensitive||[RGB]250,175,175Scene Geometry|
|Camera||[RGB]175,250,175 Low||[RGB]250,175,175 Sensitive||[RGB]250,175,175Scene Geometry|
|Camera||[RGB]175,250,175 Low||[RGB]250,175,175 Sensitive||[RGB]250,175,175Scene Geometry|
Localization: Bouman et al. [Bouman17] demonstrated practical passive tracking of hidden objects using an occlusion from a wall as illustrated in Fig. 9(a), and Tankic et al. showed that neural networks learn to exploit occlusions [Tancik2018FlashPF] as shown in Fig. 10 (b) . The occlusion in the scene can be considered to be a specific type of aperture. As shown in Fig. 9 (c), the problem can be formulated as a linear inverse problem similar to Eq. 3, where represents light transport with occlusions. With priors on the floor albedo, it is possible to estimate the 1D angular location the hidden object from a single measurement by estimating the ambient illumination [Seidel2019].
Reconstruction: More complex occlusion geometries enable reconstruction of the hidden scene, as becomes well-posed [Thrampoulidis2018ExploitingOI, Saunders2019Periscopy]. A 4D light field can be recovered from an occlusion-based inverse problem framework [Baradad2018InferringLF]. While the above reconstruction techniques assume known occlusions, the unknown occlusions can be estimated by exploiting the sparse motion of the hidden object [Yedidia_2019_CVPR]. Deep learning-based approaches showed its ability for reconstruction, tracking, and object classification to a specific scene setup, while its generalizability is yet to be explored [Tancik2018FlashPF, Tancik2018cosi, Chandran2019].
4.2 Exploiting Surface Reflectance
Another class of intensity-based NLOS imaging technique exploits the bidirectional reflectance distribution function (BRDF) of the relay wall to reconstruct the hidden scene. Specular BRDF makes the inverse problem less ill-posed. When such reflectance function of the wall is known, the light field from the hidden scene can be reconstructed [Sasaki18]. However, current demonstrations are limited to scenes without ambient light, and the wall has some specular surface reflections [Sasaki18]. Chen et al. [chen_2019_nlos] demonstrated the reconstruction of specular planar objects with active illumination. The data-driven approach showed that it is also possible to reconstruct diffuse objects with active illumination [chen_2019_nlos]. Tracking of the object using active illumination and intensity measurement can be performed by matching the simulation with the experimental measurements (inverse rendering) [Klein-2016-Tracking]. Thermal imaging benefits from the specular BRDF of long-wave IR, which was exploited for passive localization and near real-time pose detection around corners [ICCP19_Maeda].
5 Challenges and Future Directions
We reviewed major techniques for NLOS imaging in the previous sections. Here, we introduce common challenges to real-time and robust applications to inspire further research.
5.1 Properties of Different Sensing Modalities
Table 1 shows the current landscape of NLOS techniques, and Table 2 summarizes the properties of different sensing modalities. In this section, we discuss what challenges need to be solved to complete Table 1.
ToF: Since ToF-based NLOS imaging was first demonstrated, there have been significant improvements in reconstruction algorithms that became orders of magnitude faster than the originally proposed algorithm. The challenges towards practical ToF-based NLOS imaging are:
Low signal-to-background of three-bounce photons.
Estimation of line-of-sight scene parameters.
First, only a small fraction of the emitted photons are captured in the measurement. This limits the acquisition time for NLOS imaging. Many recent works still require minutes to an hour of acquisition time for diffuse objects, which limits the real-time applications. Second, the majority of the existing works treat the line-of-sight scene as a known geometry, given that the line-of-sight scene is much easier to recover than the hidden scene. However, a fully-automated procedure to recover both line-of-sight and non-line-of-sight geometry is necessary for the practical applications of NLOS imaging, where the imaging platform could be moving.
Coherence: The coherence-based approaches exploit the observations that certain coherent properties are preserved after the scattering on the diffuse wall. The challenge towards practical speckle-based methods are:
Small field-of-view due to memory effect.
Lack of depth information.
Correlography techniques exploit memory effect, which has a limited angular field of view [Katz14]. This limits its application to the macroscopic scenes such as autonomous vehicles. Speckle-based reconstruction methods recover the 2D projection of the image but do not recover the depth.
Spatial coherence-based methods do not suffer from the small field of view as the speckle-based methods, but they require sensitive interferometric detectors that are mechanically translated [Batarseh18].
Intensity: The ease of image acquisition is the main advantage of intensity-based approaches. Detection of hidden moving objects around corners using shadow was demonstrated on an autonomous wheelchair [Naser2018]. The challenge towards practical intensity-base NLOS imaging are:
Unknown occluding geometries in the hidden scene.
Separation of the background and signal photons.
Low signal-to-noise ratio.
First, the quality of passive, occlusion-based NLOS imaging heavily depends on the occluding geometries. For example, if there are no occluding geometries in the hidden scene, the reconstruction is extremely challenging [Thrampoulidis2018ExploitingOI, Saunders2019Periscopy]. However, such information about occlusions might not be available in prior. Second, ambient illumination may be much stronger than the signal light, which makes it hard to isolate the signal from the measurement. This requires active illumination, background subtraction, or motion of the hidden scene to extract the signal. Lastly, the algorithm has to be sensitive enough to capture the small signal from the hidden scene while robust enough to reject the intensity variation due to the noise.
5.2 Limitations on Reconstruction:
ToF-based NLOS imaging has a derived resolution limit. Reza et al. [reza2018physical] showed that virtual optics formulation of NLOS imaging gives the following Rayleigh lateral resolution limit:
where are the speed of light, full-width-half-max of the temporal response of the imaging system, the distance between the wall and hidden object, and diameter of the virtual aperture (scanning area) respectively. However, it is hard to evaluate resolution for some NLOS imaging techniques. The reconstruction capability of occlusion-based techniques heavily relies on scene geometry. It is essential to evaluate different methods with common datasets such as those proposed in [Klein2018AQP].
Liu et al. [Liu_2019_CVPR] showed that the hidden object becomes impossible to reconstruct depending on the direction of its surface normal. It is also shown that the reconstruction of multiple objects may fail with a simple linear light transport model when the signal from one object is stronger than the others [Heide:2019:OcclusionNLOS]. Further theoretical investigation of such limitations is necessary for the practical use of NLOS imaging.
5.3 Acquisition Time
The low signal-to-noise ratio (SNR) or signal-to-background ratio is a common issue for most NLOS imaging techniques, as discussed in the previous section. This is a fundamental problem, as the number of photons from the hidden object is typically small. This limits the possible acquisition time necessary for satisfactory reconstruction and inference. Fig. 11 summarizes reported demonstration of NLOS imaging techniques across different scales of the acquisition time, illumination power, and scene.
Active Sensing: Most of the emitted photons will not be captured by the camera. Moreover, the 3-bounce photons that contain the information about the hidden scene consist only of a small fraction of the captured photons. Current demonstrations of real-time NLOS imaging with eye-safe power level is limited to retroreflective objects. Recent work on ToF-based NLOS imaging uses a laser of up to 1W power, but yet requires 10 minutes of acquisition time [Lindell:2019:Wave]. More than 10W power is necessary to perform reconstruction under a minute with the same quality, but increasing laser power is not scalable for safety and cost. One could use light at a different wavelength, such as near-infrared radiation, with higher eye safety power. Recently, a new scanning method was proposed to focus on a single voxel on the hidden scene, which can improve the SNR for a specific region of interest [Adithya2019:SNLOS].
Passive Sensing: In many practical scenarios, most of the passively captured photons are ambient light, which does not interact with the hidden objects. This results in low SNR as well as the necessity to separate ambient photons and photons from the hidden objects. The signal photons can be isolated by exploiting the movements in the hidden scene [Bouman17], or background subtraction when a measurement without the hidden object is available. Priors on the ambient illumination can also be exploited to remove the ambient term from the measurement [Seidel2019]. Because uncontrolled ambient light is hard to model, adding an active illumination to the passive technique may improve the signal to noise ratio [Tancik2018FlashPF, Chandran2019].
5.4 Integration of Multi-Modal Techniques
The measurements from different sensing modalities can be jointly exploited for NLOS imaging. Becks et al. [Beckus2018MultimodalNP] showed that fused intensity measurement and coherence measurement in a single optimization framework produces better reconstruction results.
Insights from different approaches can be shared. For example, occlusions are not exploited in the ToF-based approaches. Occlusion-based techniques might be able to narrow down the region of interest in the hidden scene to make the ToF-based reconstruction faster.
5.5 Data-driven Approach
We refer “data-driven” approach to the use of data to model the mapping between the measurement and the hidden scenes, and to produce priors on the solution to the inverse problem.
Handcrafted priors such as total-variation are often used for NLOS reconstruction. More flexible priors or representations of the hidden objects can be learned from relevant data. Recent results on deep learning show the use of a generative model or discriminative model to enforce the learned priors in linear inverse problem [bora17a, Lunz2018]
. Theoretical connections between convolutional neural networks and dictionary learning[aberdam2019mlcsc] suggest the suitability of emerging deep learning methodology to incorporate learned representations as priors to solve inverse problems.
In recent years, deep learning has demonstrated a powerful set of techniques with applications to pattern recognition and studied for computational imaging applications such as CT and MRI [Jin17]. If there are correlations present in a dataset, then deep learning can offer powerful and flexible ways to approximate the function that maps the desired outputs from novel inputs. With a large dataset, a deep learning model can learn the mapping between a measurement and the desired inference (e.g., reconstruction and localization). The same network may also easily be adapted to learn the forward model as well by swapping the inputs and outputs and making changes to the model architecture. Potentially, a data-driven approach could offer calibration-free, efficient, and flexible solutions to NLOS imaging problems [Tancik2018FlashPF, Tancik2018cosi, chen_2019_nlos]. Fig. 10 shows the use of deep learning for localization, identification, and reconstruction around corners.
Many NLOS imaging methods attempt to find efficient and robust algorithms to solve the optimization problem shown in Eq. 3. In contrast, the data driven approach attempts to solve the following minimization problem to find the function , which approximates the mapping between the measurement , and the target (3D shape, location, class of the hidden object):
where is parameterized by and the regularizer discourages over-fitting. denotes the error between the reconstruction and ground truth.
The main problems in the data-driven approach are generalizability and explainability.
Generalizability: The potential advantage of data-driven approaches is that they can learn models that are more robust to scene variation than brittle model-based approaches. For example, a data-driven approach may handle variations in a wall reflectance, while a model-based approach is often limited to simple, diffuse reflectance. However, the key concern is data-driven approaches fail in unpredictable ways when encountering scenes that are not represented in the training data.
One approach towards generalization is to produce a sufficiently large dataset that contains rich variations such that any possible practical scenes are represented in the training dataset. However, this is ultimately limited by the ability to simulate or collect enough experimental data to sufficiently sample the parameter space. Another approach is to design flexible neural networks for specific scene types. This makes it easier for the neural networks to learn the algorithm, which robustly works if the scene type is identified correctly.
Explainability: Another challenge for data-driven approaches is that Eq. 7 produces a mapping that is difficult to interpret. Rather than incorporating a known forward model, the learned mapping is based on the statistics present in the training set and is susceptible to “hallucinating” output in order to produce a “reasonable output” with respect to the training data.
Instead of learning Eq. 7 directly, a combination of physics-based forward models and data-driven approaches could make these methods more explainable [Che2018ioannisinversenetworks]. To solve the optimization problem in Eq. 3, iterative algorithms such as ADMM and ISTA demonstrate improved performance when learning priors or forward models from a specific distribution [Gregor2010LearningFA, Yan16:ADMM-net, Chang17]. While traditional model-based approaches typically rely on a handcrafted sparsity prior, a data-driven approach could offer a prior that more accurately models the scene distribution [Lunz2018, aberdam2019mlcsc]. Furthermore, these algorithms can be embedded within the architecture of the deep learning model directly [Diamond2017deepprior]. Incorporating data-driven techniques into the traditional model-based optimization schemes show promise for NLOS imaging without sacrificing explainability.
6 Conclusions and Future Directions
We reviewed the existing NLOS imaging techniques that rely on different principles and discussed challenges towards the practical, real-time NLOS imaging. We hope that this paper will help and inspire further research toward practical NLOS imaging.
The authors thank Adithya Pediredla, Ravi Athale, and Sebastian Bauer for valuable feedback. This work is supported by DARPA REVEAL program (N0014-18-1-2894) and MIT Lab consortium funding.