1 Introduction
Recent advances in computational imaging techniques made it possible to image around corners, which is known as nonlineofsight (NLOS) imaging. The ability to see around corners would be beneficial in various applications. For example, the detection of objects around a corner enables autonomous vehicles to avoid collisions. Detection and localization of people without the need to go into dangerous environments make rescue operations safer and more efficient. NLOS imaging can also be used for medical applications such as endoscopy, where the region of interest is hard for the sensors to access directly (Fig. 1).
Fig. 2 shows typical scene setups for NLOS imaging techniques. While there are many techniques using the electromagnetic (EM) spectrum or acoustic waves to image around corners and through walls [Adib2013Wifi, Adib:2015, Lindell:2019:Acoustic], we focus on works that use light (electromagnetic wave in the visible and infrared spectrum). Fig. 3 shows an overview of the current state of NLOS imaging.
NLOS imaging was first proposed by Raskar and Davis [Raskar5DT], and demonstrated by Kirmani et al. [Kirmani09] for recovery of hidden planar patches. Velten et al. [Velten12] showed the first full reconstruction of a 3D object that was hidden around a corner. These works showed that ToF measurements of photons returning from the hidden scene with three bounces (Fig. 2 (a)) contain sufficient information to recover the occluded scene. Following these works, different sensing modalities were used for NLOS imaging. For example, Katz et al. [Katz14] first demonstrated the use of a speckle pattern to reconstruct 2D images. Tancik et al. [Tancik2018cosi, Tancik2018FlashPF] and Bouman et al. [Bouman17] demonstrated reconstruction and tracking of the hidden object with a traditional RGB camera.
The main challenge in NLOS imaging is that the photons reflected from the hidden object are scattered at the lineofsight surfaces (e.g., wall and floor). Computational imaging techniques model light transport to recover the information of the hidden scene. However, their applications in practice are still limited. This paper reviews the recent advances in NLOS imaging and discusses the challenges towards realworld applications. We note that other types of imaging tasks, such as seeing through scattering media, are often also referred to as NLOS imaging, but in this paper, we use the term “NLOS imaging” to refer solely to seeing around corners.
ToF  
Coherence  Intensity  
3D  
Reconstruction  [RGB]175,250,175mcm Resolution  
[Velten12, OToole:2018:ConfocalNLOS, Heide:2019:OcclusionNLOS, Gupta12, Buttafava15, Arellano17, Laurenzis14, Manna2018ErrorBA, Laurenzis:15, Jin:18, Pediredla17, Iseringhausen2018NonLineofSightRU, Adithya2019:SNLOS, Tsai17, Xin:19, tsai2019beyond, Heide14DiffuseMirror, Kadambi16, liu2019phasor_nlos, Lindell:2019:Wave]  [RGB]250,175,175 None  [RGB]250,250,175 Planar/Specular Object [chen_2019_nlos]  

[RGB]175,250,175 Included in 3D  [RGB]175,250,175Diffraction Limited  
Resolution [Katz14, Viswanath:18]  
cm Resolution[Batarseh18, Beckus2018MultimodalNP]  [RGB]250,250,175 Coarse Resolution [chen_2019_nlos], Thermal [ICCP19_Maeda]  
Occlusion Dependent [Tancik2018cosi, Tancik2018FlashPF, Sasaki18, Baradad2018InferringLF, Thrampoulidis2018ExploitingOI, Saunders2019Periscopy, Yedidia_2019_CVPR]  

[RGB]175,250,175 cm Precision  
[Pandharkar11, Gariepy:16, Chan17FastTracking, Chan2017NonlineofsightTO, Caramazza2018NeuralNI, BogerLambard18]  [RGB]250,175,175 [RGB]250,250,175 1D Distance Recovery[Batarseh18]  
3D Tracking [Smith_2018_CVPR]  [RGB]250,250,175 Occlusion Dependent [Bouman17, Tancik2018FlashPF, Seidel2019]  
6D Tracking [Klein2016Tracking], 3D localization [ICCP19_Maeda, Chandran2019]  
Classification  [RGB]175,250,175 Human  
Identification [Caramazza2018NeuralNI]  [RGB]175,250,175MNIST,  
Human Pose Classification [Lei_2019_CVPR]  [RGB]175,250,175 Object Type Classification [Tancik2018FlashPF, Chandran2019],  
Human Pose Detection [ICCP19_Maeda] 
1.1 What is NLOS Imaging?
Fig. 2 illustrates different strategies to see around corners. Mathematically, NLOS imaging can be formulated as the following forward model:
(1) 
where, represents the hidden scene parameters, such as albedo, motion or class of the hidden object, is the measurement, is the mapping of the hidden scene to the measurement–which depends on the choice of illumination, sensor, and the geometry of the corner–and denotes the measurement noise.
The goal of NLOS imaging is to design sensing schemes such that can be inverted, and to build algorithms such that is robustly and efficiently recovered given . Recently, datadriven approaches showed that inversion of Eq. 1 can be learned even when is not directly available [Tancik2018FlashPF].
Full Scene Reconstruction: The objective of NLOS imaging is to see around a corner as if there is a lineofsight directly into the hidden scene. Reconstruction tasks include the recovery of the 3D, 2D, or 1D shape of the hidden object or light fields. In reconstruction,
is a direct representation of the hidden scene, such as a voxelized probability map, surface, or light transport in the hidden scene.
Direct Scene Inference:
The reconstruction procedure often requires the long acquisition and timeconsuming computations. However, full scene reconstruction might not even be necessary in many scenarios–for example, detecting the presence of a moving person is sufficient for rescue operations. We define “inference” as an estimation of these latent parameters (e.g., location, class, or motion) without direct reconstruction of the hidden scene. In this case, the unknown parameter
can be location, type, or movement of the hidden object. Because inference does not require full reconstruction, inference around corners can be potentially performed with less computation and measurement than full scene reconstruction.1.2 Overview of Sensing Schemes for NLOS Imaging
We classify the existing techniques into three categories based on their sensing modalities: Timeofflight, coherence, and intensity. Table 1 provides an overview of the current state of each method regarding NLOS imaging tasks. Section 2–4 provides the review of NLOS imaging techniques that exploit different sensing modalities, and Section 5 presents the challenges of NLOS imaging for each category.
TimeofFlightBased Approach (Section 2): As the photons travel from the source, the first bounce occurs on an adjoining wall or directly on the floor near the corner. The second bounce is on the hidden object around the corner. The third bounce is again on a wall or a floor in the lineofsight region and back towards the measurement system, as shown in Fig. 2 (a). Timeresolved imaging techniques utilize the photons associated with the third bounce, which constrains possible locations of the hidden object. Section 2 reviews the reconstruction and inference algorithms for timeresolved sensing based methods.
CoherenceBased Approach (Section 3): While the information on the hidden geometry is largely lost by the diffuse reflection of photons on the relay wall, some coherence properties of light are preserved. Coherencebased methods exploit the fact that coherence preserves information about the occluded scene. Section 3 reviews NLOS imaging techniques that utilize speckle and spatial coherence of light.
IntensityBased Approach (Section 4): Occlusions from the corner or occlusions within the hidden scene can provide rich information on the hidden scene. Using occlusions, it is possible to recover the hidden scene with a typical intensity (RGB) camera, such as a smartphone camera (Fig. 2 (c)). While many intensitybased approaches rely on occlusions, some works demonstrate NLOS imaging by exploiting surface reflectance of the relay wall or the object. Section 4 reviews the intensitybased approach.
1.3 Cameras to Image Around Corners
Different hardware has been used to see around corners. We briefly introduce such hardware to describe what sensors and illumination sources are used for NLOS imaging.
Streak Camera + Pulsed Laser: A streak camera acquires temporal information by mapping the arrival time of photons in 1D space with sweep electrodes, resulting in a 2D measurement of space and time. Streak cameras offer the best time resolution down to subpicosecond or even 100 femtoseconds (submm –0.03 mm distance resolution), among other types of ToF cameras. The disadvantages of streak cameras are the high cost (typically more than $100k), low photon efficiency, high noise level, and the need for line scanning or additional optics [Liang14]
. This results in a potentially longer acquisition time for sufficient signaltonoise ratio (SNR). The first works on NLOS imaging used a Hamamatsu C5680 for the experiments with Kerr lens modelocked Ti:sapphire laser (50 fs pulse width).
SPAD + Pulsed Laser: Singlephoton avalanche diode (SPAD) can detect the arrival of a single photon with a time jitter of 20–100 ps (6–30 mm distance resolution). Unlike streak cameras, it is possible to have 2D SPAD arrays for capturing 3D measurements without the need for line scanning. While the temporal resolution of a SPAD is poorer than streak cameras, the photon efficiency and SNR are better. Commonly used singlepixel SPAD detectors and timecorrelated singlephoton counting devices are from Micro Photon Devices and Hydraharp from PicoQuant. These cost around $5k and $2035k, respectively.
AMCW ToF Camera + Modulated Light Source: An amplitude modulated continuous wave (AMCW) ToF camera compares the phase shift of emitted and received modulated light through three or more correlation measurements [Bttgen05, Buttgen08]. The distance resolution of AMCW ToF cameras is limited by modulation frequency and the SNR. While AMCW ToF cameras can achieve fewmm resolution [Yang15], they are susceptible to strong ambient light due to the need for a longer exposure time than that for pulsebased ToF cameras. AMCW ToF cameras have a much lower cost (ranging $400–$2000) than SPADs and streak cameras and are used in commercial products, including the Microsoft Kinect and Photonic Mixer Device (PMD).
Traditional Camera: In this paper, we use “traditional camera” to refer to the sensors such as chargecoupled device (CCD) and complementary metaloxidesemiconductor (CMOS) arrays, which measure irradiance images without ToF information. Traditional cameras can be used for both intensity and speckle measurement modalities. The combination of a diffuser and traditional camera enables the measurement of crosscorrelation of temporal coherence to provide passive ToF measurement [BogerLambard18]. Traditional cameras are the most ubiquitous and affordable sensor discussed in this paper but do not acquire direct measurements of timeofflight information.
Interferometer: Interference between multiple lightwaves provides depth information in m scale, which can be used for ToFbased NLOS imaging for microscopic scenes. For example, an optical coherence tomography system with temporally and spatially incoherent LED showed 10 m resolution to demonstrate NLOS reconstruction of an object as small as a coin [Xin:19]. Heterodyne interferometry, which utilizes interference of light with different wavelengths, demonstrated NLOS imaging at 70 m precision [Willomitzer:18]. A dualphase Sagnac interferometer captures complexvalued spatial coherence functions as a change of measured intensity and is used for spatial coherencebased methods. Such an interferometer is not commercially available, and we refer readers to [RezvaniNaraghi:17] for the design and details of a dualphase Sagnac interferometer. An interferometer is a sensitive instrument and often requires careful alignment and isolation from mechanical vibrations.
2 TimeofFlightBased NLOS Imaging
Among many numbers of works in NLOS imaging, ToFbased techniques are the most popular due to their ability to resolve the path length of the threebounce photons that carry the information of the hidden scene. The ToF measurement of threebounce photons can also be used for estimating the bidirectional reflectance distribution function (BRDF) of materials [Naik:2011]. While this review paper only considers imaging around corners, ToF measurements are also useful for seeing through scattering media [Satat:15, Kumar07, Raviv:14, Satat:16, Satat:18], analyzing light transport [Velten2013FemtophotographyCA, Kadambi:2013, Bhandari:14, Kadambi:14, Wu2014, Gariepy2015SinglephotonSL, Kadambi:16Interferometry, OToole:2017:SPAD], and novel imaging systems [Satat:16Compressive, Heshmat:16, Heshmat:18].
The Benefit of ToF Measurement As discussed in the supplement material of Velten et al. [Velten12], the change of intensity due to the displacement of a patch illustrated in Fig. 4 (a) is proportional to , where are the size, displacement and depth of a patch respectively. In contrast, the intensity change of a timeresolved measurement as illustrated in Fig. 4 (b) is proportional to . Drawing upon an example from [Velten12], the fractional change of the intensity for = 5 mm, = 5mm, z = 20 cm is 0.00003 for the traditional camera and 0.01 for the ToF camera. This example shows that intensity change is below the sensitivity of the traditional camera. Furthermore, the SNR is small because the number of threebounce photons is small. For this reason, the existing demonstration of intensitybased NLOS imaging relies on occlusions or more specular BRDF of the wall. ToF measurement can resolve the change of the measurement to recover the 3D geometry of the hidden object.
2.1 Reconstruction Algorithms
ToF measurements provide ellipsoidal constraints on the possible object locations in the hidden scene, as illustrated in Fig. 5. Let and denote points of the wall where a photon undergoes the first and third bounces, and denote a point of the object where a photon undergoes the second bounce. Moreover, let and denote locations of the light source and the camera. Then the hidden object must be on an ellipsoid that satisfies
(2) 
where and are the speed of light and time of travel respectively.
Most of the existing techniques consider discretized voxels to recover hidden geometry. This physicsbased forward model maps the hidden scene to the measurement. Many works express the forward model with the ellipsoidal constraint as a linear inverse problem, which can be solved by backprojection or optimization.
(3) 
where and and
denote the vectorized target voxels, ToF measurements, and noise respectively.
represents the transient light transport including the ellipsoidal constraints described by Eq. 2, intensity falloff due to the surface reflectance, and the distance between the object and the wall. The theory of light transport of multibounce photons is studied by Seitz et al. [Seitz:2005] for nontimeresolved imaging, and by Raskar et al. [Raskar5DT] for timeresolved imaging.In NLOS imaging, Eq. 3 is often illposed, so filtering or regularization of the reconstruction is often necessary. Moreover, becomes large as and are 5D measurements (3D measurement 2D scanning) and 3D voxels in general. For example, if we have 3D measurement for scanning illumination with points, th column of has elements, which maps the voxel to the measurement. If we take 3D measurement with laser illumination spots for voxel reconstruction, is a by matrix. ToFbased methods mainly focus on efficient and robust algorithms to recover .
Recently, a forward model beyond a linear model has been studied. For example, the shape of the surface can be recovered using fewer photons by modeling photons that travel specific paths called Fermat paths [Xin:19]. Phasorfield virtual wave optics enables the modeling of full light transport, including photons that bounce more than three times, in the hidden scene [liu2019phasor_nlos].
BackProjection: A naive way to solve Eq. 3 is to consider each voxel (an element in ), and compute the heat map of an object occupying the voxel given the measurement and light transport model . This backprojection method was exploited in the first demonstration of the 3D reconstruction of hidden objects [Velten12], and other NLOS imaging works for reconstruction [Gupta12, Buttafava15, Laurenzis14, Manna2018ErrorBA, Laurenzis:15, Jin:18]. Backprojection usually produces blurry reconstruction, as shown in Fig. 6 (b) because of the illposed nature of . Sharpening filters and thresholding are used to improve the reconstruction quality. Instead of considering each voxel, a probability map of can be recovered by considering the intersections of ellipsoidal constraints to perform efficient backprojection that is up to three orders of magnitude faster than naive backprojection [Arellano17]. Backprojection can be implemented optically, by illuminating and scanning along ellipsoids on the relay wall. This enables focusing the measurement to a single voxel in the hidden scene [Adithya2019:SNLOS].
Optimization: Priors on the hidden object can be incorporated to the inversion of Eq. 3 by formulating the inverse problem as minimization of a least square error with a regularizer [Gupta12, Heide14DiffuseMirror, Heide:2019:OcclusionNLOS, Kadambi16, Pediredla17]:
(4) 
where encourages to follow priors of the hidden scene. Iterative algorithms such as CoSaMP [Needell10], FISTA [Beck09:FISTA], and ADMM [Boyd11] can be used to solve this optimization problem based on the priors. Enforcing priors generally results in better reconstruction quality as compared to backprojection.
While computation and memory complexity can be enormous, the optimization formulation gives flexibility in more accurate reconstruction. For example, can be factorized to consider partial occlusion and surface normal of the hidden object to recover the visibility and surface normal as well as the albedo of the hidden object [Heide:2019:OcclusionNLOS].
Confocal Imaging: In confocal imaging, the relay wall is rasterscanned as the detector collects photons from the same point as the illuminated spot [OToole:2018:ConfocalNLOS]. This makes , and the ellipsoidal constraints become spherical constraints, which makes the forward operation a 3D convolution (Fig. 7). Confocal setup relaxes the need to solve backprojection or optimization problems, and instead, a simple deconvolution solves the reconstruction problem. 3D deconvolution makes confocal imaging both memory and computationallyefficient. Confocal imaging makes the inverse problem simple but suffers from the firstbounce photons as the detection and illumination points are the same. This issue can be mitigated by introducing a slight misalignment of the detector and illumination or time gating of SPAD. However, SNR is still limited because the singlebounce light is much stronger than the threebounce light.
The above three methods formulate NLOS imaging as a linear inverse problem. Other problem formulations can be constructed to solve NLOS imaging from different perspectives.
WaveBased reconstruction: Reza et al. [reza2018physical] showed that the intensity waves from modulated light in the NLOS setting can be modeled as a propagation of a wave (phasor field). The hidden scene’s impulse response can be recorded with a pulsed laser and ultrafast detector such as SPAD. The modulated light pulse can be virtually synthesized over time, using the recorded impulse response. Constructive interference of the synthesized wave appears at the hidden object. Hence, virtual propagation of a pulsed modulation on the impulse response of the hidden scene results in reconstruction [liu2019phasor_nlos]. Virtual wave optics approach to NLOS imaging models the full light transport, including more than threebounce photons in the hidden scene. Lindel et al. [Lindell:2019:Wave] modeled the light transport of the confocal imaging system as wave propagation, where the measurement is one boundary condition. Acquiring other boundary conditions of the wave propagation with frequencywavenumber (fk) migration algorithm results in efficient and robust reconstruction of the hidden scene containing objects with various surface reflectance.
Inverse Rendering: A renderer can be used to model the physicsbased forward model instead of analytical forward operations, as written in Eq. 3. This “synthesisbyanalysis” approach, also known as inverse rendering, changes the scene parameters such that the rendered and experimental measurements match. Inverse rendering provides more flexible reconstructions than the voxelbased reconstruction discussed above. For example, more detailed reconstruction is possible by representing the hidden object in mesh, and nonLambertian surface reflection can be incorporated [Iseringhausen2018NonLineofSightRU]. However, accurate rendering can be timeconsuming, especially because each iteration of the optimization requires computationally expensive rendering. The long runtime problem can be solved by a differential renderer that efficiently computes gradients with respect to the hidden scene parameters [tsai2019beyond].
Shape Recovery: Methods discussed above use full ToF measurement for reconstruction. However, the surface can be recovered without using all the multibounce photons. Tsai et al. showed that the firstreturning photons provide the length of the shortest path to the hidden object, which can be used to reconstruct the boundary and surface normal of the hidden object [Tsai17]. Later, discontinuities of ToF measurement are shown to follow specific paths (Fermat paths) that give rich information about the boundary of the hidden geometry [Xin:19]. Because this approach does not rely on intensity information, it is robust to BRDF variations of the object around corners.
2.2 Inference Algorithms for Localization, Tracking, and Classification
We have reviewed algorithms to reconstruct objects around the corners. Often, reconstruction of the hidden object might not be the end goal. Instead, the inference of the properties of the hidden object, such as its location or class is sufficient. While reconstruction suffices such tasks, direct inference without reconstruction can be made more efficiently with a smaller number of measurements.
BackProjection: The backprojection algorithm that we discussed in the reconstruction section can be used for a point localization. Instead of considering small voxels to recover the details of the object, an object can be treated as the single voxel to recover the location of the object [Gariepy:16, Chan17FastTracking]. Because the goal is not to recover the 3D shape, localization can be performed with much fewer measurements and less computation than reconstruction at a larger scale [Chan17FastTracking]. While reconstruction of the cluttered scene is challenging, tracking and size estimation of a moving object is demonstrated by Pandharka et al. [Pandharkar11].
Deep learning:
Datadriven algorithms have become a powerful tool for pattern recognition and found promising applications in computer vision. Though datadriven approaches for full reconstruction require further theoretical development
[Gregor2010LearningFA, Chen18LISTA], deep learning is useful for inference of unknown parameters such as object class and location. The objective of deep learning approach is to learn
, wherecould be hard to model explicitly. For example, neural networks with a SPAD array demonstrated the point localization and identification of a person around the corner
[Caramazza2018NeuralNI]. While the imaging setup is different from a corner, Satat et al. [Satat:17] demonstrated calibrationfree NLOS classification of objects behind a diffuser, where the transmissive scattering at the diffuser is similar to the reflective scattering at the relay wall.3 Coherencebased NLOS Imaging
The challenges of NLOS imaging come from the scattering of photons on the diffuse relay wall. However, some coherent properties of light are preserved after scattering. Coherencebased methods exploit speckle patterns or spatial coherence to see around corners.
3.1 SpeckleBased NLOS Imaging:
The speckle pattern is an intensity fluctuation generated by the interference of the coherent light waves. Although a speckle pattern may seem random, the observed pattern encodes information about the hidden scene.
Reconstruction The angular correlation of the object intensity pattern is preserved in the observed speckle pattern after the scattering on the relay wall [Isaac90]. This is known as memory effect [Feng88, Freund88]:
(5) 
where and are the observed speckle pattern and object intensity pattern, respectively. denotes convolution. Katz et al. [Katz14] found that the memory effect can be applied to a spatially incoherent light source such as fluorescent bulb, and demonstrated singleshot imaging through scattering, and around corners. Reconstruction of the hidden object can be performed with phaseretrieval algorithms [Shechtman2014PhaseRW, Jaganathan2015PhaseRA]. Speckle pattern can also be generated with active coherent illumination when the speckle pattern cannot be observed in passive sensing [Viswanath:18]. While this approach achieves diffractionlimited resolution, its field of view is limited because of the memory effect(order of a few degrees of the angular field of view [Katz14]). Because scattering of the diffuse wall has a similar nature as the scattering of a diffuser, specklebased reconstruction can be used to see through scattering media as well [Bertolotti2012NoninvasiveIT].
Inference for Tracking When coherent light illuminates objects around a corner, the light scattered from the objects creates a speckle pattern on the relay wall. When the object moves, the speckle pattern moves as well. The motion of the object can be tracked by computing the crosscorrelation of two images taken at different times [Smith:2017]. Smith et al. [Smith_2018_CVPR] showed that this principle applies to NLOS imaging, and demonstrated motion tracking of the multiple hidden reflectors using the active coherent illumination. The specklebased tracking has precision less than 10 m, but is currently limited to microscopic motions because of the small field of view of the memory effect [Smith_2018_CVPR]. The datadriven approach demonstrated MNIST and poseestimation classification from speckle patterns [Lei_2019_CVPR].
3.2 SpatialCoherenceBased NLOS Imaging:
Spatial coherence refers to the correlation of the phase of a light wave at different observation points (while temporal coherence refers to the correlation at different observation time). Spatial coherence can be used to reconstruct the scene in the camera’s field of view [Beckus:18]. Because the spatial coherence of light is preserved through the scattering at the diffuse wall, such reconstruction techniques can be applied to NLOS imaging [Batarseh18, Beckus2018MultimodalNP]. Measurement of spatial coherence requires a unique imaging system such as the DualPhase Sagnac Interferometer (DuPSaI) [RezvaniNaraghi:17].
Reconstruction
The propagation of coherence in free space can be approximated in a closedform. This lets the spatial coherence measurement be expressed as a system of linear equations as written in Eq. 3, where illustrates the propagation of the spatial coherence function through space and scattering. This inverse problem can be solved as minimization of least square error and regularizers that incorporate priors such as sparsity and small total variation. Beckus et al. [Beckus2018MultimodalNP] showed that coherence measurement and intensity measurement could be incorporated into a single optimization problem for reconstruction.
Inference for Localization Using spatial coherence, the distance of the incoherent light source from the detector can be estimated from the phase of the spatial coherence function. While only 1D localization was demonstrated [Batarseh18], multiple measurements of the spatial coherence functions can be used to localize the light source by triangulation.
4 IntensityBased NLOS Imaging
Velten et al. [Velten12] first studied the use of intensity measurement for NLOS imaging and showed that a traditional camera requires high sensitivity for diffuse surfaces. We illustrate this in Fig. 4, and refer readers to the supplemental material of [Velten12] for further details. To overcome this challenge, most of the existing intensitybased NLOS imaging works exploit occlusions in the scene. Before the development of NLOS imaging, occlusions have been used for computational imaging for a long time. For example, light transport from occlusions are used to synthesize images from a different point of view [Sen:2005:DP], spatially varying BRDF to capture incident light [Alldrin:2006:PLP], recover 4D light field for refocusing [Veeraraghavan:2007], and create an antipinhole camera to see outside the fieldofview of the camera [Torralba12:accidentalpinhole]. Tancik et al. [Tancik2018cosi, Tancik2018FlashPF] and Bouman et al. [Bouman17] first addressed the NLOS imaging problem directly. Bouman et al. showed 1D tracking, and Tancik et al. introduced the idea of using intensitybased reconstruction for NLOS imaging. We also review another class of technique that exploits nonLambertian surface reflection.
4.1 Exploiting Occlusions
Spatially varying occlusions make it feasible to use intensity measurement since the variability of intensity due to occlusions increases the rank of the underlying ray transport matrix. Passive intensitybased NLOS imaging with inexpensive cameras has shown a realtime demonstration of seeing around corners. However, the separation of the ambient light and the signal light from the object is necessary, which often requires moving hidden objects or background subtraction.
[RGB]0,0,0
Illumination  Sensor  Cost  Sensitivity to  
Ambient Light  Need of Priors  
on Geometry  
Pulsed  
ToF  Pulsed  
Laser  Streak Camera  
SPAD  [RGB]250,175,175High  [RGB]175,250,175 Robust  [RGB]175,250,175 Not Required  
AMCW  
ToF  Modulated  
Laser, LED  Correlation  
Camera  [RGB]250,250,175Medium  [RGB]250,250,175 Sensitive to  
Strong Light  [RGB]175,250,175 Not Required  
Passive  
Coherence  None  Traditional  
Camera  [RGB]175,250,175 Low  [RGB]250,175,175 Sensitive  [RGB]250,175,175Scene Geometry  
Passive Spatial  
Coherence  None  Dual Phase Sagnac  
Interferometer  [RGB]250,250,175 Medium  [RGB]250,175,175 Sensitive  [RGB]250,175,175Scene Geometry  
Active  
Coherence  Coherent Source  Traditional  
Camera  [RGB]175,250,175 Low  [RGB]250,175,175 Sensitive  [RGB]250,175,175Scene Geometry  
Passive  
Intensity  None  Traditional  
Camera  [RGB]175,250,175 Low  [RGB]250,175,175 Sensitive  [RGB]250,175,175Scene Geometry  
Occlusion Geometry  

Flashlight  
Laser  Traditional  
Camera  [RGB]175,250,175 Low  [RGB]250,175,175 Sensitive  [RGB]250,175,175Scene Geometry 
Localization: Bouman et al. [Bouman17] demonstrated practical passive tracking of hidden objects using an occlusion from a wall as illustrated in Fig. 9(a), and Tankic et al. showed that neural networks learn to exploit occlusions [Tancik2018FlashPF] as shown in Fig. 10 (b) . The occlusion in the scene can be considered to be a specific type of aperture. As shown in Fig. 9 (c), the problem can be formulated as a linear inverse problem similar to Eq. 3, where represents light transport with occlusions. With priors on the floor albedo, it is possible to estimate the 1D angular location the hidden object from a single measurement by estimating the ambient illumination [Seidel2019].
Reconstruction: More complex occlusion geometries enable reconstruction of the hidden scene, as becomes wellposed [Thrampoulidis2018ExploitingOI, Saunders2019Periscopy]. A 4D light field can be recovered from an occlusionbased inverse problem framework [Baradad2018InferringLF]. While the above reconstruction techniques assume known occlusions, the unknown occlusions can be estimated by exploiting the sparse motion of the hidden object [Yedidia_2019_CVPR]. Deep learningbased approaches showed its ability for reconstruction, tracking, and object classification to a specific scene setup, while its generalizability is yet to be explored [Tancik2018FlashPF, Tancik2018cosi, Chandran2019].
4.2 Exploiting Surface Reflectance
Another class of intensitybased NLOS imaging technique exploits the bidirectional reflectance distribution function (BRDF) of the relay wall to reconstruct the hidden scene. Specular BRDF makes the inverse problem less illposed. When such reflectance function of the wall is known, the light field from the hidden scene can be reconstructed [Sasaki18]. However, current demonstrations are limited to scenes without ambient light, and the wall has some specular surface reflections [Sasaki18]. Chen et al. [chen_2019_nlos] demonstrated the reconstruction of specular planar objects with active illumination. The datadriven approach showed that it is also possible to reconstruct diffuse objects with active illumination [chen_2019_nlos]. Tracking of the object using active illumination and intensity measurement can be performed by matching the simulation with the experimental measurements (inverse rendering) [Klein2016Tracking]. Thermal imaging benefits from the specular BRDF of longwave IR, which was exploited for passive localization and near realtime pose detection around corners [ICCP19_Maeda].
5 Challenges and Future Directions
We reviewed major techniques for NLOS imaging in the previous sections. Here, we introduce common challenges to realtime and robust applications to inspire further research.
5.1 Properties of Different Sensing Modalities
Table 1 shows the current landscape of NLOS techniques, and Table 2 summarizes the properties of different sensing modalities. In this section, we discuss what challenges need to be solved to complete Table 1.
ToF: Since ToFbased NLOS imaging was first demonstrated, there have been significant improvements in reconstruction algorithms that became orders of magnitude faster than the originally proposed algorithm. The challenges towards practical ToFbased NLOS imaging are:

Low signaltobackground of threebounce photons.

Estimation of lineofsight scene parameters.
First, only a small fraction of the emitted photons are captured in the measurement. This limits the acquisition time for NLOS imaging. Many recent works still require minutes to an hour of acquisition time for diffuse objects, which limits the realtime applications. Second, the majority of the existing works treat the lineofsight scene as a known geometry, given that the lineofsight scene is much easier to recover than the hidden scene. However, a fullyautomated procedure to recover both lineofsight and nonlineofsight geometry is necessary for the practical applications of NLOS imaging, where the imaging platform could be moving.
Coherence: The coherencebased approaches exploit the observations that certain coherent properties are preserved after the scattering on the diffuse wall. The challenge towards practical specklebased methods are:

Small fieldofview due to memory effect.

Lack of depth information.
Correlography techniques exploit memory effect, which has a limited angular field of view [Katz14]. This limits its application to the macroscopic scenes such as autonomous vehicles. Specklebased reconstruction methods recover the 2D projection of the image but do not recover the depth.
Spatial coherencebased methods do not suffer from the small field of view as the specklebased methods, but they require sensitive interferometric detectors that are mechanically translated [Batarseh18].
Intensity: The ease of image acquisition is the main advantage of intensitybased approaches. Detection of hidden moving objects around corners using shadow was demonstrated on an autonomous wheelchair [Naser2018]. The challenge towards practical intensitybase NLOS imaging are:

Unknown occluding geometries in the hidden scene.

Separation of the background and signal photons.

Low signaltonoise ratio.
First, the quality of passive, occlusionbased NLOS imaging heavily depends on the occluding geometries. For example, if there are no occluding geometries in the hidden scene, the reconstruction is extremely challenging [Thrampoulidis2018ExploitingOI, Saunders2019Periscopy]. However, such information about occlusions might not be available in prior. Second, ambient illumination may be much stronger than the signal light, which makes it hard to isolate the signal from the measurement. This requires active illumination, background subtraction, or motion of the hidden scene to extract the signal. Lastly, the algorithm has to be sensitive enough to capture the small signal from the hidden scene while robust enough to reject the intensity variation due to the noise.
5.2 Limitations on Reconstruction:
ToFbased NLOS imaging has a derived resolution limit. Reza et al. [reza2018physical] showed that virtual optics formulation of NLOS imaging gives the following Rayleigh lateral resolution limit:
(6) 
where are the speed of light, fullwidthhalfmax of the temporal response of the imaging system, the distance between the wall and hidden object, and diameter of the virtual aperture (scanning area) respectively. However, it is hard to evaluate resolution for some NLOS imaging techniques. The reconstruction capability of occlusionbased techniques heavily relies on scene geometry. It is essential to evaluate different methods with common datasets such as those proposed in [Klein2018AQP].
Liu et al. [Liu_2019_CVPR] showed that the hidden object becomes impossible to reconstruct depending on the direction of its surface normal. It is also shown that the reconstruction of multiple objects may fail with a simple linear light transport model when the signal from one object is stronger than the others [Heide:2019:OcclusionNLOS]. Further theoretical investigation of such limitations is necessary for the practical use of NLOS imaging.
5.3 Acquisition Time
The low signaltonoise ratio (SNR) or signaltobackground ratio is a common issue for most NLOS imaging techniques, as discussed in the previous section. This is a fundamental problem, as the number of photons from the hidden object is typically small. This limits the possible acquisition time necessary for satisfactory reconstruction and inference. Fig. 11 summarizes reported demonstration of NLOS imaging techniques across different scales of the acquisition time, illumination power, and scene.
Active Sensing: Most of the emitted photons will not be captured by the camera. Moreover, the 3bounce photons that contain the information about the hidden scene consist only of a small fraction of the captured photons. Current demonstrations of realtime NLOS imaging with eyesafe power level is limited to retroreflective objects. Recent work on ToFbased NLOS imaging uses a laser of up to 1W power, but yet requires 10 minutes of acquisition time [Lindell:2019:Wave]. More than 10W power is necessary to perform reconstruction under a minute with the same quality, but increasing laser power is not scalable for safety and cost. One could use light at a different wavelength, such as nearinfrared radiation, with higher eye safety power. Recently, a new scanning method was proposed to focus on a single voxel on the hidden scene, which can improve the SNR for a specific region of interest [Adithya2019:SNLOS].
Passive Sensing: In many practical scenarios, most of the passively captured photons are ambient light, which does not interact with the hidden objects. This results in low SNR as well as the necessity to separate ambient photons and photons from the hidden objects. The signal photons can be isolated by exploiting the movements in the hidden scene [Bouman17], or background subtraction when a measurement without the hidden object is available. Priors on the ambient illumination can also be exploited to remove the ambient term from the measurement [Seidel2019]. Because uncontrolled ambient light is hard to model, adding an active illumination to the passive technique may improve the signal to noise ratio [Tancik2018FlashPF, Chandran2019].
5.4 Integration of MultiModal Techniques
The measurements from different sensing modalities can be jointly exploited for NLOS imaging. Becks et al. [Beckus2018MultimodalNP] showed that fused intensity measurement and coherence measurement in a single optimization framework produces better reconstruction results.
Insights from different approaches can be shared. For example, occlusions are not exploited in the ToFbased approaches. Occlusionbased techniques might be able to narrow down the region of interest in the hidden scene to make the ToFbased reconstruction faster.
5.5 Datadriven Approach
We refer “datadriven” approach to the use of data to model the mapping between the measurement and the hidden scenes, and to produce priors on the solution to the inverse problem.
Learned Priors:
Handcrafted priors such as totalvariation are often used for NLOS reconstruction. More flexible priors or representations of the hidden objects can be learned from relevant data. Recent results on deep learning show the use of a generative model or discriminative model to enforce the learned priors in linear inverse problem [bora17a, Lunz2018]
. Theoretical connections between convolutional neural networks and dictionary learning
[aberdam2019mlcsc] suggest the suitability of emerging deep learning methodology to incorporate learned representations as priors to solve inverse problems.EndtoEnd Learning:
In recent years, deep learning has demonstrated a powerful set of techniques with applications to pattern recognition and studied for computational imaging applications such as CT and MRI [Jin17]. If there are correlations present in a dataset, then deep learning can offer powerful and flexible ways to approximate the function that maps the desired outputs from novel inputs. With a large dataset, a deep learning model can learn the mapping between a measurement and the desired inference (e.g., reconstruction and localization). The same network may also easily be adapted to learn the forward model as well by swapping the inputs and outputs and making changes to the model architecture. Potentially, a datadriven approach could offer calibrationfree, efficient, and flexible solutions to NLOS imaging problems [Tancik2018FlashPF, Tancik2018cosi, chen_2019_nlos]. Fig. 10 shows the use of deep learning for localization, identification, and reconstruction around corners.
Many NLOS imaging methods attempt to find efficient and robust algorithms to solve the optimization problem shown in Eq. 3. In contrast, the data driven approach attempts to solve the following minimization problem to find the function , which approximates the mapping between the measurement , and the target (3D shape, location, class of the hidden object):
(7) 
where is parameterized by and the regularizer discourages overfitting. denotes the error between the reconstruction and ground truth.
The main problems in the datadriven approach are generalizability and explainability.
Generalizability: The potential advantage of datadriven approaches is that they can learn models that are more robust to scene variation than brittle modelbased approaches. For example, a datadriven approach may handle variations in a wall reflectance, while a modelbased approach is often limited to simple, diffuse reflectance. However, the key concern is datadriven approaches fail in unpredictable ways when encountering scenes that are not represented in the training data.
One approach towards generalization is to produce a sufficiently large dataset that contains rich variations such that any possible practical scenes are represented in the training dataset. However, this is ultimately limited by the ability to simulate or collect enough experimental data to sufficiently sample the parameter space. Another approach is to design flexible neural networks for specific scene types. This makes it easier for the neural networks to learn the algorithm, which robustly works if the scene type is identified correctly.
Explainability: Another challenge for datadriven approaches is that Eq. 7 produces a mapping that is difficult to interpret. Rather than incorporating a known forward model, the learned mapping is based on the statistics present in the training set and is susceptible to “hallucinating” output in order to produce a “reasonable output” with respect to the training data.
Instead of learning Eq. 7 directly, a combination of physicsbased forward models and datadriven approaches could make these methods more explainable [Che2018ioannisinversenetworks]. To solve the optimization problem in Eq. 3, iterative algorithms such as ADMM and ISTA demonstrate improved performance when learning priors or forward models from a specific distribution [Gregor2010LearningFA, Yan16:ADMMnet, Chang17]. While traditional modelbased approaches typically rely on a handcrafted sparsity prior, a datadriven approach could offer a prior that more accurately models the scene distribution [Lunz2018, aberdam2019mlcsc]. Furthermore, these algorithms can be embedded within the architecture of the deep learning model directly [Diamond2017deepprior]. Incorporating datadriven techniques into the traditional modelbased optimization schemes show promise for NLOS imaging without sacrificing explainability.
6 Conclusions and Future Directions
We reviewed the existing NLOS imaging techniques that rely on different principles and discussed challenges towards the practical, realtime NLOS imaging. We hope that this paper will help and inspire further research toward practical NLOS imaging.
7 Acknowledgement
The authors thank Adithya Pediredla, Ravi Athale, and Sebastian Bauer for valuable feedback. This work is supported by DARPA REVEAL program (N00141812894) and MIT Lab consortium funding.
Comments
There are no comments yet.