1 Introduction and Background
The concept of light detection and ranging (LIDAR), which can be thought of as the optical analogue of radar, is by no means new, and during the many years it has been in use, it has found an extremely large variety of applications in a wide spectrum of areas and disciplines, including: agriculture, archaeology, biology, astronomy, geology and soil sciences, forestry, meteorology, and military applications, just to name a few. Most notably, there are several modern technologies that involve LIDAR, such as autonomous vehicles, space flight devices, robots of many kinds, systems with GRID based processing, and in the future, and face recognition in biometric systems (e.g., at airports).
This background sets the stage and motivates renewed interest in optical signal detection and estimation. The customary model of a direct–detection optical receiver (or detector) consists of a photo-diode (PIN diode or avalanche photo-diode), that converts the intensity of the received optical (laser) signal, modeled as the rate function of a variable–rate Poisson process, into a train of current impulses generated by the photo–electrons at random time instants, pertaining to the Poisson arrivals. This current is then fed into some electronic circuitry, whose first stage is normally a trans-impedance amplifier (TIA), that amplifies the current signal and converts it into a relatively strong voltage signal, but this amplification comes at the cost of some distortion as well as thermal noise associated with the electronic circuitry.
The challenge in the development of a solid theory of detection and estimation, for such a signal–plus–noise system, is that it is has a rather complicated model due to the various types of noise involved. The combination of shot noise, due to the photo-diode, and the thermal electronic noise is already not trivial. In the case of an avalanche photo–diode (APD), which is the more relevant case, there is an additional, third type of noise, namely, the excess noise, induced by the APD, which is actually a multiplicative noise process pertaining to fluctuations in the random gain associated with the avalanche mechanism, as each primary electron–hole pair (generated by a photon absorbed in the photo-diode), may generate secondary electron–holes, which in turn can generate additional electron–holes, and so on. For more details, and additional aspects of the problem area, the interested reader is referred to some earlier work, e.g., , , , , , , , , , , , , , , and , which is by no means an exhaustive list of relevant articles (and a book).
To provide just a rough, preliminary view on the problem and to fix ideas, we now give an informal presentation of the model and explain the difficulties more concretely. The received signal is modeled as
where is the number of photo–electrons generated during the time interval , is the current pulse contributed by a single electron (which is nearly equal to the charge of the electron multiplied by the Dirac delta function), are the random Poisson arrival times induced by the optical signal, , sensed by the APD, are the random gains induced by the APD photo-multiplication, and is thermal noise, modeled here, and in earlier works, to be white Gaussian noise with spectral density .111The flat spectrum assumption is adopted here mainly for the sake of simplicity of the exposition. The extension to colored noise is not difficult.
The signal detection problem, in its basic form, is about binary hypothesis testing. The null hypothesis is that, whereas the alternative is as in (1). Had , and been known to the receiver, the likelihood ratio (LR) would have been readily given by (see, e.g., ):
where . Since these random parameters are unknown, the actual LR must be obtained by taking the expectation of with respect to (w.r.t.) their randomness. Deriving this expectation appears to be notoriously difficult, mainly due to the second term at the exponent, i.e., the double sum over and .
It is this difficulty that triggered many researchers in the field to harness their wisdom in the quest for satisfactory solutions, and accordingly, there is rich literature on the subject, dating back many years into the past. As far as general guidelines go, a possible approach to alleviate this difficulty is the estimator–correlator approach 
, which asserts that the expected LR of detection of a random signal in Gaussian additive white noise is given by the same expression as if the desired signal,, was known (i.e., the same as if , and were known), except that it is replaced by its causal, minimum mean–square error (MMSE) estimator given . The caveat, however, is clear: deriving this MMSE causal estimator is an extremely difficult problem on its own.
To the best of the author’s knowledge, the first article that is directly relevant to this kind of study, for the above described specific signal model, is the article by Foschini, Gilbert and Salz . Their approach was to view the factor associated with the double sum in the second line of (2), namely, the term,(for given , and ), where is an auxiliary zero–mean, stationary Gaussian process with auto-correlation function . At the next step, the expectation over the randomness of was commuted with the expectations over , and , which are easier to carry out for a given realization of . The result is a more compact expression of the LR, but even after this simplification, it is not explicit enough to be implementable in practice, or to analyze its performance in full generality. At this point, the approach taken in  was to carry out a series of approximations, yielding explicit asymptotic forms of the optimal detector and its performance at least in the limits of very low and very high signal–to–noise ratio (SNR). The resulting approximate LRT for high SNR, however, was still rather complicated to implement. Also, the behavior for moderate SNR was left open.
A year later, Mazo and Salz  studied the performance of integrate–and–dump filters and also obtained exact formulas for the random gain of the APD on the basis of the earlier study by Personick , . See also . Kadota  has also derived an approximate LR test for a model like (1). His approximation approach was different from that of . It was based on neglecting the effect of overlaps between localized noise elements, which basically amounts to ignoring the cross terms of the double summation in the exponent of (2) on the ground that is a very narrow function (see also  who used the same approximation for the purpose of estimation). More recently, Helstorm and Ho  and Ho 
have applied saddle–point integration and thereby studied the behavior of certain pulse shapes at the optical receiver in terms of the performance of the decoder. Other studies are guided by the approach of approximating the distribution of the shot noise of the photo-diode by the Gaussian distribution, owing to considerations in the spirit of the central limit theorem (CLT), see e.g.,[3, Subsections 5.6.3, 5.8.4], . In this context, the well–known optical matched filter (see, e.g., , ) is the main building block of the optimal detector that simply maximizes the SNR at the sampling time,
. This Gaussian approximation approach, however, raises some concerns since the CLT is not valid for assessing the tails of the distribution and in particular, error exponents, which are the relevant players when probabilities of large deviations events, like (the asymptotically rare) FA and MD error events, are studied.
In this paper, we take a different approach. Motivated by considerations of the desired simplicity of optical detectors for LIDAR systems (especially when they need to be implemented on mobile devices), we consider the class of optical signal detectors that are based on correlating the noisy received signal with a given deterministic waveform, and we characterize the waveform with the best trade–off between the false–alarm (FA) probability and the missed–detection (MD) probability. More precisely, our derivation addresses the trade-off between the asymptotic error exponents of the FA and MD probabilities using Chernoff bounds, without resorting to Gaussian approximations. We also provide numerical results that compare the performance of the best correlator to that of the optical matched filter (or, more precisely, the matched correlator), which is coherent with the above–mentioned Gaussian approximation approach. It is demonstrated that the proposed optimal correlator outperforms the optical matched correlator, in terms of the trade–off between the FA and the MD error exponents. It should be pointed out that in addition to the random fluctuations of the APD photo-multiplier, our model also incorporates the effect of dark current that exists even under the null hypothesis.
In the same spirit and with a similar motivation, we also study, for the same type of signal model, the problem of estimating the delay of a received signal on the basis of maximizing the correlation between the received signal and a time–shifted waveform, as a function of this shift. We characterize the optimal correlator waveform that minimizes the mean square error (MSE) in the regime of high SNR, as an extension of the analysis provided by Bar-David , who analyzed the high–SNR MSE of the maximum likelihood (ML) estimator for the pure Poissonian regime (i.e., without thermal noise). Once again, the emphasis is on simplicity and therefore, the performance of this estimator cannot be compared to the much more complicated, approximate MAP estimator due to Hero , which is based on approximating the likelihood function, using the same approach as Kadota .
The optimal correlator waveforms for detection and for estimation turn out to be different, but their limiting behavior is the same in both detection and estimation problems: when the thermal Gaussian noise is dominant, the optimal correlator waveform becomes proportional to the clean signal (like the classical matched filter for additive white Gaussian noise), but when the thermal noise is negligible compared to the other noises, then it becomes logarithmic function of the clean signal, as expected in view of .
The outline of the remaining part of the paper is as follows. In Section 2, we present the model under discussion in full detail. In Section 3, we address the signal detection problem, first and foremost, for the case of zero dark current. The case of positive dark current, which follows the same general ideas (but more complicated), is also outlined, but relatively briefly. Finally, in Section 4, we address the problem of time delay estimation.
2 The Signal Model
In this section, we provide a formal presentation of the signal model, that was briefly described in the Introduction. As mentioned before, we consider the model,
whose various ingredients are described as follows. The variable is a Poissonian random variable, distributed according to
where is the a rate function that depends upon the intensity of the received optical signal. In particular,
where is the quantum efficiency of the APD, is the instantaneous power of the optical signal, is Planck’s constant, is the angular frequency of the light wave, and is the dark current. The variables are independently identically distributed (i.i.d.) positive integer random variables that designate the avalanche gains. According to Personick , , the distribution of these random variables depends on the physics of the APD, and its characteristic function obeys a certain implicit equation, which is solvable in closed form when only the electrons (and not holes) cause ionizing collisions. In this case, the distribution of each is geometric:
For the sake of concreteness, we will henceforth adopt the assumption of this geometric distribution. It also includes the case of a deterministic gain (with probability one), which corresponds to the case of the PIN diode, by taking the limit . The function is the current pulse contributed by the passage of a single photo–electron and hence its integral must be equal to the electric charge of the electron, . Naturally, this is a very narrow pulse, which for most practical purposes, can be approximated by , where is the Dirac delta function. However, can also be understood to include the convolution with some front–end filter, which is part of the electronic circuitry (e.g., the TIA). The times are the random Poissonian photon arrival times, taking on values in and being induced by the optical waveform, . Finally, is Gaussian white noise with spectral density , which is assumed to be independent of , and . Also, given , are statistically independent of . Recall that for the underlying Poissonian process defined, conditioned on the event , the unordered random arrival times, , are i.i.d. and their common density function is given by , for , and elsewhere.
Observe that if we present as , where and then designates the fluctuation, then the random input signal can be represented as
The first term, , is the desired clean signal (plus dark current), the second term, defined as
is shot noise (amplified by ), and the last term is multiplicative noise. Thus, together with the Gaussian noise, , of (3), there are three types of noise in this model, as already mentioned in the Introduction.
3 Signal Detection
In this section, we study the signal detection problem. We begin with the case (no dark current), which is considerably simpler, and then outline the extension to the more general case, . Before, we move into the technical details, a comment is in order, and it applies even to the case of no dark current. Consider then the signal detection problem of deciding between the two hypotheses:
using a detector, that is based on a correlator, that is, calculating the quantity
and comparing it to a threshold, . Here, is a deterministic waveform to be optimized, and is a threshold parameter that controls the trade-off between the FA and MD probabilities. Clearly, if the noise was purely Gaussian white noise, under both hypotheses, the optimal choice of would have been matched to the desired signal, i.e., . Here, however, as explained in Section 2, there are two additional types of non–Gaussian noise under . Since the classical matched correlator, , is no longer necessarily optimal under non–Gaussian noise, and since we would still be interested in a detector that is relatively easy to implement, the natural question is whether there is a waveform better than . If so, then what is the optimal waveform for this detection problem? In this context, we should also mention again the notion of the optical matched filter (see e.g., , ), the optical analogue of the classical matched filter, which maximizes the SNR at the sampling time, , taking into account that the intensity of the shot noise is proportional to the desired signal,
(unlike the case of pure thermal noise). But the relevance of the SNR as the only parameter that counts for detection performance is valid only under the Gaussian regime, so the optical matched filter is applied either under the Gaussian approximation, or when only second moments are important. The Gaussian approximation, under this model, is largely justified by CLT considerations. Here, however, we wish to avoid CLT considerations, as we are interested in the FA and MD error exponents, which are, in fact, given by large deviations rate functions. As is well known, the CLT is not valid for the large-deviations regime and for tails of distributions. Indeed, the optimalthat we derive below will be different from the optical matched filter.
3.1 The Case of No Dark Current
Consider the hypothesis testing problem defined at the beginning of the introductory part of this section, which is for the case of no dark current. Let and . Then, the FA probability is given by
where the notation designates asymptotic equivalence in the exponential scale, i.e., for two positive functions of , and , the assertion , stands for the assertion that . Since the FA error exponent, , depends on the waveform only via its power, , it is obvious that the maximization of the MD error exponent for given FA error exponent is equivalent to its maximization subject to a power constraint imposed on . Denoting , we next assess the MD error exponent using the Chernoff bound, assuming that .
To calculate the last expectation, we proceed in two steps. First, we average each factor over , assuming that it is geometrically distributed as in (6). This gives
As a second step, we average over and , and get
where we have used the fact [1, eqs. (8)–(13)] that for an arbitrary positive function ,
Consider first the special case where with probability one, which is obtained in the limit . In this case, the above simplifies to
Remark. Note that had the channel been purely Gaussian, that is, without the shot noise, and the desired signal was , we would have obtained
This means that the difference
designates the loss due to the additional shot noise.222The integrand is, of course, non–negative due to the inequality .
We therefore need to maximize the exponent over both and . For a given , the optimal minimizes subject to the power constraint , which is equivalent to minimizing
where is a Lagrange multiplier.333This form of the Lagrange multiplier is adopted as it allows a convenient representation of the solution. It is legitimate since there is complete freedom to control its value by the choice of the constant . Finding the optimal function is a standard problem in calculus of variations, whose solution is characterized as follows. Let the function denote the inverse of the monotonically increasing function , , i.e., is the solution to the equation , . The optimal is given by
where is chosen such that . The MD exponent is then given by
where it should be kept in mind that depends on .
Note that if the optimal is very small (which is the case when and/or
are large, then the solution to the equation
is found near the origin, where , which means that is
nearly proportional to , namely, the classical matched correlator. If, on
the other hand, the optimal is very large (which is the case when and are both
small), then the solution is found away
from the origin, where . In this case,
, in agreement of optimal
photo–counting detector (see, e.g., ),
which is obtained in the absence of Gaussian noise.
Example 1. Consider the frequently–encountered case where is a two–level signal, where half of the time and in the other half, . Then, must also be a two–level signal. Owing to the power constraint, we may denote these two levels by and , respectively, and it remains to maximize the exponent over and alone. Specifically, we have
for the following values of the parameters of the problem: , , and . The numerical value of the spectral density of the thermal noise was deliberately chosen extremely small, in order to demonstrate a situation where the noise is far from being Gaussian, and thereby examine sharply the validity of the Gaussian approximation. In this case, is nearly equal to unity for all , which means a pure, unweighted integrator [7, p. 1291, Remark 2]. As can be seen, the optimal correlator improves upon the optical matched correlator fairly significantly, especially for large values of the threshold parameter, . This concludes Example 1.
Returning to the case of a general, finite , and carrying out a similar optimization, we find that the optimal is now given by
where is the inverse of the function
and where, once again, is chosen such that
Here too, if and/or are large, then must be small, and then due to the power constraint, must be small too, which means that the functions and operate near the origin, where they are roughly linear, as
At the other extreme, on the other hand, and operate away from the origin, where
which is again, nearly exponential, and so, is approximately logarithmic, as before. The MD exponent is therefore given by
Example 2. Consider again the setting of Example 1 above, except that here is finite. Specifically, Fig. 2 displays a comparison analogous to that of Fig. 1 for the case of a random gain with parameter , where for the red solid curve, and are chosen to maximize
and for the blue, dashed curve, is chosen to be the optical matched correlator as before. As can be seen, here too, the optimal improves upon the optical matched correlator and the gap is rather considerable, especially as grows.
3.2 The Case of Positive Dark Current
The case of is analyzed on the basis of similar ideas, and we therefore cover it relatively briefly, highlighting mostly the points where there is a substantial difference relative to the zero dark–current case.
Extending the analysis to the case of positive dark current, a similar derivation yields the following FA exponent for a given correlator waveform, :
where . Here, the trade–off between the FA and the MD error exponents is somewhat more involved than in the zero dark–current case, since the FA error exponent depends on in a more complicated manner than just via its power . In particular, we would now like to find the pair that maximizes over all pairs such that , for a given that designates the target FA exponent.444Previously, was given by , so for a given , was proportional to . Equivalently, we would like to solve the problem
where is a Lagrange multiplier chosen to meet the constraint, . More specifically, using Chernoff bounds for both error exponents, this amounts to solving the problem,
We will not continue any further to the full, detailed solution of this problem, beyond the following comment which applies to the case of a deterministic gain. In the limit of , the above trade-off yields
in other words, in the presence of dark-current, the relation is always logarithmic.
4 Time Delay Estimation
In this section, we consider the problem of time delay estimation. The underlying model is the same as before, except that the optical signal is time–shifted, i.e., , where is the delay. It is assumed that the support of in included in the interval , for every in the range of uncertainty. As mentioned in the Introduction, this is a relevant problem in LIDAR systems, where distances to certain objects have to be estimated, similarly as in classical radar systems. Here too, our basic building block is a correlator. Consider an estimator of the form,
and where is a twice–differentiable waveform to be optimized, with the property that the temporal cross–correlation function,
achieves its maximum at . This implies
where dotted functions designate derivatives. Similarly as the assumption concerning , we also assume that the support of the waveform is fully included in for the entire range of search of the estimated delay.
Our analysis begins similarly as in , which assumes high SNR and small estimation errors. Accordingly, consider the Taylor series expansion of first derivative, , around the true parameter value, :
where is the second derivative of . This yields
where is the total noise, composed of the shot noise, the multiplicative noise and the thermal noise, i.e.,
where we have made a further approximation by neglecting the contribution of the random variable relative to the deterministic constant (at the denominator of the last line of (40)) since is proportional to
, whereas the standard deviation ofis proportional to (see also [1, eqs. (38)–(40)] for a more detailed justification of a similar approximation). It is easy to see that all three noise components are uncorrelated with each other. The auto-correlation function of the thermal noise is . The auto-correlation function of the (amplified) shot noise is