In automatic Radar Target detection, modeling and statistical inference play an important role in designing constant false alarm rate (CFAR) detection procedures. CFAR detection is basically an adaptive thresholding process designed to detect targets immersed in varying background clutter [barton_radar_2004]
. CFAR is achieved when the probability of false alarm () does not depend on any of the clutter parameters. On the other hand, detection performance (i.e., probability of detection ) is mainly attributed to the backscattering from the target, often termed as radar cross-section (RCS) of the target [wilson_probability_1972]. Though for simple targets, RCS is a function of aspect angle, frequency, and polarization, there is a paradigm shift to the Statistical modeling of RCS because of extremely complex formulation of a real-world target’s compounded scattering effect and its strong sensitivity to the aspect angle and frequency [richards_principles_2010]
. So, usually, the RCS within a single resolution cell is considered as a random variable from a specified distribution termed as a target-fluctuating-model[richards_fundamentals_2005].
Accurate target models are essential for good detection performance. Nonetheless, with newer clutter models, more attention is given to CFAR detection at the cost of detection performance [richards_principles_2010]. Because of this, we see poor detection performance in detecting real-world targets like aircraft or ballistic targets [johnston_target_1997]. In [johnston_target_1997-1], the authors reiterated the importance of accurate target fluctuating models, departing from the conventional Marcum and four Swerling models for improved detection performance.
In the context of aircraft detection, it has been recently shown in [persson_radar_2017] that generalized Pareto distribution best fits a SAAB aircraft’s RCS, except only when the aircraft approaches the radar head-on, wherein the experimental study was carried out in ‘C’ and ‘Ku’ bands. Specifications and characteristics of the aircraft navigation system and the experimental setup along with the in-flight configurations are given in the above reference. But as stated in [persson_radar_2017], RCS fluctuations are more sensitive to the smaller changes in the geometry of the target at a higher frequency, and the parameters of the RCS model are frequency dependent. Statistical models are widely accepted to capture these sensitive RCS fluctuations [richards_fundamentals_2005], and for a particular class of distribution model, the parameters of the distribution vary with aspect angle and frequency [wilson_probability_1972]. Now as the works of  validates the Pareto model, both for the lower frequency range (C band) and higher frequency range, (Ku band), we adopt the same Pareto model for the intermediate frequency band (i.e., X band which lies in between C and Ku bands). We call any such Pareto distributed target-fluctuating-model as Pareto-Target (PT).
On the other hand, modeling spiky (heavy tail) behavior of sea clutter with Pareto distribution has gained much attention after it was validated for X-band high-resolution clutter intensity [weinberg_noncoherent_2017]. Furthermore, Pareto has been the forerunner for high-resolution sea clutter returns at both low and high grazing angles, outperforming Log-normal, Weibull, -distribution. However, it closely matches distribution with five parameters [farshchian_pareto_2010, weinberg_assessing_2011]. Thereby, many detection schemes [weinberg_coherent_2011, weinberg_constant_2013, weinberg_assessing_2013] and constant false alarm rate (CFAR) detectors [weinberg_constant_2013] were designed for this clutter model. However, to the best of our knowledge, CFAR detectors for the case of a PT in Pareto distributed clutter does not exist.
Further, in the maritime surveillance and reconnaissance, an airborne early warning and control (AEW&C) or AWACS (Airborne Warning and Control System) are popularized incorporating the radar picket in aerial warfare [force_2019, defenceWorld]. In such aerial engagements, scenarios of detecting enemy aircraft over the sea clutter are quite common when viewed from a radar picket at higher altitudes. For example, patrol planes with onboard installed radars (AWACS) are often engaged to detect enemy aircraft hovering over the sea. In such scenarios, we address this problem of detecting a Pareto modeled aircraft target, immersed in Pareto distributed clutter from the two-sample hypothesis testing framework.
So, in this paper, we consider the detection of a PT aircraft in Pareto distributed clutter. We further assume a more realistic scenario by considering unknown shape ‘’ and scale ‘’ parameters. By making the scale parameter unknown, Pareto distribution (two-parameter) no longer belongs to the exponential family where the standard statistical procedures are readily available [casella2002statistical]. To the best of our knowledge, we couldn’t find any literature on CFAR detection of PT in Pareto clutter when both the parameters are unknown.
Lately, works in [weinberg_examination_2014] mainly focused on, fitting or adapting the solution that was obtained for Gaussian-intensity case (i.e., exponential vs. exponential hypothesis testing [gandhi_optimality_1994] where one consider both exponentially distributed target and clutter scenario), to the detection problem in Pareto clutter. In [weinberg_general_2014], assuming that the scale parameter is known and by exploiting the transformation of Pareto to the exponential distribution, a CFAR detector was derived. In other words, it preserves the relation between the threshold and probability of false alarm () as that was obtained for the Gaussian intensity case, i.e., the cell-averaging CFAR (CA-CFAR). In the subsequent work [weinberg_constant_2014], it is attributed that the CFAR process depends on the scale parameter for preserving the Gaussian “threshold - ” relationship. Later on, in [weinberg_construction_2017, weinberg_invariant_2017], the scale parameter dependence on CFAR detection was rectified by employing the complete sufficient statistic. All of these existing literature focuses on the scenario where PT is not considered; wherein, an inherent loss in the detection performance is present. In contrast, our procedure provides an elegant solution, i.e., GLRT based CFAR detector, which we derive from the first principles by projecting the problem as a binary Composite hypothesis testing.
Complementing the Pareto model for clutter returns, it was shown that the thermal noise plus clutter at medium grazing is also the Pareto distributed [rosenberg_application_2015]. This aids us to formulate a composite binary hypothesis testing problem (Pareto vs. Pareto), as both the presence of PT or its absence (only the clutter) is a two-parameter Pareto distribution. It is natural to consider the tail index (shape parameter) to be discerning parameter as observing larger intensity values are more probable from a nearby PT aircraft rather than from farther sea clutter. This is captured in the very definition of the tail-index of the Pareto distribution. So, instead of trying to design a CFAR detector, we start from first principles by posing the problem as a composite binary hypothesis testing for the heavier tail, just as it is done for the exponential vs. exponential (i.e., detecting Swerling-I target in exponential clutter) [gandhi_optimality_1994].
We pose the aircraft detection problem as a two-sample hypothesis test for comparing the tail index of two-parameter Pareto distribution.
We solve it systematically by GLRT approach and derive expressions for and .
We verify the CFAR property of the derived detector by theory and simulation.
We match analytical receiver operating characteristics (ROC) curves against simulated ROC curves.
We next provide rudiments about two-parameter Pareto distribution in section II, followed by system model, the formal statement of composite binary hypothesis testing problem. Next, we provide the detection procedures in a systematic manner of increasing complexity by relaxing assumptions on the knowledge of the parameters, in sections III and IV, respectively. We then give extensive simulation results validating to validate our proposed detector in section V. Section VI concludes the paper.
Ii Statistical Preliminaries
Pareto distribution is one among the power-law family with a negative exponent [balakrishnan_primer_2004]. We use the notation for a random variable drawn from a Pareto distribution with shape and scale
where the indicator function is defined as, is one when , otherwise it is zero. In other words, support of is parameterized interval .
For the squared amplitude or intensity observations in the range cells, we assume this two-parameter Pareto model [weinberg_constant_2013], where the shape parameter dictates the fatness of the distribution tail and the scale defines the support set. The fatter the tail-index, the smaller is the shape parameter, and it is more likely from a nearby target than from farther sea clutter. So, it is more natural to consider tail-index as the discerning parameter in the formulation of the hypothesis testing problem that is described in the next section.
Iii System Model
Consider a Radar target detection problem in homogeneous background clutter wherein the squared amplitude or intensity observations in the range cells are modeled as Pareto distributed with the unknown shape and scale parameter. The range cells comprise of reference window cells and a cell under test (CUT) as shown in Fig. 1. Reference window cells read the background clutter observations , while the CUT reads a single observation , the backscattering either from the same background clutter or from the target. Usually, CUT is isolated from the reference window cells by the several guard cells as shown in Fig. 1. So, the CUT observation is statistically independent of each of ’s. Even the reference window cells are sufficiently apart such that ’s are also independent. So, for the discerning parameter, the tail-index , we choose a different notation for the CUT to distinguish it from that of window reference cells.
Now, we pose the problem as a two-sample test for comparing tail indices, with one sample lot consisting iid observations, each , while the other sample lot has one observation on which the test is conducted. As lower values of shape parameter imply heavier tail, we say the target is present when , i.e., is restricted to , while is unrestricted, allowing natural parametric space . Also, in both the sample lots, the scale parameter takes the same value in and merely acts as a nuisance parameter. So, our two-sample hypothesis testing problem can be compactly stated as follows:
Problem Statement: Let be the realization of two independent random sample lots , drawn from and respectively. Our problem is to find GLRT test for the hypotheses
: when is unknown and is known;
: when both and are unknown.
Before addressing the above cases, (a) and (b), we shall study a simple scenario, though unrealistic, but helps us in getting upper bounds on the detection performance. We call this idealized scenario ‘simple vs. composite,’ based on the specification of parameters in the probabilistic model, as described below.
Iii-a ‘Simple vs. Composite’
We assume perfect knowledge of the clutter statistics, i.e., and
. So the reference window observations become irrelevant here. Then clearly, the null hypothesisis ‘simple,’ as is completely specified when no target is present, and the alternate hypothesis is ‘composite,’ as . The usual strategy for this scenario is to pretend that we know the exact value of (thereby presuming simple), and derive the Neyman Pearson (NP) test. Then, if we can find the test statistic and the threshold without utilizing the knowledge of the unknown parameter , then the test so derived is optimal and can be used as an upper bound (clairvoyant detector) for detection performance [kay_fundamentals_1998]. Therefore by NP-lemma, the likelihood ratio test (LRT) is
and after some simplifications we get
Even though the threshold is dependent on unknown parameter , as the test statistic is independent of , we can choose a threshold for any predetermined significance level or the probability of false alarm . Hence,
since it is complimentary to the cdf (1), so that
As the threshold and relation (7) doesn’t involve the unknown parameter , we have the optimal test in NP sense. Similarly, the probability of detection is given by
and is dependent on the unknown . Further, after substituting from equation (7) and rearranging, we get
Therefore, the ROC curves directly follow from (9) for varied clutter tail index, as shown in Fig. 2. These curves can be used as an upper-bound when we relax the assumptions on knowledge of clutter parameters.
Iii-B ‘Composite vs. Composite’
When we relax the assumptions on clutter parameters, clearly, the null hypothesis becomes composite and choosing a threshold for a particular
requires the knowledge of parameters under null hypotheses. So, by GLRT approach, we circumvent the problem by using maximum likelihood (ML) estimates of the unknown parameters. Usually, for any two samplehypothesis testing, vs. , by GLRT, the critical region (i.e., rejecting , or deciding ) is given as
is the generalized likelihood ratio (LR) which ranges between zero and one. So, for a particular value of , we choose such that the size of the test . Thus, for our two sample hypothesis test (3), after replacing the unknown parameters with their respective restricted and whole parametric ML estimates, and , the GLRT statistic is given by
We address the problem (3) by this GLRT given above, for each case separately in the following section.
Iv Solution by GLRT for ‘Composite vs. Composite’
Iv-a 1: When is unknown, and is known.
In this case, the parametric spaces under null, alternate, and whole parameter spaces are:
As the densities of the lots , with each , and are independent, the likelihood , for the whole parameter space is given as the product of densities, i.e.,
is a column vector of unknown parameters. Here, we absorbed the indicator function for simplicity asis known, and the support of the random variable is understood as . As the logarithm is monotonically increasing function, we consider logarithm of the likelihood in finding the ML estimates. Therefore, by taking logarithm on both sides of (14), the log-likelihood function, is
we get a stationary point at
Now, by the second derivative test, we see that the concavity, is negative. Since we have one stationary point, the absolute maximum of is attained at ‘’ which we call ML estimate of under .
Here, the log-likelihood is a function of two variables, as given in (13). The parameter space of is constrained by the relations . We seek to maximize the log-likelihood under these constraints. It is evident that for , we shall obtain the same solutions as that under . Thus, we are left with the case when . By setting the first-order partial derivatives
we get an unrestricted stationary point , given as
Next, due to the constraint domain of parameter space, we analyze the behavior more closely. Firstly, we observe that the obtained value of () satisfies the condition and does not depend on . When , (likelihood function increases), while for , (likelihood function decreases). This implies that maximizes the likelihood for any given . Similar argument hold true for , and hence the pair () will be the optimal pair for the unrestricted case. The condition on the constraint set will impose the condition that implies . If this condition is violated, we understand that the likelihood increases for all values of , while the constraint would allow to only reach till . Thus, the solution will always be at the boundary i.e., when . This solution should be the same as that found under in (17). Therefore, the MLE is given as:
When , the LR becomes one and we always accept . So the critical region is mainly dictated when . Therefore, the LR becomes,
After substituting the MLEs in (21), arriving at (22) is a non-trivial step, and we give the simplification in the appendix. From (22), we can see that the LR is dependent on the data through . So, letting , we have LR as a decreasing function of because
since from the critical region condition , we have . Therefore, the critical region is equivalent to , where . So, for a given value of , we can choose from the size condition
Here, exponentially distributed with rate and
, standard gamma distributed with shape parameter, and scale parameter . This is because of the transformation of random variables, i.e., the logarithm of scaled Pareto distributed is exponential, and the sum of exponential distributed random variables is gamma-distributed. Whereby, after simplifying the (IV-A2) (given in appendix), the relation between and threshold is
which is independent of unknown , the shape parameter of background clutter. So, our test statistic has CFAR property, and the GLRT is
Similarly for probability of detection ; under , and remains . So following the simplification given in the appendix, we arrive at
Iv-B 2 When both , and are unknown.
Here, with the introduction of as an unknown parameter, Pareto distribution no longer belongs to the regular class of exponential family as the density is given in terms of an indicator function of parameterized interval . We can see this in the following expressions for the densities of lots, with each iid, for and given as
respectively. Here, , and the indicator function is one when , zero otherwise. Therefore, the likelihood function of the sample lots can be expressed as
Here is a column vector, and the null, alternate and whole parameter spaces are as follows respectively:
Here, is unrestricted in the above parameter spaces (30), and it is acting as a nuisance parameter. Clearly, while fixing other variables in (29), we see that is monotonically increasing function of and we can deduce from the indicator function that takes its value from . So, the supremum is attained on the boundary, at (say). Here we don’t use the subscript to denote which parameter space belongs, as it is common to all parameter spaces. Now we look for MLE separately for the remaining unknown parameters and .
Here, we first replace with , just as the previous case in the log-likelihood (15). After replacing with , by the first derivative test, supremum of w.r. to is attained at
where, by abusing the notation for to include in lieu of , we now have
Note that we modified the notation which was first introduced in 1, in order to have the similar expression for test statistic and for ease of simplification.
Incidentally, after replacing with with modified as in (32), following similar steps as in the previous 1 under , the MLEs of and are exactly the same expressions. Further, following the same analysis and the same simplification steps, we could arrive at the same expression for the test statistic (26). Therefore, with the new modified , the critical region is
for some . Further, considering the numerator
we accept , whenever i.e., when . So, the critical region is considered when , such that the GLRT is now simplified to
Therefore, for a given significance level , the size condition is
From theorem 3 of Malik’s work [malik_estimation_1970], we have
Further in [malik_estimation_1970], he also proved that independent of . Also, it is easy to see that [balakrishnan_primer_2004]. Therefore, the size condition (36) becomes
where, and . Let’s denote , difference of exponential distributed random variables whose density function can be derived as
Therefore, on further simplifying the size condition expression
in the appendix, the relation between and threshold is
which is independent of both the unknowns, and parameters of the background clutter. So our test statistic in (35) has CFAR property.
For evaluating , under the pdf of for gets modified to and remains same . So by following similar steps as in the previous 1, we can arrive at
Thus for both 1 and 2, for the test statistic so derived has no parameters in expressions, but as expected, has both the clutter and target model parameters in expression. In other words, detection performance depends on the relative measure of the target to the clutter tail indices. In the next section, we validate our expressions with the Monte Carlo simulations.
V Simulation Results
In this section, we validate the theoretical results which we derived in the previous sections with extensive Monte Carlo simulations for each case separately. We also comment on the performance of the proposed detectors with respect to the number of reference window cell observations.
For 1, we validate the CFAR property of the test (26) by plotting against varied unknown clutter parameter in the range of . In the Fig. 2(a), we consider three levels taking values from , and clearly, (i.e., simulated) remains constant across the range of , thus validating CFAR. In the simulation, we considered Monte Carlo runs, and by considering more data runs one gets even flatter or constant . Further, we also validate the (25), (27) expressions with the Monte Carlo simulations by plotting ROC curves Fig. 2(b). We varied in steps of , and , and clearly, the ROC curves depict that the theoretical and simulation results are in good accord.
For 2, we validate the CFAR property of the test (35) by plotting against varying both the unknown clutter parameters as shown in the Fig. (3(a)). Here we consider two levels of , and clearly, we see (i.e., simulated) remaining constant across the range of thus validating CFAR. In the simulation, we considered Monte Carlo runs, and by considering more runs, one gets even precise and constant . Further, we also validate the (41), (42) expressions with Monte Carlo simulations by plotting ROC curves in Fig. 3(b). We varied in steps of , in steps of , and , and clearly, the ROC curves depict that the theoretical and simulation results match closely.
In reality, we face mainly composite vs. composite hypothesis testing as the assumption of accurate knowledge of clutter parameters is impractical. The popular strategy is to estimate the unknown parameters and then apply NP lemma or simply substituting MLEs of the unknown parameters in the LRs as in GLRT. As the estimates are from limited observations, GLRT is sub-optimal as one cannot attain the true values of the unknown parameters, as observed in Fig. 4(a) for . On the other hand, when there are more observations from reference window cells, the estimates are better, and the detection performance approaches the upper bound as depicted in Fig. 4(b) for . However, we can’t increase the reference window cells indefinitely as the very objective of adaptive thresholding to the changes in homogeneous clutter is not met [barton_radar_2004]. In other words, such a CFAR detector wouldn’t capture the varying trends in the homogeneous clutter. In such scenarios, CFAR detector performance loss is analyzed using clairvoyant upper bounds given in Fig. 2 for different parameters.
In this paper, we formulated the aircraft detection problem as a two-sample Pareto vs. Pareto composite hypothesis test for comparing tail indices with scale as the nuisance parameter. We then solved it systematically by GLRT approach for the following cases
: when is unknown and is known;
: when both and are unknown.
In both cases, we derived the test statistic and arrived at the expressions. We then validated our results with the extensive Monte Carlo simulations. We further showed that the test so obtained has CFAR property, which we also validated via simulations. In the future, we plan to study the robustness of the proposed detectors under different realistic scenarios, such as in the presence of clutter edges and interfering targets. Simplification of equation (21) follows from (43).
where the equalities are justified as follows: for are independent random variables and their joint density products down; by taking expectation with respect to the random variable; by the complementary cdf formula;
by applying moment generating function formula for, followed by simplification.
For equation (27) following the similar lines of above justification, but under , the simplification steps are:
where the equalities are justified as follows: for are independent random variables; substituting the density function for in (39) as ; by taking expectation with respect to the random variable; by applying moment generating function formula for , followed by simplification.