Adaptive Detection of Structured Signals in Low-Rank Interference

08/16/2018 ∙ by Evan Byrne, et al. ∙ The Ohio State University 0

In this paper, we consider the problem of detecting the presence (or absence) of an unknown but structured signal from the space-time outputs of an array under strong, non-white interference. Our motivation is the detection of a communication signal in jamming, where often the training portion is known but the data portion is not. We assume that the measurements are corrupted by additive white Gaussian noise of unknown variance and a few strong interferers, whose number, powers, and array responses are unknown. We also assume the desired signals array response is unknown. To address the detection problem, we propose several GLRT-based detection schemes that employ a probabilistic signal model and use the EM algorithm for likelihood maximization. Numerical experiments are presented to assess the performance of the proposed schemes.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

I-a Problem statement

Consider the problem of detecting the presence or absence of a signal from the measured output of an -element antenna array. We are interested in the case where is unknown but structured. A motivating example arises with communications signals, where typically a few “training” symbols are known and the remaining “data” symbols are unknown, apart from their alphabet. We will assume that the signal’s array response

is completely unknown but constant over the measurement epoch and signal bandwidth. The complete lack of knowledge about

is appropriate when the array manifold is unknown or uncalibrated (e.g., see the discussion in [1]), or when the signal is observed in a dense multipath environment (e.g., [2]

). Also, we will assume that the measurements are corrupted by white noise of unknown variance and

possibly strong interferers. The interference statistics are assumed to be unknown, as is .

The signal-detection problem can be formulated as a binary hypothesis test [3] between hypotheses (signal present) and (signal absent), i.e.,

(1a)
(1b)

In (1), refers to the noise and to the interference. We model as white Gaussian noise (WGN)111By white Gaussian, we mean that has i.i.d. zero-mean circularly symmetric complex Gaussian entries. with unknown variance . If the array responses of the interferers are constant over the measurement epoch and bandwidth, then the rank of will be at most . As will be discussed in the sequel, we will sometimes (but not always) model the temporal interference component as white and Gaussian.

Communications signals often take a form like

(2)

where is a known training sequence, is an unknown data sequence, is a finite alphabet, and . Suppose that the measurements are partitioned as , conformal with (2). For the purpose of signal detection or synchronization, the data measurements are often ignored (see, e.g., [2]). But these data measurements can be very useful, especially when the training symbols (and thus the training measurements ) are few. Our goal is to develop detection schemes that use all measurements while handling the incomplete knowledge of in a principled manner.

We propose to model the signal structure probabilistically. That is, we treat

as a random vector with prior pdf

, where is statistically independent of , , , and . Although the general methodology we propose supports arbitrary , we sometimes focus (for simplicity) on the case of statistically independent components, i.e.,

(3)

For example, with uncoded communication signals partitioned as in (2), we would use (3) with

(4)

where denotes the Dirac delta, the th training symbol, is a finite-cardinality set containing the th data symbol, and is the cardinality of . For coded communications signals, the independent prior (3) would still be appropriate if a “turbo equalization” [4]

approach was used, where symbol estimation is iterated with soft-input soft-input decoding. A variation of (

2) that avoids the need to know follows from modeling as i.i.d. Gaussian. In practical communications scenarios, there exists imperfect time and frequency synchronization, which leads to mismatch in the assumed model (3)-(4). In Sec. V, we discuss synchronization mismatch and investigate its effect in numerical experiments.

The proposed probabilistic framework is quite general. For example, in addition to training/data structures of the form in (2), the independent model (3) covers superimposed training [5], bit-level training [6], constant-envelope waveforms [1], and pulsed signals (i.e., with unknown ) [1]. To exploit sinusoidal signal models, or signals with known spectral characteristics (see, e.g., [1]), the independent model (3) would be discarded in favor of a more appropriate . There is an excellent description of most of these topics in [1], and we refer readers to that source for more details.

I-B Prior work

For the case where the entire signal is known, the detection problem (1) has been studied in detail. For example, in the classical work of Kelly [7, 8], the interference-plus-noise was modeled as temporally white222By temporally white and Gaussian, we mean that the columns are i.i.d. circularly symmetric complex Gaussian random vectors with zero mean and a generic covariance matrix. and Gaussian with unknown (and unstructured) spatial covariance , and the generalized likelihood ratio test (GLRT) [3] was derived. Detector performance can be improved when the interference is known to have low rank. For example, Gerlach and Steiner [9] assumed temporally white Gaussian interference with known noise variance and unknown interference rank and derived the GLRT. More recently, Kang, Monga, and Rangaswamy [10] assumed temporally white Gaussian interference with unknown and known and derived the GLRT. Other structures on were considered by Aubry et al. in [11]. In a departure from the above methods, McWhorter [12] proposed to treat the interference components and , as well as the noise variance , as deterministic unknowns. He then derived the corresponding GLRT. Note that McWhorter’s approach implicitly assumes knowledge of the interference rank . Bandiera et al. [13] proposed yet a different approach, based on a Bayesian perspective.

For adaptive detection of unknown but structured signals , we are aware of relatively little prior work. Forsythe [1, p.110] describes an iterative scheme for signals with deterministic (e.g., finite-alphabet, constant envelope) structure that builds on Kelly’s GLRT. Each iteration involves maximum-likelihood (ML) signal estimation and least-squares beamforming, based on the intuition that correct decisions will lead to better beamformers and thus better interference suppression. Error propagation remains a serious issue, however, as we will demonstrate in the sequel.

I-C Contributions

We propose three GLRT-based schemes for adaptive detection of unknown structured signals with unknown array responses , additive WGN of unknown variance , and interference of possibly low rank. All of our schemes use a probabilistic signal model , under which the direct evaluation of the GLRT numerator becomes intractable. To circumvent this intractability, we use expectation maximization (EM) [14]. In particular, we derive computationally efficient EM procedures for the independent prior (3), paying special attention to finite-alphabet and Gaussian cases.

Our first approach treats the interference as temporally white and Gaussian, and it makes no attempt to leverage low interference rank, similar to Kelly’s approach [7]. A full-rank interference model would be appropriate if, say, the interferers’ array responses varied significantly over the measurement epoch. We show that our first approach is a variation on Forsythe’s iterative scheme [1, p.110] that uses “soft” symbol estimation and “soft” signal subtraction, making it much more robust to error propagation.

Our second approach is an extension of our first that aims to exploit the possibly low-rank nature of the interference. As in [9, 10, 11], the interference is modeled as temporally white Gaussian but, different from [9, 10, 11], both the interference rank and the noise variance are unknown. More significantly, unlike [9, 10, 11], the signal is assumed to be unknown.

Our third approach also aims to exploit low-rank interference, but it does so while modeling the interference as deterministic, as in McWhorter [12]. Unlike [12], however, the interference rank and the signal are assumed to be unknown. Numerical experiments are presented to demonstrate the efficacy of our three approaches.

Ii Background

We first provide some background that will be used in developing the proposed methods. In our discussions below, we will use to denote orthogonal projection onto the column space of a given matrix , i.e.,

(5)

and to denote the orthogonal complement. Recall that both and are Hermitian and idempotent.

Ii-a Full-rank Gaussian Interference

The classical work of Kelly [7, 8] tackled the binary hypothesis test (1) by treating the interference-plus-noise as temporally white and Gaussian with unknown spatial covariance matrix . This reduces (1) to

(6a)
(6b)

where denotes the vector formed by concatenating all columns of the matrix ,

denotes the circularly symmetric multivariate complex Gaussian distribution with mean vector

and covariance matrix , and denotes the Kronecker product. We note that the covariance structure in (6) corresponds temporal whiteness across time samples and spatial correlation with covariance matrix . With known , the GLRT [3] takes the form

(7)

for some threshold . Using results from [15], it was shown in [7] that (7) reduces to

(8)

for decreasing ordered (i.e.,

) eigenvalues

(9a)
(9b)

Kelly’s approach was applied to the detection/synchronization of communications signals by Bliss and Parker in [2] after discarding the measurements corresponding to the unknown data symbols .

When , some eigenvalues will be zero-valued and so the test (8) is not directly applicable. One can imagine many strategies to circumvent this problem (e.g., restricting to positive eigenvalues, computing eigenvalues from a regularized sample covariance of the form for , etc) that can be considered as departures from Kelly’s approach. In the sequel, we describe approaches that use a low-rank-plus-identity covariance , as would be appropriate when the interferers are few, i.e., .

Ii-B Low-rank Gaussian Interference

The low-rank property of the interference can be exploited to improve detector performance. Some of the first work in this direction was published by Gerlach and Steiner in [9]. They assumed known noise variance and temporally white Gaussian interference, so that where with unknown low-rank . The GLRT was then posed under the constraint that :

(10)

They showed that the GLRT (10) reduces to one of the form (8), but with thresholded eigenvalues .

More recently, Kang, Monga, and Rangaswamy [10] proposed a variation on Gerlach and Steiner’s approach [9] where the noise variance is unknown but is known, , and . In particular, they proposed the GLRT

(11)

where

(12)

Using a classical result from [16], it can be shown that the GLRT (11) simplifies to

(13)

with a smoothed version of from (9):

(14)
(15)

Ii-C Low-rank Deterministic Interference

The approaches discussed above all model the interference as temporally white Gaussian. McWhorter [12] instead proposed to treat the interference components and as deterministic unknowns, yielding the GLRT

(16)

where the interference rank is implicitly known. It was shown in [12] that the GLRT (16) simplifies to

(17)

using the defined in (9). Comparing (17) to (13), we see that both GLRTs involve noise variance estimates computed by averaging the smallest eigenvalues. However, (17) discards the largest eigenvalues whereas (13) uses them in the test.

Iii GLRTs via White Gaussian Interference

We now consider adaptive detection via the binary hypothesis test (1) with unknown structured . As described earlier, our approach is to model as a random vector with prior density .

Our first approach treats the interference in (1) as temporally white and Gaussian, as in [7, 9, 10, 11]. In this case, the interference-plus-noise matrix

(18)

is temporally white Gaussian with spatial covariance matrix , where both and are unknown. For now, we will model using a fixed and known rank . The case is reminiscent of Kelly [7], and the case is reminiscent of Kang, Monga, and Rangaswamy [10]. The estimation of will be discussed in Sec. III-G.

For a fixed rank , the hypothesis test (1) reduces to

(19a)
(19b)

where and (defined in (12)) are unknown and . When , note that reduces to . The corresponding GLRT is

(20)

As a consequence of , the numerator likelihood in (20) differs from that in (11), as detailed in the sequel.

Iii-a GLRT Denominator

For the denominator of (20), equations (19b) and (12) imply

(21)
(22)

We first find the ML estimate of under . When , the results in [16] (see also [10]) imply that

(23)

where follow the definition in (14) with . That is, is a smoothed version of the eigenvalues of the sample covariance matrix in decreasing order, where the smoothing averages the smallest eigenvalues to form the noise variance estimate , as in (15). When , the results in [15] (see also [7]) imply that . In either case, the columns of

are the corresponding eigenvectors of the sample covariance matrix

. Plugging (23) into (22), taking the log, and rearranging gives

(24)
(25)
(26)

Since , we have

(27)
(28)
(29)

When , note that can be computed using only the principal eigenvalues of , since

(30)

Iii-B GLRT Numerator

For the numerator of (20), and (19a) imply

(32)

Exact maximization of over and appears to be intractable. We thus propose to approximate the maximization by applying EM [14] with hidden data . This implies that we iterate the following over :

The EM algorithm is guaranteed to converge to a local maxima or saddle point of the likelihood (LABEL:eq:pY1aa) [17]. Furthermore, at each iteration , the EM-approximated log-likelihood increases and lower bounds the true log-likelihood [18].

Because is statistically independent of and , we have , which allows us to rewrite (LABEL:eq:EMa) as

(35)

We first perform the minimization in (35) over . Since

(36)

the gradient of the cost in (35) w.r.t.  equals

(37)

and this gradient is set to zero by

(38)

which uses the notation

(39)
(40)

Setting in (35), we obtain the cost that must be minimized over :

(41)
(42)

where

(43)
(44)
(45)
(46)

Note that is a regularized version of the projection matrix that equals when is completely known. In general, however, is not a projection matrix. Minimizing (42) over is equivalent to maximizing

(47)

As with (22), when , the results in [16] imply

(48)
(49)
(50)
(51)

where are the eigenvalues of the matrix in decreasing order, and the columns of are the corresponding eigenvectors. When , we have that .

We have thus derived the EM procedure that iteratively lower bounds [18] the numerator of (20) under a generic signal prior .

Iii-C EM Update under an Independent Prior

The EM updates of and in (39)-(40) compute the conditional mean (or, equivalently, the MMSE estimate [3]) of and , respectively, given the measurements in (19) under the model and . For any independent prior, as in (3), we can MMSE-estimate the symbols one at a time from the measurement equation

(52)

From , we obtain a sufficient statistic [3] for the estimation of by spatially whitening the measurements via

(53)

and then matched filtering via

(54)

where

(55)

We find it more convenient to work with the normalized and conjugated statistic

(56)

which is a Gaussian-noise-corrupted version of the true symbol , with noise precision .

The computation of the MMSE estimate from depends on the prior . For the Gaussian prior , we have the posterior mean and variance [3]

(57)
(58)

which from (40) implies

(59)

For the discrete prior , with alphabet

and prior symbol probabilities

(such that ), it is straightforward to show that the posterior density is

(60)
(61)

and thus the posterior mean and second moment are

(62)
(63)

which from (40) implies

(64)

This EM update procedure is summarized in Alg. 1.

0:   Data , signal prior .
  Initialize and (see Sec. III-H)
  repeat
      
      
      Estimate interference rank (see Sec. III-G).
      if  then
          
          
      else if  then
          
      else
          
          
          
      end if
      
      
      
      
  until Terminated
Algorithm 1 EM update under white Gaussian interference

Iii-D Fast Implementation of Algorithm 1

The implementation complexity of Alg. 1 is dominated by the eigenvalue decomposition in line 12, which consumes operations per EM iteration. We now describe how the complexity of this step can be reduced. Recall that

(65)

as described after (23). Thus in line 4 takes the form

(66)
(67)

using the definition

(68)

The key idea is that the eigen-decomposition of