Asymptotically Optimal Sampling Policy for Quickest Change Detection with Observation-Switching Cost

12/25/2019 ∙ by Tze Siong Lau, et al. ∙ Nanyang Technological University 0

We consider the problem of quickest change detection (QCD) in a signal where its observations are obtained using a set of actions, and switching from one action to another comes with a cost. The objective is to design a stopping rule consisting of a sampling policy to determine the sequence of actions used to observe the signal and a stopping time to quickly detect for the change, subject to a constraint on the average observation-switching cost. We propose a sampling policy of finite window size and a generalized likelihood ratio (GLR) Cumulative Sum (CuSum) stopping time for the QCD problem. We show that the GLR CuSum stopping time is asymptotically optimal with a properly designed sampling policy and formulate the design of this sampling policy as a quadratic programming problem. We prove that it is sufficient to consider policies of window size not more than one when designing policies of finite window length and propose several algorithms that solve this optimization problem with theoretical guarantees. Finally, we apply our approach to the problem of QCD of a partially observed graph signal.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Quickest change detection (QCD) is the problem of detecting an abrupt change in a system while keeping the detection delay to a minimum. In a usual scenario, a sequence of iid observations with distribution up to an unknown change point , and i.i.d. with distribution after is obtained. The objective is to detect for the change at as quickly as possible while maintaining false alarm constraints [tartakovsky2014sequential, poor2009quickest, veeravalli2013quickest]. QCD has applications across diverse fields, including quality control[woodall2004using, lai1995sequential, lau17, LauTay:J19], fraud detection[bolton02], cognitive radio[lai2008quickest, ZhaTayLi:J16], network surveillance[akoglu10, sequeira02, tartakovsky2006novel, tartakovsky2014rapid], structural health monitoring[sohn00], spam detection[xie12, JiTayVar:J17, TanJiTay:J18], bioinformatics[muggeo10], power system line outage detection[banerjee2014power], and sensor networks[coppin1996digital, Hong2004, YanZhoTay:J18, lau19].

In many applications, the signal of interest may be high dimensional. For example, may consist of observations from many correlated sensors. Due to the large number of sensors in the network, bandwidth and power constraints prevent us from observing the entire network at any time instance, and we may only obtain sensor readings from a small subset of sensors at any time instance [LiuTayLiu:J20, JiTay:J19]. While it may seem optimal to observe the maximum number of sensors allowed by the network, this sampling policy may not be feasible due to power and communication bandwidth considerations. Furthermore, the action of switching from one subset to another subset of sensors also incurs power and communication costs. In this paper, we consider both of these costs collectively as the observation-switching cost, and we study the problem of QCD while maintaining an AOSC constraint. To be more general, we consider the case where the signal can only be observed using an action selected from a set of permissible actions with observation-switching costs associated with the sequence of actions chosen. We assume that the pre- and post-change distributions as well as their conditional distributions given the actions are known to the observer. The objective is to design a sampling policy together with a stopping time that satisfies both the QCD false alarm and AOSC requirements. To solve this problem, we propose a sampling policy that determines which action to perform based on the previous actions and a generalized likelihood ratio (GLR) Cumulative Sum (CuSum) stopping time for the QCD problem. We show that the GLR CuSum stopping time is asymptotically optimal with a properly designed sampling policy and formulate the design of the sampling policy as a quadratic programming problem.

I-a Related Work

Existing works in QCD where the signal is not entirely available to the decision maker or the fusion center can be categorized into three main categories. In the first category, the papers [veeravalli1993decentralized, veeravalli2001decentralized, jiang2016distributed, hadjiliadis2009one] consider the problem of distributed or decentralized QCD where each node observes and processes its signal locally, with some memory of its previous messages, before sending a message to the fusion center. The authors of [veeravalli2001decentralized] consider the problem where each sensor only has access to the local information at that node and would process the signal to send a quantized message to the fusion center for further processing.

The second category of papers [banerjee15, banerjee2012data, banerjee2013data, banerjee2013decentralized] considers a QCD problem where there is an additional cost for sensing at each node, and a control policy that determines whether a given observation is taken. In [banerjee15], the authors developed a data-efficient scheme that allows for optional on-off sampling of the observations in the case where either the post-change family of distributions is finite, or both the pre- and post-change distribution belong to a one parameter exponential family.

In the third category, the papers [xie2015sketching, atia2015change, heydari2017] consider QCD where the observer only has access to compressed or incomplete measurements. The authors of [xie2015sketching] study the problem of sequential change point detection where a randomly generated linear projection is used to reduce the dimensions of a high dimensional signal for the purpose of QCD. In [heydari2017]

, the authors consider QCD with a closed-loop control of the actions where the current nodes to observe is determined by the maximal likelihood estimate of the post change hypothesis. In

[lau2017optimal], we discussed the QCD problem where the observer is only able to obtain a partial observation of the signal through an action with an open-loop control of the actions.

Unlike the papers mentioned above, in this paper, we provide a general framework by considering random decision rules to select the current actions. We also do not give a fixed cost to the sampling of observation. Instead, we consider a more general approach where we use a set of permissible actions to model the practical sampling constraints and a cost is associated to the switching of actions to model observation-switching costs. In this paper, we consider the case where the decision maker is given a finite set of pre-defined actions, and the observed signal is a function of the action and the full signal. We also do not make any assumptions about the pre- and post-change distributions.

I-B Our Contributions

To the best of our knowledge, there are no existing work that considers the QCD problem while taking observation-switching costs into account. In this paper, we consider the problem QCD while maintaining an AOSC constraint. The objective is to design a sampling policy together with a stopping time that satisfies both the QCD false alarm ARL and AOSC requirements. Our main contributions are as follows:

  1. We formulate the QCD problem with an AOSC constraint with sampling policy of window size , where the current action is allowed to be dependent only on the previous actions. We show that the GLR CuSum stopping time together with a properly designed sampling policy is asymptotically optimal. We formulate the design of the sampling policy of window size as a quadratic programming problem with an additional combinatorial constraint.

  2. We derive expressions for the AOSC and the asymptotic WADD for the GLR CuSum stopping time together with a sampling policy of window size . Using these expressions, we prove that under the constraint that , there exists an asymptotically optimal sampling policy of window size .

  3. For the case when the window size of the sampling policy is

    , we provide derivations to transform the policy design problem into a linear programming problem for the special case where all observation-switching costs are equal and also an iterative rank minimization (IRM) algorithm to obtain a locally optimal solution for the general case.

  4. For the case when the window size of the sampling policy is , we provide relaxations and derivations to transform the policy design problem into a convex programming problem.

The rest of this paper is organized as follows. In Section II, we present our signal model and problem formulation. In Section III, we present properties of the AOSC and the asymptotic ARL-WADD trade-off. In Section IV, we present the GLR CuSum stopping time and formulate the design of the sampling policy as a quadratic programming problem. We present algorithms to solve the policy design problems in Section V. Numerical results are presented in Section VI. We conclude in Section VII.

Notations: The operator

denotes mathematical expectation wrt the probability density (pdf)

, and

means that the random variable

has distribution with pdf . If the change point is at and post-change distribution is , we let and

be the probability measure and mathematical expectation, respectively. The Gaussian distribution with mean

and covariance is denoted as . Almost-sure convergence under the probability measure is denoted as . We use as the indicator function of the set , and to denote the set of positive integers. We use to denote the set of real numbers. We also use the notation to denote the sequence . For , we use the notation to denote its -th entry. For a probability transition matrix , we use the notation to denote the probability of moving to a state given that it is currently at state . For a probability mass function , denotes the support of .

Ii Problem formulation: Quickest Change Detection with a Cost for Switching Actions

Let and be distinct distributions on , and

be a sequence of vector-valued random variables satisfying the following:

(1)

where and are unknown but deterministic constants. The QCD problem is to detect the change in distribution as quickly as possible by observing , while keeping the false alarm rate low.

In this paper, we assume that the observer is only able to obtain a partial observation of , where is a function of the random variable under the action at each time . Let be the collection of permissible actions. We assume that the set is finite. We also assume that at each time , under the distribution , the observation is conditionally independent of and given the action . Some examples of that arise in practical applications include:

  1. Network Sampling. The set of rank transformations with

    where is an column vector with all zeros except a 1 at the -th position, and .

  2. -bit Quantization. The set of functions on for a fixed with

    where is the indicator function of the set .

In our sequential change detection problem, we obtain observations sequentially and aim to detect the change in distribution from to for some fixed as quickly as possible. This is determined by a stopping time. At each time instant, we also seek to find the best action to take based on actions from a past window of finite length.

Definition 1.

A policy with window size

is a Markov chain of order

on with initial distribution and probability transition matrix .

For a stopping time and a policy , we quantify its detection delay using the worst case average detection delay (WADD) as proposed by Lorden[lorden71]:

(2)

where

and its ARL to false alarm as .

In order to take the observation-switching costs of a policy into consideration, we let be a matrix where its -th entry denotes the cost of switching from action to action . Inspired by a similar cost first proposed in [banerjee2013data], we define the AOSC of the policy as

(3)

Formally, our quickest change detection with AOSC constraint can be formulated as a minimax problem: find a sampling policy and a stopping time to

(4)

for some given thresholds and .

Iii Properties of the AOSC

In this section, we present results that relate the and of a sampling policy. When the window size of the sampling policy is zero, the actions used to observe the signal are generated iid with respect to the distribution . In this case, the observations are also generated iid However, when the window size of the sampling policy is positive, unlike the former case, it is possible that the actions and observations

are not generated iid We denote the joint probability distribution function of

under as

where is the conditional probability mass function of given induced by the probability transition matrix and

is the conditional probability density function of an observation

given the action under the distribution .

A sampling policy with window size can also be written as a Markov chain of order where satisfies whenever for some and . For the rest of this paper, we switch between either representation of a sampling policy to simplify the computations in the proofs. We denote the observation-switching costs associated with the latest two actions , from to , as .

For the rest of this section, we present results that relate the and of sampling polices with different initial distributions but equal probability transition matrices. First, we recall a relation between the average number of visits and the stationary distributions of a Markov chain. Let denote the number of times, up to time , that the state is visited given that the initial state is . Since is finite, the Markov chain defined by the transition matrix has at least one recurrence class. Let be the number of recurrence classes and be the number of transient states. By the Ergodic Theorem for finite state Markov chains[bertsekas2002introduction], for a finite state Markov Chain with recurrent classes , there exists stationary distributions where if the state , and for recurrent states , we have

for any state . For transient states ,

where is the first-passage probability of initializing at state and entering the recurrence class before any other recurrence classes.

For any state , denoting as the vector of expected proportion of visits to each of the states initializing at state such that , we can see that is a stationary distribution of the probability transition matrix as it is a convex linear combination of stationary distributions. Thus, for any initial distribution , the expected proportion of visits to each of the states, denoted as , is a stationary distribution of . In the next lemmas, we see that the AOSC and asymptotic log likelihood ratios depend only on and the expected proportion of visits to each of the states.

Lemma 1.

Let be a policy of finite window size and , then we have

Proof:

See Appendix A. ∎

Next, we let

We show that for any policy where has support in only one recurrence class, the following results hold.

Lemma 2.

For any policy of finite window size where has support in only one recurrence class , and any and change-point , we have

(5)

with

(6)
Proof:

See Appendix B. ∎

Lemma 3.

Let be a policy of finite window size . Suppose has support in only one recurrence class. For any and , we have

(7)

for , and

Proof:

See Appendix C. ∎

Iv Structure of Asymptotically Optimal Stopping time

In this section, we present the GLR CuSum test for the QCD problem with AOSC constraints and study its asymptotic properties as .

Iv-a Asymptotic Properties of GLR CuSum

For a fixed policy , the GLR CuSum stopping time wrt the observed sequence is defined as:

(8)
(9)

where is a preselected threshold. The GLR CuSum stopping time can be re-written as

where . We note that for , is the CuSum statistic corresponding to the post-change distribution and policy . Thus, the GLR CuSum statistic is the maximum of the CuSum statistics for each of the post-change distributions . Next, we present some asymptotic properties of and for a policy . In this paper, we use to denote the notion of asymptotic equivalence[greene2007mathematics]:

For a fixed policy such that the expected proportion of visits to each of the states, , has support in one recurrence class, we apply Lemma 3 together with [tartakovsky2014sequential, Theorem 8.2.3] to obtain the following proposition.

Proposition 1.

For a fixed policy such that has support in one recurrence class, we have the following asymptotic - trade-off for any :

as .

Thus, when the signal is is sampled using the policy such that has support in one recurrence class, using similar techniques from [poor2009quickest, Theorem 6.16] together with Proposition 1, we know that the GLR CuSum stopping time gives us a stopping time satisfying and is asymptotically optimal for the following problem as :

(10)

In the next proposition, we discuss the case where has support in multiple recurrence classes.

Proposition 2.

Let the policy be such that has support in multiple recurrence classes. Then, there exists a policy where has support in only one recurrence class such that for any stopping time ,

Proof:

See Appendix D. ∎

Using this proposition, we obtain a result regarding asymptotically optimal solutions of creftype 4.

Theorem 1.

There exists a policy with having support in one recurrence class such that is asymptotically optimal for creftype 4 as , where is the GLR CuSum stopping time.

Proof:

See Appendix E. ∎

Using Theorem 1, the minimization in creftype 4, over the sampling policy and stopping time , can be decoupled. When the signal is sampled using a sampling policy with having support in one recurrence class, satisfying , the GLR CuSum is asymptotically optimal with the asymptotic - trade-off given as as .

Definition 2.

We call the asymptotic - trade-off rate.

Let be an optimal solution to the following problem:

(11)

By the argument above, is asymptotically optimal for creftype 4. We call creftype 11 the policy design problem.

V Optimal Sampling Policy

In this section, we investigate the sampling policy design creftype 11 under the cases where the switching costs from one action to another are all equal or not.

V-a Equal Switching Costs

In this subsection, we propose a method to solve the policy design problem in which the switching costs are constant, i.e., , for a fixed and any . First, we note that creftype 11 is feasible if and only if . Furthermore, if then creftype 4 reduces to

(12)

where the constraint is automatically satisfied. Next, we show that for the case when all action-switching costs are equal, there exists a memoryless policy (i.e., ) for which the GLR CuSum achieves asymptotic optimality.

Proposition 3.

Suppose creftype 11 is feasible and is an asymptotically optimal solution for creftype 4. There there exists a policy with window size such that

Proof:

See Appendix F. ∎

From Proposition 3, to solve the policy design problem creftype 11 for some , it suffices to solve creftype 11 for the case where .

When , creftype 11 becomes

(13)

This is equivalent to solving the linear optimization problem:

(14)

Let be the solution for Problem (14) and be the probability transition matrix with rows equal to . Using similar techniques from [poor2009quickest, Theorem 6.16] together with Proposition 1, we know that the GLR CuSum algorithm with as the sampling policy gives us a stopping time satisfying with asymptotically optimal - trade-off as tends to infinity.

V-B Unequal Switching Costs

In this subsection, we propose a method to solve the policy design problem in which the switching costs are not all equal. First, we present a proposition regarding the structure of asymptotically optimal solutions of creftype 4.

Proposition 4.

Suppose is an asymptotically optimal solution for creftype 4 with window size at least 1. There exists a policy with window size such that and

Proof:

See Appendix G. ∎

From Proposition 4, the policy design problem for window size can be reduced to a problem of window size . Thus, we only need to study the cases where or . In the following, we present algorithms to solve creftype 11 for each of these cases.

V-B1 Window size

Using a similar argument from Section V-A, we can see that for any optimal sampling policy , we have and has only one recurrence class. Thus, we have

and creftype 11 becomes

(15)

Using the same argument from creftype 13, we can introduce a new variable to obtain a linear cost function

(16)

This is a QCQP, and we may assume that is symmetric without loss of generality. However, without additional assumptions, the problem is NP-hard.

First, we discuss some special cases where the global optimum can be obtained easily. When is positive semi-definite, creftype 16 is a convex programming problem. A convex program solver[cvx, gb08] can be used to obtain globally optimal solutions. For the case where there are only cost of observations rather than cost of switching (i.e., for some function ), the quadratic constraint in creftype 16 reduces to

(17)

In this case, creftype 16 becomes a linear programming problem, which can be solved by a linear program solver[cvx].

For the general case, we use the IRM algorithm [sun2017rank] to obtain a locally optimal solution. In order to apply the IRM algorithm, we first rewrite the quadratic constraint

and creftype 16 is equivalent to

(18)

where is a vector of ones. We note that creftype 18 becomes a convex programming problem when the constraint is ignored. We are now ready to present the IRM algorithm [sun2017rank]. Fix .

First, we solve the convex problem

(19)

to obtain a solution and let be the eigen-decomposition of . Let

be the eigenvectors corresponding to the

smallest eigenvalues of

.

At each step , we solve the following convex problem:

(20)

to obtain a solution and let be the eigenvectors corresponding to the smallest eigenvalues of . We iterate until , where is a small threshold chosen as a stopping criterion. Following similar methods from [sun2017rank], it can be shown that at a linear rate and that converges to a locally optimal solution for creftype 18 if creftype 20 is feasible for all .

V-B2 Window size

Unlike the case when , not every distribution is a stationary distribution of . Furthermore, when , it is possible that more than one recurrence class exists. In this case, creftype 11 becomes

(21)

Using the same argument from creftype 13, creftype 21 is equivalent to

(22)
for all , .

creftype 22 has two quadratic constraints and . Thus, it is a QCQP with an additional combinatorial constraint that is contained in a single recurrence class of . Even without the combinatorial constraint, finding the global optimal for creftype 22 would be difficult without additional assumptions on .

By considering a similar construction used in the proof of Proposition 4,we show that any policy with window size can be expressed as a policy of window size satisfying the following:

(23)
(24)

Furthermore, any policy with window size that satisfies creftypeplural 24 and 23 can be expressed as a policy of window size . When we consider a policy with window size that satisfies creftypeplural 24 and 23, the constraint becomes automatically satisfied and the constraint is linearized to . Hence, creftype 22 is equivalent to the following problem

(25)

which is a linear programming problem with an additional combinatorial constraint. In order to handle the combinatorial constraint, we can solve creftype 25 without the combinatorial constraint and if the solution satisfies it, we have found the globally optimal solution. Alternatively, we can select a sufficiently small , and require that for all . The new constraint ensures that there is only one recurrence class for any feasible policy and thus, the recurrence class constraint is satisfied. The relaxed problem becomes

(26)

which can be solved by a linear program solver[cvx].

Vi Numerical Results

In this section, we consider the QCD problem with based on partially observed graph signals under the various conditions discussed in this paper. We consider the problem of quickest detection of a rogue node in a graph signal. We assume that our graph is a connected graph with nodes. We model the graph signal[dong2016learning] in the pre-change regime with a zero-mean Gaussian distribution with covariance , where is the psuedo-inverse of the graph Laplacian matrix , is a identity matrix and is the noise power in the graph signal. Thus, in the pre-change regime, we have

For the post-change regime, we assume that the signal obtained at the rogue node follows the same distribution as the pre-change distribution. However, this signal becomes independent of signals obtained from the rest of the graph. Thus, in the post-change regime, we have

for , where the covariance matrix is given as

(27)

For the set of actions, we assume that the infrastructure constrains us to observe at most nodes at any instance. The set of actions is the set of all partial observations where