# A Bayesian Theory of Change Detection in Statistically Periodic Random Processes

A new class of stochastic processes called independent and periodically identically distributed (i.p.i.d.) processes is defined to capture periodically varying statistical behavior. A novel Bayesian theory is developed for detecting a change in the distribution of an i.p.i.d. process. It is shown that the Bayesian change point problem can be expressed as a problem of optimal control of a Markov decision process (MDP) with periodic transition and cost structures. Optimal control theory is developed for periodic MDPs for discounted and undiscounted total cost criteria. A fixed-point equation is obtained that is satisfied by the optimal cost function. It is shown that the optimal policy for the MDP is nonstationary but periodic in nature. A value iteration algorithm is obtained to compute the optimal cost function. The results from the MDP theory are then applied to detect changes in i.p.i.d. processes. It is shown that while the optimal change point algorithm is a stopping rule based on a periodic sequence of thresholds, a single-threshold policy is asymptotically optimal, as the probability of false alarm goes to zero. Numerical results are provided to demonstrate that the asymptotically optimal policy is not strictly optimal.

There are no comments yet.

## Authors

• 9 publications
• 11 publications
• 6 publications
10/30/2018

### Quickest Detection Of Deviations From Periodic Statistical Behavior

A new class of stochastic processes called independent and periodically ...
07/13/2018

### On the Complexity of Value Iteration

Value iteration is a fundamental algorithm for solving Markov Decision P...
04/06/2019

### Quickest Event Detection Using Multimodal Data In Nonstationary Environments

Theory and algorithms are developed for event detection using multimodal...
06/17/2021

### Interactive Change Point Detection using optimisation approach and Bayesian statistics applied to real world applications

Change point detection becomes more and more important as datasets incre...
08/08/2020

### Convex Q-Learning, Part 1: Deterministic Optimal Control

It is well known that the extension of Watkins' algorithm to general fun...
06/17/2020

### Parameterized MDPs and Reinforcement Learning Problems – A Maximum Entropy Principle Based Framework

We present a framework to address a class of sequential decision making ...
08/31/2020

### Optimal Bayesian Quickest Detection for Hidden Markov Models and Structured Generalisations

In this paper we consider the problem of quickly detecting changes in hi...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

In the problem of quickest change detection, the objective is to detect a change in the distribution of a sequence of random variables with the minimum possible delay, subject to a constraint on the rate of false alarms

[1, 2, 3]. Optimal or asymptotically optimal algorithms for quickest change detection are available in the literature. The results can be divided broadly into two categories: results for independent and identically distributed (i.i.d.) processes with algorithms that can be computed recursively and using finite memory and enjoy strong optimality properties [4, 5], and results for non-i.i.d. data with algorithms that cannot be necessarily computed recursively or using finite memory but are asymptotically optimal [6, 7, 8, 9].

In this paper, we develop theory and algorithms for detecting changes in stochastic processes that have periodically varying statistical characteristics. In this non-i.i.d. setting, we will show that the optimal algorithms can be computed recursively and using finite memory. The motivation for this problem comes from the following anomaly detection problems in cyber-physical systems and biology where such periodic behavior is observed.

1. [leftmargin=*]

2. Traffic monitoring: In [11] and [12]

, we reported results on multimodal traffic data we collected from NYC around a 5K run; during, before and after the run. We collected CCTV images, Twitter and Instagram data. We extracted counts of persons and vehicles appearing in CCTV images over time using a deep neural network-based object detector. We observed that in the absence of the event (in the normal regime), the counts have a periodic statistical behavior (over a day or a week) with increased intensity every day during morning and evening rush hours.

3. Social networks: We also observed in [11] and [12] that the aggregate social network behavior shows periodic characteristics under the normal regime. Also, the total number of Instagram messages posted in a rectangular area around the CCTV cameras showed periodic behavior.

4. Power grid monitoring: The power usage by end users have a periodic pattern with low usage at nighttime and high usage at daytime [10].

5. Neural spike patterns: In brain-computer interface studies where single neural spike data is collected, the spike firing pattern can exhibit statistically periodic behavior in the absence of any external stimuli; see, for example, [13].

6. ECG: Several biological signals including the ECG have a periodic behavior [14].

The problem of anomaly detection in the above-mentioned applications can be seen as a problem of detecting changes in deviation from periodic statistical behavior.

In this paper, we develop a Bayesian theory for anomaly detection in problems where the statistical characteristics are periodic. We introduce a class of stochastic processes called independent and periodically identically distributed (i.p.i.d) processes that can be used to model such periodic statistical behavior. We then develop algorithms for quickest detection of changes in i.p.i.d. processes and prove their optimality with respect to the Bayesian criterion of Shiryaev [15]. In the Shiryaev formulation, the objective is to detect a change in the distribution of a stochastic process to minimize the average detection delay, subject to a constraint on the probability of false alarm. In the Shiryaev problem, each time we take an observation, we pay a penalty for delay if the change has already occurred. If an alarm is raised and the change has not yet occurred, we pay a penalty for a false alarm. In this paper, we also study a more general or modified Shiryaev formulation where the penalties on the delay and the false alarm are dependent on time. The latter problem is relevant for detecting changes in non-i.i.d. processes. The definition of i.p.i.d. processes and precise problem formulations are given in Section II.

When the observations are i.i.d. before and after the change, optimal algorithms are obtained in the Bayesian setting of Shiryaev using optimal stopping theory or dynamic programming [5], [17], [22]. However, since the processes under investigation here are not i.i.d., but i.p.i.d., the traditional optimal control theory cannot be applied. We show in this paper that the change detection problems for i.p.i.d. processes we study in this paper can be mapped to a problem of optimal control of a Markov decision process (MDP) with a nonstationary but periodic transition and cost structure. As a result, in Section III, we first develop an optimal control theory for periodic MDPs. With a view towards optimal stopping, we develop the optimal control theory for the total cost problem with finite control spaces [17]. For stationary problems, the optimal policy can be obtained using the framework of dynamic programming and can be shown to be Markovian and stationary [17]. A general recipe for solving nonstationary (including periodic) problems can also be found in [17]; e.g., see pg. 256. In fact, it is suggested in [17] that the optimal policy for periodic problems is periodic in nature; see also [18]. In this paper, we explicitly derive optimal policies for periodic problems using a more direct approach and prove that the optimal policies are indeed periodic. We obtain a fixed point equation satisfied by the optimal cost function. We also obtain a value iteration algorithm for computing the optimal cost using which the periodic optimal policy can be computed. The optimal control theory developed here for periodic MDPs should be of independent interest for other control applications as well.

In Section IV, we apply the optimal control theory developed in Section III to the change detection problems. We show that the optimal change detection algorithm is periodic. The change detection statistic, as in the classical i.i.d. setting, is the a posteriori probability that the change has already occurred. But, unlike the i.i.d. setting where a single-threshold policy is strictly optimal, the stopping threshold for the i.p.i.d. problem varies with time. In fact, we show that the sequence of thresholds is periodic. We provide examples where a periodic and nonstationary policy is strictly better than a single threshold stationary policy.

In Section V, however, we show that if the constraint on the probability of false alarm is small, then we can, in fact, use a fixed time-invariant threshold. Specifically, we show that a single-threshold algorithm is asymptotically optimal for the classical Shiryaev formulation by showing that the proposed algorithm achieves a universal lower bound on the delay of any change detection procedure [7]. We will show that while the exact optimality result and periodic MDP theory is valid only for geometric priors, the asymptotic optimality result is valid for a large class of distributions on the change point.

## Ii Mathematical Model

We begin by first introducing a class of stochastic processes that can be used to model data showing periodic statistical behavior.

###### Definition 1

A stochastic process is called independent and periodically identically distributed (i.p.i.d) if

1. [leftmargin=*]

2. The random variables are independent.

3. If has density , for , then there is a positive integer such that the sequence of densities is periodic with period :

 fn+T=fn,∀n≥1.

We say that the process is i.p.i.d. with the law . Note that the law of an i.p.i.d. process is completely characterized by the finite-dimensional product distribution involving . We assume that in the normal regime, the data can be modeled as an i.p.i.d. process. At some point in time, due to an event, the distribution of the i.p.i.d. process deviates from . Our objective in this paper is to develop algorithms to observe in real time and detect changes in the distribution as quickly as possible, subject to a constraint on the rate of false alarms. If the period , an i.p.i.d. process reduces to an i.i.d. process. Optimal algorithms for change detection in i.i.d. processes has been extensively developed in the literature [1, 2, 3, 4, 5, 6, 7, 8, 9].

We now define a change point model. Consider another periodic sequence of densities such that

 gn+T=gn,∀n≥1.

We assume that at some point in time , called the change point in the following, the law of the i.p.i.d. process is governed not by the densities , but by the new set of densities :

 Yn∼{fn,∀n<ν,gn∀n≥ν. (1)

The densities need not be all different from the set of densities , but we assume that there exists at least an such that they are different:

 gi≠fi,for some i=1,2,⋯,T. (2)

### Ii-a Classical Shiryaev Formulation

Let be a stopping time for the process , i.e., a positive integer-valued random variable such that the event belongs to the -algebra generated by . In other words, whether or not is completely determined by the first observations. We declare that a change has occurred at the stopping time . To find the best stopping rule to detect the change in distribution, we need a performance criterion. Towards this end, we model the change point as a random variable with a prior :

 πn=P(ν=n), for n=1,2,⋯.

For each , we use to denote the law of the observation process when the change occurs at , and use to denote the corresponding expectation. Using this notation, we define the average probability measure

 Pπ=∞∑n=1πnPn.

To capture a penalty on the false alarms, we use the probability of false alarm defined as

 Pπ(τ<ν).

To penalize the detection delay, we use the average detection delay given by

 Eπ[(τ−ν)+]

or its conditional version

 Eπ[τ−ν|τ≥ν],

where .

The optimization problem we are interested in solving is

 minτ∈CαEπ[(τ−ν)+], (3)

where

 Cα={τ:Pπ(τ<ν)≤α},

and is a given constraint on the probability of false alarm. In the above problem, we can also use the conditional version of the delay .

When the change point is a geometric random variable, the classical approach to solving problem (3) is to solve a relaxed version using dynamic programming. Specifically, let

 P(ν=n)=(1−ρ)n−1ρ, for n=1,2,⋯.

Then

 Pπ=∞∑n=1(1−ρ)n−1ρPn.

The relaxed Bayesian optimization problem is

 minτEπ[(τ−ν)+]+λfPπ(τ<ν), (4)

where is a penalty on the cost of false alarms. The above optimization problem can be stated as a problem in partially observable MDPs (POMDPs); see Section II-B, and also [1, 22], and [11]. Specifically, define and

 pn=Pπ(ν≤n|Y1,⋯,Yn), for n≥1. (5)

Then, it can be shown that problem (4) is equivalent to solving

 minτEπ[τ−1∑n=0pn+λf(1−pτ)]. (6)

If the period and the processes are i.i.d., then the problem in (6) can be solved using classical belief state MDPs [1, 22, 11, 17]. However, in our case, the belief updates are not stationary.

###### Lemma II.1

The belief in (5) can be recursively computed using the following equations: and for ,

 pn=~pn−1gn(Yn)~pn−1gn(Yn)+(1−~pn−1)fn(Yn), (7)

where

 ~pn−1=pn−1+(1−pn−1)ρ.
###### Proof:

The proof is provided in the appendix.

Note that the likelihood ratios are a function of the time index . Thus, the belief updates are nonstationary. However, because the processes are i.p.i.d. in nature and there are only finitely many densities and , the belief updates have a periodic structure that repeats after time slots.

The optimal stopping problem (6) cannot be solved using classical optimal stopping theory or dynamic programming [5], [17], [22]. This is because in these theories it is assumed that the Markov process to be controlled is homogeneous. The Markov process to be controlled in (6) is not homogeneous. However, as shown in Lemma V.1, the transition structure is periodic. Motivated by this observation, in Section III, we develop the optimal stopping theory or optimal control theory for periodic MDPs. In Section IV, we will apply the results obtained for periodic MDPs to the periodic optimal stopping problem (6).

### Ii-B Modified Shiryaev Formulation

In this section, we formulate a more general optimal stopping problem than stated in (6). In the problem in (6), the delay penalty at all times is unit and the false alarm penalty is units. Since the processes under study here are not i.i.d., we formulate a POMDP where the delay and false alarm penalties are a function of time. Since we are investigating i.p.i.d. processes, we assume that the delay and false alarm penalties are periodic as well. The precise problem formulation is given below.

• States: Let be a sequence of states with values

. The state process is a finite-state Markov chain taking values

 Θk∈{A,0,1},∀k. (8)

The state is a special absorbing state introduced for mathematical convenience in a stopping time POMDP [22].

• Control: The control sequence for the POMDP is the process taking values and is binary valued:

 Uk∈{1(stop),2(continue)}. (9)

The control is used to continue the observation process and is used to stop it. At the time of stopping, an alarm is raised indicating that a change in the distribution of the observations has occurred.

• Observations: The distribution of the observations depends on the state and if the control is to continue: for ,

 (Yk|Θk=0,Uk−1=2)∼fk(Yk|Θk=1,Uk−1=2)∼gk, (10)

with the understanding that the observation process is i.p.i.d. with law before the change, and i.p.i.d. with law after the change. No observation is collected at time and the distribution of the observations when the state equals is irrelevant to the problem.

• Transition Structure: The Markov chain evolves according to a transition structure that depends on the control process . Let be the transition matrix for the Markov chain, given the control is . Then, we have

 P(uk)={P1,if uk=1P2,if uk=2, (11)

where

 P1=⎡⎢⎣pAApA0pA1p0Ap00p01p1Ap10p11⎤⎥⎦=⎡⎢⎣100100100⎤⎥⎦ (12)

and

 P2=⎡⎢⎣pAApA0pA1p0Ap00p01p1Ap10p11⎤⎥⎦=⎡⎢⎣10001−ρρ001⎤⎥⎦. (13)

The initial distribution for the Markov chain is

 ~π0=(~π0(A),~π0(0),~π0(1))T=(0,1,0)T. (14)

Thus, the Markov chain starts at . As long as the control , which means to continue, the states evolve according to the transition probability matrix . The values selected for elements of matrix ensure that the absorption time to the state

is inevitable and is a geometrically distributed random variable with parameter

. Before the absorption, the distributions of the observations are i.p.i.d. with law . After the state is absorbed in the distributions of the observations changes to that of an i.p.i.d. process with law .

• Cost: The cost associated with state and control is defined for as

 Ck(0,1)=λkCk(1,2)=dkCk(θ,u)=0,otherwise. (15)

Thus, is the penalty on the false alarm at time and is the penalty on the delay at time . We assume that the penalty sequences are periodic with period : for any ,

 λk+T=λkdk+T=dk. (16)
• Policy: Let

 Ik=(Y1,⋯,Yk,U1,⋯,Uk−1)

be the information at time . Also define a policy

 ~Φ=(~ϕ1,~ϕ2,⋯)

to be a sequence of mappings such that .

We want to find a control policy to optimize the long term cost

 V(~π0)=min~ΦE[∞∑k=0Ck(Θk,Uk)]. (17)

Using arguments similar to those used to obtain (6), it can be shown that the probability sequence

 pn=P(Θn=1|Y1,⋯,Yn)=P(ν≤n|Y1,⋯,Yn)

is a sufficient statistics also for the problem in (17). Consequently, solving (17) is equivalent to solving the following MDP problem:

 minτE[τ−1∑n=0dnpn+λτ(1−pτ)]. (18)

If and , , then the problem in (18) reduces to the problem in (6). If the period and the processes are i.i.d., then the problem in (18) can also be solved using classical MDP theory [1, 22, 11, 17]. However, for , the observation process is i.p.i.d. and the classical theory cannot be applied. The optimal control theory developed in Section III below for periodic MDP can and will be used to solve (18) and hence its special case (6).

## Iii Optimal Control of Periodic MDPs

In this section, we develop an optimal control theory for MDPs with periodic cost and transition structure. We have stochastic processes , , and taking values is spaces as follows:

 Xk∈R,∀k,Uk∈U={u1,u2,⋯,um},∀k,Wk∈R,∀k. (19)

The process is an MDP generated according to the transition structure

 Xk=ϕk−1(Xk−1,Uk−1,Wk−1),k≥1. (20)

Here is the control process and is the disturbance process. We assume that given and , the distribution of the disturbance is independent of the past disturbances . We use to denote this conditional distribution. Thus, the state and disturbance spaces are real-valued and the control spaces are finite. The results in the paper are, in fact, valid for more general spaces. To accommodate more general spaces, the proof techniques may have to be slightly modified [17, 20].

The main assumption in our model is that the transition functions and the conditional distributions are periodic: there is a positive integer such that for ,

 ϕk+T(Xk+T,Uk+T,Wk+T)=ϕk(Xk+T,Uk+T,Wk+T),tk+T(wk+T|xk+T,uk+T)=tk(wk+T|xk+T,uk+T). (21)

The objective is to choose the control process so as to minimize the cost

 E[∞∑k=0αkck(Xk,Uk,Wk)], (22)

where is the discount factor which is allowed to be equal to with a view towards problems in optimal stopping. The cost functions are assumed to be non-negative and periodic with the same period : for ,

 ck(x,u,w)≥0,∀x,u,w,ck+T(Xk+T,Uk+T,Wk+T)=ck(Xk+T,Uk+T,Wk+T). (23)

The assumption of non-negativity of the cost functions (which is the same as the assumption in [17]) ensures that all infinite summations are well-defined by monotone convergence theorem [21].

In order to minimize this long-term additive cost (22), we search over Markov control policies of type

 Π=[μ0,μ1,⋯]

such that

 Uk=μk(Xk),k=0,1,⋯.

As done in [17], it can be argued that restricting the search over Markov policies is sufficient. For a policy , we define the cost-to-go function starting with the state as

 VΠ(x0)=E[∞∑k=0αkck(Xk,μk(Xk),Wk)∣∣∣X0=x0], (24)

where the expectation is with respect to the disturbances. We are interested in solving the following problem: for ,

 V∗(x0)=minΠVΠ(x0)=minΠE[∞∑k=0αkck(Xk,μk(Xk),Wk)∣∣∣X0=x0]. (25)

In this section, we show that the optimal policy for the problem in (25) is periodic with period , i.e., it is of the type

 Π∗=[μ∗0,⋯μ∗T−1,μ∗0,⋯,μ∗T−1,⋯].

We also provide an explicit way to compute this optimal periodic policy.

For , and , define the operator

 Ψ(ℓ)(J)(x)=minu∈UE(ℓ)[cℓ(x,u,W)+αJ(ϕℓ(x,u,W))], (26)

where the expectation is defined with respect to the conditional distribution . We also define the operator for a Markov map and :

 Ψ(ℓ)μ(J)(x)=E(ℓ)[cℓ(x,μ(x),W)+αJ(ϕℓ(x,μ(x),W))]. (27)

Finally, define the fold operator

 Ψ=Ψ(0)Ψ(1)⋯Ψ(T−1), (28)

which is the successive application of the operators defined in (26). Our first result is the following.

###### Theorem III.1

The optimal cost function in (25) satisfies the following fixed-point equation: for any ,

 V∗(x)=(Ψ)(V∗)(x)=Ψ(0)Ψ(1)⋯Ψ(T−1)(V∗)(x). (29)
###### Proof:

The proof is provided in the appendix.

Next, we show that if the optimal cost function is known, then the optimal policy can be obtained and shown to be periodic.

###### Theorem III.2

The optimal policy is periodic. Specifically, let be the optimal cost function and let be such that for , and ,

 Ψ(ℓ)μ∗ℓ(Ψ(ℓ+1)⋯Ψ(T−1)(V∗))(x)=Ψ(ℓ)⋯Ψ(T−1)(V∗)(x). (30)

Then, the optimal policy is given by

 Π∗=[μ∗0,⋯μ∗T−1,μ∗0,⋯,μ∗T−1,⋯]. (31)
###### Proof:

The proof is provided in the appendix.

An optimal policy always exists in our case because we assume the control spaces to be finite.

The previous result is useful only when we have an algorithm to compute the optimal cost function . This is facilitated by the theorem below. Define

 Vk=[Ψ]k(V0), (32)

where we have used to denote the operator in (28) applied times to the all-zero function .

###### Theorem III.3

The limit of sequence in (32) exists:

 V∞=limk→∞Vk=limk→∞[Ψ]k(V0). (33)

Furthermore,

 V∗=V∞.
###### Proof:

The proof is provided in the appendix.

Thus, the recipe for finding the periodic optimal policy is to start with the all-zero function and apply the value iteration (32) to obtain the optimal cost function . Finally, solve (30) to obtain the Markov maps to construct the optimal policy in (31).

## Iv Detecting Changes in I.P.I.D. Processes

In this section, we solve the problem in (18), and hence the problem in (6), using Theorems III.1, III.2, and III.3. Specifically, we can state the following result. Let

 J∗(p)=minτE[τ−1∑n=0dnpn+λτ(1−pτ)],p∈[0,1], (34)

where .

###### Theorem IV.1

The optimal cost function in (34) satisfies

 J∗(p)=Ψ(0)Ψ(1)⋯Ψ(T−1)(J∗)(p),p∈[0,1], (35)

where for ,

 Ψ(ℓ)(J)(p)=min{λℓ(1−p),pdℓ+A(ℓ)J(p)}. (36)

In the above equation,

 A(ℓ)J(p)=∫xJ(¯ϕℓ(p,x))(~pgℓ+1(x)+(1−~p)fℓ+1(x))dx, (37)

where , and

 ¯ϕℓ(p,x)=~pgℓ+1(x)~pgℓ+1(x)+(1−~p)fℓ+1(x). (38)

In the above theorem, the densities are assumed to be with respect to the Lebesgue measure. The expressions can be modified to allow for counting measures (summations) or more general measures.

If and , then we are reduced to the classical formulation of Shiryaev. It is well known that the optimal policy for the classical case is a single-threshold policy in which the change is declared the first time the statistic or probability is above a pre-defined threshold. The threshold depends on the choice of the false alarm penalty .

For , it is interesting to ask if the Shiryaev single-threshold stopping rule is still the optimal policy. In the subsections below, we will show that the optimal policy, in fact, utilizes multiple thresholds, where the number of distinct thresholds can be up to . For reference below, we define the Shiryaev stopping rule here: for ,

 τps=inf{n≥0:pn>A}. (39)

In the rest of the paper, we call this stopping rule or policy the periodic-Shiryaev stopping rule or algorithm to emphasize that the recursion for is periodic.

### Iv-a Example: Change Detection in I.P.I.D. Processes For Different Values of T

In this section, we show two examples in the i.p.i.d. setting where the optimal policy is not stationary. In fact, it is periodic with periodic thresholds.

In Fig. 1, we report results for . Specifically, consider a change detection problem where the period of the i.p.i.d. processes is and the pre- and post-change i.p.i.d. densities are Gaussian:

 f1=f2=N(0,1),g1=N(2,1),g2=N(1,1). (40)

The parameters for change point, false alarm and delay are as follows:

 λ0=20,λ1=5d0=10,d1=1ρ=0.01. (41)

The optimal cost function was obtained using value iteration (33) with the operators as defined in (36). The cost functions at each iteration

 Jk=[Ψ]k(J0),fork≥1,

with being the all-zero function on , are plotted in Fig. 0(a). We used a point resolution or discretization of the interval in the value iteration. In Fig 0(b), we have plotted the norm distance as a function of the iteration index . In Fig. 0(c), we have plotted the stopping cost and the continue cost appearing in (36) for , where we use stages to refer to the distinct time slots. It can be inferred from Fig. 0(c) that the optimal policy has alternating thresholds: and . For stopping, is compared with the threshold

during the odd time slots and compared with the threshold

during the even time slots. The optimal cost achieved by this alternating threshold policy is , as can be seen in Fig. 0(a). In Fig. 0(d), we have plotted the total cost achieved by the periodic-Shiryaev algorithm (39) for different values of constant threshold . These costs were obtained through Monte-Carlo simulations using sample paths. The best achievable cost of the single-threshold periodic-Shiryaev algorithm is establishing that the optimal cost of cannot be achieved using a single-threshold policy.

Next, we consider a change detection problem where the period of the i.p.i.d. processes is and the pre- and post-change i.p.i.d. densities are given by

 f1=f2=f3=f4=N(0,1),g1=N(2,1),g2=N(1.5,1),g3=N(1,1),g4=N(0.5,1). (42)

The parameters for change point, false alarm and delay are as follows:

 λ0=20,λ1=15,λ2=10,λ3=5,d0=10,d1=10,d2=6,d3=1,ρ=0.01. (43)

Results for this case are reported in Fig. 2. Similar to results for in Fig. 1, we see here also that the optimal cost is again (Fig. 1(a)). The Fig. 1(c) also shows that there are four thresholds in this case, one for each of the four stages in a cycle or period of length . Again, the best cost achievable by the periodic-Shiryaev algorithm equals and is strictly larger than the cost of the optimal policy.

### Iv-B Change Detection in I.I.D. Data

In this section, we show examples where the periodic-Shiryaev algorithm is not strictly optimal even for i.i.d. data, as long as and we use the modified Shiryaev formulation discussed in Section II-B. Specifically, we consider the change point problem with parameters

 f1=f2=N(0,1),g1=g2=N(θ,1). (44)

and

 λ0=20,λ1=5d0=10,d1=1ρ=0.01. (45)

In Table I, we have reported the comparison between the optimal policy and the periodic-Shiryaev algorithm for the above parameters for different choices of the post-change parameter . All points are obtained using Monte-Carlo simulations with sample paths.

### Iv-C Performance Comparison For Different Mean Choices

In this section, we report comparison between the performance of the optimal policy and the periodic-Shiryaev algorithm for different choices of mean parameters for :

 f1=f2=N(0,1),g1=N(θ1,1),g2=N(θ2,1). (46)

The parameters for change point, false alarm and delay are as follows.

 λ0=20,λ1=5d0=10,d1=1ρ=0.01. (47)

The results are collected in Table II. The values in the table suggests that the superiority of the optimal policy over the periodic-Shiryaev is maintained for different values of the mean parameters and .

### Iv-D Performance Comparison For Different Choices of Delay and False Alarm Penalties

In the previous sections, we have shown examples where the periodic-Shiryaev is strictly sub-optimal. In this section, we show that the performance gap depends on the choice of the delay and false alarm penalties . In Table III, we show the performance comparison for

 f1=f2=N(0,1),g1=N(2,1),g2=N(1,1). (48)

The performance gap reduces if the false alarm penalties are kept different but the delay penalties are set to the same value. The performance gap, in fact, vanishes as the false alarm and delay penalties are equal and the problem reduced to the classical Shiryaev case. In the next section, we provide a theoretical basis for this observation by showing that the periodic-Shiryaev algorithm is, in fact, asymptotically optimal for the classical Shiryaev formulation, as the false alarm rate goes to zero. While we could not find an example where the two algorithms have different performance for the classical Shiryaev formulation, we conjecture that such an example exists and necessarily involve high values of probability of false alarm.

## V Asymptotic Optimality of Single-Threshold Policies

In this section, we show that the periodic-Shiryaev algorithm is asymptotically optimal for the classical Shiryaev formulation (3) as the probability of false alarm goes to zero.

For easy reference, we recall the definition of the periodic-Shiryaev algorithm here. Define

 pn=Pπ(ν≤n|Y1,⋯,Yn) (49)

and stop the first time this probability is above a threshold, i.e., use the stopping rule

 τps=min{n:pn>A}. (50)

While the statistic is always well-defined in a Bayesian setting, recall that in general for a non-i.i.d. model, the Shiryaev statistic cannot be computed recursively using a finite amount of memory [7, 3]. Another convenient way to compute the statistic is to compute its transformation defined as

 Rn=pn1−pn. (51)

The statistic can also be computed recursively.

###### Lemma V.1

In the i.p.i.d. setting, the statistic (51) can be computed recursively as

 Rn=(Rn−1P(ν≥n)P(ν>n)+πnP(ν>n))gn(Yn)fn(Yn), (52)

with . Further, if the prior is then the above recursion simplifies to

 Rn=(Rn−1+ρ1−ρ)gn(Yn)fn(Yn). (53)
###### Proof:

The proof is provided in the appendix.

### V-a Universal Performance Bounds For Change Detection in I.P.I.D. Processes

In this section, we obtain a universal lower bound on the performance of any stopping rule for detecting changes in an i.p.i.d. process.

Let there exist such that

 limn→∞logP(ν>n)n=−d. (54)

If , then

 logP(ν>n)n=log(1−ρ)nn=nlog(1−ρ)n=log(1−ρ).

Thus, .

Further, let

 I=1TT∑i=1D(gi∥fi), (55)

where

is the Kullback-Leibler divergence between the densities

and . We assume that

 D(gi∥fi)<∞,∀i=1,2,⋯,T,

and

 0
###### Theorem V.1

Let the information number be as defined in (55) and satisfy . Also, let be as in (54). Then, for any stopping time , i.e., for any satisfying the false alarm constraint , we have

 Eπ[τ−ν|τ≥ν]≥|logα|I+d(1+o(1)), as α→0. (56)

Here as .

###### Proof:

The proof is provided in the appendix.

### V-B Optimality of Periodic-Shiryaev Algorithm

We now show that the periodic-Shiryaev algorithm (50) is asymptotically optimal for problem (3) as the false alarm constraint . We will establish the optimality by showing that the periodic-Shiryaev algorithm achieves the lower bound specified in Theorem V.1.

Let

 Zi=loggi(Yi)fi(Yi),

and define for

 L(k)ϵ=sup{n≥1:1n∣∣∣k+n−1∑i=kZi−I∣∣∣>ϵ}. (57)

We assume for simplicity that and are densities with respect to Lebesgue measure on the real line. The results below can be easily extended to densities with respect to more general measures (including the counting measure).

###### Theorem V.2

Let

 ∫∞−∞(loggi(y)fi(y))2gi(y)dy<∞, for i=1,2,⋯,T. (58)

Then, we have

 Ek[L(k)ϵ]<∞,∀ϵ>0,∀k≥1,∞∑k=1πkEk[L(k)ϵ]<∞,∀ϵ>0. (59)

The implication of (59) is that with in (50), we have

 Eπ[τps−ν|τps≥ν]≤|logα|I+d(1+o(1)), as α→0. (60)
###### Proof:

The proof is provided in the appendix.

Thus, the periodic-Shiryaev algorithm achieves the lower bound and is asymptotically optimal. The arguments provided in the proofs of the theorems above can be extended to also establish asymptotic optimality with respect to higher order moments of the detection delay.

### V-C Numerical Results

In Fig. 3, we have plotted the average detection delay (ADD) as a function of the magnitude of the logarithm of the probability of false alarm (PFA) for the following set of parameters:

 f1=f2=N(0,1),g1=N(0.75,1),g2=N(0.25,1)ρ=0.01. (61)

The values for simulations were obtained using sample paths. The values for the analysis curve in the figure were obtained by setting the probability of false alarm using the threshold and using the delay expression

. As can be observed from the figure, the analytical expression provides an accurate estimate of the delay.

In Fig. 4, we have plotted a typical sample path of the periodic-Shiryaev algorithm for the same set of parameters used in Fig. 3.

## Vi Conclusion

We established the optimality of periodic policies for optimal control of MDPs in which the cost structure and the transition probabilities are periodic, all with the same period. We then applied this result to solve an optimal stopping problem using the framework of partially observable MDPs. The optimal stopping problem we studied is the problem of detecting changes in i.p.i.d. processes. The exact optimality theory suggests that the optimal policy has multiple thresholds (alternating, in case the period ). This structural behavior, or its effect, is absent in the low false alarm regime, where we showed that using a single, fixed, threshold is asymptotically optimal, as the probability of false alarm goes to zero. A Bayesian analysis often provides important insights into a problem. The insight we obtain from this paper is that when analyzing a non-Bayesian or minimax version of the problem studied in this paper, one can conjecture that a single-threshold policy like cumulative sum is not strictly optimal for all values of the false alarm rate [4].

###### Proof:

For we have

 P(ν=k|Y1,⋯,Yn)=πk∏k−1i=1fi(Yi)∏ni=kgi(Yi)∑nj=1πj∏j−1i=1fi(Yi)∏ni=jgi(Yi)+Γn∏ni=1fi(Yi),

where . This can be formally proved using the rigorous definition of conditional expectations using sub-sigma-algebras [21]. This implies

 pn=P(ν≤n|Y1,⋯,Yn)=∑nk=1πk∏k−1i=1fi(Yi)∏ni=kgi(Yi)∑nk=1πk∏k−1i=1fi(Yi)∏ni=kgi(Yi)+Γn∏ni=1fi(Yi).

Using this we can obtain an expression for :

 1−pn==Γn∏ni=1fi(Yi)∑nk=