 # A sequential test for the drift of a Brownian motion with a possibility to change a decision

We construct a Bayesian sequential test of two simple hypotheses about the value of the unobservable drift coefficient of a Brownian motion, with a possibility to change the initial decision at subsequent moments of time for some penalty. Such a testing procedure allows to correct the initial decision if it turns out to be wrong. The test is based on observation of the posterior mean process and makes the initial decision and, possibly, changes it later, when this process crosses certain thresholds. The solution of the problem is obtained by reducing it to joint optimal stopping and optimal switching problems.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

We consider a problem of sequential testing of two simple hypotheses about the value of the unknown drift coefficient of a Brownian motion. In usual sequential testing problems (see e.g. the seminal works [10, 8, 4] or the recent monographs [1, 9]), a testing procedure must be terminated at some stopping time and a decision about the hypotheses must be made. In contrast, in the present paper we propose a new setting, where a testing procedure does not terminate and it is allowed to change the initial decision (for the price of paying some penalty) if, given later observations, it turns out that it is incorrect.

We will work in a Bayesian setting and assume that the drift coefficient has a known prior distribution on a set of two values. A decision rule consists of an initial decision , where is the moment at which the decision is made and is a two-valued function showing which hypothesis is accepted initially, and a sequence of stopping times , at which the decision can be changed later. The goal is to minimize a penalty function which consists of the three parts: a penalty for the waiting time until the initial decision, a penalty for a wrong decision proportional to the time during which the corresponding wrong hypothesis is being accepted, and a penalty for each change of a decision.

This study was motivated by the paper , where a sequential multiple changepoint detection problem was considered. That problem consists in tracking of the value of the unobservable drift coefficient of a Brownian motion, which is modeled by a telegraph process (a two-state Markov process) switching between and at random times. In the present paper, we deal with a similar tracking procedure and a penalty function, but the difference is that the unobservable drift coefficient does not change. Among other results on multiple changepoint detection, one can mention the paper , where a tracking problem for a general two-state Markov process with a Brownian noise was considered, and the paper , which studied a tracking problem for a compound Poisson process.

We solve our problem by first representing it as a combination of an optimal stopping problem and an optimal switching problem (an optimal switching problem is an optimal control problem where the control process assumes only two values). The optimal stopping problem allows to find the initial stopping time, while the subsequent moments when the decision is changed are found from the optimal switching problem. Consequently, the value function of the optimal switching problem becomes the payoff function of the optimal stopping problem. Then both of the problems are solved by reducing them to free-boundary problems associated with the generator of the posterior mean process of the drift coefficient. We consider only the symmetric case (i.e. type I and type II errors are of the same importance), in which the solution turns out to be of the following structure. First an observer waits until the posterior mean process exists from some interval

and at that moment of time makes the initial decision. Future changes of the decision occur when the posterior mean process crosses some thresholds and . The constants are found as unique solutions of certain equations.

The rest of the paper consists of the three sections: Section 2 describes the problem, Section 3 states the main theorem which provides the optimal decision rule, Section 4 contains its proof.

## 2 The model and the optimality criterion

Let

be a complete probability space. Suppose one can observe a process

defined on this probability space by the relation

 Xt=μθt+Bt, (1)

where is a standard Brownian motion, is a known constant, and is a

-valued random variable independent of

. It is assumed that neither nor can be observed directly. The goal is to find out whether or by observing the process sequentially. Note that the case when the drift coefficient of can take on two arbitrary values can be reduced to (1) by considering the process .

We will assume that the prior distribution of is known and is characterized by the probability . Recall that usual settings of sequential testing problems consist in that an observer must choose a stopping time of the (completed and right-continuous) filtration generated by , at which the observation is stopped, and an -measurable function with values or that shows which of the two hypotheses is accepted at time . The choice of depends on a particular optimality criterion which combines penalties for type I and type II errors, and a penalty for observation duration. But, in any case, a test terminates at time .

In this paper we will focus on a setting where an observer can change a decision made initially at time and the testing procedure does not terminate.

By a decision rule we will call a triple , where is an -stopping time, is an -measurable function which assumes values , and is a sequence of -stopping times such that for all . At the moment , the initial decision is made. Later, if necessary, an observer can change the decision to the opposite one, and the moments of change are represented by the sequence . Thus, if, for example, , then at an observer decides that and at switches the opinion to ; at switches back to , and so on. It may be the case that starting from some ; then the decision is changed only a finite number of times (the optimal rule we construct below will have this property with probability 1).

With a given decision rule , associate the -adapted process which expresses the current decision at time ,

 Dδt=⎧⎨⎩0,if t<τ0,d,if t∈[τ2n,τ2n+1),−d,if t∈[τ2n+1,τ2n+2),

and define the Bayesian risk function

 R(δ)=E(c0τ0+c1∫∞τ0I(Dδt≠θ)dt+c2∑t>τ0I(Dδt−≠Dδt)), (2)

where are given constants.

The problem that we consider consists in finding a decision rule which minimizes , i.e.

 R(δ∗)=infδR(δ).

Such a decision rule will be called optimal.

One can give the following interpretation to the terms under the expectation in (2). The term is a penalty for a delay until making the initial decision. The next term is a penalty for making a wrong decision, which is proportional to the time during which the wrong hypothesis is being accepted. The last term is a penalty for changing a decision, in the amount for each change. Note that the problem we consider is symmetric (i.e. type I and type II errors are penalized in the same way); in principle, an asymmetric setting can be studied as well.

## 3 The main result

To state the main result about the optimal decision rule, introduce the posterior mean process

 Mt=E(θ∣FXt).

As follows from known results, the process satisfies the stochastic differential equation

 dMt=μ(1−M2t)d˜Bt,M0=2p−1, (3)

where is a Brownian motion with respect to (an innovation process, see, e.g., Chapter 7 in ), which satisfies the equation

 d˜Bt=dXt−Mtdt.

Representation (3) can be obtained either directly from filtering theorems (see Theorem 9.1 in 

) or from the known equation for the posterior probability process

(see Chapter VI in ) since . In the explicit form, can be expressed through the observable process as

 Mt=1−2(1−p)pe2μXt+1−p.

Introduce the two thresholds , which depend on the parameters of the problem, and will define the switching boundaries for the optimal decision rule. The threshold is defined as the solution of the equation

 ln1−B1+B+2B1−B2=2μ2c2c1, (4)

and the threshold is defined as the solution of the equation

 (5)

The next simple lemma shows that and are well-defined. Its proof is rather straightforward and is omitted.

###### Lemma.

Equations (4), (5) have unique solutions . If , then .

The following theorem, being the main result of the paper, provides the optimal decision rule in an explicit form.

###### Main Theorem.

The optimal decision rule consists of the stopping time and the decision function defined by the formulas

 τ∗0=inf{t≥0:|Mt|≥A},d∗=sgnMτ0,

and the sequence of stopping times which on the event are defined by the formulas

 τ∗2k+1=inf{t≥τ∗2k:Mt≤−B},τ∗2k+2=inf{t≥τ∗2k+1:Mt≥B}, (6)

and on the event by the formulas

 τ∗2k+1=inf{t≥τ∗2k:Mt≥B},τ∗2k+2=inf{t≥τ∗2k+1:Mt≤−B} (7)

(where ).

###### Example.

Figure 1 illustrates how the optimal decision rule works. In this example, we take , , , , . The thresholds can be found numerically, , .

The simulated path on the left graph has . The rule first waits until the process exists from the interval . Since in this example it exists through the lower boundary (at ), the initial decision is (incorrect). Then the rule waits until crosses the threshold , and changes the decision to at . Figure 1: Left: the process Xt; right: the process Mt. Parameters: p=0.5, μ=1/3, c0=2/3, c1=1, c2=3/2.

## 4 Proof of the Main Theorem

Let us denote by and the probability measure and the expectation under the assumption , so the posterior mean process starts from the value . It is easy to verify that

 Px(Dδt≠θ∣FXt)=1−MtDδt2,

and, by taking intermediate conditioning with respect to in (2), we can see that we need to solve the problem

 V∗(x)=infδEx(c0τ0+c12∫∞τ0(1−MtDδt)dt+c2∑t>τ0I(Dδt−≠Dδt)),x∈[−1,1] (8)

(by “to solve” we mean to find at which the infimum is attained for a given ; in passing we will also find the function in an explicit form).

Observe that there exists the limit a.s. Hence the solution of problem (8) should be looked for only among decision rules such that has a finite number of jumps and (note that the rule satisfies these conditions). In view of this, for a stopping time denote by the class of all -adapted càdlàg processes such that, with probability 1, they assume values after , have a finite number of jumps, and satisfy the condition . Let be the value of the following optimal switching problem:

 U∗(τ0)=infD∈D(τ0)Ex(c12∫∞τ0(1−MtDt)dt+c2∑t>τ0I(Dt−≠Dt)). (9)

Consequently, problem (8) can be written in the form

 V∗(x)=infτ0Ex(c0τ0+U∗(τ0)). (10)

Thus, to show that the decision rule is optimal, it will be enough to show that delivers the infimum in the problem , and delivers the infimum in the problem . In order to do that, we are going to use a usual approach based on “guessing” a solution and then verifying it using Itô’s formula. Since this approach does not show how to actually find the functions and

, in the remark after the proof we provide heuristic arguments that can be used for that.

We will first deal with . Let be the constant from (4). Introduce the “candidate” function , , defined by

 U(x,1)=c1(1−x)4μ2(ln1+x1−x+21−B2), x∈(−B,1], (11) U(x,1)=U(−x,1)+c2∣∣, x∈[−1,−B], (12) U(x,−1)=U(−x,1)∣∣, x∈[−1,1] (13)

(see Figure 2, which depicts the function , as well as the function defined below, with the same parameters as in the example in the previous section). Figure 2: The functions V(x) and U(x,y). The parameters μ,c0,c1,c2 are the same as in Figure 1.

We are going to show that . Let denotes application of the generator of the process to a sufficiently smooth function , i.e.

 Lf(x)=μ22(1−x2)2∂2∂x2f(x).

By and denote, respectively, the derivative with respect to the first argument, and the difference with respect to the second argument of , i.e.

 U′(x,y)=∂U∂x(x,y),ΔU(x,y)=U(x,y)−U(x,−y).

From the above explicit construction (11)–(13), it is not difficult to check that has the following properties:

1. [leftmargin=*,topsep=0.5em,label=(U.0)]

2. in for , and in except at points ;

3. is bounded for ;

4. if , and if ;

5. if , and if .

Consider any process and let be the sequence of the moments of its jumps after . Property 1 allows to apply Itô’s formula to the process , from which for any we obtain

 (14)

Take the expectation of the both sides of (14). By 2, the integrand in the stochastic integral is uniformly bounded, so its expectation is zero. Passing to the limit and using the equality , which implies as , we obtain

 U(Mτ0,Dτ0)≤Ex(c12∫∞τ0(1−MtDt)dt+c2∑t>τ0I(Dt≠Dt−)∣∣FXτ0), (15)

where to get the inequality we used property 3 for the first term under the expectation and 4 for the second term. Taking the infimum of the both sides of (15) over we find

 U(Mτ0,Dτ0)≤U∗(τ0). (16)

On the other hand, if the process is such that (let , if necessary) and its jumps after are identified with the sequence defined as in (6)–(7) but with arbitrary in place of , then we would have the equality in (15), as follows from 3 and 4. Together with (16), this implies that and the infimum in the definition of is attained at this process .

Let us now consider the problem . As follows from the above arguments, we can write it in the form

 V∗(x)=infτ0Ex(c0τ0+U(|Mτ0|,1)). (17)

It is clear that it is enough to take the infimum only over stopping times with finite expectation.

Let be the constant defined in (5), and put

 K=(c1(1−A)4μ2+c0A2μ2)ln1+A1−A+c1(1−A)2μ2(1−B2). (18)

Introduce the “candidate” function , :

 V(x)=c0x2μ2ln1−x1+x+K, |x|

It is straightforward to check that has the following properties:

1. [leftmargin=*,topsep=0.5em,label=(V.0)]

2. in for , and in except at points ;

3. is bounded for ;

4. if , and if ;

5. if , and if .

Applying Iô’s formula to the process and taking the expectation, for any stopping time with we obtain

 ExV(Mτ0)=V(x)+Ex∫τ00LV(Ms)ds

(Itô’s formula can be applied in view of 1; the expectation of the stochastic integral, which appears in it, is zero in view of 2 and the finiteness of ).

From 3 and 4, we find

 V(x)≤Ex(c0τ0+U(|Mτ0|,1)), (21)

so, after taking the infimum over , we get . On the other hand, for the stopping time we have the equality in (21), so . Consequently, solves the problem .

The proof is complete.

###### Remark.

The above proof does not explain how to find the functions and . Here we provide arguments which are based on well-known ideas from the optimal stopping theory and allow to do that. The reader is referred, e.g., to the monograph  for details.

Since the process is Markov, we can expect that the optimal process for should depend only on current values of and . Moreover, it is natural to assume that should switch from to when becomes close to , and switch from to when becomes close to . The symmetry of the problem suggests that there should be a threshold such that the switching occurs when crosses the levels . This means that the optimal sequence of stopping times is of the form (6)–(7). Consequently, in the set , where corresponds to the value of and corresponds to the value of , one should continue using the current value of , while in the set switch to the opposite one. In what follows, we will call these sets the continuation set and the switching set, respectively.

Next we need to find . Introduce the value function (cf. (9); it turns out to be the same function which appears in the proof):

 U(x,y)=infDEx(c12∫∞0(1−MtDt)dt+c2I(D0≠y)+c2∑t>0I(Dt−≠Dt)),

where the infimum is taken over all càdlàg processes which are adapted to the filtration generated by , take on values , and have a finite number of jumps. In the switching set, we have

 U(x,y)=U(x,−y)+c2.

From the general theory (see Chapter III in ), we can expect that the value function in the continuation set solves the ODE

 LU(x,y)=−c12(1−xy).

Its general solution can be found explicitly:

 Ugen(x,1)=c1(1−x)4μ2ln1+x1−x+K1x+K2,

where and are constants. Since we have (if , then for all and the optimal process is ), we get . To find and , we can employ the continuous fit and smooth fit conditions, also known from the general theory, which state that at the boundary of the continuation set, i.e. at the points with , the value function satisfies the equations

 U(−B,1)=U(−B,−1)+c2,U′(−B,1)=U′(−B,−1)

(here , ; the pair , gives the same equations due to the symmetry of the problem). Solving these equations gives formulas (11)–(13) for .

To find the function we use a similar approach. From the representation as a standard optimal stopping problem (17), we can expect that the optimal stopping time should be the first exit time of the process from some continuation set. Taking into account the original formulation of the problem as a sequential test, it is natural to assume that the initial decision should be made at a moment when the posterior mean becomes close to 1 or , i.e. the continuation set for should be an interval . As follows from the general theory, in the continuation set satisfies the ODE

 LV(x)=−c0,

which has the general solution

 Vgen(x)=c0x2μ2ln1−x1+x+K3x+K4.

Due to the symmetry of the problem, we have , so . Then the constants and can be found from the continuous fit and smooth fit conditions at :

 V(A)=U(A,1),V′(A)=U′(A,1).

These equations give the function defined in (19)–(20), with from (18).

## References

•  J. Bartroff, T. L. Lai, and M. Shih (2012) Sequential experimentation in clinical trials: design and analysis. Springer Science & Business Media, New York. Cited by: §1.
•  E. Bayraktar and M. Ludkovski (2009)

Sequential tracking of a hidden Markov chain using point process observations

.
Stochastic Processes and their Applications 119 (6), pp. 1792–1822. Cited by: §1.
•  P. V. Gapeev (2015) Bayesian switching multiple disorder problems. Mathematics of Operations Research 41 (3), pp. 1108–1124. Cited by: §1.
•  A. Irle and N. Schmitz (1984) On the optimality of the SPRT for processes with continuous time parameter. Statistics: A Journal of Theoretical and Applied Statistics 15 (1), pp. 91–104. Cited by: §1.
•  R. S. Liptser and A. N. Shiryaev (2001) Statistics of random processes I, II. Springer-Verlag, Berlin. Cited by: §3.
•  A. Muravlev, M. Urusov, and M. Zhitlukhin (2019) Sequential tracking of an unobservable two-state markov process under brownian noise. arXiv 1908.01162 (to appear in Sequential Analysis). Cited by: §1.
•  G. Peskir and A. Shiryaev (2006) Optimal stopping and free-boundary problems. Birkhäuser Verlag, Basel. Cited by: §3, Remark, Remark.
•  A. N. Shiryaev (1967) Two problems of sequential analysis. Cybernetics 3 (2), pp. 63–69. Cited by: §1.
•  A. Tartakovsky, I. Nikiforov, and M. Basseville (2014) Sequential analysis: hypothesis testing and changepoint detection. CRC Press, Boca Raton. Cited by: §1.
•  A. Wald and J. Wolfowitz (1948) Optimum character of the sequential probability ratio test. The Annals of Mathematical Statistics 19 (3), pp. 326–339. Cited by: §1.