Value of Information in Feedback Control

12/18/2018 ∙ by Touraj Soleymani, et al. ∙ University of Maryland Technische Universität München 0

In this article, we investigate the impact of information on networked control systems, and illustrate how to quantify a fundamental property of stochastic processes that can enrich our understanding about such systems. To that end, we develop a theoretical framework for the joint design of an event trigger and a controller in optimal event-triggered control. We cover two distinct information patterns: perfect information and imperfect information. In both cases, observations are available at the event trigger instantly, but are transmitted to the controller sporadically with one-step delay. For each information pattern, we characterize the optimal triggering policy and optimal control policy such that the corresponding policy profile represents a Nash equilibrium. Accordingly, we quantify the value of information VoI_k as the variation in the cost-to-go of the system given an observation at time k. Finally, we provide an algorithm for approximation of the value of information, and synthesize a closed-form suboptimal triggering policy with a performance guarantee that can readily be implemented.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Networked control systems, wherein feedback loops are closed over communication networks, have received much attention in the last two decades [1]. One of the main challenges in networked control systems is resource constraints, caused by various limitations in communication, computation, and energy, which can severely affect the overall system performance. Traditionally, in a networked control system, observations of the process are periodically sampled and transmitted to the controller because periodic sampling and transmission facilitate the design of such a system [2]. However, it has been conceived that not every sampled observation of the process has the same effect on the performance of a networked control system, and one can employ a transmission mechanism, i.e., event trigger, that transmits an observation only when a significant event occurs [3]. As a result, one can expect a reduction in the sampling rate for transmission in event-triggered control. This elegant idea has lead to extensive development and employment of event triggers in different contexts including consensus of multi-agent systems [4], distributed optimization [5], medium access control [6], and model predictive control [7].

In this article, we investigate the impact of information on networked control systems, and illustrate how to quantify a fundamental property of stochastic processes that can enrich our understanding about such systems. Undoubtedly, transmission of an observation decreases the uncertainty of the controller, and hence increases the control performance in a networked control system. However, the amount of this increment in the control performance is still unknown. Here, associated with optimal event-triggered control, we quantify the value of information as the variation in the cost-to-go of the system given an observation at time . To that end, we develop a theoretical framework based on which the notion of value of information is conceivable.

In optimal event-triggered control, one primarily deals with a distributed optimization problem with an event trigger and a controller as the decision makers. Yet, this problem for the joint design of the event trigger and controller is in general intractable (see e.g., [8, 9]

). The reasons are that the optimal estimator at the controller is nonlinear with no analytical solution, estimation and control are coupled due to a dual effect, and the event trigger and controller have nonclassical information patterns. Nevertheless, one can characterize the solutions of this problem under a certain assumption, and study a trade-off between the sampling rate and control performance.

Here, following a game-theoretic analysis, we shed light on the structures of the optimal policies in optimal event-triggered control by restricting the information set of the controller to one with discarded negative information (i.e., information associated with non-transmitted observations). We cover two distinct information patterns: perfect information where observations are the states of the process, and imperfect information where observations are noisy outputs of the process. In both cases, observations are available at the event trigger instantly, but are transmitted to the controller sporadically with one-step delay.

I-a Related Work

In a seminal work in 1999, Åström and Bernhardsson [3]

showed for a first-order continuous-time stochastic process under a sampling rate constraint that event-triggered sampling outperforms periodic sampling in the sense of mean error variance. This work later fostered extensive research in event-triggered control. In general, in a networked control system an event trigger can be employed at the sensor side to reduce the sampling rate in the observation channel, or at the controller side to reduce the sampling rate in the control channel. We should point out that what we are interested in here is the former in which the event trigger and controller are distributed. In the joint design of the event trigger and controller in optimal event-triggered control, one makes a trade-off between the sampling rate and control performance. To elucidate the essence of this problem, we herein neglect network-induced effects such as quantization, packet dropouts, and time-varying delays.

Several works have addressed optimal event-triggered estimation, and characterized the optimal triggering policies [10, 11, 12, 13]. In particular, Xu and Hespanha [10] studied optimal event-triggered estimation with perfect information by discarding the negative information. They searched in the space of stochastic triggering policies, and showed that the optimal triggering policy is indeed deterministic. Rabi and Baras [11] formulated optimal event-triggered estimation with perfect information as an optimal multiple stopping time problem by discarding the negative information, and showed that the optimal triggering policy for first-order systems is symmetric. Later, Lipsa and Martins [12] used majorization theory to study optimal event-triggered estimation with perfect information without discarding the negative information, and proved for first-order systems that the optimal estimator is linear and the optimal triggering policy is symmetric. Moreover, Molin and Hirche [13] developed an iterative algorithm for obtaining the optimal estimator and optimal triggering policy in optimal event-triggered estimation with perfect information that is applicable to systems with arbitrary noise distributions. They studied the convergence properties of the algorithm for first-order systems, and obtained a result that coincides with that in [12]. As explained before, in the joint design of the event trigger and controller a separation between estimation and control is not given a priori. Therefore, the results in the aforementioned studies do not apply directly to optimal event-triggered control.

However, there exist a number of studies that have addressed optimal event-triggered control, and characterized the optimal control policies [14, 15, 9, 16]. In particular, Molin and Hirche [14, 15] investigated optimal event-triggered control with perfect and imperfect information, and showed that the optimal control policy is a certainty-equivalence policy while assuming that the triggering policy is a function of primitive variables. Ramesh et al. [9] studied dual effect in optimal event-triggered control with perfect information, and proved that the dual effect in general exists. In addition, they showed that the certainty-equivalence principle holds if and only if the triggering policy is independent of the control policy. Recently, Demirel et al. [16] addressed optimal event-triggered control with imperfect information by adopting a stochastic triggering policy that is independent of the control policy, and proved that the optimal control policy is a certainty-equivalence policy. Unlike these studies, in addition to scrutinizing the notion of value of information in optimal event-triggered control, we herein characterize both optimal triggering policy and optimal control policy such that the corresponding policy profile represents a Nash equilibrium. Besides, we synthesize a closed-form suboptimal triggering policy with a performance guarantee that can readily be implemented. We cover both perfect and imperfect information, and show that our analysis is tractable and also extensible to high-order systems.

A special class of event-triggered estimation and event-triggered control is sensor scheduling in which open-loop triggering policies are employed. Sensor scheduling can be traced back to the 1970s. However, recently Trimpe and D’Andrea [17] and Leong et al. [18] adopted sensor scheduling for networked control systems, and obtained open-loop triggering policies in terms of the estimation error covariances. It is also worth mentioning that in a rather different setup from what we consider in this study, Antunes and Heemels [19] considered a networked control system in which the event trigger and controller are both collocated with the sensor, and control inputs are to be transmitted to the process. They proposed an approximation algorithm, and showed that a performance improvement with respect to periodic control can be guaranteed. Our approximation algorithm is inspired by this idea. Nevertheless, herein unlike the above work, the event trigger and controller are distributed.

As mentioned earlier, in this article, we introduce the notion of value of information. Generally speaking, value of information is defined as the value that is assigned to the reduction of uncertainty from the decision maker’s perspective given a piece of information [20]. In other words, the value of information measures information beyond its probabilistic structure by considering the economic impact of uncertainty on the decision maker. The concept of value of information has widely been used in multiple disciplines including information economics [21], risk management [22], and stochastic programming [23]. Recently, closer to the applications of our work, value of information was adopted in sensor selection [24], shortest path optimization [25], and prioritization in medium access control [26].

I-B Contributions and Outline

Our main contributions, corresponding to each information pattern, are summarized as:

  1. We show, under a certain assumption, that a separation in the optimal designs of the event trigger and controller is guaranteed.

  2. We characterize the optimal triggering policy and optimal control policy such that the corresponding policy profile represents a Nash equilibrium.

  3. We quantify the value of information, and demonstrate that the optimal triggering policy transmits an observation whenever the value of information is positive.

  4. We provide an algorithm for approximation of the value of information, and synthesize a closed-form suboptimal triggering policy with a performance guarantee that can readily be implemented.

The remainder of the article is organized in the following way. We formulate the problem in Section II. We provide the main results in Section III. We present numerical examples in Section IV. Finally, we make concluding remarks in Section V.

Ii Problem Formulation

We first introduce notations and provide some definitions from stochastic control theory and game theory. Then, we describe the system model along with two distinct information patterns, and formulate the main problem of this study.

Ii-a Preliminaries

In the sequel, vectors, matrices, and sets are represented by lower case, upper case, and Calligraphic letters like

, , and respectively. The sequence of all vectors , is represented by , and the sequence of all vectors for a specific , is represented by . The indicator function of a subset of a set is denoted by where

. The identity matrix is denoted by

. For matrices and , the relations and denote that and

are positive definite and positive semi-definite respectively. The probability distribution of the stochastic variable

is represented by . The expected value and covariance of are represented by and respectively.

Let be the information set of the controller and be the information set of the controller when controls are equal to zero. The control has no dual effect [27] of order () if

where is the

th central moment of the

th component of the state conditioned on . In other words, the control has no dual effect if the expected future uncertainty is not affected by the prior controls.

Consider a team game with two decision makers. Let and be the policies of the first and the second decision makers respectively, and be the cost function. A policy profile represents a Nash equilibrium [28] if and only if

The optimality considered in this study is in the above sense.

Ii-B Optimal Event-Triggered Control Problem

Consider a stochastic process with linear discrete-time time-varying dynamics generated by the following state equation:

(1)

for with initial condition where is the state of the process, is the state matrix, is the input matrix, is the control input to be decided by a controller, and

is an i.i.d. Gaussian white noise with zero mean and covariance

. It is assumed that the initial state is a Gaussian vector with mean and covariance , and that is controllable. We consider two distinct information patterns: perfect information and imperfect information. Accordingly, we employ an event trigger that determines whether an observation is transmitted or not. In particular, let be an event. The observation at time is transmitted if ; otherwise, it is not transmitted.

Fig. 1: Event-triggered control with perfect information. The exact value of the state is accessible. The state is given at the event trigger instantly, and is transmitted to the controller sporadically with one-step delay.
Fig. 2: Event-triggered control with imperfect information. The exact value of the state is not accessible. Instead, the noisy output is given at the event trigger instantly, and is transmitted to the controller sporadically with one-step delay.

In event-triggered control with perfect information (see Fig. 1), the exact value of the state is accessible. In this case, the state is given at the event trigger instantly, and is transmitted to the controller with one-step delay when . Hence, we have

(2)

where is the output of the event trigger with perfect information.

However, in event-triggered control with imperfect information (see Fig. 2), the exact value of the state is not accessible. Instead, a noisy output of the process is measured by a sensor, and is given by

(3)

for where is the output of the process, is the output matrix, and is an i.i.d. Gaussian white noise with zero mean and covariance . It is assumed that is observable. In this case, the observation is given at the event trigger instantly, and is transmitted to the controller with one-step delay when . Hence, we have

(4)

where is the output of the event trigger with imperfect information.

Consider a finite time horizon , and let and denote a randomized triggering policy and a randomized control policy respectively. We measure the sampling rate by

(5)

where is a weighting coefficient specifying the relative communication cost at each time. Moreover, we measure the control performance by

(6)

where and are weighting matrices.

Both event trigger and controller seek to maximize the control performance such that the sampling rate is less than or equal to a level . We study this problem under the following assumption:

Assumption 1

The information associated with non-transmitted observations, i.e., when , is discarded at the controller.

For a system that satisfies the above assumption, let and denote the admissible information sets of the event trigger and controller at time respectively, and and denote the sets of the admissible triggering policies and admissible control policies respectively. Then, satisfies Assumption 1. Moreover, we have where is a measurable function of , and where is a measurable function of . For such a system, we have the following distributed optimization problem:

(7)

In the sequel, we shall characterize the Nash equilibria in this problem with perfect and imperfect information.

Iii Main Results

We here present the main results of our study. All proofs are provided in Appendix. We first reformulate the problem of interest. Then, we characterize the Nash equilibria. The perfect and imperfect information are treated separately. Finally, we provide an approximation algorithm.

Iii-a Lagrange Multiplier and Riccati Equation

In order to reformulate the problem, we need the following theorem, which shows the convexity of the constraint set specified by the sampling rate.

Theorem 1

The constraint set specified by is convex.

Following Theorem 1 and the theory of Lagrange multipliers [29], the existence of a Lagrange multiplier is guaranteed. Hence, we can reformulate (7) as

(8)

We are interested in a general trade-off between the sampling rate and control performance. Therefore, we shall study this problem without specifying a particular Lagrange multiplier  for now. Equivalently, we can study the following problem:

(9)

where the cost function is given by

(10)

where .

Besides, associated with (7), we define the matrix such that it satisfies the following Riccati equation:

(11)
(12)

with initial condition and with . This Riccati equation will play an essential role in the structures of the optimal policies.

Iii-B Perfect Information

We first consider the basic case, which is event-triggered control with perfect information. In this case, only the controller needs to infer the state of the process. We derive the optimal estimator at the controller based on a Bayesian analysis. Let us define the admissible information set of the event trigger at time as the set of the current and prior states, i.e.,

(13)

and the admissible information set of the controller at time as the set of the prior transmitted states, i.e.,

(14)

The next proposition gives the optimal estimator with respect to the information set that can be used at the controller, and shows that such an estimator is linear.

Proposition 1

The conditional expectation with the following dynamics is the state estimate that minimizes the mean-square error at the controller:

(15)

for with initial condition where . Moreover, the error covariance is given by

(16)

for with initial condition where .

Remark 1

One can show that the structure of the optimal estimator at the controller is the same as (15) even without discarding the negative information (see, e.g., [9]). Our following results will still be valid in such a case. However, discarding the negative information yields a Gaussian conditional distribution at the controller, which helps us in the extension of our framework to the imperfect information pattern.

We design the optimal policies using backward induction. Let be the estimation error associated with the estimator at the controller. Note that, in addition to the controller, the event trigger can obtain . This is possible because . The next theorem characterizes the structures of the optimal triggering policy and optimal control policy such that the corresponding policy profile represents a Nash equilibrium, and proves that there exists a separation in the optimal designs of the event trigger and controller.

Theorem 2

In event-triggered control with perfect information, the optimal triggering policy is a symmetric threshold policy given by

(17)

where is the value of information at time defined as

(18)

and is a variable that depends on , and the optimal control policy is a certainty-equivalence policy given by

(19)

where

(20)

According to Theorem 2, the optimal triggering policy depends on , and is independent of the control policy. Besides, the error covariance in (16) does not depend on . Hence, the control has no dual effect.

Remark 2

The value of information in (18) quantifies the deviation in the cost-to-go of the system with perfect information. In light of this definition, it is certified that the optimal triggering policy transmits an observation whenever the value of information is positive.

Remark 3

It should be noted that the results here are consistent with those in [14, 9]. These works mainly studied the optimal control policy under different assumptions.

Iii-C Imperfect Information

Now, we extend the results presented above to event-triggered control with imperfect information. In this case, both event trigger and controller need to infer the state of the process. We derive the optimal estimators at the event trigger and controller based on a Bayesian analysis. Let us define the admissible information set of the event trigger at time as the set of the current and prior outputs, i.e.,

(21)

and the admissible information set of the controller at time as the set of the prior transmitted outputs, i.e.,

(22)

The next two propositions give the optimal estimators with respect to the information sets and that can be used at the event trigger and controller respectively, and show that such estimators are linear.

Proposition 2

The conditional expectation with the following dynamics minimizes the mean-square error at the event trigger:

(23)
(24)

where

(25)

for with initial conditions and where and .

Proposition 3

The conditional expectation with the following dynamics minimizes the mean-square error at the controller:

(26)
(27)

where

(28)

for with initial conditions and where and .

Remark 4

We recall that given imperfect information, we need to employ two distinct estimators. Employing two distinct estimators for the event trigger and controller with the imperfect information was also noted in [30, 15, 16]. Discarding the negative information here yields a Gaussian conditional distribution at the controller. Approximation of the optimal estimator at the controller when the negative information is not discarded was studied in [8].

We design the optimal policies using backward induction. Let be the estimation error and be the innovation both associated with the estimator at the controller. Note that, in addition to the controller, the event trigger can obtain . This is possible because . Moreover, let be the mismatch estimation error associated with the estimators at the event trigger and controller. Consequently, we can obtain

(29)
(30)

The next theorem characterizes the structures of the optimal triggering policy and optimal control policy such that the corresponding policy profile represents a Nash equilibrium, and proves that there exists a separation in the optimal designs of the event trigger and controller.

Theorem 3

In event-triggered control with imperfect information, the optimal triggering policy is a symmetric threshold policy given by

(31)

where is the value of information at time defined as

(32)

where is a variable that depends on and , and the optimal control policy is a certainty-equivalence policy given by

(33)

where

(34)

According to Theorem 3, the optimal triggering policy depends on and , and is independent of the control policy. Besides, the error covariance in (27) does not depend on . Hence, the control has no dual effect.

Remark 5

The value of information in (32) quantifies the variation in the cost-to-go of the system with imperfect information. In light of this definition, it is certified that the optimal triggering policy transmits an observation whenever the value of information is positive.

Remark 6

Related to our study here are the works in [15, 16]. In these works, it is assumed that, instead of an observation, the state estimate at the event trigger is transmitted to the controller whenever an event occurs.

Iii-D Approximation Algorithm with Guaranteed Performance

The optimal triggering policy provided above in each case depends on the variable . Although can be computed recursively according to the procedure given in the proof of Theorem 2 or Theorem 3, its computation is in general expensive. We here provide a rollout algorithm [31] for approximation of the variable and the value of information , and accordingly synthesize a closed-form suboptimal triggering policy with a performance guarantee that can readily be implemented. The following algorithm gives an approximation of the variable .

Algorithm 1

Let be a periodic policy with period , and be the event at time . An approximation of the variable associated with the periodic policy is given by

(35)

where

with for both perfect and imperfect information.

The next theorem guarantees that given any periodic policy it is possible to synthesize a suboptimal triggering policy that outperforms it.

Theorem 4

Let be a periodic policy with period , and be a suboptimal triggering policy obtained based on Algorithm 1 with periodic policy . Then,

(36)

for both perfect and imperfect information.

In the next two propositions, we synthesize a closed-form suboptimal triggering policy with a performance guarantee for perfect and imperfect information.

Proposition 4

Let be the periodic policy with period . A suboptimal triggering policy that outperforms the periodic policy in event-triggered control with perfect information is given by

(37)

where

(38)
Proposition 5

Let be the periodic policy with period . A suboptimal triggering policy that outperforms the periodic policy in event-triggered control with imperfect information is given by

(39)

where

(40)

and

where

for with initial condition , , , , , and .

Iv Numerical Examples

We provide two examples for the theoretical framework that we developed. In the first example, we consider a scalar process with the following dynamics:

with initial conditions and and the noise variance for all , in which the state is accessible. The time horizon is . We chose the weighting coefficients as , , , and for all . For this system, we obtained the suboptimal triggering policy and optimal control policy provided by Proposition 4 and Theorem 2 respectively. The trade-off curve between the sampling rate and control performance was numerically computed using different values of the Lagrange multiplier , and is depicted in Fig. 3. The achievable region is specified by the area above the trade-off curve. Note that this trade-off curve should be regarded as an upper bound.

In the second example, we consider an inverted pendulum on a cart observed by a sensor, which communicates with a controller through a controller area network. The continuous-time equations of motion linearized around the unstable equilibrium are given by

where is the pitch angle of the pendulum, , here, is the position of the cart, , here, is the force applied to the cart, , here, is the moment of inertia of the pendulum, is the mass of the pendulum, is the length to the pendulum’s center of mass, is the gravity, is the mass of the cart, and is the coefficient of friction for the cart. We chose these parameters as , , , , , and . The sensor can only measure the position and pitch angle. The discrete-time dynamics of form (1) obtained by a zero-hold transformation with discretization sampling frequency of and the sensor model of form (3) together with the covariance matrices are given by

for all with initial conditions and . The time horizon is . We chose the Lagrange multiplier as and the weighting coefficients and matrices as , , , and for all . For this system, we obtained the suboptimal triggering policy and optimal control policy provided by Proposition 5 and Theorem 3 respectively. For a realization of the system, we carried out a simulation experiment. The trajectories of the value of information, event, and control are shown in Fig. 4. Moreover, the trajectories of the position, velocity, pitch angle, and pitch rate are shown in Fig. 5. In this experiment, the value of information became positive only times, which lead to the transmission of the observation at each of those times. Besides, we observe that the system could still achieve a good control performance while the sampling rate was reduced by with respect to the periodic policy with .

Fig. 3: Trade-off curve between the sampling rate and control performance in event-triggered control. The control performance is scaled by one tenth. High values in the horizontal axis represent low control performance, and vice versa.
Fig. 4: Trajectories of the value of information, event, and control. The value of information is scaled by one tenth. The dotted line in the diagram of the value of information represents the zero values.
Fig. 5: Trajectories of the position, velocity, pitch angle, and pitch rate. The solid lines represent the state components and the dotted lines represent the state estimate components at the controller.

V Conclusion

In this article, we quantified the value of information, which systematically gauges the instantaneous impact of information on a networked control system. In the course of our study, we developed a theoretical framework for the joint design of an event trigger and a controller in optimal event-triggered control with perfect and imperfect information. In each case, we characterized the optimal triggering policy and optimal control policy such that the corresponding policy profile represents a Nash equilibrium. In particular, we proved that the optimal triggering policy is a symmetric threshold policy and the optimal control policy is a certainty-equivalence policy. We demonstrated that the optimal triggering policy transmits an observation whenever the value of information is positive. Finally, we provided an algorithm for approximation of the value of information.

In general, our results may improve knowledge about decision making based on the value of information. Our ongoing research shows that we can exploit the framework developed here for studying the impact of reliability, resolution, and timeliness of information on a networked control system. In addition, the tractable framework developed here can be used as a foundation for future research in event-triggered control of complex systems. We propose that further research should be undertaken in following directions. First, the framework should be extended for networks of interacting systems. Second, the team optimality gap of the Nash equilibria derived here should be investigated. An attempt towards this direction was made in [14] based on a transformation to an equivalence class. However, as pointed out in [9], there are subtleties in defining an equivalence class for a state-dependent triggering policy due to the existence of a dual effect, which must be taken into account.

Vi Appendix

We here present few lemmas and then the proofs of the main results of the article.

Lemma 1

Let be a matrix that satisfies the following algebraic Riccati equation:

(41)
(42)

for all with initial condition . Then, the cost function is equal to

(43)
Proof:

Using the process dynamics (1) and the Riccati equation (41), we can write

(44)
(45)

Consequently, we find

where the first equality is an identity, and in the second equality we used (44) and (45) and also added and subtracted the term to and from the right-hand side. Rearranging the terms in the above relation, we find

Adding the term to both sides of the above relation and taking expectation, we obtain the result:

where in the second equality we used the fact that is independent of and