Linear Temporal Logic Satisfaction in Adversarial Environments using Secure Control Barrier Certificates

10/27/2019
by   Bhaskar Ramasubramanian, et al.
0

This paper studies the satisfaction of a class of temporal properties for cyber-physical systems (CPSs) over a finite-time horizon in the presence of an adversary, in an environment described by discrete-time dynamics. The temporal logic specification is given in safe-LTL_F, a fragment of linear temporal logic over traces of finite length. The interaction of the CPS with the adversary is modeled as a two-player zero-sum discrete-time dynamic stochastic game with the CPS as defender. We formulate a dynamic programming based approach to determine a stationary defender policy that maximized the probability of satisfaction of a safe-LTL_F formula over a finite time-horizon under any stationary adversary policy. We introduce secure control barrier certificates (S-CBCs), a generalization of barrier certificates and control barrier certificates that accounts for the presence of an adversary, and use S-CBCs to provide a lower bound on the above satisfaction probability. When the dynamics of the evolution of the system state has a specific underlying structure, we present a way to determine an S-CBC as a polynomial in the state variables using sum-of-squares optimization. An illustrative example demonstrates our approach.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/16/2019

Secure Control under Partial Observability with Temporal Logic Constraints

This paper studies the synthesis of control policies for an agent that h...
08/31/2018

Minimum Violation Control Synthesis on Cyber-Physical Systems under Attacks

Cyber-physical systems are conducting increasingly complex tasks, which ...
07/22/2020

Secure Control in Partially Observable Environments to Satisfy LTL Specifications

This paper studies the synthesis of control policies for an agent that h...
09/25/2021

Verification of Switched Stochastic Systems via Barrier Certificates

The paper presents a methodology for temporal logic verification of cont...
02/04/2021

Barrier Function-based Collaborative Control of Multiple Robots under Signal Temporal Logic Tasks

Motivated by the recent interest in cyber-physical and autonomous roboti...
07/27/2020

Privacy-Preserving Resilience of Cyber-Physical Systems to Adversaries

A cyber-physical system (CPS) is expected to be resilient to more than o...
04/10/2020

Deceptive Labeling: Hypergames on Graphs for Stealthy Deception

With the increasing sophistication of attacks on cyber-physical systems,...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Cyber-physical systems (CPSs) use computing devices and algorithms to inform the working of a physical system [8]. These systems are ubiquitous, and vary in size and scale from energy systems to medical devices. The wide-spread influence of CPSs such as power systems and automobiles makes their safe operation critical. Although distributed algorithms and systems allow for more efficient sharing of information among parts of the system and across geographies, they also make the CPS vulnerable to attacks by an adversary who might gain access to the distributed system via multiple entry points. Attacks on distributed CPSs have been reported across multiple application domains [20], [43], [44], [46]. In these cases, the damage to the CPS was caused by the actions of a stealthy, intelligent adversary. Thus, methods designed to only account for modeling and sensing errors may not meet performance requirements in adversarial scenarios. Therefore, it is important to develop ways to specify and verify properties that a CPS must satisfy that will allow us to provide guarantees on the operation of the system while accounting for the presence of an adversary.

In order to verify the behavior of a CPS against a rich set of temporal specifications, techniques from formal methods can be used [9]. Properties like safety, stability, and priority can be expressed as formulas in linear temporal logic (LTL) [19]. These properties can then be verified using off-the-shelf model solvers [15], [28]

that take these formulas as inputs. If the state space and the actions available to the agents are both finite and discrete, then the environment can be represented as a Markov decision process (MDP)

[38] or a stochastic game [11]. These representations have also been used as abstractions of continuous-state continuous action dynamical system models [10], [32]. However, a significant shortcoming is that the computational complexity of abstracting the underlying system grows exponentially with the resolution of discretization desired [14], [21].

The method of barrier certificates (or barrier functions), which are functions of the states of the system was introduced in [36]. Barrier functions provide a certificate that all trajectories of a system starting from a given initial set will not enter an unsafe region. The use of barrier functions does not require explicit computation of sets of reachable states, which is known to be undecidable for general dynamical systems [29], and moreover, it allows for the analysis of general nonlinear and stochastic dynamical systems. The authors of [36] further showed that if the states and inputs to the system have a particular structure, computationally efficient methods can be used to construct a barrier certificate.

Barrier certificates were used to determine probabilistic bounds on the satisfaction of an LTL formula by a discrete-time stochastic system in [22]. A more recent work by the same authors [23] used control barrier certificates to synthesize a policy in order to maximize the probability of satisfaction of an LTL formula.

Prior work that uses barrier certificates to study temporal logic satisfaction assumes a single agent, and does not study the case when the CPS is operating in an adversarial environment. To the best of our knowledge, this paper is the first to use barrier certificates to study temporal logic satisfaction for CPSs in adversarial environments. We introduce secure barrier certificates (S-CBCs), and use it to determine probabilistic bounds on the satisfaction of an LTL formula under any adversary policy. Further, definitions of barrier certificates and control barrier certificates in prior work can be recovered as special cases of S-CBCs.

1.1 Contributions

In this paper, we consider the setting when there is an adversary whose aim is to ensure that the LTL formula is not satisfied by the CPS (defender). The temporal logic specification is given in , a fragment of LTL over traces of finite length. We make the following contributions:

  • We model the interaction between the CPS and adversary as a two-player dynamic stochastic game with the CPS as defender. The two players take their actions simultaneously, and these jointly influence the system dynamics.

  • We present a dynamic programming based approach to determine a stationary defender policy to maximize the probability of satisfaction of an LTL formula over a finite time-horizon under any stationary adversary policy.

  • In order to determine a lower bound on the above satisfaction probability, we define a new entity called secure control barrier certificates (S-CBCs). S-CBCs generalize barrier certificates and control barrier certificates to account for the presence of an adversary.

  • When the evolution of the state of the dynamic game can be expressed as polynomial functions of the states and inputs, we use sum-of-squares optimization to compute an S-CBC as a polynomial function of the states.

  • We present an illustrative example demonstrating our approach.

1.2 Outline of Paper

We summarize related work on control barrier certificates and temporal logic satisfaction in Section 2. Section 3 gives an overview of temporal logic and game-theoretic concepts that will be used to derive our results. The problem that is the focus of this paper is formulated in Section 4. Our solution approach is presented in Section 5, where we define a dynamic programming operator to synthesize a policy for the defender in order to maximize the probability of satisfaction of the LTL formula under any adversary policy. We define a notion of secure control barrier certificates to derive a lower bound on the satisfaction probability, and are able to explicitly compute an S-CBC under certain assumptions. Section 6 presents an illustrative example, and we conclude the paper in Section 7.

2 Related Work

The method of barrier functions was introduced in [36] to certify that all trajectories of a continuous-time system starting from a given initial set do not enter an unsafe region. Control barrier functions (CBFs) were used to provide guarantees on the safety of continuous-time nonlinear systems with affine inputs for an adaptive cruise control application in [6]. The notion of input-to-state CBFs that ensured the safety of nonlinear systems under arbitrary input disturbances was introduced in [24], and safety was characterized in terms of the invariance of a set whose computation depended on the magnitude of the disturbance. The authors of [45] relaxed the supermartingale condition that a barrier certificate had to satisfy in [36] in order to provide finite-time guarantees on the safety of a system. The verification and control of a finite-time safety property for continuous-time stochastic systems using barrier functions was recently presented in [41]. Barrier certificates were used to verify LTL formulas for a deterministic, continuous-time nonlinear dynamical system in [49]. Time-varying CBFs were used to accomplish tasks specified in signal temporal logic in [30]. A survey of the use of CBFs to design safety-critical controllers is presented in [5]. The use of barrier certificates or CBFs in these works were all for continuous time dynamical systems and did not consider the effect of the actions of an adversarial player.

Barrier certificates in the discrete-time setting were used to analyze the reachable belief space of a partially observable Markov decision process (POMDP) with applications to verifying the safety of POMDPs in [2], and for privacy verification in POMDPs in [3]. The use of barrier certificates for the verification and synthesis of control policies for discrete-time stochastic systems to satisfy an LTL formula over a finite time horizon was presented in [22] and [23]. These papers also assumed a single agent, and did not account for the presence of an adversary.

The authors of [33] used barrier functions to solve a reference tracking problem for a continuous-time linear system subject to possible false data injection attacks by an adversary, with additional constraints on the safety and reachability of the system. Probabilistic reachability over a finite time horizon for discrete-time stochastic hybrid systems was presented in [1]. This was extended to a dynamic stochastic game setting when there were two competing agents in [18]

, and to the problem of ensuring the safety of a system that was robust to errors in the probability distribution of a disturbance input in

[50]. These papers did not assume that a temporal specification had to be additionally satisfied.

Determining a policy for an agent in order to maximize the probability of satisfying an LTL formula in an environment specified by an MDP was presented in [19]. This setup was extended to the case when there were two agents- a defender and an adversary- who had competing objectives to ensure the satisfaction of the LTL formula in an environment specified as a stochastic game in [32]. These papers assume that the states of the system are completely observable, which might not be true in every situation. The satisfaction of an LTL formula in partially observable environments represented as POMDPs was studied in [42] and the extension to partially observable stochastic games with two competing agents, each with its own observation of the state of the system, was formulated in [39].

3 Preliminaries

In this section, we give a brief introduction to linear temporal logic and discrete-time dynamic stochastic games. Wherever appropriate, we consider a probability space . We write to denote the measurable space equipped with the Borel algebra, and to denote the set of non-negative real numbers.

3.1 Linear Temporal Logic

Temporal logic frameworks enable the representation and reasoning about temporal information on propositional statements. Linear temporal logic (LTL) is one such framework, where the progress of time is ‘linear’. An LTL formula [9] is defined over a set of atomic propositions , and can be written as:

where , and and are temporal operators denoting the next and until operations. The semantics of LTL are defined over (infinite) words in .

The syntax of linear temporal logic over finite traces, denoted [17], is the same as that of LTL. The semantics of is expressed in terms of finite-length words in . We denote a word in by , write to denote the length of , and , , to denote the proposition at the position of . We write when the formula is true at the position of .

Definition 1 ( Semantics)

The semantics of can be recursively defined in the following way:

  1. ;

  2. iff ;

  3. iff ;

  4. iff and ;

  5. iff and ;

  6. iff such that and for all .

Finally, we write if and only if .

Moreover, the logic admits derived formulas of the form: i) ; ii) ; iii) ; iv) . The set comprises the language of finite-length words associated with the formula . In this paper, we focus on a subset of called [40], that explicitly considers only safety properties [26].

Definition 2 ( Formula)

An formula is a formula if it can be written in positive normal form (PNF)111In PNF, negations occur only adjacent to atomic propositions., using the temporal operators (next) and (always).

Next, we define an entity that will serve as an equivalent representation of an formula, and will allow us to check if the formula is satisfied or not.

Definition 3 (Deterministic Finite Automaton)

A deterministic finite automaton (DFA) is a quintuple where is a nonempty finite set of states, is a finite alphabet, is a transition function, is the initial state, and is a set of accepting states.

Definition 4 (Accepting Runs)

A run of of length is a finite sequence of states such that for all and for some . The run is accepting if . We write to denote the set of all words accepted by .

Every formula over can be represented by a DFA with that accepts all and only those runs that satisfy , that is, [16]. The DFA can be constructed by using a tool like Rabinizer4 [25].

3.2 Discrete-time Dynamic Stochastic Games

We model the interaction between the CPS (defender) and adversary as a two-player dynamic stochastic game that evolves according to some known (discrete-time) dynamics [7]. The evolution of the state of the game at each time step is affected by the actions of both players.

Definition 5 (Discrete-time Dynamic Stochastic Game)

A discrete-time dynamic stochastic game (DDSG) is a tuple , where and are Borel-measurable spaces representing the state-space and uncertainty space of the system, and are compact Borel spaces that denote the action sets of the defender and adversary, is a Borel-measurable transition function characterizing the evolution of the system, is an index-set denoting the stage of the game, is a set of atomic propositions, and is a labeling function that maps states to a subset of atomic propositions that are satisfied in that state.

The evolution of the state of the system is given by:

(1)

where

is a sequence of independent and identically distributed (i.i.d.) random variables with zero mean and bounded covariance.

In this paper, we focus on the Stackelberg setting with the defender as leader and adversary as follower. The leader selects its inputs anticipating the worst-case response by the adversary. We assume that the adversary can choose its action based on the action of the defender [18], and further, restrict our focus to stationary strategies for the two players. Due to the asymmetry in information available to the players, equilibrium strategies for the case when the game is zero-sum can be chosen to be deterministic strategies [13].

Definition 6 (Defender Strategy)

A stationary strategy for the defender is a sequence of Borel-measurable maps .

Definition 7 (Adversary Strategy)

A stationary strategy for the adversary is a sequence of Borel-measurable maps .

4 Problem Formulation

For a DDSG , recall that the labeling function indicates which atomic propositions are true in each state.

Assumption 1

We restrict our attention to labeling functions of the form . Then, if , and will partition the state space as , where . We further assume that for all .

Remark 1

Through the remainder of the paper, we interchangeably use or to denote the state at time .

Given a sequence of states , using Assumption 1, if for all , then we can write .

Definition 8 (LTL Satisfaction by DDSG)

For a DDSG and a formula , we write to denote the probability that the evolution of the DDSG starting from under player policies and satisfies over the time horizon .

We are now ready to formally state the problem that this paper seeks to solve.

Problem 1

Given a discrete-time dynamic game that evolves according to the dynamics in Equation (1) and a formula , determine a policy for the defender, , that maximizes the probability of satisfying over the time horizon under any adversary policy for all for some . That is, compute:

(2)

5 Solution Approach

In this section, we present a dynamic programming approach to determine a solution to Problem 1. Our analysis is motivated by the treatment in [18] and [50].

We then introduce the notion of secure control barrier certificates (S-CBCs), and use these to provide a lower bound on the probability of satisfaction of the formula for a defender policy under any adversary policy in terms of the accepting runs of length less than or equal to the length of the time-horizon of interest of a DFA associated with . For systems whose evolution of states can be written as a polynomial function of states and inputs, we present a sum-of-squares optimization approach in order to compute an S-CBC.

S-CBCs generalize barrier certificates [22] and control barrier certificates [23] to account for the presence of an adversary. A difference between the treatment in this paper and that of [22], [23] is that we define S-CBCs for stochastic dynamic games, while the latter papers focus on stochastic systems with a single agent.

5.1 Dynamic Programming for Satisfaction

We introduce a dynamic programming (DP) operator that will allow us to recursively solve a Bellman equation related to Equation (2) backward in time. First, observe that we can write the satisfaction probability in Definition 8 as:

(3)

where is the expectation operator under the probability measure induced by agent policies and . is the indicator function, which takes value if its argument is true, and otherwise.

Assume that is a Borel-measurable function. A DP operator can then be characterized in the following way:

(4)
(5)

where is a probability measure on the Borel space .

The following results adapts Theorem 1 of [18] to the case of temporal logic formula satisfaction over a finite time-horizon.

Theorem 5.1

Assume that the DDSG has to satisfy a formula over horizon . Let the DP operator be defined as in Equation (5). Additionally, if is continuous, then,

(6)

where ( times) is the repeated composition of the operator .

Proof

Consider a particular pair of stationary agent policies and . For these policies, define measurable functions , :

(7)
(8)

Therefore, we have .

Now, consider strategies of the agents at a stage . Define the operator :

(9)

Expanding Equation (8) using the definition of the expectation operator will allow us to write .

The result follows by an induction argument which uses the fact that is a monotonic operator. We refer to [18] for details. Further, this procedure also guarantees the existence of a defender policy that will maximize the probability of satisfaction of under any adversary policy. ∎

5.2 Secure Control Barrier Certificates

Definition 9

A continuous function is a secure control barrier certificate (S-CBC) for the DDSG if for any state and some constant ,

(10)

Intuitively, for some defender action , the increase in the value of an S-CBC is bounded from above along trajectories of under any adversary action .

Remark 2

S-CBCs generalize control barrier certificates and barrier certificates seen in prior work. If for every , then we recover the definition of a control barrier certificate [23]. The definition of a barrier certificate [22], [36] is got by additionally requiring that for every and . Here denotes stochastic equivalence of the respective stochastic processes [35]. In the latter case, when , the function is a super-martingale. For this case, along with some additional assumptions on the system dynamics, asymptotic guarantees on the satisfaction of properties over the infinite time-horizon can be established [36].

Remark 3

Although our definition of S-CBCs in Definition 9 bears resemblance to the notion of a worst-case barrier certificate introduced in [36], there are some distinctions. While the entity in [36] considers a dynamical system with a single disturbance input, our setting considers three terms that influence the evolution of the state of the system: we want to find a defender input that will allow the barrier function to satisfy a certain property under any adversary input and disturbance. A second point of difference is that while [36]

focuses on asymptotic analysis, we consider properties over a finite time horizon.

We limit our attention to stationary strategies for both players. Studying the effects of other strategies is left as future work. The following preliminary result will be used subsequently to determine a bound on the probability of reaching a subset of states under particular agent policies over a finite time-horizon.

Lemma 1

Consider a DDSG and let be an S-CBC as in Definition 9 with constant . Then, for any and initial state , for a stationary defender policy, , the following holds under any stationary adversary policy :

(11)
Proof

The proof follows from the result of Chapter III, Theorem 3 and Corollary 2-1 in [27], Definition 9, and the fact that the agents adopt stationary policies. ∎

Definition 10 (Reachability)

For the DDSG with dynamics in Equation (1), let and be the set of possible initial states and be disjoint from . Then, given , is reachable with respect to , if . That is, the probability of reaching a state in starting from in the time horizon is upper bounded by .

Theorem 5.2

With and known, and , assume there exists an S-CBC , stationary policies, and , and constant . Additionally, if there is a constant such that:

  1. for all ,

  2. for all ,

then the DDSG starting from is reachable with respect to .

Proof

Observe that . Therefore, starting from , and following the respective agent policies, . Since this should be true for arbitrary , we have:

The second line of the above system of inequalities follows by setting in Lemma 1, and the fact that for all . ∎

5.3 Automaton-Based Verification

In order to verify that under agent policies and , we need to establish that . To do this, we first construct a DFA , that accepts all and only those words over that do not satisfy the formula . We have the following result:

Lemma 2

[9] For and a DFA , the following is true:

The construction of can also be carried out in Rabinizer4 [25]. The accepting runs of of length less than or equal to can be computed using a depth-first search algorithm [47]. For the purposes of this section, it is important to understand that the accepting runs of of length less than or equal to will give a bound on the probability that a particular pair of agent policies will not satisfy over the time horizon . Using Definition 4 and following the treatment of [22] and [23] define the following terms (the reader is also referred to these works for an example that offers a detailed treatment of the procedure):

(12)
(13)
(14)
(15)

Intuitively, is the set of accepting runs in of length not greater than , and without counting any self-loops in the states of the DFA. The set is the set of runs in with the first state transition labeled by . For an element of , defines the set of paths of length augmented with a ‘loop-bound’. The ‘loop-bound’ is an indicator of the number of ‘self-loops’ the run in the DFA can make at state while still keeping its length less than or equal to . We assume that when the run cannot make a self-loop at .

5.4 Satisfaction probability using S-CBCs and

In this section, we show that an accepting run of of length less than or equal to gives a lower bound on the probability that a particular pair of agent policies will not satisfy the formula . We use this in conjunction with the S-CBC to derive an upper bound on the probability that will be satisfied for a particular choice of defender policy under any adversary policy. Specifically, we use Theorem 5.2 over each accepting run of of length less than or equal to to give a bound on the overall satisfaction probability.

Theorem 5.3

Assume that the DDSG has to satisfy a formula over horizon . Let be the DFA corresponding to the negation of , and for this DFA, assume that the quantities in Equations (12)-(14) have been computed. Then, for some and all the maximum value of the probability of satisfaction of for a defender policy under any adversary policy satisfies the following inequality:

where is the set of paths of length with loop bound for in an accepting run of length in .

Proof

For , consider (Equation (13)) and the set (Equations (14) and (15)). Consider an element . From Theorem 5.2, for some stationary defender policy , the probability that a trajectory of starting from and reaching under stationary adversary policy over the time horizon is at most . Therefore, the probability of an accepting run in of length at most starting from is upper bounded by:

Now consider Equation (2) of Problem 1. We have the following set of equivalences and inequalities:

Theorem 5.3 generalizes Theorem 5.2 of [23] to provide a lower bound for a stationary defender policy that maximizes the probability that the formula is satisfied by the DDSG over the time horizon , starting from for some for any stationary adversary policy.

5.5 Computing an S-CBC

The use of barrier functions will circumvent the need to explicitly compute sets of reachable states, which is known to be undecidable for general dynamical systems [29]. However, computationally efficient methods can be used to construct a barrier certificate if the system dynamics can be expressed as a polynomial [36]. This will allow for determining bounds on the probability of satisfaction of the LTL formula without discretizing the state space. In contrast, if the underlying state space is continuous, computing the satisfaction probability and the corresponding agent policy using dynamic programming will necessitate a discretization of the state space in order to approximate the integral in Equation (5).

We propose a sum-of-squares (SOS) optimization [34] based approach that will allow us to compute an S-CBC if the evolution of the state of the DDSG has a specific structure. The key insight is that if a function can be written as a sum of squares of different polynomials, then it is non-negative.

Assumption 2

The sets in the DDSG are continuous, and in Equation (1) can be written as a polynomial in for any . Further, the sets in Assumption 1 can be represented by polynomial inequalities.

Proposition 1

Under the conditions of Assumption 2, suppose that sets , , and , where the inequalities are element-wise. Assume that there is an SOS polynomial , constants and

, SOS (vector) polynomials

, and , and polynomials corresponding to the entry in , such that:

(16)
(17)
(18)

are all SOS polynomials. Then, satisfies the conditions of Theorem 5.2, and is the corresponding defender policy.

Proof

The proof of this result follows in a manner similar to Lemma 7 in [49] and Lemma 5.6 in [23], and we do not present it here. ∎

The authors of [23] discuss an alternative approach in the case when the input set has finite cardinality. A similar treatment is beyond the scope of the present paper, and will be an interesting future direction of research.

6 Example

We present an example demonstrating our solution approach to Problem 1.

Example 1

Let the dynamics of the DDSG with , is a compact subset of , , and (and i.i.d.) be given by:

(19)
(20)

Let , and sets such that for , . The sets are defined by:

The aim for an agent is to determine a sequence of inputs such that starting from , for any sequence of adversary inputs , it avoids obstacles in its environment, defined by the sets and for units of time. The corresponding formula is . The DFA that accepts is shown in Figure 1. Suppose we are interested in determining a bound on the probability of being satisfied for a time-horizon of length . Using Equations (12) - (15), we have , and for .

Figure 1: The DFA that accepts for the formula and .

We use a sum-of-squares optimization toolbox, SOSTOOLS [37] along with SDPT3 [48], a semidefinite program solver. The barrier function was assumed to be a polynomial of degree-two. For the case , we determine the smallest value of that will satisfy the conditions in Proposition 1 to compute an S-CBC. The output of the program was an S-CBC given by

The environment and the obstacles denoted by the sets and the contours of the S-CBC is shown in Figure 2. We observe that is less than in some part of . A possible reason is that when solving for the second condition in Proposition 1, we work with the union of the sets and

, which may lead to a conservative estimate of the S-CBC.

Figure 2: The regions along with the computed secure control barrier certificate (S-CBC): . The regions with red boundaries () denote obstacles in the environment. is the set from which the agent starts at time . The contours show the values of the S-CBC of degree 2 ranging from to .

From Theorem 5.2 and the computed value of , we have that

This bound is conservative in the sense that we consider defender inputs for only the extreme values of and . However, for the dynamics in Equation (20), if the last inequality in Proposition 1 is non-negative for both and , then for any , this quantity will be non-negative.

Determining methods to explicitly compute a defender policy and considering S-CBCs of higher degree is an area of future research.

7 Conclusion

This paper introduced a new class of barrier certificates to provide probabilistic guarantees on the satisfaction of temporal logic specifications for CPSs that