1 Introduction
Pushdown Systems (PDSs) are known to be a natural model for sequential programs [18]. Therefore, networks of pushdown systems are a natural model for concurrent programs where each PDS represents a sequential component of the system. In this context, Dynamic pushdown Networks (DPNs) [6] were introduced by Bouajjani et al. as a natural model of multithreaded programs with procedure calls and thread creation. Intuitively, a DPN is a network of pushdown processes where each process, represented by a Pushdown system (PDS), can perform basic pushdown actions, call procedures, as well as spawn new instances of pushdown processes. A lot of previous researches focused on investigating automated methods to verify DPNs. In [6, 15, 14, 9], the reachability analysis of DPNs are considered. While the modelchecking problem for DPNs against doubleindexed properties is undecidable, i.e., the properties where the satisfiability of an atomic proposition depends on control states of two or more threads [10], it is decidable to modelcheck DPNs against the linear temporal logic (LTL) and the computation tree logic (CTL) with singleindexed properties [19], i.e., properties where the satisfiability of an atomic proposition depends on control states of only one thread.
CARET is a temporal logic of calls and returns [1]. This logic allows us to write linear temporal formulas while taking into account the matching between calls and returns. CARET is needed to describe several important properties such as malicious behaviors or API usage rules. Thus, to be able to analyse such properties for multithreaded programs, we need to be able to check CARET formulas for DPNs. We tackle this problem in this paper. As LTL is a subclass of CARET, CARET modelchecking for DPNs with doubleindexed properties is also undecidable. Thus, in this paper, we consider the modelchecking problem for DPNs against singleindexed CARET formulas and show that it is decidable. A singleindexed CARET formula is a formula in the form where is a CARET formula over a certain PDS . A DPN satisfies iff all instances of the PDS created in the network satisfy the subformula .
The modelchecking problem of DPNs against singleindexed CARET formulas is nontrivial because the number of instances of pushdown processes in DPNs can be unbounded. It is not sufficient to check if every PDS satisfies the corresponding formula . Indeed, we need to ensure that all instances of created during a run of DPN satisfies the formula . Also, it is not correct to check whether all possible instances of satisfy the formula . Indeed, an instance of should not be checked if it is not created during the run of DPNs. In this paper, we solve these problems. We show that singleindexed CARET model checking is decidable for DPNs. To this end, we reduce the problem of checking whether Dynamic Pushdown Networks satisfy singleindexed CARET formulas to the membership problem for Büchi Dynamic Pushdown Networks (BDPNs). Finally, we show that singleindexed CARET model checking is decidable for Dynamic Pushdown Networks communicating via nested locks.
Related work.
[5, 7, 2, 3] considered Pushdown networks with communications between processes. However, these works consider only networks with a fixed number of threads. The modelchecking problem for pushdown networks where synchronization between threads is ensured by a set of nested locks is considered in [12, 10, 11] for singleindexed LTL/CTL and doubleindexed LTL. These works do not handle dynamic thread creation.
Multipushdown systems were considered in [13, 4] to represent multithreaded programs. These systems have only a finite number of stacks, and thus, they cannot handle dynamic thread creation.
Pushdown Networks with dynamic thread creation (DPNs) were introduced in [6]. The reachability problems of DPNs and its extensions are considered in [6, 9, 14, 15, 21]. [19] considers the modelchecking problem of DPNs against singleindexed LTL and CTL, while [20] investigates the singleindexed LTL model checking problem for DPNs with locks.
2 Linear Temporal Logic of Calls and Returns  CARET
In this section, we recall the definition of CARET [1]. A CARET formula is interpreted on an infinite path where each state on the path is associated with a tag in the set . A callstate denotes an invocation to a procedure of a program while the corresponding retstate denotes the ret statement of that procedure. A simple statement (neither a call nor a ret statement) is called an internal statement and its associated state is called intstate.
Let be an infinite path where each state on the path is associated with a tag in the set . Over , three kinds of successors are defined for every position :

[noitemsep,topsep=0pt]

globalsuccessor: The globalsuccessor of is .

abstractsuccessor: The abstractsuccessor of is determined by its associated tag.

[noitemsep,topsep=0pt]

If is a call, the abstract successor of is the matching return point.

If is a int, the abstract successor of is .

If is a ret, the abstract successor of is defined as .


callersuccessor: The callersuccessor of is the most inner unmatched call if there is such a call. Otherwise, it is defined as .
A globalpath is obtained by applying repeatedly the globalsuccessor operator. Similarly, an abstractpath or a callerpath are obtained by repeatedly applying the abstractsuccessor and callersuccessor respectively.
Formal Definition. Given a finite set of atomic propositions AP. Let . A CARET formula over AP is defined as follows (where ):
Let . Let be an word over . Let be the suffix of starting from . Let , , be the globalsuccessor, abstractsuccessor and callersuccessor of respectively. The satisfiability relation is defined inductively as follows:

[noitemsep,topsep=0pt]

, where , iff and or

iff or

iff

iff

iff and

iff and

(with ) iff there exists a sequence of positions where , for every : and
Then, iff . Other CARET operators can be expressed by the above operators: , , ,…
Closure. Let be a CARET formula over . The closure of , denoted , is the smallest set that contains , , and and satisfies the following properties:

[noitemsep,topsep=0pt]

if , then

if (with ), then

if , then

if (with ), then

if , and is not in the form then
Atoms. A set is an atom of if it satisfies the following properties:

[noitemsep,topsep=0pt]


or

where or and

A includes exactly one element of the set {call, ret, int}
Let be the set of atoms of . Let and be two atoms, we define the following predicates:

[noitemsep,topsep=0pt]

iff for every iff .

iff for every iff

iff for every iff .
We define (resp. ) to be a function which returns the callerformulas (resp. abstractformulas) in . Formally:

[noitemsep,topsep=0pt]


3 Dynamic Pushdown Networks (DPNs)
3.1 Definitions
Dynamic Pushdown Networks (DPNs) is a natural model for multithreaded programs [6]. To be able to define CARET formulas over DPNs, we must extend this model to record whether a transition rule corresponds to a call, ret or a simple statement (neither call nor ret).
Definition 1.
A Dynamic Pushdown Network (DPN) is a set s.t. for every , is a Labelled Dynamic Pushdown System (DPDS), where is a finite set of control locations, for all , is a finite set of stack alphabet, and is a finite set of transition rules. Rules of are of the following form, where , :

[noitemsep,topsep=0pt]



Intuitively, there are two kinds of transition rules depending on the nature of . A rule with a suffix of the form is a nonspawn rule (does not spawn a new process), while a rule with a suffix describes a spawn rule (a new process is spawned). A nonspawn step describes pushdown operations of one single process in the network. Roughly speaking, a statement is described by a rule in the form . This rule usually models a statement of the form where is the control point of the program where the function call is made, is the entry point of the called procedure , and is the return point of the call; and can be used to encode various information, such as the return values of functions, shared data between procedures, etc. A return statement is modeled by a rule , while a rule is used to model a simple statement (neither a call nor a return). A spawn step allows in addition the creation of a new process. For instance, a rule of the form where describes that a process at control location and having on top of the stack can (1) change the control location to and modify the stack by replacing with and also (2) create a new instance of a process () starting at . Note that in this case, if is call, then is , and if t is ret, then is .
A DPDS can be seen as a Pushdown System (PDS) if there are no spawn rules in . Generally speaking, a DPN consists of a set of PDSs running in parallel where each PDS can dynamically spawn new instances of PDSs in the set during the run. An initial local configuration of a newly created instance is called a Dynamically Created Local Initial Configuration (DCLIC). For every , let be the set of DCLICs that can be created by the DPDS .
A local configuration of an instance of a DPDS is a tuple where is the control location, is the stack content. A global configuration of is a multiset over , in which is a local configuration of an instance of which is running in parallel in the network .
A DPDS defines a transition relation as follows: if then for every where if , if . Let be the transitive and reflexive closure of , then, for every :

[noitemsep,topsep=0pt]


if and , then,
A local run of an instance of a DPDS starting at a local configuration is a sequence s.t. for every , is a local configuration of , for some . A global run of from a global configuration is a set of local runs (possibly infinite) where each local run describes the execution of one instance of a certain DPDS . Initially, consists of local runs of instances starting from , when a new instance is created, a new local run of this instance is added to . For example, when a DCLIC is created by a certain local run of , a new local run that starts at is added to . Note that from a global configuration, we can obtain a set of global runs because from a local configuration, we can have different local runs.
3.2 Singleindexed CARET for DPNs
Given a DPN , a singleindexed CARET formula is a formula in the form s.t. for every , is a CARET formula in which the satisfiability of its atomic propositions depends only on the DPDS .
Given a set of atomic propositions , let be a labeling function that associates each control location with a set of atomic propositions.
Let be a local run of the DPDS . We associate to each local configuration of a tag in as follows, where or :

[noitemsep,topsep=0pt]

If corresponds to a transition rule , then .
Then, we say that satisfies iff the word satisfies . A local configuration of satisfies (denoted ) iff there exists a local run starting from such that satisfies . If is the set of DCLICs created during the run , then, we write . A DPN satisfies a singleindexed CARET formula iff there exist a global run s.t. for every , each local run of in satisfies the formula .
4 Applications
We show in this section how modelchecking singleindexed CARET for DPNs is necessary for concurrent malware detection.
Malware detection is nowadays a big challenge. Several malwares are multithreaded programs that involve recursive procedures and dynamic thread creation. Therefore, DPNs can be used to model such programs. We show in what follows how singleindexed CARET for DPNs can describe malicious behaviors of concurrent malwares.
More precisely, we show how this logic can specify email worms. To this aim, let us consider a typical email worm: the worm Bagle. Bagle is a multithreaded email worm. In the main thread, one of the first things the worm does is to register itself into the registry listing to be started at the boot time. Then, it does some different actions to hide itself from users. After this, the malware creates one thread (named Thread2) that listens on the port 6777 to receive different commands and also allow the attacker to upload a new file and execute it. This grants the attacker the ability to update new versions for his malware. In addition, the attacker can send a crafted byte sequence to this port to force the malware to kill itself and delete it from the system. Thus, the attacker can remove his malware remotely. In the next step, the malware creates one more thread (named Thread3) which contacts a list of websites every 10 minutes to announce the infection of the current machine. The malware sends the port it is listening to as well as the IP of the infected machine to these sites. At some point in the program, the malware continues to spawn a thread named Thread4 to search on local drives to look for valid email addresses. In this thread, for each email address found, the malware attaches itself and sends itself to this email address.
Thus, you can see that Bagle is a mutithreaded malware with dynamic thread creation, i.e., the main process can create threads to fulfill various tasks. To model Bagle, DPNs is a good candidate since DPNs allow dynamic thread creation. Let be a model of Bagle where is a PDS that represents the main process of the malware; are PDSs that model the code segments corresponding to Thread1, Thread2, Thread3 respectively. Note that are designed to execute specific tasks, while is a main process able to dynamically create an arbitrary number of instances of to fulfill tasks in need.
We show now how the malicious behavior of the different threads can be described by a CARET formula. Let us start with the main process. The typical behaviour of this process is to add its own executable name to the registry listing so that it can be started at the boot time. To do this, the malware needs to invoke the API function with and as parameters. will put the file name of its current executable on the memory address pointed by x. After that, the malware calls the API function with the same as parameter. will use the file name stored at to add itself into the registry key listing. This malicious behaviour can be specified by CARET as follows:
where the is taken over all possible memory addresses over domain .
Note that parameters are passed via the stack in binary programs. For succinctness, we use regular variable expression (resp. ) to describe the requirement that (resp. ) is on top of the stack. Then, this formula states that there is a call to the API GetModuleFileNameA with and on the top of the stack (i.e., with and as parameters), followed by a call to the API with on the top of the stack. Using the operator guarantees that RegSetValueExA is called after GetModuleFileNameA terminates.
Similarly, the malicious behaviors of the Threads 2, 3 and 4 can be described by CARET formulas , and respectively .
Thus, the malicious behavior of the concurrent worm Bagle can be described by the singleindexded CARET formula .
5 Singleindexed CARET modelchecking for DPNs
In this section, we consider the CARET modelchecking problem of DPNs. Let be a labeling function that associates each control location with a set of atomic propositions. Let be a DPN, be a singleindexed CARET formula.
5.1 Büchi DPNs (BDPNs)
Definition 2.
A Büchi DPDS (BDPDS) is a tuple s.t. is a DPDS, is the set of accepting control locations. A run of a BDPDS is accepted iff it visits infinitely often some control locations in .
Definition 3.
A Generalized Büchi DPDS (GBDPDS) is a tuple , where is a DPDS and is a set of sets of accepting control locations. A run of a GBDPDS is accepted iff it visits infinitely often some control locations in for every .
Given a BDPDS or a GBDPDS , let be a local configuration of . Then, let be the set of all pairs s.t. has an accepting run from and is the set of DCLICs generated during that run. We get the following properties:
Proposition 1.
Given a GBDPDS , we can effectively compute a BDPDS s.t. .
This result comes from the fact that we can translate a GBDPDS to a corresponding BDPDS by applying the similar approach as the translation from a Generalized Büchi automaton to a corresponding Büchi automaton [8].
Definition 4.
A Büchi Dynamic Pushdown Network (BDPN) is a set s.t. for every , is a BDPDS. A (global) run of a BDPN is accepted iff all local runs in are accepting (local) runs.
Definition 5.
A Generalized Büchi Dynamic Pushdown Network (GBDPN) is a set s.t. for every , is a GBDPDS. A (global) run of a GBDPN is accepted iff all local runs in are accepting (local) runs.
Given a BDPN or a GBDPN , let be the set of all global configurations s.t. has an accepting run from . We get the following properties:
Proposition 2.
Given a GBDPN , we can effectively compute a BDPN s.t. .
This result is obtained due to the fact that we can translate each GBDPDS in to a corresponding BDPDS in .
Given a BDPN where . Let be the index of the local configuration . Let . Then, we get the following theorem:
Thus, from Proposition 2 and Theorem 5.1, we get that the membership problem of a GBDPN is decidable.
Theorem 5.2.
The membership problem of GBDPNs is decidable.
5.2 From CARET model checking of DPNs to the membership problem in BDPNs
Given a local run , let be the index of the DPDS corresponding to . Let be an initial global configuration of the DPN , then we say that satisfies iff has a global run starting from s.t. every local run in satisfies . Determining whether satisfies is a nontrivial problem since the number of global runs can be unbounded and the number of local runs of each global run can also be unbounded. Note that it is not sufficient to check whether every pushdown process satisfies the corresponding CARET formula . Indeed, we need to ensure that all instances of created during a global run satisfy the formula . Also, it is not correct to check whether all possible instances of satisfy the formula . Indeed, an instance of should not be checked if it is not created during a global run. To solve these problems, we reduce the CARET modelchecking problem for DPNs to the membership problem for GBDPNs. To do this, we compute a GBDPN where () is a GBDPDS s.t. (1) the problem of checking whether each instance of satisfies a CARET formula can be reduced to the membership problem of ; (2) if creates a new instance of starting from , which requires that ; must also create an instance of starting from a certain configuration (computed from ) from which has an accepting run. In what follows, we present how to compute such GBDPDSs.
Let (we explain later the need to these labels). Given a DPDS (), a corresponding CARET formula , we define as the set of atoms A () such that and . Our goal is that for every (), we compute a GBDPDS s.t. for every , satisfies iff there exists an atom where s.t. has an accepting run from .
GBDPDSs Computation.
Let us fix a DPDS in the DPN , a CARET formula in corresponding to the DPDS . In this section, we show how to compute such a GBDPDS corresponding to . Given a local configuration , let be the index of the DPDS corresponding to . We define as follows:

[noitemsep,topsep=0pt]

and } is the finite set of control locations of

is the finite set of stack symbols of .
The transition relation of is the smallest set of transition rules satisfying the following:

[noitemsep,topsep=0pt]

for every : for every ; such that:

[noitemsep,topsep=0pt]






implies ( and )

if ; where if


for every :

[noitemsep,topsep=0pt]

for every such that:

[noitemsep,topsep=0pt]






if ; where if


for every ; such that:

[noitemsep,topsep=0pt]







for every : for every , such that:

[noitemsep,topsep=0pt]







if ; where if

Let and be the set of formulas and formulas of respectively. The generalized Büchi accepting condition of is defined as: where

[noitemsep,topsep=0pt]


where where if then for every .

where where if then for every .
Given a configuration , let be the procedure to which belongs. For example, in Figure 1, , …, . Intuitively, we compute as a kind of product of and which ensures that: for every , satisfies iff there exists an atom s.t. has an accepting run from . To do this, we encode atoms of into control locations of . The form of control locations of is where contains all sub formulas of which are satisfied at the configuration , is a label to determine whether the execution of the procedure of , , terminates in the path . A configuration labeled with means that the execution of is finished in , i.e., the run will run through the procedure , reaches its ret statement and exits after that. On the contrary, labeled with means that in , the execution of the procedure never terminates, i.e., the run will be stuck in and never exits the procedure . Let
Comments
There are no comments yet.