1 Introduction
The design of programs verifying some realtime specifications is a notoriously difficult problem, because such programs must take care of delicate timing issues, and are difficult to debug a posteriori. One research direction to ease the design of realtime software is to automatise the process. The situation may be modelled into a timed game, played by a controller and an antagonistic environment: they act, in a turnbased fashion, over a timed automaton [2], namely a finite automaton equipped with realvalued variables, called clocks, evolving with a uniform rate. A simple, yet realistic, objective for the controller is to reach a target location. We are thus looking for a strategy of the controller, that is a recipe dictating how to play so that the target is reached no matter how the environment plays. Reachability timed games are decidable [4], and complete [19].
Weighted extensions of these games have been considered in order to measure the quality of the winning strategy for the controller [9, 1]: when the controller has several winning strategies in a given reachability timed game, the quantitative version of the game helps choosing a good one with respect to some metrics. This means that the game now takes place over a weighted (or priced) timed automaton [5, 3], where transitions are equipped with weights, and locations with rates of weights (the cost is then proportional to the time spent in this location, with the rate as proportional coefficient). While solving the optimal reachability problem on weighted timed automata has been shown to be complete [6] (i.e. the same complexity as the nonweighted version), weighted timed games are known to be undecidable [12]. This has led to many restrictions in order to regain decidability, the first and most interesting one being the class of strictly nonZeno cost with only nonnegative weights (in transitions and locations) [9]: this hypothesis requires that every execution of the timed automaton that follows a cycle of the region automaton has a weight far from 0 (in interval , for instance).
Negative weights are crucial when one wants to model energy or other resources that can grow or decrease during the execution of the system to study. In [16], we have recently extended the strictly nonZeno cost restriction to weighted timed games in the presence of negative weights in transitions and/or locations. We have described there the class of divergent weighted timed games where each execution that follows a cycle of the region automaton has a weight far from 0, i.e. in . We were able to obtain a doublyexponentialtime algorithm to compute the values and almostoptimal strategies, while deciding the divergence of a weighted timed game is complete. These complexity results match the ones that could be obtained in the nonnegative case from [9, 1].
The techniques used to obtain the results of [16] cannot be extended if the conditions are slightly relaxed. For instance, if we add the possibility for an execution of the timed automaton following a cycle of the region automaton to have weight exactly 0, the decision problem is known to be undecidable [10]
, even with nonnegative weights only. For this extension, in the presence of nonnegative weights only, it has been proposed an approximation schema to compute arbitrarily close estimates of the optimal value
[10]. To this end, the authors consider regions with a refined granularity so as to control the precision of the approximation. In this work, our contribution is twofold: first, we extend the class considered in [10] to the presence of negative weights; second, we show that the approximation can be obtained using a symbolic computation, based on the paradigm of value iteration.More precisely, we define the class of almostdivergent weighted timed games where, for each strongly connected component (SCC) of the region automaton, executions following a cycle of this SCC have weights either all in , or all in . In contrast, the divergent condition is equivalent to the same property on the strongly connected components, but without the presence of singleton . Given an almostdivergent weighted timed game, an initial configuration and a threshold , we compute a value that we guarantee to be close to the optimal value when the play starts from . Moreover, we prove that deciding if a weighted timed game is almostdivergent is a complete problem.
In order to approximate almostdivergent weighted timed games, we first adapt the approximation schema of [10] to our setting. At the very core of their schema is the notion of kernels that collect all cycles of weight exactly 0 in the game. Then, a semiunfolding of the game (in which kernels are not unfolded) of bounded depth is shown to be equivalent to the original game. Adapting this schema to negative weights requires to address new issues:

The definition and the approximation of these kernels is much more intricate in our setting (see Sections 4 and 6). Indeed, with only nonnegative weights, a cycle of weight only encounters locations and transitions with weight . It is no longer the case with arbitrary weights, both for discrete weights on transitions (that could alternate between weight and , e.g.) and continuous rates on locations: for this continuous part, this requires to keep track of the realtime dynamics of the game.

Some configurations may have value . While it is undecidable in general whether a configuration has value , we prove that it is decidable for almostdivergent weighted timed games (see Lemma 5).

The identification of an adequate bound to define an equivalent semiunfolding of bounded depth is more difficult in our setting, as having guarantees on weight accumulation is harder (we can lose accumulated weight). We deal with this by evaluating how large the value of a configuration can be, provided it is not infinite. This is presented in Section 5.
We also develop, in Section 7, a more symbolic approximation schema, in the sense that it avoids the a priori refinement of regions. Instead, all computations are performed in a symbolic way using the techniques developed in [1]. This allows to mutualise as much as possible the different computations: comparing these schemas with the evaluation of MDPs or quantitative games like meanpayoff or discountedpayoff, it is the same improvement as when using value iteration techniques instead of techniques based on the unfolding of the model into a finite tree which can contain many times the same location.
2 Weighted timed games
Clocks, guards and regions
We let be a finite set of variables called clocks. A valuation of clocks is a mapping . For a valuation , and , we define the valuation as , for all , and the valuation as if , and otherwise. The valuation assigns to every clock. A guard on clocks of is a conjunction of atomic constraints of the form , where and (we allow for rational coefficients as we will refine the granularity in the following). Guard is the closed version of a satisfiable guard where every open constraint or is replaced by its closed version or . A valuation satisfies an atomic constraint if . The satisfaction relation is extended to all guards naturally, and denoted by . We let denote the set of guards over .
We rely on the crucial notion of regions, as introduced in the seminal work on timed automata [2]: intuitively, a region is a set of valuations that are all timeabstract bisimilar. We will need some refinement of regions, with respect to a granularity , with . Formally, with respect to the set of clocks and a constant , a region
is a subset of valuations characterised by the vector
and the order of fractional parts of , given as a partition of clocks: a valuation is in this region if () , for all clocks ; () for all ; () all clocks satisfy that have the same fractional part, for all . We denote by the set of regions, and we write as a shorthand for . We recover the traditional notion of region for . E.g., the figure on the right depicts regions as well as their refinement . For any integer guard , either all valuations of a given region satisfy , or none of them do. A region is said to be a time successor of the region if there exist , , and such that . Moreover, for , we let be the region where clocks of are reset.Weighted timed games
A weighted timed game (WTG) is then a tuple where and are finite disjoint subsets of locations belonging to and , respectively, is a finite set of transitions, is the weight function, associating an integer weight with each transition and location, is a subset of target locations for player , and is a function mapping each target location and valuation of the clocks to a final weight of (possibly , , or ). The addition of target weights is not standard, but we will use it in the process of solving those games: anyway, it is possible to simply map each target location to the weight , allowing us to recover the standard definition. Without loss of generality, we suppose the absence of deadlocks except on target locations, i.e. for each location and valuation , there exists such that , and no transitions start in .
The semantics of a WTG is defined in terms of a game played on an infinite transition system whose vertices are configurations of the WTG. A configuration is a pair with a location and a valuation of the clocks. Configurations are split into players according to the location. A configuration is final if its location is a target location of . The alphabet of the transition system is given by and will encode the delay that a player wants to spend in the current location, before firing a certain transition. For every delay , transition and valuation , there is an edge if and . The weight of such an edge is given by . An example is depicted on Figure 1.
A finite play is a finite sequence of consecutive edges . We denote by the length of . The concatenation of two finite plays and , such that ends in the same configuration as starts, is denoted by . We let be the set of all finite plays in , whereas (resp. ) denote the finite plays that end in a configuration of (resp. ). A play is then a maximal sequence of consecutive edges (it is either infinite or it reaches ).
A strategy for (resp. ) is a mapping (resp. ) such that for all finite plays (resp. ) ending in nontarget configuration , there exists an edge . A play or finite play conforms to a strategy of (resp. ) if for all such that belongs to (resp. ), we have that . A strategy is memoryless if for all finite plays ending in the same configuration, we have that . For all strategies and of players and , respectively, and for all configurations , we let be the outcome of and , defined as the only play conforming to and and starting in .
The objective of is to reach a target configuration, while minimising the accumulated weight up to the target. Hence, we associate to every finite play its cumulated weight, taking into account both discrete and continuous costs: . Then, the weight of a play , denoted by , is defined by if is infinite (does not reach ), and if it ends in with . Then, for all locations and valuation , we let be the value of in , defined as , where the order of the infimum and supremum does not matter, since WTGs are known to be determined^{1}^{1}1The determinacy result is stated in [13] for WTG (called priced timed games) with one clock, but the proof does not use the assumption on the number of clocks.. We say that a strategy of is optimal if, for all , and all strategies of , . It is said optimal if this holds for . A symmetric definition holds for optimal strategies of . If the game is clear from the context, we may drop the index from all previous notations.
As usual in related work [1, 9, 10], we assume that the input WTGs have guards where all constants are integers, and all clocks are bounded, i.e. there is a constant such that every transition of the WTG is equipped with a guard such that implies for all clocks . We denote by (resp. , ) the maximal weight in absolute values of locations (resp. of transitions, edges) of , i.e. (resp. , ). We also assume that the output weight functions are piecewise linear with a finite number of pieces and are continuous on each region. Notice that the zero output weight function satisfies this property. Moreover, the computations we will perform in the following maintain this property as an invariant, and use it to prove their correctness.
Region and corner abstractions
The region automaton, or region game, (abbreviated as when ) of a game is the WTG with locations and all transitions with such that the model of guard (i.e. all valuations such that ) is a region , time successor of such that satisfies the guard , and . Distribution of locations to players, final locations and weights are taken according to . We call path a finite or infinite sequence of transitions in this automaton, and we denote by the paths. A play in is projected on a path in , by replacing every edge by the transition , where (resp. ) is the region containing (resp. ): we say that follows the path . It is important to notice that, even if is a cycle (i.e. starts and ends in the same location of the region game), there may exist plays following it in that are not cycles, due to the fact that regions are sets of valuations. By projecting away the region information of , we simply obtain:
Lemma 1
For all , regions , and , .
On top of regions, we will need the cornerpoint abstraction techniques introduced in [8]. A valuation is said to be a corner of a region , if it belongs to the topological closure and has coordinates multiple of (). We call corner state a triple that contains information about a location of the regiongame , and a corner of the region . Every region has at most corners. We now define the cornerpoint abstraction of a WTG as the WTG obtained as a refinement of where guards on transitions are enforced to stay on one of the corners of the current region: the locations of are all corner states of , associated to each player accordingly, and transitions are all such that there exists a transition of such that the model of guard is a corner satisfying the guard (recall that is the closed version of ), , and there exist two valuations , such that for some (the latter condition ensures that the transition between corners is not spurious). Because of this closure operation, we must also define properly the final weight function: we simply define it over the only valuation reachable in location (with ) by (the limit is well defined since is piecewise linear with a finite number of pieces on region ).
The WTG can be seen as a weighted game (with final weights), i.e. a WTG without clocks (which means that there are only weights on transitions), by removing guards, resets and rates of locations, and replacing the weights of transitions by the actual weight of jumping from one corner to another: a transition becomes an edge from to with weight (for all possible values of , which requires to allow for multiedges^{2}^{2}2The only case where several edges could link two corners using the same transition is when all clocks are reset in , in which case there is a choice for delay .). Note that delay is necessarily a rational of the form with , since it must relate corners of regions. In particular, this proves that the cumulated weight of a finite play in is indeed a rational number with denominator .
We will call corner play a play in the cornerpoint abstraction : it can also be interpreted as a timed execution in where all guards are closed (as explained in the definition above). It straightforwardly projects on a finite path in the region game : in this case, we say again that follows . Figure 2 depicts a play, its projected path in the region game and one of its associated corner plays.
Corner plays allow one to obtain faithful information on the plays that follow the same path:
Lemma 2
If is a finite path in , the set is an interval bounded by the minimum and the maximum values of the set .
Value iteration
We will rely on the value iteration algorithm described in [1] for a WTG .
If represents a value function—i.e. a mapping from configurations of to a value in —we denote by the image , for better readability, and by the function mapping each valuation to . One step of the game is summarised in the following operator mapping each value function to a value function defined by if , and otherwise
(1) 
where ranges over valid edges in . Then, starting from mapping every configuration to , except for the targets mapped to , we let for all . The value function represents the value , which is intuitively what can guarantee when forced to reach the target in at most steps.
More formally, we define the weight of a maximal play at horizon , as if reaches a target state in at most steps, and otherwise. Using this alternative definition of the weight of a play, we can obtain a new game value . Then, if is a tree of depth , if .
The mappings are piecewise linear for all , and preserves piecewise linearity over regions, so all iterates are piecewise linear with a finite number of pieces. In [1], it is proved that has a number of pieces (and can be computed within a complexity) exponential in and in the size of when . This result can be extended to handle negative weights in and output weights .
3 Results
We consider the value problem that asks, given a WTG , a location and a threshold , to decide whether . In the context of timed games, optimal strategies may not exist. We generally focus on finding optimal strategies, that guarantee the optimal value, up to a small error . Moreover, when the value problem is undecidable, we also consider the approximation problem that consists, given a precision , in computing an approximation of .
In the oneplayer case, computing the optimal value and an optimal strategy for weighted timed automata is known to be complete [6]. In the twoplayer case, the value problem of WTGs (also called priced timed games in the literature) is undecidable with 3 clocks [12, 10], or even 2 clocks in the presence of negative weights [15] (for the existence problem asking if a strategy of player can guarantee a given threshold). To obtain decidability, one possibility is to limit the number of clocks to 1: then, there is an exponentialtime algorithm to compute the value as well as optimal strategies in the presence of nonnegative weights only [7, 20, 17], whereas the problem is only known to be hard. A similar result can be lifted to arbitrary weights, under restrictions on the resets of the clock in cycles [13].
The other possibility to obtain a decidability result [9, 16] is to enforce a semantical property of divergence (originally called strictly nonZeno cost): it asks that every play following a cycle in the region automaton has weight far from . It allows the authors to prove that playing for only a bounded number of steps is equivalent to the original game, which boils down to the problem of computing the value of a treeshaped weighted timed game using the value iteration algorithm.
Other objectives, not directly related to optimal reachability, have been considered in [11] for weighted timed games, like meanpayoff and parity objectives. In this work, the authors manage to solve these problems for the socalled class of robust WTGs that they introduce. This class includes the class we consider, but is decidable in 2.
In [16], we generalised the strictly nonZeno cost property of [9, 16] to weighted timed games with both positive and negative weights: we called them divergent weighted timed games. This article relaxes the divergence property, to introduce almostdivergent weighted timed games. We first define formally these classes of games. A cycle of is said to be a positive cycle (resp. a 0cycle, or a negative cycle) if every finite play following satisfies (resp. , or ). A strongly connected component (SCC) of is said to be positive (resp. negative) if every cycle is positive (resp. negative). An SCC of is said to be nonnegative (resp. nonpositive) if every play following a cycle in satisfies either or (resp. either or ).
Definition 1
A WTG is divergent if every SCC of is either positive or negative. As a generalisation, a WTG is almostdivergent when every SCC of is either nonnegative or nonpositive.
In [16], we showed that we can decide in the value problem for divergent WTGs. Unfortunately, it is shown in [10] that this problem is undecidable for almostdivergent WTGs (already with nonnegative weights only, where almostdivergent WTGs are called simple). They propose a solution to the approximation problem, again with nonnegative weights only. Our first result is the following extension of their result:
Theorem 3.1
Given an almostdivergent WTG , a location and , we can compute an approximation of in time doublyexponential in the size of and polynomial in . Moreover, deciding if a WTG is almostdivergent is complete.
To obtain this result, we follow an approximation schema that we now outline. First, we will always reason on the region game of the almostdivergent WTG . The goal is to compute an approximation of for some state , with the region where every clock value is 0. As already recalled, techniques of [1] allow one to compute the (exact) values of a WTG played on a finite tree, using operator . The idea is thus to decompose as much as possible the game in a WTG over a tree. First, we decompose the region game into SCCs (left of Figure 3).
During the approximation process, we must think about the final weight functions as the previously computed approximations of the values of SCCs below the current one. We will keep as an invariant that final weight functions are piecewise linear functions with a finite number of pieces, and are continuous on each region.
For an SCC of and an initial state of provided by the SCC decomposition, we show that the game on the SCC is equivalent to a game on a tree built from a semiunfolding (see middle of Figure 3) of from of finite depth, with certain nodes of the tree being kernels. These kernels are some parts of that contain all cycles of weight 0. The semiunfolding is stopped either when reaching a final location, or when some location (or kernel) has been visited for a certain fixed number of times: such locations deep enough are called stop leaves.
Our second result is a more symbolic approximation schema based on the value iteration only. It is more symbolic in the sense that it does not require the SCC decomposition, the computation of kernels nor the semiunfolding of the game in a tree.
Theorem 3.2
Let be an almostdivergent WTG such that for all configurations. Then the sequence converges towards and for every , we can compute an integer such that is an approximation of for all configurations.
Remark 1
In a weightedtimed game, it is easy to detect the set of states with value : these are all the states from which cannot ensure reachability of a target location with . It can therefore be computed by an attractor computation, and is indeed a property constant on each region. In particular, removing those states from does not affect the value of any other state and can be done in complexity linear in . We will therefore assume that the considered WTG have no configurations with value .
4 Kernels of an almostdivergent WTG
The approximation procedure described before uses the socalled kernels in order to group together all cycles of weight 0. We study those kernels and give a characterisation allowing computability. Contrary to the nonnegative case, the situation is more complex in our arbitrary case, since weights of both locations and transitions may differ from in the kernel. Moreover, it is not trivial (and may not be true in a non almostdivergent WTG) to know whether it is sufficient to consider only simple cycles, i.e. cycles without repetitions.
To answer these questions, let us first analyse the cycles of that we will encounter. Since we are in an almostdivergent game, by Lemma 2, all cycles of (with transitions of ) are either 0cycles, positive cycles or negative cycles. Additionally, in an SCC of
, we cannot find both positive and negative cycles by definition. Moreover, we can classify a cycle by looking only at the corner plays following it.
Lemma 3
A cycle is a 0cycle iff there exists a corner play following with .
Proof
If is a 0cycle, every such corner play will have weight , by Lemma 2. Reciprocally, if such a corner play exists, all corner plays following have weight : otherwise the set would have nonempty intersection with the set which would contradict the almostdivergence.
An important result is that 0cycles are stable by rotation. This is not trivial because plays following a cycle can start and end in different valuations, therefore changing the starting state of the cycle could a priori change the plays that follow it and their weights.
Lemma 4
Let and be paths of . Then, is a 0cycle iff is a 0cycle.
Proof
Since is a cycle, and , so is correctly defined.
First, since there are finitely many corners, by constructing a long enough play following an iterate of , we can obtain a corner play that starts and ends in the same corner. Formally, we define two sequences of region corners and . We start by choosing any . Let be a corner of such that is accessible from by following . For every , let be a corner of such that is accessible from by following , and let be a corner of such that is accessible from by following . We stop the construction at the first such that there exists with . Additionally, we let and . This process is bounded since has at most corners.
For every , let be the weight of a play from to along , and let be the weight of a play from to along . The concatenation of the two plays has weight , since it follows the 0cycle . Therefore, all corner plays from to following have the same weight , and the same applies for . For every , the concatenation of and is a play from to , of weight , following . Since is a cycle, and the game is almostdivergent, all possible values of have the same sign.
Finally, we can construct a corner play from to by concatenating the plays . That play has weight . This implies that the terms , of constant sign, are all equal to . As a consequence, the concatenation of and is a corner play following of weight . By Lemma 3, we deduce that is a 0cycle.
We will now construct the kernel as the subgraph of containing all 0cycles. Formally, let be the set of transitions of belonging to a simple 0cycle, and be the set of states covered by . We define the kernel of as the subgraph of defined by and . Transitions in with starting state in are called the output transitions of . We define it using only simple 0cycles in order to ensure its computability. However, we now show that this is of no harm, since the kernel contains exactly all the 0cycles, which will be crucial in the approximation schema we present in Section 6.
Proposition 1
A cycle of is entirely in if and only if it is a 0cycle.
Proof
We prove that every 0cycle is in by induction on the length of the cycles. The initialisation contains only cycles of length , that are in by construction. If we consider a cycle of length , it is either simple or it can be rotated and decomposed into , and being smaller cycles. Let be a corner play following . We denote by the prefix of following and the suffix following . It holds that , and in an almostdivergent SCC this implies . Therefore, by Lemma 3 both and are 0cycles, and they must be in by induction hypothesis. Note that this reasoning proves that every cycle contained in a longer 0cycle is also a 0cycle.
We now prove that every cycle in is a 0cycle. By construction, every transition is part of a simple 0cycle. Thus, to every transition , we can associate a path such that is a simple 0cycle (rotate the simple cycle if necessary). We can prove (using both Lemmas 3 and 4) the following property by relying on another pumping argument on corners: If is a path in , then is a 0cycle of . Now, if is a cycle of in , there exists a cycle such that is a 0cycle, therefore is a 0cycle.
5 Semiunfolding of almostdivergent WTGs
Given an almostdivergent WTG , we describe the construction of its semiunfolding (as depicted in Figure 3). This crucially relies on the absence of states with value , so we explain how to deal with them first:
Lemma 5
In an SCC of , the set of configurations with value is a union of regions computable in time linear in the size of .
Proof (Sketch of proof)
If the SCC is nonnegative, the cumulated weight cannot decrease along a cycle, thus, the only way to obtain value is to jump in a final state with final weight . We can therefore compute this set of states with an attractor for .
If the SCC is nonpositive, we let (resp. ) be the set of target states where is bounded (resp. has value ). We also define (resp. ), the set of transitions of whose end state belongs to (resp. ). Notice that the kernel cannot contain target states since they do not have outgoing transitions. We can prove that a configuration has value iff it belongs to a state where player can ensure the LTL formula on transitions: . The procedure to detect states thus consists of four attractor computations, which can be done in time linear in .
We can now assume that no states of have value , and that the output weight function maps all configurations to . Since is piecewise linear with finitely many pieces, is bounded. Let denote the bound of , ranging over all target configurations.
We now explain how to build the semiunfolding . We only build the semiunfolding of an SCC of starting from some state of the region game, since it is then easy to glue all the semiunfoldings together to get the one of the full game. Since every configuration has finite value, we can prove that values of the game are bounded by . As a consequence, we can find a bound linear in , and such that a play that visits some state outside the kernel more than times has weight strictly above , hence is useless for the value computation. This leads to considering the semiunfolding of (nodes in the kernel are not unfolded, see Figure 3) such that each node not in the kernel is encountered at most times along a branch: the end of each branch is called a stop leaf of the semiunfolding. In particular, the depth of is bounded by , and thus is polynomial in , and
Comments
There are no comments yet.