1. Introduction
Formal synthesis of controllers enforcing complex specifications on cyberphysical systems has gained significant attention in the last few years. This is mainly due to the need for obtaining formally verified control strategies rendering some complex tasks; these are usually represented using temporal logic specifications or (in)finite strings over automata. There are several techniques and tools available that provide automated, correctbyconstruction, controller synthesis for cyberphysical systems by utilizing symbolic models (a.k.a. finite abstractions) (Tabuada, 2009; Belta et al., 2017), in which the uncountable continuous states and inputs are aggregated to finite symbolic states and inputs via quantization (a.k.a. discretization). The socalled symbolic controllers are then computed by utilizing algorithmic machinery from computer science and then mapped back for use in the original systems. The stateoftheart tools to synthesize such controllers are, e.g., SCOTS (Rungger and M, 2016), pFaces (Khaled and Zamani, 2019), QUEST (Jagtap and Zamani, 2017), Pessoa (Mazo et al., 2010), CoSyMA (Mouelhi et al., 2013), or Uppaal Stratego (David et al., 2015). These tools give a huge list of stateaction pairs (a.k.a. lookup tables) representing controllers.
Storing these symbolic controllers in the memory is a major problem because they usually need to run on embedded devices with limited memory. However, if we do not store the controllers as lookup tables, but take advantage of decision trees (DT) (Mitchell, 1997), which exploit their hidden structure to represent them in a more compact way, we can mitigate this problem. As shown in (Ashok et al., 2019b), DTs can be orders of magnitude smaller than lookup tables. Such a concise representation opens the door for better readability, understandability, and explainability of the controllers, while reducing memory requirements and preserving correctness guarantees. Moreover, humanunderstandable controllers may also provide insight into the models themselves, thus aiding their validation, as we illustrate in the example below.
Our setting is inherently different from the usual use of DT in machine learning; there, in order to generalize well, DTs typically do not fit the training data exactly; in contrast, in this work, DTs have to exactly represent the given controllers in order to preserve their correctness guarantee. Therefore, our requirements on DTs differ: beside the size and the explainability, it is also the perfect fitting. Consequently, it is necessary to thoroughly reevaluate current DTlearning algorithms and possibly also modify them.
A basic technique used to represent controllers more concisely is to determinize them, i.e. to make them not (maximally) permissive but only retain a single action for each state. To this end, one can use, for instance, the action with the minimum norm from a reference input, when least energy consuming controllers are preferred (Meyer et al., 2017), or the previously applied action (if possible), when lazy controllers are preferred (Mazo et al., 2010; Mouelhi et al., 2013). Such a size reduction by determinization can be applied as preprocessing before learning the DT representation of the controller, typically yielding also a smaller DT. Alternatively, one can apply other kinds of reduction by determinization as postprocessing after constructing the DT. For instance, in “safe pruning” of (Ashok et al., 2019b), the DT constructed for the maximally permissive controller is modified as follows. The leaves of the tree are merged in a bottomup fashion, thereby reducing the size and partially determinizing it. In contrast, here we introduce a novel approach for determinizing the controllers during the construction of the DT, with advantages to both preprocessing and postprocessing methods. Firstly, since the choice of the action for each state greatly affects the size and structure of the DT, it is advantageous to guide the choice by the concrete, already built part of the DT, compared to apriori choices made by preprocessing approaches. Secondly, while the postprocessing approaches have to construct a large tree first, our new technique constructs an already reduced tree, avoiding the intermediate large one, thus making it more scalable.
Motivating Example
Consider a temperature control system running in a building with 10 rooms with the heater installed only in 2 rooms as described in (Jagtap and Zamani, 2017). The permissive controller maintaining the temperatures of all the rooms within a certain range obtained using SCOTS is a lookup table with 52,488 stateaction pairs. By naively determinizing, we get a lookup table with 26,244 symbolic states (i.e. domain of the controller) and their respective actions. The standard DTlearning, e.g. (Breiman et al., 1984), applied to these two lookup tables yields DT with 8,648 and with 2,703 decision nodes, respectively. While this is an improvement, it is far from being explainable. With the help of our novel determinization strategy presented in Section 4.2, we are able to obtain the decision tree with only 3 (!) decision nodes, see Figure 1. Apart from obtaining a compact and easily implementable controller representation while preserving correctness guarantees, the result is so small that it is immediately explainable and, moreover, allows us to improve on the implementation: one can readily see that we only need to install temperature sensors in two rooms instead of all 10 rooms, which will help users to reduce the system deployment cost as well as the required bandwidth to transfer the state information to the controller. Only 4 symbols (leaves of the tree) need to be transferred to realize the controller.
We also obtain a controller with very few nodes for the cruisecontrol model of (Larsen et al., 2015). From such a clear representation one immediately notices that the controller makes the car decelerate when the car in front of it is far away. This counterintuitive behaviour has thus revealed a bug in the model, which did not actually describe the intended behaviour of the system.
The contribution of this paper can be summarized as follows:

We present dtControl
, an opensource tool to convert formally verified controllers to decision trees preserving their correctness guarantees.
dtControl has a simple input format and already supports automated conversion for controllers generated by two stateoftheart tools – Uppaal Stratego (David et al., 2015) and SCOTS (Rungger and M, 2016). It supports several output formats, most importantly the graphical output as DOT files, useful for further analysis and visual presentation, and the C source code, useful for closedloop simulation or for loading onto embedded devices. 
We introduce a new technique for using arbitrary binary classifiers in the DTs and a novel approach for determinizing controllers during the DT learning. Our approach is tuned towards obtaining extremely small, explainable DTs. In 5 out of 8 case studies where it is applicable (the original controllers are nondeterministic), it produces trees with singledigit numbers of decision nodes.

We present a comprehensive evaluation of 8 DTlearning algorithms on 10 case studies.
Related Work
DTs (Mitchell, 1997, Chapter 3) are a wellknown class of data structures, particularly known for their interpretability, used mostly by machine learning practitioners in classification or regression tasks. Our work is based on wellknown algorithms for decision tree learning, namely CART (Breiman et al., 1984), C4.5 (Quinlan, 1993) and OC1 (Murthy et al., 1993).
There has been previous work on combining decision trees with classifiers, namely Perceptrons
(Utgoff, 1988), Logistic Regression models
(Landwehr et al., 2003), piecewise functions (Neider et al., 2016)(Christou and Efremidis, 2007; Ashok et al., 2019a). We generalize those approaches by allowing for arbitrary binary classifiers to be used in our trees. Additionally, those methods are either restricted to only use two labels, which is not applicable for controllers with more than two possible actions, or they only allow linear classifiers in leaf nodes (Ashok et al., 2019a; Neider et al., 2016). In contrast, our approach is applicable with an arbitrary number of actions and also leverages the power of linear classifiers in inner nodes.An alternative to DTs are binary decision diagrams (BDD) (Bryant, 1986). As seen in (Ashok et al., 2019b; Brázdil et al., 2018, 2015), BDDs have several disadvantages: firstly, they do not retain the inherent flavour of decisions of strategies as maps from states to actions due to their bitlevel representation and, hence, are hardly explainable. Secondly, they are notoriously hard to minimize (Brázdil et al., 2018), also because finding the best variable ordering is NPcomplete (Bryant, 1986). BDDs only allow binary classification, so the actions have to be joined with the state space to represent a controller. The recent result in (Zapreev et al., 2018)
discusses various heuristicbased determinization algorithms for BDDs representing controllers; however, they still suffer from those disadvantages we mentioned for BDDs. Algebraic decision diagrams (ADD)
(Bahar et al., 1997) are an extension of BDDs that allow to have more than two labels, i.e. associate every action to a leaf node. However, they still suffer from the same drawbacks as BDDs. In (Girard, 2013) ADDs are used for controller representation; however, no concrete algorithm is provided.The formal methods community has made use of decision trees to represent controllers and counterexamples arising out of model checking Markov decision processes, stochastic games and LTL synthesis
(Brázdil et al., 2015; Ashok et al., 2019a, b; Brázdil et al., 2018). DTs have also been used to represent learnt policies from reinforcement learning
(Pyeatt et al., 2001). However, in contrast to our paper, (Pyeatt et al., 2001) does not preserve safety guarantees, only considers axisaligned splits and does not consider nondeterminism. (Julian et al., 2018) suggests the possibility of using regression trees for representing policies, whereas we consider classification trees.2. Tool
dtControl is an easytouse opensource tool for postprocessing memoryless symbolic controllers into various compact and more interpretable representations. We report the input and output formats as well as the algorithms that are currently supported. Note that the tool can easily be extended with new formats and algorithms. dtControl is distributed as an easytoinstall pip package^{1}^{1}1pip is a standard packagemanagement system used to install and manage software packages written in Python. See https://pypi.org/project/dtcontrol/. along with a user and developer manual^{2}^{2}2Available at https://dtcontrol.readthedocs.io/en/latest/.
Dependencies
Input formats
dtControl currently accepts controllers in three formats: (i) a raw commaseparated values (CSV) format with each row consisting of a vector of state variables concatenated with a vector of input variables; (ii) a sparse matrix format used by SCOTS; and (iii) the raw strategy produced by Uppaal Stratego. More details about the various formats are described in the user manual.
Algorithms
dtControl offers a range of parameters to adjust the DT learning algorithm, which are described in Section 4.
Output formats
dtControl outputs the decision tree in the DOT graph representation language (for visual presentation of the tree), as well as C code that can be directly used for implementation; see Appendix A for the DOT and C output that dtControl produces for the DT in Figure 1. Additionally, dtControl reports statistics for every constructed tree, namely size, the minimum number of bits required to represent symbols in obtained controller, and the construction time.
3. Preliminaries  Decision tree learning
A decision tree (DT) over the domain with the set of labels is a tuple , where is a finite full binary tree (every node has exactly 0 or 2 children), assigns to every leaf node (node with 0 children) a label and assigns to every inner node (node with 2 children, also called decision node) of the tree a predicate, which is a boolean function .
The semantics of a DT is as follows: given a state , there is a unique decision path through the tree starting from the root node (the only node with no parent) to a leaf node . This means that the label for state is . The decision path is defined by starting at the root node, and then for each decision node evaluating the predicate on the state, i.e. computing , and picking the left child if the predicate is true and the right child otherwise.
For example, consider the DT in Figure 1: has 7 nodes, 3 of which are decision nodes (including the root node) and 4 of which are leaf nodes. A state of the system is a vector of 10 temperatures, e.g. To find the decision for this state, we first evaluate the predicate in the root node. Since the temperature in the second room is smaller than 20.625, the predicate is true and we go to the left child. We evaluate the next predicate in the same fashion and arrive at the leaf node labelled , which gives us a safe control input, in this case to turn on both heaters.
All DT learning algorithms implemented in dtControl follow the same underlying structure: given a finite set of featurelabel pairs, it returns a DT that represents precisely; this means that for every , the leaf node of the decision path for has the label . In the setting of this paper, is a controller, features are states and labels are actions^{3}^{3}3We use the term actions instead of control inputs, to avoid confusion because of the fact that the control inputs are the outputs of a DT..
To learn the DT, the algorithm tries to minimize the entropy of , denoted , by splitting it according to a predicate. Formally, for some ,
where
is the empirical probability of label
being in ; notation denotes the cardinality of a set. The underlying algorithm works recursively as follows:
Base case: If , i.e. all pairs have the same label , then return the following DT: the tree has only a single node , with , and has no domain in this case, as there are no decision nodes.

Recursive case: If , needs to be split; for that, we use some predicate which splits , where the set PREDS to be picked here is a parameter of the algorithm that is discussed in Section 4.1. We pick the predicate that minimizes the entropy after the split, i.e.,
Intuitively, the best predicate is the one which is able to split into two parts which are as homogeneous as possible. Given the best predicate, we recursively call the algorithm on the subsets resulting from the split, getting two DTs and ; the indices and indicate whether the predicate was true or false, respectively. Then we return the following DT: the tree has the root node , with the left child being the root of and the right child the root of . uses for leaves of the left subtree and for the right subtree. is defined similarly on the inner nodes of the left and right subtrees, with the addition that , i.e. the predicate of the root of is the predicate we used for the split.
The symbolic controllers designed by SCOTS and Uppaal Stratego are generated by correctbyconstruction synthesis procedures. In order to use these controllers for original systems (i.e. with infinite continuous states and inputs), we need to refine the controllers. For more details on refinement procedures, we kindly refer the interested reader to (Reissig et al., 2016; Tabuada, 2009; Larsen et al., 2018).
dtControl preserves the correctness guarantees by representing the symbolic controllers precisely, i.e. iterating until the entropy in all leaf nodes is 0. In the case of determinization, dtControl represents one of the deterministic subcontrollers precisely, which is chosen onthefly during the construction.
4. Methods
There are two parameters of dtControl: the set of predicates to consider (PREDS) and the way in which nondeterminism is handled. For each of these, dtControl implements existing ideas and introduces new ones. Here, we only report the highlevel ideas; for a more detailed description, refer to the user or developer manual.
4.1. Predicates
4.1.1. Existing idea: Axisaligned splits
In the standard algorithms, e.g (Breiman et al., 1984; Quinlan, 1993), only axisaligned splits are considered; i.e. predicates that can only have the form , where is one of the state variables, , and . In our setting, the set of possible predicates is greatly restricted due to discretization (quantization). The number of splits to be evaluated for each variable is equal to the number of discrete values of .
4.1.2. Existing idea: Oblique splits
Beside the standard axisaligned splits, dtControl also supports predicates of the form , where . These oblique predicates (Murthy et al., 1993) incorporate information from multiple state variables in a single split and thus have the potential to greatly simplify the induced decision tree (Ashok et al., 2019a). However, due to combinatorial explosion, it is too costly to simply enumerate all possible oblique predicates even in the discretized space, due to which different heuristics are employed (Murthy et al., 1993). In this regard, dtControl supports the usage of predicates obtained using (an adapted version of) the OC1 algorithm (Murthy et al., 1993).
4.1.3. New technique: Using binary machinelearnt classifiers
It is possible to find nonaxisaligned predicates splitting the controller by using classification techniques from machine learning. As our main goal is for the resulting tree to be explainable, we want to avoid complex predicates, and thus we restrict the classifiers we consider in two ways: (i) we only consider linear classifiers, and (ii) we restrict to binary classifiers, so that the resulting tree is binary.
We use these binary linear classifiers in a way that is similar to the classical onevstherest classification, e.g. (Bishop, 2007, Chapter 4): For each action , we train a classifier that tries to separate the states with that action from the rest. We then pick that classifier whose predicate minimizes the entropy, i.e.
We considered various linear classification techniques including Logistic Regression (Bishop, 2007, Chapter 4), linear Support Vector Machines (SVM) (Bishop, 2007, Chapter 7), Perceptrons (Bishop, 2007, Chapter 5)
, and Naive Bayes
(Zhang, 2004). However, the latter two yielded significantly larger DTs in all of our experiments, so dtControl does not offer these algorithms to the enduser.In summary, dtControl currently supports four possibilities for the set PREDS: axisaligned predicates, the modified oblique split heuristic from (Murthy et al., 1993) and oblique splits obtained either via logistic regression or linear SVM classifiers. Due to the modular structure of the code, it is easy to extend the existing approaches or add new methods, as described in our developer manual.
4.2. Nondeterminism
In the general algorithm described in Section 3, for the sake of simplicity, we restricted our procedure to controllers that deterministically choose a single control input. In case of nondeterministic (also called permissive) controllers, the tuples in the controller have the form , where is now a set of admissible control inputs. One approach to handle nondeterminism is to simply assign a unique label to each set, and hence reduce the setting to the case where for every state there is only a single label. This means that the DT algorithm can be used in exactly the same way as described in Section 3. This method retains all information that was initially present in the given controller.
The disadvantage of handling nondeterminism like this is that the number of unique classes may be as large as . In order to avoid this blowup and optimize memory, one can decide to determinize the controller. If we have some knowledge about which value of a control input is optimal, e.g. from domain knowledge or since it was computed by an optimization algorithm as in Uppaal Stratego (David et al., 2015), this information can be used, eliminating the nondeterministic choice. Otherwise, one can use a standard determinization approaches, e.g. picking the value with the minimum norm. The tree can then simply be constructed from the determinized labels. Additionally, we propose the following alternative to these determinization approaches.
Novel determinization approach: Maximal frequencies
Our new determinization technique MaxFreq aims to minimize the size of the resulting DT. The underlying general idea is simple: if many of the data points share the same label, a DT learning algorithm should group them together under the common label. This idea naturally gives a determinizing strategy when applied in our context.
Consider a set of pairs of state and sets of actions. The goal is to identify for each state a single action which can be assigned to it. Let be the function for action frequency, which maps actions to their number of occurrences in . Then, for each state such that , we reassign to the single label which appears with the highest frequency. Formally, our determinization procedure produces for each state , an action , where
Once we have determinized , we can use any method presented in Section 4.1 to find a predicate for the current node. After the set is split, the procedure is recursively applied to both child nodes, recomputing the action frequency each time.
In summary, dtControl offers 3 different possibilities to handle nondeterminism: unique labels retaining the information, determinizing upfront by picking the action with the minimal norm, and using the novel heuristic MaxFreq.
5. Experiments
Most permissive controller  Determinized controller  
Case Study  Lookup table  CART  LinSVM  LogReg  OC1  MaxFreq  MaxFreqLC  MinNorm  MinNormLC 
Singleinput nondeterministic  
cartpole (Jagtap et al., 2018)  271  127  126  100  92  6  7  56  39 
2D Thermal (Girard, 2013)  40,311  14  14  8  12  5  4  8  4 
helicopter (Jagtap et al., 2018)  280,539  3,174  2,895  1,877  115  134  677  526  
cruise (Larsen et al., 2015)  295,615  494  543  392  374  2  2  282  197 
dcdc (Rungger and M, 2016)  593,089  136  140  70  90  5  5  11  11 
Multiinput nondeterministic  
10D Thermal (Jagtap and Zamani, 2017)  26,244  8,649  67  74  2,263  4  10  2,704  28 
truck_trailer(Khaled and Zamani, 2019)  1,386,211  169,195  21,598  12,611  95,417  30,888  
traffic(Swikir and Zamani, 2019)  16,639,662  6,287  4,477  98  80  690  
Multiinput deterministic  
vehicle (Rungger and M, 2016)  48,018  6,619  6,592  5,195  4,886  n/a  n/a  n/a  n/a 
aircraft (Rungger et al., 2015)  2,135,056  456,929  407,523  n/a  n/a  n/a  n/a 
All experiments were conducted on a server running on an Intel Xeon W2123 processor with a clock speed of 3.60GHz and 64 GB RAM. We ran the uniquelabel approach with all 4 possible predicate classes (see Section 4.1): axisaligned predicates (CART) (Breiman et al., 1984), oblique predicates with linear supportvector machines (LinSVM), logistic regression (LogReg), and the heuristic from (Murthy et al., 1993), called OC1. Note that all these resulting trees represent the maximally permissive controller for the finite abstraction. Additionally, on all the nondeterministic models we ran our novel determinization approach (see Section 4.2) with axisaligned predicates (MaxFreq), and with oblique predicates (MaxFreqLC where LC stands for linear classifier). For the results in Table 1, we used logistic regression as linear classifier, because it reliably performed well. As a competitor for our determinization approach we use apriori determinization with the minimum norm, again both with axisaligned predicates (MinNorm) and with logistic regression for linear predicates (MinNormLC). Additionally, we compare to the random apriori determinization, to get an impression for possible cases where MinNorm would not be a natural choice but no better is given. However, since the results are always worse, we only report the numbers in Appendix B. Since some of the algorithms rely on randomization, we ran all experiments thrice and report the median.
We run the discussed algorithms on ten case studies, five of which are marked as multiinput, containing control inputs which are multidimensional, i.e. . All our algorithms work by giving each multidimensional control input a single action label, and then working on these labels as in the case of singledimensional control inputs.
In order to compare the sizes of the representations of the controllers fairly, we provide two different ways. Firstly, the straightforward way is to compare the number of nodes used in the DT and the number of rows in the lookup table, which we do in Table 2 in Appendix B. However, a practically more relevant comparison should reflect the number of state symbols needed to capture the behaviour of the controller; these can also be directly related to memory requirements. To this end, in Table 1 for DTs we report the number of decision paths, as these induce a partitioning of the state space into symbolic states. For more information on this and an example, see Figure 2 and the discussion in Section 6.
Beside comparing DTs to the lookup tables, we also compare them to BDDs. However, BDDs do not directly correspond to the state symbols. Hence we refrain from the statesymbols comparison and do not report BDD sizes in Table 1, but only in Appendix B. There, we compare the number of nodes in the BDDs to the number of nodes (not decision paths) generated by our DT algorithms. The BDDs were generated using SCOTS for all models but the two from Uppaal Stratego, cruise and 2D Thermal; for these two, we used the dd and autoref Python libraries. The BDDs were minimized as much as possible by calling reordering heuristics until convergence. The results show that the DT algorithms which determinize or which do not use oblique predicates are more scalable, as they were able to compute the result for all case studies, while BDDs timed out on dcdc and traffic. Depending on the case study, BDDs are usually in the same order of magnitude as CART, sometimes better, sometimes worse. On the one hand, on 10D Thermal and truck_trailer, BDDs have an order of magnitude less nodes, but on the other hand CART is able to produce results for dcdc and traffic. Compared to MaxFreq, there is the exception of truck_trailer, where the best BDD has a quarter of the size; on all other models, MaxFreq is at least one order of magnitude better.

6. Discussion
Table 1 shows that DTs are always better than lookup tables. In the case of DTs exactly representing the most permissive controller, our linearclassifierbased algorithm, LogReg, generally performs better than the standard DT learning algorithm CART. An inspection of the trees showed that oblique splits indeed aid in this reduction. In order to save memory, however, our determinizing algorithms may be used. Here, MaxFreq and its linear classifier variant, MaxFreqLC, easily outperform all other discussed algorithms, returning trees which can be drawn on a single sheet of paper in most of our case studies! The controller produced by MaxFreq for the case study cartpole is depicted in Figure 1(a).
Apart from the compact representation of the controllers and efficient determinization, dtControl makes controllers more understandable. This helps to do some analysis for the systems and corresponding controllers. A few analyses were mentioned for the temperature control example in the introduction. Another application is that dtControl learns how to efficiently partition the state space. In general, the tools synthesizing symbolic controllers use uniform partitioning, i.e. a uniform quantizer is used to discretize the state set. Therefore, they need a large number of symbols to represent the state set. dtControl aggregates state symbols where the same control input is admissible to reduce the number of symbols required. In other words, dtControl provides a scheme to design nonuniform quantizers (i.e., state encoders with nonuniform partitioning of stateset), illustrated in Figure 1(b).
The entries in Table 1 correspond to the necessary number of state symbols. For instance, consider the cartpole example in Table 1. The controller obtained using SCOTS requires symbols to represent the domain of the controller, which implies that one needs to send 9 bits per time unit over the sensorcontroller channel to achieve invariance. After processing the controller using dtControl with MaxFreq, we only need 6 symbols to represent the controller, corresponding to only 3 bits information. One can directly relate this idea of constructing efficient static coders to the notion of invariance feedback entropy introduced in (Tomar et al., 2017). This notion characterizes the necessary state information required by any codercontroller to enforce the invariance condition in the closed loop. For example, in the case of cartpole, the theoretical lowerbound on average bit rate for any static codercontroller to achieve invariance is 1 (obtained through the invariance feedback entropy (Tomar et al., 2017)), which is not far from 3, computed using dtControl.
In summary, one can utilize the results provided in this paper for constructing efficient codercontrollers for invariance properties which is an active topic in the domain of informationbased control (Nair et al., 2007).
7. Conclusion
We presented dtControl, an opensource, easily extensible tool for postprocessing controllers synthesized by various tools such as SCOTS and Uppaal Stratego into small, efficient and interpretable representations. The tool allows for a comparison between various representations in terms of size and performance and also allows us to export the controller both as a graphic and as a code. We also presented a new determinization technique, MaxFreq, which easily converts nondeterministic controllers into extremely small deterministic decision trees. Further algorithms for controller representation were thoroughly evaluated and made accessible to the enduser. We believe these small representations will not only allow us to save memory but also help us in understanding and validating the model. As for future work, dtControl can be extended with

different predicates: this can be other, possibly even nonlinear or nonbinary, machinelearning classifiers or richer algebraic predicates utilizing domain knowledge;

other impurity measures instead of entropy, which decide the predicate used for the split
Acknowledgements.
This work was supported in part by the H2020 ERC Starting Grant AutoCPS (grant agreement no 804639), the German Research Foundation (DFG) through the grants ZA 873/11 and KR 4890/21 Statistical Unbounded Verification, and the TUM International Graduate School of Science and Engineering (IGSSE) grant 10.06 PARSEC.References
 Strategy representation by decision trees with linear classifiers. In QEST (1), pp. 109–128. Cited by: §1, §1, §4.1.2.
 SOS: safe, optimal and small strategies for hybrid markov decision processes. In QEST (1), D. Parker and V. Wolf (Eds.), pp. 147–164. Cited by: §1, §1, §1, §1.
 Algebraic decision diagrams and their applications. Formal Methods in System Design 10 (2/3), pp. 171–206. Cited by: §1.
 Formal methods for discretetime dynamical systems. Vol. 89, Springer. Cited by: §1.
 Pattern recognition and machine learning, 5th edition. Information science and statistics, Springer. Cited by: §4.1.3, §4.1.3.
 Counterexample explanation by learning small strategies in markov decision processes. In CAV (1), Lecture Notes in Computer Science, Vol. 9206, pp. 158–177. Cited by: §1, §1.
 Strategy representation by decision trees in reactive synthesis. In TACAS (1), Lecture Notes in Computer Science, Vol. 10805, pp. 385–407. Cited by: §1, §1.
 Classification and regression trees. Wadsworth. Cited by: §1, §1, §4.1.1, §5.
 Graphbased algorithms for boolean function manipulation. IEEE Transactions on Computers 100 (8), pp. 677–691. Cited by: §1.
 An evolving oblique decision tree ensemble architecture for continuous learning applications. In AIAI, IFIP, Vol. 247, pp. 3–11. Cited by: §1.
 Uppaal stratego. In TACAS, Lecture Notes in Computer Science, Vol. 9035, pp. 206–211. Cited by: 1st item, §1, §4.2.
 Lowcomplexity quantized switching controllers using approximate bisimulation. Nonlinear Analysis: Hybrid Systems 10, pp. 34–44. Cited by: Table 2, §1, Table 1.
 Software fault tolerance for cyberphysical systems via full system restart. arXiv preprint arXiv:1812.03546. Cited by: Table 2, Table 1.
 QUEST: a tool for statespace quantizationfree synthesis of symbolic controllers. In International Conference on Quantitative Evaluation of Systems, pp. 309–313. Cited by: Table 2, §1, §1, Table 1, 1st item.

Deep neural network compression for aircraft collision avoidance systems
. CoRR abs/1810.04240. Cited by: §1.  pFaces: an acceleration ecosystem for symbolic control. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control, pp. 252–257. Cited by: Table 2, §1, Table 1, 1st item.
 Logistic model trees. In ECML, Lecture Notes in Computer Science, Vol. 2837, pp. 241–252. Cited by: §1.
 Guaranteed control synthesis for continuous systems in uppaal tiga. In Cyber Physical Systems. ModelBased Design  8th International Workshop, CyPhy 2018, and 14th International Workshop, WESE 2018, Turin, Italy, October 45, 2018, Revised Selected Papers, R. D. Chamberlain, W. Taha, and M. Törngren (Eds.), Lecture Notes in Computer Science, Vol. 11615, pp. 113–133. Cited by: §3.
 Safe and optimal adaptive cruise control. In Correct System Design, Lecture Notes in Computer Science, Vol. 9360, pp. 260–277. Cited by: Table 2, §1, Table 1.
 Pessoa: a tool for embedded controller synthesis. In International Conference on Computer Aided Verification, pp. 566–569. Cited by: §1, §1.
 Quantitative implementation strategies for safety controllers. arXiv preprint:1712.05278. Cited by: §1.
 Machine learning. McGraw Hill series in computer science, McGrawHill. Cited by: §1, §1.
 CoSyMA: a tool for controller synthesis using multiscale abstractions. In Proceedings of the 16th international conference on Hybrid systems: computation and control, pp. 83–88. Cited by: §1, §1.
 OC1: A randomized induction of oblique decision trees. In AAAI, pp. 322–327. Cited by: §1, §2, §4.1.2, §4.1.3, §5.
 Feedback control under data rate constraints: an overview. Proc. of the IEEE 95 (1), pp. 108–137. Cited by: §6.
 Synthesizing piecewise functions by learning classifiers. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 186–203. Cited by: §1.
 Scikitlearn: machine learning in Python. Journal of Machine Learning Research 12, pp. 2825–2830. Cited by: §2.

Decision tree function approximation in reinforcement learning.
In
Proceedings of the third international symposium on adaptive systems: evolutionary computation and probabilistic graphical models
, Vol. 2, pp. 70–77. Cited by: §1.  C4.5: programs for machine learning. Morgan Kaufmann. Cited by: §1, §4.1.1.
 Feedback refinement relations for the synthesis of symbolic controllers. IEEE Transactions on Automatic Control 62 (4), pp. 1781–1796. Cited by: §3.
 SCOTS: A tool for the synthesis of symbolic controllers. In HSCC, pp. 99–104. Cited by: Table 2, 1st item, §1, Table 1.
 State space grids for low complexity abstractions. In 2015 54th IEEE Conference on Decision and Control (CDC), pp. 6139–6146. Cited by: Table 2, Table 1.
 Compositional synthesis of symbolic models for networks of switched systems. IEEE Control Systems Letters 3 (4), pp. 1056–1061. Cited by: Table 2, Table 1.
 Verification and control of hybrid systems: a symbolic approach. Springer Science & Business Media. Cited by: §1, §3.
 Invariance feedback entropy of uncertain control systems. arXiv preprint arXiv:1706.05242. Cited by: Figure 2, §6.
 Perceptron trees: A case study in hybrid concept representations. In AAAI, pp. 601–606. Cited by: §1.
 Optimal symbolic controllers determinization for BDD storage. In ADHS, Cited by: §1.

The optimality of naive bayes.
In
Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference, Miami Beach, Florida, USA
, V. Barr and Z. Markov (Eds.), pp. 562–567. Cited by: §4.1.3.
Appendix A Output of dtControl for DT in Figure 1
The following is the Ccode for the DT in Figure 1, and Figure 3 shows the corresponding DOT output.
Appendix B Additional experimental results
In Table 2, we compare our algorithms as described in Section 5 to the size of BDDs representing the controllers and to the idea of randomly determinizing the controller before applying the DT algorithms. Unlike in Table 1, we report the full number of nodes, not the number of decision paths, to make the comparison to BDDs fairer. For clarity, we did not include all the algorithms from Table 1. However, if needed, one can compute the number of nodes for every algorithm by multiplying the number of decision paths in Table 1 with two and then subtracting one.
Comments
There are no comments yet.