1 Introduction
Realworld systems often contain asymmetric state spaces and do not admit a natural product structure. To model such systems with traditional graphical models such as Bayesian Networks (BNs), including its various static and dynamic variants, a set of distinct measurement variables first need to be elicited. For many asymmetric processes, this is not always easy, natural or defensible to do. Forcing a random variable based description of such systems results in a model that stores lots of redundant structural zeroes within its conditional probability tables (CPTs). For instance, a variable describing the postoperative health condition of a patient does not make sense if the patient died during the operation.
Additionally, asymmetric systems often exhibit contextspecific conditional independences where the independence relationship depends on the realisations of the conditioning set. Modifications of the BN have been proposed for embodying contextspecific independences (for e.g., Geiger and Heckerman (1996); Boutilier et al. (1996); Poole and Zhang (2003)); often using treelike adaptations of the graph or the CPTs in the process.
Chain Event Graphs (CEGs) (Collazo et al., 2018) address both these shortcomings of BNs when it comes to asymmetric processes. CEGs are built from event trees which provide an excellent and natural framework to describe the evolution of such processes (Shafer, 1996). Through a series of transformations involving the colouring of its vertices, an event tree is transformed into the more concise graph of a CEG. The topology of the CEG describes the partial or complete symmetries within the different ways in which a process might evolve. This allows reading of conditional independences, causal exploration and moreover, the graphical properties of the CEG can be drawn directly from natural language descriptions provided by domain experts before the tree is populated with probabilities.
Example 1: The hypothetical event tree in Figure 1 shows the different infection strains, treatment alternatives and outcomes for an individual (at vertex ) who shows the first symptoms of a particular infection.
The CEG family contains several dynamic variants called DCEGs (Barclay et al., 2015; Collazo and Smith, 2018; Shenvi and Smith, 2019)
. In this paper, we look at the continuous time DCEG (CTDCEG) which is inspired by the flexibility offered by semiMarkov processes (SMPs) in modelling nonexponentially distributed holding times at the various states. First introduced as the extended DCEG in
Barclay et al. (2015), this class was developed through a special case called the reduced DCEG which was applied to modelling public health interventions (Shenvi and Smith, 2019) and criminal investigations (Bunnin and Smith, 2019).CTDCEGs differ from dynamic BNs (DBNs) (Nicholson and Brady, 1994) for the reasons stated above and additionally because the latter models discrete time processes. A continuous time analogue of BNs is given by continuous time BNs (CTBNs) (Nodelman et al., 2002). CTBNs were inspired by Markov processes and represent the dynamics of structured multicomponent processes. The holding times at the vertices are restricted to exponential distributions (except in Nodelman et al. (2005b)). The major difference between CTDCEGs and CTBNs is that the former describes the possible lifehistories of a unit within a temporal process while the latter describes a temporal process through the evolution of the components describing the process. Hence CTBNs may have variables coevolving while in a CTDCEG, a unit can only be in one state at a time and models the time evolution from that state. Secondly, exact inference in CTBNs is exponential in its number of components and the approximate inference techniques are either (a) variational (Nodelman et al., 2005a; Saria et al., 2007; Cohn et al., 2009), typically using expectation propagation; or (b) stochastic approximations (Fan and Shelton, 2008; ElHay et al., 2008; Fan et al., 2010); in both cases using approximations that exploit the exponential holding time constraint of such models. And lastly, under plausible assumptions, the transition probabilities and holding times can be modelled independently in a CTDCEG.
Interestingly, a visionary but currently underdeveloped class of graphical models called the Temporal Nodes BN (TNBN) (ArroyoFigueroa and Sucar, 1999) have several similarities with the CTDCEG. Although some properties of propagation in a TNBN are similar in intuition to that in a CTDCEG, they are presented in a nontechnical way with no formal justification. Besides, point temporal evidence cannot be propagated through the TNBN (Galán and Díez, 2002). Another important difference between TNBNs and CTDCEGs is that the former, as it uses BN semantics, has to assume that each effect is caused by only one of its parents. The CTDCEG circumvents this problem through its event tree construction and colourings of its vertices.
In Section 2 we review the terminology and framework of the CTDCEG class. The new inference scheme for CTDCEGs proposed in this paper showed a gap in the existing literature which we filled in Section 3 by presenting for the first time a static continuous time CEG (CTCEG) and proved an extension of the standard propagation algorithm (Thwaites et al., 2008) for this class. Our inference scheme for propagating evidence in CTDCEGs, inspired by the scheme in Kjærulff (1992) for DBNs, is presented in Section 4. This splits the CTDCEG into three submodels; two of which have linear time complexity for propagation. While here we apply this to a continuous time setting, this scheme works exactly the same for DCEGs with discrete time domains. Later in Section 5 we explore what we believe to be a highly applicable novel class of models called the mixed CEGs whose vertices are partitioned into a set for which holding times are meaningful and the other where they are not. We briefly discuss how the methods developed in this paper extend to this new class.
2 The Continuous Time Dceg
2.1 Framework and Terminology
A CTDCEG consists of a graph describing the possible developments of a process. It exploits the symmetries within these evolutions and so provides a concise representation. It also facilitates the exploration of the underlying dependences and associates a welldefined probability model to the graph. Similar to a CEG, a CTDCEG is constructed from an event tree, albeit one which has a continuous time domain. For clarity we have chosen to use notation as close as possible to Thwaites et al. (2008).
An event tree is a directed graph representing the evolution of a process in continuous time with a (possibly infinite) vertex set and edge set . The set of nonleaf vertices  called situations  is represented by . Without loss of generality, we assume that the time spent by a unit in a situation is dependent on the situation it visits next, say . Hence, this conditional holding time can be associated with edge which goes from to . We denote this by variable . The variable indicates the unconditional holding time in situation .
Symmetries within the event tree are expressed through vertex and edge colourings associated with partitions called stages and clusters. Two situations are in the same stage
when they can be hypothesised to share the same conditional transition distribution. For the purposes of this paper we also require that for situations in the same stage, the edges with the same estimated probability share the same edge label. However, this is not essential. In fact, the edges may be retrospectively labeled to explain the symmetries observed  one of the key features of this family of models. Similarly, two edges are in the same
cluster when they share the same holding time distribution. The vertices and edges of the event tree are coloured according to their stage and cluster memberships respectively. Such a coloured tree is referred to as a hued tree. So the stages indicate equivalences in where a unit goes and clusters indicate equivalences about the speed of these transitions. Table 1 gives the nontrivial stages and clusters for the finite event tree in Example 1.An additional partition called positions defines the vertex set of the resultant CTDCEG. Two situations in the hued tree are in the same position if and only if the subtrees rooted at these situations are isomorphic to each other where the isomorphism preserves structure and colouring. We denote the set of positions in by .
Stages  Clusters 

From a hued tree representation, we obtain a CTDCEG by coalescing situations in the same position into a single vertex and by collapsing all its leaves into a single sink vertex . Hence . Note that . Only the nodes which are in the same stage but not the same position retain their colouring in the CTDCEG. Here we focus on CTDCEGs with a finite representation. However, this is not necessary for our proposed scheme as the graph is first “unrolled” (see Section 2.2) to propagate evidence. This gives our scheme the flexibility of adapting easily if and when the structure of the graph needs to be partially or fully changed at a later time.
Example 2: Suppose we consider reinfection from the same or different strain for individuals who were not hospitalised. Assume that recovery from a particular strain does not offer any added or reduced resistance to that or the other strains. This gives an infinite event tree whose fragments are represented in Figure 1. The repetition in structure and probabilities results in the CTDCEG in Figure 2 where returns are represented by backward arrows labelled “recovered”.
These backward edges representing a repetition of structure in the underlying hued tree are called cyclic edges. Additionally, we define passageslices which will play the role of timeslices in this continuous time setting. The first passageslice is a subgraph of the CTDCEG starting at its root and following all possible developments of a unit until it either arrives at or up to the vertex from which it traverses along a cyclic edge (e.g. in Figure 2). The subsequent passage slices are a collection of subgraphs of the CTDCEG such that each subgraph is rooted at a vertex into which a cyclic edge from the preceding passageslice enters, . The termination of each subgraph is determined as described for above. Thus, the cyclic edges connect the passageslices. In practice, the timeinterval of a passageslice can be arbitrarily defined. In our example, all the passageslices are identical.
The event tree notation introduced earlier extends in an obvious way to CTDCEGs. Transition probabilities in a CTDCEG can be written as the probability of an event defined using the set of its roottosink paths, . The probability of reaching a position is given by where is the union of all paths in passing through . Similarly, the probability of passing through is given by where is the union of all paths in passing through the edge . The holding time density of staying at position for time before transitioning along edge is denoted by . Let and .
2.2 Unrolling a CtDceg
Denote by the set of cyclic edges connecting passageslice to , . Any DCEG can be “unrolled” into an infinitely large CEG (Collazo and Smith, 2018; Shenvi and Smith, 2019) by connecting its passageslices with the corresponding cyclic edges. This is analogous to unrolling a DBN.
Denote by a CTDCEG which has been unrolled from passageslices to , for . All the edges in are collected into a sink node . The unrolling process may result in multiple sink nodes then merged into a single sink node. Figure 3 shows a pictorial description of this process.
2.3 SemiMarkov Representation
A CTDCEG can be represented as an SMP (see Section 5 and the relevant appendices of Shenvi and Smith (2019)). Informally, an SMP is a stochastic process where the next state occupied by a unit in state
is determined by the transition probabilities of its embedded Markov chain, and the distribution governing the time spent in state
is determined by the choice of . The positions of a CTDCEG can be regarded as states in its corresponding SMP. Furthermore, depending on the evidence, only a small subset of nodes of the original CTDCEG might be relevant. Such a CTDCEG can be represented by a condensed SMP. The statetransition diagram of the SMP for the CTDCEG in Example 2 is identical to Figure 2 with the exception that the two edges from position to are merged into a single edge with a holding time distribution that is a mixture of their individual holding time distributions.3 The Continuous Time Ceg
A CTCEG is a static variant of the CTDCEG or equivalently, a continuous time analogue of the discrete time CEG. A CTCEG is an acyclic eventbased graphical model with a total ordering (coming out of its event tree construction) and vertices evolving at possibly varying time granularities  a semiMarkovian approach. It has one sink node to collect its leaves. For simplicity, here we consider timehomogeneous CTCEGs.
Say that a CTCEG is completely specified when
are specified. The joint distribution of any events measurable with respect to the
algebra of can be obtained when is completely specified. Let be a path of a sequence of edges, and be a sequence of times at which transitions are made along the edges of . We can write this as a sequence of triples of vertex, edge and time spent at the vertex before going along the edge  for example, . The joint probability of and can be specified as3.1 Compatible Evidence and Events
Evidence in a BN is typically in the form of instantiations of a subset of its variables. In a CEG, evidence takes the form of intrinsic events occurring (Thwaites et al., 2008). Consider an event given by in a CTCEG . Call an intrinsic event if the subgraph of , say induced by the roottosink paths of are exactly the same as the set of roottosink paths contained in . With respect to temporal evidence, here we consider only point evidence. Call the temporal evidence temporally compatible if we know all the transition times for the unit starting from the root of up to a certain depth. If evidence defines an intrinsic event and is temporally compatible, call it compatible evidence. In Section 3.2.1, we consider temporal evidence where we only know the transition time for some specific, nonroot vertex or vertices.
3.2 A Propagation Algorithm
For a process described by a CTCEG , compatible evidence about the temporal evolution of a unit makes the retrospective transition probabilities dependent on the corresponding holding time densities. Assume that the temporal information in gives the transition times at all vertices from the root to the sink of the CTCEG. While we may not know the exact vertices visited by the unit, we know the time the unit made its th transition. Observing typically revises the probability of not visiting several of the roottosink paths, either partially or entirely, to one. These paths or parts thereof can be deleted from the graph of to obtain a condensed representation given by the adapted CTCEG subgraph called the transporter CTCEG. Note that where represents the set of paths implied by the compatible evidence . The original staging structure within may be destroyed by the deletion of vertices and edges to obtain . In this sense, only preserves the conditional independences which are still valid after observing . A CTCEG is called minimal when it contains no two vertices in that have isomorphic subtrees preserving structure and colourings. While this is not essential, we will assume from now that our transporter CTCEG is minimal.
We describe below a twopass backwardforward messagepassing algorithm which has two main steps: a backward step to calculate the potentials and emphases, and a forward step which revises the transition probabilities. Note that refers to probabilities in and to the updated probabilities in .
Denote by all the edges emanating from vertex . Let . The algorithm proceeds as follows.

For each edge , , set the tpotential and hpotential as
if and zero otherwise. The holding time at is indicated by . Set the temphasis and hemphasis as follows
Now we say that the sink and all the positions in are accommodated.

For an edge , such that all of ’s children are accommodated, set the tpotential and hpotential as
if and zero otherwise. Set the emphases as
Position is accommodated when the potentials and emphases are calculated for all .

For all , the revised conditional transition probabilities are given by
Note that for the edges in the transporter , the holding time densities are invariant under the compatible evidence and are simply imported from the relevant edges in .
A proof of this result is presented in the appendix. Let denote the vertices that have edges terminating in and denote the edges terminating in . The pseudocode for the above algorithm is given in Algorithm 1. Here, the possible arrival time at position is denoted by . Note that this algorithm is also applicable to the discrete time setting where the holding time densities are replaced by the corresponding probability mass functions.
3.2.1 Incomplete Temporal Evidence
So far we considered evidence where we knew all the arrival times from the root up to the sink. However, it may happen that the evidence contains the arrival times only up to some nonsink vertex . In this scenario, the tpotentials and both the emphases are as defined in Section 3.2. The hpotential is set to one for all vertices for which the transition time from is unknown. For the other vertices, define the hpotential as in Section 3.2. Thus when the transition time for a certain vertex is not known, we revert to the standard CEG propagation algorithm for that vertex. Observe that both these types of temporal evidence automatically remove the possibility of having visited any paths that contain fewer or more vertices than we have arrival times.
In several applications, there may be directed paths of varying lengths from the root to some vertex , denote them by . Evidence might only indicate that the unit has arrived at at some time , following the convention that the arrival time at the root is . In such cases, often the interest is in knowing the probabilities associated with the different paths in . Examples of domains where this may be the case are medicine, law and criminal justice.
To obtain these path probabilities, we first construct the transporter CTCEG and calculate the revised transition probabilities, denoted by , only using the nontemporal evidence in and the CEG propagation algorithm. For each path , let the random variable indicate the time it takes to get from vertex to in , when the unit goes along path . Then is a convolution of the holding time densities along the edges in . The probability that the unit travelled to from along path is given by
where . The holding time densities are invariant under any evidence observed and the path probability is obtained as
Alternatively, revised transition probabilities could be obtained by iteratively computing the holding time density on edges upstream of given the temporal evidence in . However, this requires integrating over the possible times the unit could have arrived at the intermediate vertices which is a nontrivial task.
4 Dynamic Propagation
For a given CTDCEG , suppose the evidence pertains to a set of positions contained in passageslices to , . Assume that is compatible with respect to the model . Then can be split into three models: present, past and future. The construction and propagation scheme within each of these models is described below.
Present model: The present model is given by . Propagating the compatible evidence in this model proceeds exactly as in a static CTCEG (see Section 3). Note that situations in the event tree of a CTDCEG which have a directed path between them could be in the same position as the infinitely large subtrees rooted at them could be isomorphic. However, when a CTDCEG is unrolled and we look at a finite number of passageslices, they can no longer be in the same position. This is necessary for writing 2 as 3 in the proof in the appendix.
Past model: The unrolled model
gives the past model. Any evidence in the present model also affects the past model. However, this need not be propagated to the past model unless we need to reestimate the past probability distributions or make inferences about the positions within the past passageslices.
Passageslices from the past model can be moved to the present model in a straightforward way for reestimating the probabilities therein. For instance, for inference on evidence concerning positions in passageslice , , can be incorporated into the present model as follows. First, the revised past model is given by . Next, the vertices and edges that are not visited with probability one, conditioned on evidence and , are deleted from . Denote this by . For an edge , if , then connect it to the relevant vertex in the present model. This gives us the revised present model. Propagation continues backwards from the root vertices of the original present model to the root vertices of the revised present model.
Future model: The graph of a finite CTDCEG as it applies to passageslices , is first adapted to delete all the edges and vertices that will not be visited in future passageslices with probability one given the evidence . Call this . The conditional transition probabilities at each position are revised as
Recall that the holding time distributions defined along the extant edges in are invariant under observing evidence and can be imported directly from . The adapted CTDCEG can now be represented by the statetransition diagram of a, possibly condensed, SMP (see Section 2.3). Forecasts concerning probabilities of future events are calculated using the transition matrix of its embedded Markov chain. Additionally, all inferences that can be typically made from a semiMarkov process or a CTDCEG can still be made in the standard way (Shenvi and Smith, 2019).
The above scheme, although simple, is capable of making a wide range of inferences. While it is analogous to the dynamic inference scheme for DBNs in Kjærulff (1992), movement between the three models is much easier for the CTDCEG. This is because we do not need to reconstruct a junction tree and propagation is carried out directly on the vertices and edges of the adapted graphs of the concerned model. More importantly, the complexity of propagation in the past and present models is linear in the number of vertices they contain, whereas it is exponential in the maximal clique size in a junction tree. Another key advantage of the CTDCEG is that additional intrinsic events observed always lead to a simplification of the graph and thereby to efficiency gains.
5 Applications
5.1 A Simple Application
We now revisit Example 2. Suppose that for an individual who has had the infection twice in the past, we observe that they had the infection again, were treated for it and recovered from it for the third time. Suppose we also observe that the individual had the following transition times recorded as the number of days since they recovered last time . For this evidence, the present model is simply given by Figure 8 and propagation in this model requires only 32 operations: 8 tpotentials, 8 hpotentials, 5 temphases, 5 hemphases and 6 revised edge probabilities.
{Strain 1}  

{Strain 2}  
The future model is represented by an SMP whose statetransition diagram is given in Figure 5. The two edges from to in Figure 8 are replaced by a single edge in the SMP. As reinfection from any of the strains is possible, the CTDCEG for the future model is identical to the original CTDCEG in Figure 2. Observe that acts as an absorbing state in the SMP. This example is explored in greater detail in the supplementary material.
It is instructive to compare this problem representation with alternative dynamic models: the DBN and the CTBN. A DBN for Example 2 could be represented by the two timeslice DBN in Figure 6(a) where the variables , and represent the strain of the infection, the treatment type and outcome respectively. Given the significant asymmetries in the example, this DBN is an approximation and hides away structural zeros within its CPTs. It also does not graphically represent the lack of treatment options for strain 3 of the infection.
Figure 6(b) shows a CTBN for Example 2. Due to structural zeroes, some of the conditional intensity matrices are null matrices (e.g. for treatment given the third strain of infection). Additionally, as seen from Table 2, the process contains nonexponential holding times which CTBNs were not designed to represent: CTBN propagation algorithms rely on exploiting the exponential nature of the holding times.
5.2 A Mixed Ceg Application
For certain types of transitions, it is not natural to define a holding time at that vertex. For instance, a vertex categorising an individual into different risk categories may not be meaningfully associated with a holding time. Call a CEG a mixed CEG if its vertex set can be partitioned into two mutually exclusive subsets and such that transitions from vertices in are associated with holding times and those from vertices in are not. However, using a mixed CEG is a modelling choice. The modeller may choose to associate such a vertex with, for example, the time it takes a health practitioner to make the categorisation. Additionally, may contain vertices with holding time distributions in discrete as well as continuous time domains. In our experience such mixed systems are more common in the real world than a homogeneously defined one. In fact, the public health applications considered in Shenvi and Smith (2019) are mixed DCEGs although they were not recognised as such. For illustrative purposes, we consider one of these applications to emphasise the usefulness of mixed (D)CEGs.
Example 3: Consider the mixed DCEG in Figure 7 based on a realworld falls intervention (Eldridge et al., 2005). The transitions from , describe categorical events that are not naturally associated with holding times. However, all transitions from the remaining vertices are best described in conjunction with how long it took for such transitions to occur and such descriptions are of clinical importance.
For propagation in a mixed CEG, the hpotentials for edges emanating from vertices in are set to be one. The remaining potentials and emphasis are defined as in Section 3.2. Thus our methodology can be adapted in a straightforward way to the wide range of applications for which mixed CEGs are appropriate.
6 Discussion
We presented an evidence propagation scheme for the CTDCEG  a temporal, dynamic eventbased graphical model  within a statistically grounded framework. This scheme is also applicable to all other DCEGs. We also filled the technical gaps in the literature for such a scheme to work by providing a propagation algorithm for a static CTCEG, and briefly explored the novel class of mixed CEGs.
We have demonstrated how the CTDCEG gives a better representation of a process when it contains significant asymmetries and when the evolution of the process can be described under some total ordering of the events. For dynamic processes with fewer asymmetries, where coevolution of components is important and where holding times are known to be exponential, the CTBN should be preferred. On the other hand, the DBN is the ideal and most welldeveloped modelling tool available when the process meets all the conditions for the CTBN and all the components can be assumed to evolve at the same discrete time granularity.
Finally, in this paper we only looked at certain types of point temporal evidence. Methods to incorporate interval temporal evidence in the CT(D)CEG are yet to be devised. Additionally, certain types of point and interval temporal evidence can lead to confounding by making all the past subprocesses highly dependent on each other. One example of this is addressed in a very simple way in Section 3.2.1. In case of such evidence we suggest (a) deferral of its inclusion until further evidence is obtained, or (b) use of approximate inference schemes. The latter remains an open problem for this class of models.
Appendix
The CTCEG propagation algorithm states that for we have
where the potentials and emphases are as defined in Section 3.2 and is some compatible evidence.
Proof:
Given that a unit is in some position , the probability of transitioning along an edge emanating from is independent of the holding time at . This gives us
(1) 
Let denote the event tree underlying the CTCEG and  a subtree of  denote the event tree underlying the transporter CTCEG . By the definition of a position, each corresponds to a set of vertices in . Then, for , we can split this into two mutually exclusive subsets: representing the vertices of in and representing the vertices of not in . Additionally, the paths in and in are a union of the paths in and in respectively for . Every represents . Thus, for some represented by , we have
This allows us to write (1), for , as
(2) 
to be evaluated on the tree . There is no directed path from and for as their subtrees are isomorphic in . Hence we have that . So we can write (2) as
(3) 
For we have that . Also, notice that writing as a probability over paths of is equivalent to writing it over paths of , for . So we can write (3) as
for any .
The proofs for where , , and follow exactly as given in Thwaites et al. (2008). Additionally, we have for ,
by definition of and by the invariance of the holding time density given any compatible evidence . We use induction to prove that where , .
Step 1: Consider the positions . We have
for any and .
Step 2: Now consider any such that all the vertices into which terminate have . Then we have
However, in a tree, we have that . So we can write as
This completes the proof.
Supplementary Material
A Propagation in the present model
In this section, we continue our analysis of Example 2 given the evidence in Section 5.1. The potentials and emphases for the present model are given in Tables 3 and 4 respectively. Here refers to the edge associated with strain of the infection, .
Edge  tpotential  hpotential 

,  
,  
,  
,  
,  
, 
Vertex  temphasis  hemphasis 

,  
,  
,  
, 
The revised transition probabilities are shown in Figure 8. Let paths be given by the sequences of edges
Comments
There are no comments yet.