From Random Differential Equations to Structural Causal Models: the stochastic case

03/23/2018 ∙ by Stephan Bongers, et al. ∙ University of Amsterdam 0

Random Differential Equations provide a natural extension of Ordinary Differential Equations to the stochastic setting. We show how, and under which conditions, every equilibrium state of a Random Differential Equation (RDE) can be described by a Structural Causal Model (SCM), while pertaining the causal semantics. This provides an SCM that captures the stochastic and causal behavior of the RDE, which can model both cycles and confounders. This enables the study of the equilibrium states of the RDE by applying the theory and statistical tools available for SCMs, for example, marginalizations and Markov properties, as we illustrate by means of an example. Our work thus provides a direct connection between two fields that so far have been developing in isolation.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Uncertainty and random fluctuations are a very common feature of real dynamical systems. For example, most physical, financial, biochemical and engineering systems are subjected to time-varying external or internal random disturbances. These complex disturbances and their associated responses are most naturally described in terms of stochastic processes. A more realistic formulation of a dynamical system in terms of differential equations should involve such stochastic processes. This led to the fields of stochastic and random differential equations, where the latter deals with processes that are sufficiently regular. Random differential equations (RDEs) provide the most natural extension of ordinary differential equations to the stochastic setting and have been widely accepted as an important mathematical tool in modeling and analysis of numerous processes in physics and engineering systems (Bunke, 1972; Soong, 1973; Sobczyk, 1991; Rupp and Neckel, 2013).

These internal and external disturbances of RDEs are not only of stochastic nature, but they are also of causal nature. They are causal in the sense that the disturbance processes are affecting other processes of the system. This allows us to model interventions on RDEs by forcing certain processes to be of a certain form, e.g. moving an object to a fixed position. Perfect or surgical interventions break any other causal influences on the intervened processes, but other types of interventions also occur in practice.

Although at least in principle random differential equations could be used for modeling causal relationships between the processes, infering such causal models from data is often difficult. A significant practical drawback of this modeling class is that obtaining time series data with sufficiently high temporal resolution is often costly, impractical or even impossible. Another issue is that if one has only access to a subset of the system’s processes, for example due to practical limitations on the measurability of some of the processes, then in general there does not have to exist an RDE on this subset of processes that could be estimated. A similar issue arises when the RDE contains exogenous latent confounding processes.

Structural causal models (SCMs), also known as structural equation models, are another well-studied causal modeling tool and have been widely applied in the genetics, economics, engineering and social sciences (Pearl, 2009; Spirtes et al., 2000; Bollen, 1989). One of the advantages of SCMs over other causal modeling tools is that they have the ability to deal with cyclic causal relationships (Spirtes, 1995; Hyttinen et al., 2012; Mooij et al., 2011; Forré and Mooij, 2017; Bongers et al., 2016). In particular, recent work has shown how one can apply Markov properties (Forré and Mooij, 2017), how one can deal with marginalization and how one can causally interpret these models in the cyclic setting (Bongers et al., 2016).

Over the years, several attempts have been made to interpret these structural causal models that include cyclic causal relationships. They can be derived from an underlying discrete-time or continuous-time dynamical system (Fisher, 1970; Iwasaki and Simon, 1994; Dash, 2005; Lacerda et al., 2008; Mooij et al., 2013). All these methods assume that the dynamical system under consideration converges to a single static equilibrium, with the exception of the analysis by Fisher (1970), who assumes that observations are time averages of a dynamical system. These assumptions give rise to a more parsimonious description of the causal relationships of the equilibrium states and ignore the complicated but decaying transient dynamics of the dynamical system. The assumption that the system has to equilibrate to a single static equilibrium is rather strong and limits the applicability of the theory, as many dynamical systems have multiple equilibrium states.

In this paper, we relax this condition and capture, under certain convergence assumptions, every random equilibrium state of the RDE in an SCM. Conversely, we show that under suitable conditions, every solution of the SCM corresponds to a sample-path solution of the RDE. Intuitively, the idea is that in the limit when time tends to infinity the random differential equations converge exactly to the structural equations of the SCM. Moreover, we show that this construction is compatible with interventions under similar convergence assumptions. We like to stress that our construction automatically captures the stochastic behavior of the RDE in the associated SCM. It can deal with randomness in the initial conditions, the coefficients and via the random inhomogenous part (captured as additive noise in the SCM), thereby significantly extending the work by Mooij et al. (2013) who only considers the deterministic setting.

The advantage of SCMs over RDEs is that by not modeling the transient random dynamics of the RDE, one arrives at a more compact representation for learning and prediction purposes of random systems that have reached equilibrium. Another advantage is that one can marginalize over a subset of the systems variables and get a more parsimonious representation that preserves the causal semantics (Bongers et al., 2016). Yet another advantage is that it is easier to deal with confounders within the framework of SCMs, as we only need to model the equilibrium distribution of these confounders, and don’t need to model their dynamics.

The remainder of the paper is organized as follows: In Section 2 we review the necessary theory about stochastic processes in order to describe RDEs. In Section 3 we introduce random dynamical models, that define RDEs together with interventions, and we discuss convergence properties of those models. In Section 4 we introduce structural causal models. In Section 5 we present our main result, which builds the bridge between RDEs and SCMs. In Section 6 we give an example from chemical kinetics and Section 7 contains a discussion including some open problems. Proofs are provided in the Supplementary Material.

2 Preliminaries

Let be a set. A stochastic process is an -valued function such that (which denotes ) is for each an -measurable function111Assuming the Borel -algebra on , that is the smallest -algebra on which contains all open -balls.

(i.e. a random variable) on a probability space

. We will always assume that there exists some background probability space on which the random variables and processes live. Furthermore, will always denote an interval in and has the meaning of time, if not stated otherwise. For each we have an -valued function from to , which is called a sample path or realization of .

Let two such processes and be equivalent, i.e., for all we have , then for all there are sets such that and holds for all . If, moreover, one can choose the sets independently of , that is such that holds for all , then we denote such an equivalence between the processes and by .

If the sample paths of a stochastic process are continuous on for almost all , then the process is called sample-path continuous on . For a sample-path continuous process and the function defined, with probability one, by

is a random variable, if it exists (Doob, 1953; Loéve, 1977). Moreover, a stochastic process is called sample-path differentiable on , if there exists a set such that and that for all the derivative

exists. The mapping is called the sample-path derivative of . Note that there is always a stochastic process such that . Similarly the sample-path integral is defined, that is, a stochastic process is called sample-path integrable on , if the integral222We use here Lebesgue integration, hence we assume the Lebesgue measure on the Lebesgue -algebra of Lebesgue measurable sets of . exists for almost all .

A random vector

can itself be seen as a stochastic process, that is the process defined by . This stochastic process is by definition sample-path continuous. Moreover, if a process is sample-path continuous on and there exists a random variable such that

almost surely, then we say that converges to and we will call the process convergent.

3 Random Differential Equations

Ordinary differential equations, which have the general form

(1)

provide a simple deterministic description of the dynamics of dynamical systems. The solution of an initial value problem consisting of differential equation (1) together with an initial value

(2)

represents the state of such a system at time , given that the state (2) was attained at time . The inclusion of random effects in the dynamical system leads to a number of modifications that can be made to the formulation of the initial value problem (1), (2) (Gard, 1988). The first, and simpler, case arises when the initial value is replaced by a random variable . The second case arises when the deterministic function has random coefficients, i.e. it is replaced by a random function , where is a stochastic process uncoupled with the solution process . As a special case, may be replaced by a random function with a random inhomogenous part (i.e., additive noise), that is, it is replaced by a random function . Of course, a combination of these cases could hold.

The inclusion of random effects in differential equations leads to two distinct classes of equations, for which the random processes have differentiable and non-differentiable sample paths, respectively. If the random processes occuring in a differential equation (for example and ) are sufficiently regular, i.e. have differentiable sample paths, then the majority of problems can be analyzed by use of methods which are analogous to those in deterministic theory of differential equations; such equations are called random differential equations

. The second class occurs when the inhomogenous part is an irregular stochastic process such as Gaussian white noise. The equations are then written symbolically as stochastic differentials, but are interpreted as integral equations with Ito or Stratonovich stochastic integrals. These differential equations are called

stochastic differential equations. In this paper, we will focus on random differential equations.

3.1 Observational random dynamical models

We will define a random differential equation in terms of an observational random dynamical model:

Definition 1.

An observational random dynamical model (oRDM) is a tuple

where

  • is a time interval,

  • is a finite index set of endogenous processes,

  • is a finite index set of exogenous processes,

  • is the product of the codomains of the endogenous processes, where each codomain ,

  • is the product of the codomains of the exogenous processes, where each codomain ,

  • is a function that specifies the dynamics,

  • is an exogenous stochastic process.

The oRDM gives the observational random dynamics of the random dynamical system, without any intervention from outside. The random dynamics are described in terms of random differential equations (Bunke, 1972):

Definition 2.

A stochastic process is a sample-path solution of the random differential equations (RDE) associated to ,

(3)

if the ordinary differential equations (ODE)

are satisfied for almost all .

Let and let be a -dimensional random variable, where , such that a.s., then is called a sample-path solution of (3) with respect to the initial condition .

The sample-path solution of (3) with respect to the initial condition is called unique on if for an arbitrary pair , of sample-path solutions with respect to the initial conditions we have .

We associated an ordinary differential equation to any specific sample path . The solutions of these ordinary differential equations are the sample paths of a stochastic process , which is the sample-path solution of the random differential equation.

In particular, an oRDM is a deterministic dynamical model, if the background probability space is . In this setting, the associated RDE is just a single ODE.

Example 1 (Damped coupled harmonic oscillator).

Consider the well-known damped coupled harmonic oscillator, consisting of a one-dimensional system of point masses () with positions and momenta . They are coupled by springs, with spring constants and equilibrium lengths (), under influence of friction with friction coefficients and with fixed end-points and (see Figure 1).

Figure 1: Damped coupled harmonic oscillator for .

The equations of motion of this system are provided by the ODE:

Suppose that the lengths

are not constant, but are indepent normally distributed around

with a certain variance. Then this ODE with random coefficients is actually a random differential equation modeled by an oRDM.

If the oRDM is sufficiently regular, then the majority of problems for such models can be analyzed by use of methods which are analogous to those in the theory of ordinary differential equations (Bunke, 1972; Sobczyk, 1991; Rupp and Neckel, 2013).

Definition 3.

An oRDM is called regular if is continuous and is sample-path continous.

For ordinary differential equations a sufficient condition, for the existence and uniqueness of a solution with respect to an initial value, is the Lipschitz condition. Similarly, one can prove, by using results from the theory of ordinary differential equations, that there exists a similar sufficient condition for random differential equations of regular oRDMs.

Theorem 1.

Consider a regular oRDM . If for almost all there exists a continuous function such that for each and the condition

is satisfied, where means the Euclidean norm in , then for any initial condition there exists a unique sample-path solution of the RDE (3) w.r.t. .

3.2 Intervened random dynamical models

Interventions on an observational random dynamical model can be modeled in different ways. Here we will consider interventions on the endogenous processes. We model an intervention on a subset of the endogenous processes by forcing those processes to be . This can be seen as a “surgical” intervention, since they break the causal influences on the intervened processes (Eberhardt, 2014). The random dynamics of the other processes are still untouched and are described in terms of the RDE associated to those processes, that is333For , we adopt the notation for .

This yields the following random dynamical model including interventions.

Definition 4.

A random dynamical model (RDM) is a tuple

where

  • is a time interval,

  • is a finite index set of endogenous processes,

  • is a finite index set of exogenous processes,

  • is a subset of intervened processes,

  • is the product of the codomains of the endogenous processes, where each codomain ,

  • is the product of the codomains of the exogenous processes, where each codomain ,

  • is a function that specifies the dynamics of the processes,

  • is an intervened stochastic process,

  • is an exogenous stochastic process.

If , then we call also a non-intervened random dynamical model, otherwise we will call it an intervened random dynamical model.

The (intervened) random dynamical model gives the (intervened) random dynamics of the random dynamical system, where the random dynamics are described by the following set of equations:

Definition 5.

A stochastic process is a sample-path solution of the (intervened) random differential equations associated to the (intervened) RDM ,

(4)

if and the ordinary differential equations

are satisfied for almost all .

Let and let be a -dimensional random variable, where , such that a.s., then is called a sample-path solution of the (intervened) RDE (4) with respect to the initial condition .

The sample-path solution of (4) with respect to the initial condition is called unique on if for an arbitrary pair , of sample-path solutions with respect to the initial conditions we have .

In particular, the non-intervened model , where is the terminal process , yields the same sample-path solutions as the observational random dynamical model . They describe the same random dynamics and in this sense the class of observational random dynamical models can be seen as a subclass of the class of random dynamical models.

Definition 6.

We call an RDM linear, if the function is given by

where and are matrices.

The function that defines the dynamics of the RDM encodes a functional structure that can be represented by a directed mixed graph.

Definition 7.

We define the functional graph of an RDM as the directed mixed graph with nodes , directed edges if and only if is a functional parent444Let and and consider a function . We call an a functional parent of w.r.t. , if there does not exist a function such that . of w.r.t.  and bidirected edges if and only if there exists a such that is a functional parent of both and w.r.t. .

For a linear RDM one would draw an edge if is non-zero and if both and are non-zero for some .

The causal semantics of a random dynamical model can be modeled using interventions:

Definition 8.

Given an RDM , a subset and a stochastic process , the intervention maps to the intervened RDM .

Note that interventions on disjoint subsets of the endogenous processes commute.

Example 2 (Damped coupled harmonic oscillator).

Consider the damped coupled harmonic oscillator of Example 1. Its functional graph is depicted in Figure 2. We can perform an intervention on by moving the position of the mass to a fixed position . This is modeled by replacing the equation of motion of the position by the process , that defines the motion of moving the mass to the fixed position . Performing a similar intervention on the momentum usually does not lead to an RDM with sample-path solutions that converge to a certain random variable.

Figure 2: Functional graph of the RDM for the damped coupled harmonic oscillator of Example 1 for .

We can define a regularity condition for the RDM similar to the one for oRDMs.

Definition 9.

An RDM is called regular if is continuous and both and are sample-path continous.

The existence and uniqueness Theorem 1 generalizes to the RDM .

Corollary 1.

Consider a regular RDM . If for almost all there exists a continuous function such that for each and the condition

is satisfied, then for any initial condition such that a.s. there exists a unique sample-path solution of the RDE (4) w.r.t. .

Every linear RDM for which and are sample-path continuous is regular. Moreover, there always exists an , which is independent of , such that the condition of Corollary 1 holds, hence for every regular linear RDM there exists a unique sample-path solution for any initial condition (Bunke, 1972).

3.3 Steady random dynamical models

Here we consider an important subclass of regular RDMs that satisfies certain convergence properties.

Definition 10.

We call an RDM steady, if is regular, , the process converges to a random variable and the process converges to a random variable .

The class of steady RDMs is not stable under arbitrary interventions, that is a steady RDM does not have to stay steady under an intervention, however it is stable under the following class of interventions:

Definition 11.

We call an intervention a perfect intervention if the process converges to a random variable .

Note that for any perfect intervention , the process is sample-path continuous by definition.

Although steadiness of an RDM guarantees that the exogenous and intervened processes converge, it does, in general, not guarantee that any of its sample-path solutions converges. However:

Definition 12.

Given a steady RDM . If a sample-path solution converges to a random variable , then we say that the sample-path solution equilibrates and we call an equilibrium variable of the sample-path solution .

If a sample-path solution , that describes the behaviour of the system, equilibrates, then in particular we have

almost surely.

4 Structural Causal Models

Structural causal models (SCMs), also known as structural equation models, provide a probabilistic description of the causal semantics of a system. They are widely used for causal modeling purposes (Pearl, 2009; Spirtes et al., 2000; Bollen, 1989). In this paper, we will follow the terminology of Bongers et al. (2016).

Definition 13.

A structural causal model (SCM) is a tuple

where

  • is a finite index set of endogenous variables,

  • is a finite index set of exogenous variables,

  • is the product of the codomains of the endogenous variables, where each codomain ,

  • is the product of the codomains of the exogenous variables, where each codomain ,

  • is a measurable function that specifies the causal mechanism,

  • is a random variable.555We slightly deviate from Bongers et al. (2016), where instead they take an exogenous probability measure on .

The solutions are described in terms of structural equations.

Definition 14.

A random variable is a solution of the SCM if the structural equations

are satisfied almost surely.

The causal mechanism encodes a functional structure that can be represented by a directed mixed graph.

Definition 15.

We define the functional graph of an SCM as the directed mixed graph with nodes , directed edges if and only if is a functional parent of w.r.t.  and bidirected edges if and only if there exists a such that is a functional parent of both and w.r.t. .

4.1 Intervened structural causal models

The causal semantics of a structural causal model can be modeled using perfect interventions (Pearl, 2009).

Definition 16.

Given an SCM , a subset and an endogenous variable , the intervention maps to the intervened model where the intervened causal mechanism is defined by:

We call an intervention a perfect intervention if .

Note that interventions on disjoint subsets of endogenous variables commute.

5 From Steady RDMs to SCMs

We now have set the stage for constructing an SCM from an RDM under some convergence properties. Here, we will consider steady RDMs, as discussed in Section 3.3, for which the exogenous and intervened processes are well-behaved as time tends to infinity. For this class of RDMs we will see that the random differential equations, that determine the sample-path solutions of the RDM, play an analogous role to the structural equations, that determine the solutions of the SCM.

Definition 17.

Given a steady RDM . Define the SCM associated to to be where the associated causal mechanism is defined by

with

and

Note that the steadiness of implies the measurability of . This leads to our first main result:

Theorem 2.

Given a steady RDM . If there exists a sample-path solution of that equilibrates to , then is a solution of the associated SCM .

The converse does not hold in general, however we have the following sufficient condition:

Proposition 1.

Consider a steady RDM such that (i.e., is constant in time). If is a solution for the associated SCM , then there exists a sample-path solution of that equilibrates to .

We can weaken the condition that has to be constant over time by imposing the following additional assumption on the model.

Proposition 2.

Consider a steady RDM for which (i) there exists an and a such that for all and (ii) for almost all there exists a continuous function such that for each and the condition

is satisfied. If is a solution for the associated SCM , then there exists a unique sample-path solution of that equilibrates to .

Consider the diagram in Figure 3. So far, we have defined each mapping in this diagram separately (see Definition 8, 16 and 17). The next result shows that this diagram commutes:

Theorem 3.

Given a steady RDM , a subset and a process such that equilibrates to and equilibrates to . Then:

In other words, perfect intervention commutes with the mapping from steady RDM to SCM.

RDM

SCM

intervened RDM

intervened SCM

Figure 3: This diagram shows that perfect intervention commutes with the mapping from steady RDM to SCM as is made explicit in Theorem 3.
Example 3.

Consider a linear RDM where is of the form as in Definition 6 and is a random variable, that is a stochastic process that is constant in time. Then the associated SCM is where the causal mechanism is defined by

where .

Example 4 (Damped coupled harmonic oscillator).

Consider again the damped coupled harmonic oscillator of Example 1. The structural equations of the associated SCM are given by

These equations describe the equilibria of the positions and momenta. Figure 2 reflects the intuition that at equilibrium the position of each mass has a direct causal effect on the position of its neighbors. This can be seen more clearly by marginalizing over the momentum variables. Observing that the momentum variables always vanish at equilibrium, we can focus on the position variables as the variables of interest. We can marginalize over the momentum variables by solving each equation of w.r.t. itself and then substituting these in the equations of (Bongers et al., 2016). This yields the marginal model with the following structural equations

Resolving the self-loops of this marginal model by solving each equation w.r.t. itself gives the structural equations

and this model yields the same causal semantics for the position variables as the original model (Bongers et al., 2016). The functional graph associated to this model is depicted in Figure 3(a). If we now perform a perfect intervention on by moving the mass to a fixed position , then we get the graph as depicted in Figure 4(b). Because these models are uniquely solvable and linear we can perform -separation w.r.t. both graphs and conclude that holds in the intervened model but not in the observational model (Forré and Mooij, 2017).

(a)

(b)
Figure 4: Functional graph of the marginal SCM associed to the damped coupled harmonic oscillator of Example 1 for after resolving the self-loops, under (a) no intervention and (b) perfect intervention on .

This example demonstrates that the equilibrium variables of the RDM can be studied by statistical tools applicable to SCMs. This sheds some new light on the concept of causality as expressed within the framework of structural causal models.

6 Application: Chemical Kinetics

(a) Observational.
(b) Perfect intervention on .
(c) Perfect intervention on and .
Figure 5: Simulation of the RDE associated to the basic enzyme reaction with random initial conditions under different interventions.

Figure 6: Basic enzyme reaction

Chemical kinetics is the study of rates of chemical processes. The chemical processes are described by the chemical reactions which are often modeled through ordinary differential equations. A well-known chemical reaction is the basic enzyme reaction which is schematically represented in Figure 6 (Murray, 2002).

(a)

(b)
Figure 7: The functional graph of the SCM associated to the basic enzyme reaction under (a) perfect intervention on and (b) perfect intervention on and .

It describes an enzyme , binding to a substrate , to form a complex , which in turn releases a product while regenerating the original enzyme. The ’s, called the rate constants, quantify the rate of a chemical reaction. These chemical reactions satisfy the law of mass action, which states that the rate of a reaction is proportional to the product of the concentrations of the reactants. Applying this to the concentration processes , , and of the basic enzyme reaction, gives the RDE:

Although this RDE has no random coefficients (or random inhomogenous part), randomness can enter the RDE via the initial conditions. In Figure 4(a) we simulated the RDE with rate constants and random initial conditions. The randomness of the initial conditions evolves over time and is captured in the associated SCM at equilibrium. That is, they are described by the associated SCM:

where we removed the self-loops for convenience. This is an example of an SCM that is not uniquely solvable, which is illustrated in Figure 4(a) by the dispersion of the concentration and at large , hence this example cannot be treated with the theory of Mooij et al. (2013) which assumes no dependence on initial conditions.

Let us for the moment fix the concentration of the complex by performing a perfect intervention on

as illustrated in Figure 4(b). From the functional graph of the associated intervened SCM in Figure 6(a) we can read off that performing another perfect intervention on the substrate should have no effect on the product , as it would lead to the functional graph in Figure 6(b) where there is no directed path from to . This prediction, based on the functional graph of the SCM associated to the RDM, is indeed verified by the simulations in Figures 4(b)4(c). Intuitively, this is also what one would expect, since the complex is the only element in the system that is capable of releasing the product.

This illustrates that random differential equations are capable of modeling randomness through the initial conditions, while the causal semantics at equilibrium of the dynamical system are parsimoniusly described by the associated SCM.

7 Discussion

In this paper we built a bridge between the world of random differential equations and the world of structural causal models. This allows us to study a plethora of physical and engineering systems subject to time-varying random disturbances within the framework of structural causal models. We naturally extend the work of Mooij et al. (2013) to the stochastic setting, which allows us to address both cycles and confounders. In particular, we relaxed the condition that the dynamical system has to equilibriate to a single static equilibrium, and show that if an RDE is sufficiently regular all equilibrium sample-path solutions of the RDE are described by the solutions of the associated SCM, while pertaining the causal semantics.

There are two possible interesting directions for future research. The first is relaxing the regularity assumption. Earlier work has shown that SCMs can be derived from stochastic differential equations (Hansen and Sokol, 2014), however they restrict to the acyclic case. The second is relaxing the convergence assumption. Although the convergence assumption is a convenient and simplifying assumption, convergence of the stochastic processes is not always satisfied in practice. Recent work has shown that dynamic asymptotic behaviour of ordinary differential equations can be captured by dynamic structural causal models (Rubenstein et al., 2016). Other related work on discrete-time dynamical system and causality which does not require a single static equilibrium assumption is (Voortman et al., 2010).

Acknowledgements

Stephan Bongers and Joris Mooij were supported by NWO, the Netherlands Organization for Scientific Research (VIDI grant 639.072.410).

References

  • Bollen (1989) Bollen, K. (1989). Structural Equations with Latent Variables. John Wiley & Sons, New York, USA.
  • Bongers et al. (2016) Bongers, S., Peters, J., Schölkopf, B., and Mooij, J. M. (2016). Structural causal models: Cycles, marginalizations, exogenous reparametrizations and reductions. arXiv.org preprint, arXiv:1611.06221 [stat.ME].
  • Bunke (1972) Bunke, H. (1972). Gewöhnliche Differentialgleichungen mit Zufälligen parametern. Akademie, Berlin, DE.
  • Dash (2005) Dash, D. (2005). Restructuring dynamic causal systems in equilibrium. In

    Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (AISTATS 2005)

    .
  • Doob (1953) Doob, J. (1953). Stochastic Processes. Wiley, New York, USA.
  • Eberhardt (2014) Eberhardt, F. (2014). Direct causes and the trouble with soft interventions. Erkenntnis, 79(4):755–777.
  • Fisher (1970) Fisher, F. M. (1970). A correspondence principle for simultaneous equation models. Econometrica, 38(1):73–92.
  • Forré and Mooij (2017) Forré, P. and Mooij, J. M. (2017). Markov properties for graphical models with cycles and latent variables. arXiv.org preprint, arXiv:1710.08775 [math.ST].
  • Gard (1988) Gard, T. (1988). Introduction to Stochastic Differential Equations, volume 114 of Monographs and textbooks in pure and applied mathematics. Marcel Dekker, New York, USA.
  • Hansen and Sokol (2014) Hansen, N. and Sokol, A. (2014). Causal interpretation of stochastic differential equations. Electron. J. Probab., 19.
  • Hyttinen et al. (2012) Hyttinen, A., Eberhardt, F., and Hoyer, P. O. (2012). Learning linear cyclic causal models with latent variables.

    The Journal of Machine Learning Research

    , 13(1):3387–3439.
  • Iwasaki and Simon (1994) Iwasaki, Y. and Simon, H. A. (1994). Causality and model abstraction. Artificial Intelligence, 67(1):143–194.
  • Lacerda et al. (2008) Lacerda, G., Spirtes, P. L., Ramsey, J., and Hoyer, P. O. (2008).

    Discovering cyclic causal models by independent components analysis.

    In Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI 2008).
  • Loéve (1977) Loéve, M. (1977). Probability Theory II, volume 46 of Graduate Texts in Mathematics. Springer, New York, USA, 4th edition.
  • Mooij et al. (2011) Mooij, J. M., Janzing, D., Heskes, T., and Schölkopf, B. (2011). On causal discovery with cyclic additive noise models. In Advances in Neural Information Processing Systems (NIPS 2011), pages 639–647.
  • Mooij et al. (2013) Mooij, J. M., Janzing, D., and Schölkopf, B. (2013). From ordinary differential equations to structural causal models: the deterministic case. In Nicholson, A. and Smyth, P., editors, Proceedings of the 29th Annual Conference on Uncertainty in Artificial Intelligence (UAI-13), pages 440–448. AUAI Press.
  • Murray (2002) Murray, J. (2002). Mathematical Biology. Springer, New York, USA, 3th edition.
  • Pearl (2009) Pearl, J. (2009). Causality: Models, Reasoning, and Inference. Cambridge University Press, New York, USA, 2nd edition.
  • Rubenstein et al. (2016) Rubenstein, P. K., Bongers, S., Mooij, J. M., and Schölkopf, B. (2016). From deterministic ODEs to dynamic structural causal models. arXiv.org preprint, arXiv:1608.08028 [cs.AI].
  • Rupp and Neckel (2013) Rupp, F. and Neckel, T. (2013). Random Differential Equations in Scientific Computing. Versita, London, GB.
  • Sobczyk (1991) Sobczyk, K. (1991). Stochastic Differential Equations, volume 40 of Mathematics and its Applications. Springer.
  • Soong (1973) Soong, T. (1973). Random Differential Equations in Science and Engineering, volume 103 of Mathematics in Science and Engineering. Academic Press, New York, USA.
  • Spirtes (1995) Spirtes, P. (1995). Directed cyclic graphical representations of feedback models. In Proceedings of the Eleventh conference on Uncertainty in Artificial Intelligence (UAI 1995), pages 491–498.
  • Spirtes et al. (2000) Spirtes, P., Glymour, C., and Scheines, R. (2000). Causation, Prediction, and Search. The MIT Press, Cambridge, Massachusetts, 2nd edition.
  • Voortman et al. (2010) Voortman, M., Dash, D., and Druzdzel, M. J. (2010). Learning why things change: the difference-based causality learner. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI 2010).

Supplementary Material

Proofs

Proof of Theorem 1

Proof.

Continuity of and sample-path continuity of implies that for almost all the function is continuous on . Moreover, continuity of implies separate continuity of . That is, for each the function is continuous in and in particular measurable in . Hence for all the function is -measurable. Applying theorem 1.2 in Bunke (1972) proves the result. ∎

Proof of Corollary 1

Proof.

Apply Theorem 1 to the regular oRDM . ∎

Proof of Theorem 2

Proof.

Let be a sample-path solution that equilibrates to . Then

which gives

where we used continuity of in the second equality, and steadiness in the last equality. This gives

and hence

Proof of Proposition 1

Proof.

Let be a solution of . Then the stochastic process defined by is a sample-path solution of that equilibrates to . ∎

Proof of Proposition 2

Proof.

Let be a solution of . Then by Corollary 1 there exists a unique sample-path solution w.r.t. the initial condition . Hence is the unique sample-path solution that equilibrates to . ∎

Proof of Theorem 3

Proof.

Applying the perfect intervention (by Definition 16) to yields the SCM where is defined by

Applying Definition 17 to the RDM yields the same SCM. ∎