Learning soft interventions in complex equilibrium systems

12/10/2021
by   Michel Besserve, et al.
2

Complex systems often contain feedback loops that can be described as cyclic causal models. Intervening in such systems may lead to counter-intuitive effects, which cannot be inferred directly from the graph structure. After establishing a framework for differentiable interventions based on Lie groups, we take advantage of modern automatic differentiation techniques and their application to implicit functions in order to optimize interventions in cyclic causal models. We illustrate the use of this framework by investigating scenarios of transition to sustainable economies.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

09/26/2013

Cyclic Causal Discovery from Continuous Equilibrium Data

We propose a method for learning cyclic causal models from a combination...
05/05/2021

Goodness of Causal Fit

We propose a Goodness of Causal Fit (GCF) measure which depends on Pearl...
06/08/2015

backShift: Learning causal cyclic graphs from unknown shift interventions

We propose a simple method to learn linear causal cyclic models in the p...
06/08/2019

Estimation Rates for Sparse Linear Cyclic Causal Models

Causal models are important tools to understand complex phenomena and pr...
07/04/2017

Causal Consistency of Structural Equation Models

Complex systems can be modelled at various levels of detail. Ideally, ca...
05/28/2021

Near-Optimal Multi-Perturbation Experimental Design for Causal Structure Learning

Causal structure learning is a key problem in many domains. Causal struc...
02/08/2022

Causal Scene BERT: Improving object detection by searching for challenging groups of data

Modern computer vision applications rely on learning-based perception mo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Designing optimal interventions in complex systems, composed of many interacting parts, is a key objective in multiple fields. In the context of socio-economic systems, the design of public policies to improve economic and social welfare is a major source of scientific and political debate. However, environmental questions need to be traded-off with the immediate needs of human activities, as their detrimental long term impacts may considerably affect societies (Dearing et al., 2014; Sherwood and Huber, 2010). Interestingly, a priori intuitive interventions in such systems may lead to paradoxical outcomes. The rebound effect in energy economy, first reported by Jevons (1866), is paradigmatic: while the energy efficiency of devices may considerably increase due to technological improvements, this may trigger an overall increase of energy consumption due to increases in demand (Brockway et al., 2021). This suggests in particular that efficiency alone may not be the best way to foster a transition towards sustainability, and calls for a quantitative study of optimal interventions in such complex systems (Arrobbio and Padovan, 2018). As argued for the case of rebound effects (Wallenborn, 2018), such unexpected behaviors may reflect balanced causal relationships designed by evolution (Andersen, 2013), and feedback loops (Blom and Mooij, 2021) that maintain a system at an optimal “equilibrium” operating point independent from external perturbations, challenging classical causal inference assumptions of faithfulness and acyclicity.

While interventions have been extensively investigated theoretically in the field of causality (Pearl, 2000; Imbens and Rubin, 2015), the case of systems incorporating feedback loops remains particularly challenging, and therefore still has only led to limited applications to real-life complex systems. Many complex systems can be approximated to operate at an equilibrium point, which can be described by cyclic structural causal models (Bongers et al., 2016). Such models satisfy a self consistent set of equations that, under unique solvability assumptions, fully identifies the operating point. While interventions in such models are typically not directly readable from the causal graph, the latter can be turned into a causal ordering graph, in which the effects of interventions that do not change the causal structure, referred to as soft interventions, are reflected in the directed paths (Blom et al., 2020). For practical and ethical reasons, soft interventions also arguably provide a more realistic account of changes that can be performed in real life systems. While a restricted set of qualitative results exist for such interventions, their quantitative assessment and design in complex systems is made difficult by the analytical intractability of the self-consistency relations.

In this paper, we propose a framework for a general class of differentiable parametric soft interventions based on Lie groups and leverage recent technical and algorithmic developments allowing learning implicit functional relationships (Bai et al., 2019) to optimize such interventions. After defining Lie interventions and assaying their theoretical properties, we provide a computational framework to optimize them. We illustrate its application to economic models derived from real data, offering a novel approach to computational sustainability. Proofs are provided in Appendix B.

Related work.

Various types of economic equilibrium models (EEM) have been used to investigate macroeconomic effect of specific interventions (Wiebe et al., 2018; Wood et al., 2018). In contrast to such work, we develop a general optimization framework that allows the optimal design of interventions to achieve specific goals. A restricted set of EEMs have been investigated more extensively from an optimization perspective (see, e.g., Esteban-Bravo 2004

); however, these are restricted to rather specific assumptions and constraints that allow to address optimization with linear programming approaches. Instead, we optimize systems with arbitrary non-linearities, relying on automated differentiation and backpropagation algorithms, and leverage the causal structure of the models. In the field of causality, several studies investigate the relationship between the equilibrium of dynamical systems and structural causal models (SCM)

(Mooij et al., 2013; Peters et al., 2020) and how the causal structure can be learnt from data. In contrast, we focus on design of soft interventions in a system for which the SCM at equilibrium is known. While most work in the field of causality has focused on hard interventions, specificities of soft interventions have started to be investigated theoretically in general (possibly cyclic) structural causal models (Correa and Bareinboim, 2020; Blom et al., 2020)

. Morevover, recent work has investigated theoretically and empirically the design of extrapolations in generative feedforward (thus acyclic) neural network using a soft intervention perspective

(Besserve et al., 2021). The present work is to the best of our knowledge the first to investigate theoretically and algorithmically the design of such interventions in cyclic causal models. The algorithmic approach relies on modeling economic equilibrium with deep equilibrium models (Bai et al., 2019)

. This approach belongs to the category of implicit deep learning

(El Ghaoui et al., 2021), which has been used in a variety of applications such as model predictive control (Amos et al., 2018) and multi-agent trajectory modelling (Geiger and Straehle, 2020).

2 Background

2.1 Transition to a sustainable economy

In the face of climate change and more generally of the increasing severity of environmental impacts of human activities, our societies face challenges to transition to more sustainable economies. An overarching difficulty is the complexity of the systems that need to be intervened on, which comprise tightly intertwined components, ranging from economic agents to a broad variety of ecosystems (Haberl et al., 2019).

A classical way to represent the economy and its impacts are input-output (IO) multi-sector economic equilibrium models (Stadler et al., 2018), in which economic activities are divided in interdependent sectors and described by a positive -dimensional outputvector (see Appendix A). We take as a guiding example the demand-driven model introduced by Leontief (1951), which is the basis of Input-Output analysis that can be used for environmental impact assessment. In this model, the sectors’ outputs at economic equilibrium are set as a function of the vector gathering final demand for each product (consumed by users instead of being used to make another product). Satisfying demand of all sectors implies the self-consistent equation

(1)

where is the so-called technical coefficients matrix, with the amount of each product used as input to produce product .111In this simplified model, each sector is in charge of the production of a single homogeneous product.

An example of a technical coefficient matrix estimated from economic data is provided in Fig. 

1. While such equilibria can be thought of as the asymptotic value of in a dynamic model (see Appendix A) we focus our analysis on the equilibrium equations without consideration for the dynamics that gives rise to it. In turn, the socio-economic impacts (e.g., employment) and environmental stressors (e.g., GHG emissions, water use, …) of each sector’s activity is gathered in a vector of impacts such that

(2)

where is a footprint intensity matrix such that is the amount of impact of type generated by unit of output .

In the long run, impacts of activities on the environment and planetary resources are likely to trigger feedback loops in various forms: shortage of renewable and non-renewable resources, drop in agricultural yields and major environmental migration, to name a few (see, e.g., Dearing et al. 2014; Sherwood and Huber 2010). To mitigate them, a reorganization of the global economy is required, which, for instance, may consist in intervening on the interactions between sectors reflected in the matrix . However, choosing such interventions faces three challenges.

Figure 1: (a) Top left: technical coefficient matrix between 200 sectors and 49 world regions for 2011 (source: Exiobase 3, Stadler et al. 2018). Top right: magnification of the top left corner of the previous matrix. Blocks on the diagonal reflect the stronger dependency between domains within a region/country. Bottom right: putative example of cyclic dependency between different sectors. (b) Illustration of the causal graph for price rebound in a two sector economy. (c) Principle of deep equilibrium models.

Challenge 1: social acceptability.

Reducing the activity in a sector may lead to both positive environmental effects (by reducing the amount of stressors) and negative socio-economic impacts (such as reducing economic growth and employment). Political decision makers thus need to trade off environmental goals with the social acceptability of the chosen policies.

Challenge 2: recurrence between sectors.

The activities of different sectors are highly intertwined by their reciprocal demands, as illustrated by the graphical model at the bottom of Fig. 1: electricity production through renewable energy requires wind turbines, which require metals, while the metal industry requires itself electricity to extract metals from ores and transform them. Such cycles make it challenging to anticipate the positive and negative system-wide consequences of interventions to reduce the activity of sectors with strong environmental footprints.

Challenge 3: rebound effects.

The complexity of the economic system also manifests itself through balancing mechanisms that reflect the utility maximization behavior of economic agents, such as rebound effects mentioned in the introduction. Consider the model of eq. 1, which can be written as a function of final demand

(3)

In practice, final demand is influenced by prices of each good and often modeled by a static demand curve for good such that . We thus model a final demand rebound through prices in the context of the Leontief model as follows. While the energy efficiency of the production of a particular good can be modeled as a decrease of , where indicates the energy sector, these modification also affect the unit price if energy costs constitute a significant part of it. We define the price vector of goods such that it is proportional to the energy required in all sectors to produce one unit of this good. In a simple linear model, these prices are thus also determined by the technical coefficient matrix through a self-consistent equation

where is a canonical basis vector which takes value for the energy sector, and value for all other sectors. For illustrative purposes, the overall causal model is shown in Fig. 1 in the case of a two sector economy, with sector 1 being the energy sector. The price-based rebound mechanisms then operates as follows: a decrease of will decrease energy demand on sector , but will also decrease the unit price of goods for sector (and downstream sectors consuming its goods), and because the demand curves are typically decreasing, this will increase the final demand for these products, which in turn will increase economic activity according to eq. 3, and their environmental footprint. One way to avoid such rebound is thus to simultaneously intervene on the unit price of energy through a tax policy, so that price level is maintained high and prevents increases of final demand (see Fig. 1). Importantly, while eq. (3) provides a linear relationship between activity and final demand, once we assume is price dependent, the system of equations becomes non-linear and finding an analytic expression of the economic equilibrium is nontrivial.

We propose to investigate interventions that address the above challenges by constrained optimization of soft interventions in a structural causal model framework.

2.2 Cyclic causal models

Interventions and their effects on systems have been investigated using Structural Causal Models (SCM) (Pearl, 2000). In this framework, relationships between observed variables are described by a set of structural assignments of the form

where indicates the parents of variable in an associated directed causal graph, such as the one illustrated in Fig. 1. Interventions turn an SCM into a different one, by applying a modification to at least one of its elements. Broadly construed, interventions range from “hard” interventions that modify the structure of the graph to “soft” interventions that do not (Eberhardt and Scheines, 2007).

While in acyclic graphs, interventions have generic effects on their descendants in the causal graph, and no effects on the parents, Blom et al. (2020) have shown that causal effects are less easy to read in graphs containing cycles. While some qualitative information can be gathered through the use of a causal ordering graph (see Appendix A), it is limited to specific graph structures. Anticipating the effect of interventions in cyclic graphs overall requires to estimate the changes in the equilibrium point, which is typically non-trivial. While a variety of approaches may be used (e.g., based on root finding approaches), designing optimal interventions for self-consistent equations that cannot be handled analytically is challenging, especially in high dimensional systems. Recent work in deep neural network has come up with techniques allowing gradient descent based optimization of such equilibrium models (Bai et al., 2019) that we will leverage.

2.3 Deep equilibrium models

Bai et al. (2019) introduced deep learning architecture elements with input-output functional relationships between variables and parameters that are only defined through a self-consistent equation

Assuming that for each value pair there is a unique solution , the gradient with respect to the input or the parameter (indicated by the “(.)” notation) can be obtained through another self-consistent equation

Overall, can be integrated as a layer in more complex differentiable models, which, as depicted in Fig. 1, can be understood as a cascade of multiple layers with identical functions and shared parameters, with specific accelerated root finding approaches to compute the forward and backward passes (Bai et al., 2019). These layers are used to design differentiable interventions.

2.4 Lie groups

With deep equilibrium models offering a differentiable framework for investigating the behavior of cyclic graphs, we can design differentiable soft interventions compatible with classical optimization frameworks. We will use the concept of Lie groups, which are smooth manifolds of transformations (see Appendix A for more background), in order to implement smooth soft interventions. In short, a group is a set of objects equipped with a group “multiplication” operation mapping to and an inverse operation with the following properties:

  • (associativity) ,

  • (identity element) there exist a unique identity element such that for all , ,

  • (inverse) for all , there exists a unique element such that .

Groups may be used as sets of transformation applied to objects in a set through the definition of a group action operation mapping to , such that

  • (identity) for all , ,

  • (compatibility) for all , for all , .

A real Lie group is a group that is also a finite-dimensional real smooth manifold (see Appendix A), in which the group operations of multiplication and inversion are smooth maps. The differentiability of Lie groups comes with additional properties that we are going to exploit to design smooth interventions.

3 Lie interventions in cyclic models

3.1 Smooth causal graphical models

We start by defining a deterministic smooth structural causal model (SSCM) as a set of variables related to each other through structural equations and vertices in a directed graph as follows.

Definition 1 (Sscm).

A -dimensional smooth structural causal model is a triplet consisting of

  • two collections of smooth manifolds and ,

  • a directed graph with set of vertices and set of directed edges between them, each vertex being associated to one variable ,

  • a set of structural assignments where are smooth maps, and are the variables indexed by the set of parents of vertex in .

Compared to classical definitions of SCMs (see, e.g., Peters et al. 2017

), we have replaced exogenous random variables by deterministic parameters living on a manifold. This general definition does not prevent assigning random variables to some (components of) these parameters. In the cases considered here,

are subsets of Euclidean spaces. We are particularly interested in cyclic SCMs, where there exists at least one directed path linking one vertex to itself. As a consequence, the possible values achieved by each variable have to be chosen among the solutions of the self-consistent structural equation constraints. We assume the base causal model is locally uniquely solvable.

Definition 2.

An SSCM is locally uniquely solvable around a reference point whenever there exists a neighborhood of and a neighborhood of such that for all there exists a unique (self-consistent) solution to the set of structural assignments .

Note that this is adapted to our SSCM definition and differs from the unique solvability definition of Bongers et al. (2016), which was formulated for causal models with random exogenous variables. This property is guaranteed by a condition on the Jacobian of the structural equations.

Proposition 1.

We say the SSCM is locally diffeomorphic at when is a solution and the Jacobian of the mapping is invertible. Then this implies the SSCM is uniquely solvable around this reference point and the local mapping is smooth.

In the context of IO analysis presented in Section 2.1, the variables can be seen as the sector’s outputs and (possibly) unit prices. For eq. (1), the resulting SSCM thus contains the affine structural assignments associated to each component of

which are clearly smooth, and the ’s may be assumed fixed or free parameters within an interval.

3.2 Lie interventions

We will consider interventions parameterized by an element that turns the unintervened equilibrium solution into the intervened equilibrium solution over a range of values of . In particular, we define Lie interventions implemented through the action of a Lie group.

Definition 3 (Lie intervention).

A Lie intervention on an SSCM with a set of smooth structural assignments is a pair where is a Lie group and a smooth group action . The action defines a family of intervened SSCMs , for in a neighborhood of the identity within .

Note in particular that applying the identity element of the group leads to the original (unintervened) causal model. Such interventions preserve unique solvability.

Proposition 2 (Solvability).

For a Lie intervention on a locally diffeomorphic SSCM, there is a neighborhood of the identity in such that the intervention is soft, the familly of intervened SSCMs is locally uniquely solvable and the local mapping to the intervened solution is smooth.

3.3 Examples

Scalar multiplicative intervention.

A simple way of intervening on an arbitrary system is to multiply one selected assignment by a strictly positive scalar coefficient. We can consider equipped with multiplication as a Lie group, such that the action of on a structural assignment is

In the context of Input-Output models presented in Section 2.1, applying this intervention can be seen as reducing or increasing the demand for products of a specific sector. Reducing the demand for a sector with large GHG emissions is for example a relevant objective for the transition to a sustainable economy and may be implemented by public policy in various ways (taxes, norms, …). Such interventions are investigated in the field of industrial ecology (Wood et al., 2018).

Distributed multiplicative interventions.

We can combine several scalar Lie interventions to act on a set of nodes indexed by instead of a single one. A group element is a strictly positive vector acting on assignments in such that

For example, this can model simultaneous interventions on multiple economic sectors that may be optimized to achieve a global objective for the economic system.

Figure 2: (a) Illustration of a compartmentalized intervention: enforcing invariance of the green nodes allows each compartment to be independently influenced by two (invariant) interventions and . (b) Architecture for Lie intervention optimization. The equilibrium layer is controlled by intervention parameters and a loss is applied to its output. (c) Schematic representation of the procedure to learn invariant soft intervention (: intervened node,

: invariant and auxiliary node). A multilayer perceptron (MLP) learns the soft intervention enforcing invariance of

over a range of parameter values.

3.4 Invariant soft interventions

The rebound effect is paradigmatic of interventions that may trigger undesired effects that we wish to prevent. To this end, simultaneous interventions on other parts of a system have been considered in applications. For example, a rebound through prices can be prevented by a simultaneous auxiliary intervention of prices through taxes, such that the prices remain invariant to the overall intervention. Using the SSCM framework, we theoretically investigate the conditions under which some variables of the causal model can be maintained invariant to the Lie intervention on others.

Motivating example.

Consider the following SSCM with parameters with distributed multiplicative Lie intervention :

By choosing , the intervened equilibrium solution component becomes insensitive to multiplicative interventions , such that for arbitrary values of parameters in a neighborhood of the reference parameter (see Appendix C). This result suggests that the influence of soft interventions ( in this example) can be restricted to a subset of nodes, by choosing an second intervention ( in this example) on an auxiliary variable. However, it is unclear whether this result still holds when the functional assignment of becomes non-linear.

To frame this question in a general setting, we introduce a class of soft interventions under invariance constraint.

Definition 4 (Invariant soft interventions).

Given an SSCM with Lie intervention from group on node . The intervention leaves node invariant by leveraging node if for arbitrary group element in a neighborhood of the identity, there exist a soft intervention on node , , replacing functional assignment such that the intervened node value satisfies in a neighborhood of the reference parameter. Node is called the intervened node, node is called the invariant node, and node is called the auxiliary node.

Remark:

The soft intervention property is key, as it entails that the use of an auxiliary variable to enforce the invariance constraint must only exploit the information available to this node as defined by its parents in the unintervened graph (and no parameter values). This constraint makes deployment more realistic in a complex system, as intervening does not require supervision by an external entity monitoring the whole system.

Let us denote and the vector and mapping with the -th component removed. We also define two quantities important for the existence of such interventions. The partial derivative is obtained by performing a hard intervention leading to equilibrium value , and computing the derivative . The Jacobian is the Jacobian of the mapping from the parameters to the vector consisting of the parent nodes of at equilibrium. Based on these two quantities, we have the following sufficient condition.

Proposition 3.

Consider an SSCM locally diffeomorphic at with intervened/invariant/auxiliary triplet of nodes . If the Jacobian of the mapping is invertible, has full column rank, and , then the intervention on leaves node invariant by leveraging node .

This result suggests that the above motivating example can be extended, in a neighborhood of the identity, beyond the linear case, when the number of free parameters considered remains low relative to the number of parents of the auxiliary node. However, as can be seen in the proof, the soft intervention on the auxiliary variable is given by an implicit function theorem, suggesting non-parametric models and automated differentiation methods are necessary to learn it. This will be described in Sec. 

4.

3.5 Compartmentalized interventions

Invariant interventions allow to restrict the propagation of effects to a subset of nodes. If a complex system can be partitioned into sparsely connected subsets of nodes, we can consider designing such interventions in order to modify the equilibrium values of each compartment independently from each other.

Definition 5.

Given a partition of the SSCM nodes into compartments . Interventions are compartmentalized when they affect only values of a single compartment.

The following result guaranties that if the nodes influencing other compartments are made invariant, interventions on each compartment can be designed and performed independently from each other as their effects remain confined to their own compartment.

Proposition 4.

Given a partition of the SSCM nodes. If for each compartment there exists one Lie intervention performed on structural equations such that intervened, auxiliary and invariant nodes belong to the compartment, and all nodes that have an arrow between compartment are invariant, those interventions are compartmentalized.

A fundamental aspect of this result is that, from the definition of invariant interventions, compartmentalization is valid over a range of parameters of the causal model (a neighborhood of the reference point) and a range of Lie interventions parameters (a neighborhood of the identity). This can be seen as a form of robustness of the interventional framework. An illustration of such intervention is provided in Fig. 2, where the equilibria of two sparsely connected compartments are interdependent (notably, the causal ordering algorithm returns a single cluster merging both compartments). Enforcing invariance of the green nodes, each associated to one intervention ( and ) within their compartment allows applicability of Proposition 4.

4 Differentiable intervention design

To address Challenge 2 of Sec. 2.1, we design interventions with implicit layers (see Appendix D for details).

Differentiable architecture.

Base optimization relies on a differentiable architecture comprising one central module representing the cyclic SSCM. Essentially, the cyclic model is represented by an equilibrium layer following Bai et al. (2019), schematized in Fig. 1

: the differentiable module is designed such that forward and backward passes through the equilibrium layer use Anderson acceleration to solve a fixed point equation. This equilibrium layer is cascaded if necessary with parametric layers to achieve specific goals. The architectures are implemented using the PyTorch library.

Figure 3: (a-b) Outcome of Lie intervention optimization of GHG emission reduction while preserving employment in France (a) and Germany (b). Tables indicate sectors with largest % reduction of activity, for the associated to the red cross in the above curve. (c) Outcome of an Lie and invariant interventions on energy efficiency in a 3-sector economy. Top: unit price of the target sector, bottom: total energy demand.

Lie intervention optimization

We design an architecture around the equilibrium module to optimize multiplicative intervention according to a loss, as represented in Fig. 2. Parameters of the Lie group element are optimized in order to minimize an objective achieved by the equilibrium solution of the cyclic causal model, and possibly controlled by external parameters. The total loss may include an additional regularization term, with regularization parameter , to enforce that some properties of the intervened system remain invariant or close to the original, non-intervened, equilibrium solution .

Learning invariant interventions.

In order to enforce invariance of interventions based on Sec. 3.4, we follow the procedure exemplified in Fig. 2 for a 3-node network. We design two implicit layers with shared parameters , the first layer being unintervened giving the corresponding equilibrium values of the nodes, and the second one being invariantly intervened, for a fixed value of Lie intervention

on the intervened node. In the intervened layer, we replace putative incoming arrows from the invariant node by arrows from the same node in the unintervened graph (as this replacement encodes the invariance assumption) and we replace the functional assignment of the auxiliary node by a multilayer perceptron, relying on its universal approximation properties to learn a soft intervention that will ensure invariance is satisfied. We use a least square penalty between the intervened and unintervened equilibrium values of the invariant node in order to train the MLP.

5 Experiments

The following toy and semi-synthetic experiments illustrate how our framework contributes addressing sustainability challenges exposed in Sec. 2.1. Supplemental experiments can be found in Appendix E.

Optimization of multiplicative Lie interventions

In order to investigate Challenge 1, we optimize the IO demand driven model of eq. (1). The matrices and , as well as the final demands and sector output at equilibrium are estimated from the Exiobase 3 dataset (Stadler et al., 2018) for year 2007, using the Pymrio library (Stadler, 2021). While this database describes economic interactions across multiple countries, we perform a separate analysis of the economic equilibrium of each country by neglecting those interactions, an extracting the blocs of matrices and relevant to the country under consideration. We design a distributed multiplicative Lie intervention on the activity of all 200 sectors included in the database. The coefficient vector is then optimized in order to reduce the overall greenhouse gaz (GHG) emissions cumulated across sectors (estimated by one component of the stressor vector ), while enforcing that the overall employment distribution over the sectors stays closest to the non-intervened economy, in order to mitigate challenges associated to reorganizing of economic activities (e.g. mass unemployment and the need for large scale professional reorientation programs). Using the norm for regularization, this leads to the following loss:

where is the GHG emission intensity of each sector, and and the intervened and unintervened distributions of employment across sectors (estimated by entry wise multiplication of with one row of matrix ). The graph shown in Figs. 3-3 (top), illustrate the trade off between employment preservation and GHG emission reduction achieved by varying for three different countries. Interestingly, the left tail of these curves reflect differences across countries, with Germany having less room than France for reducing emissions before starting reducing employment significantly. Also notice that sectors where activity is the most reduced differ across countries, likely influenced both by the overall structure of each economy, and technology-based differences in GHG intensities.

Control of rebound effects.

To illustrate how Challenge 3 of Sec. 2.1 can be addressed, we used our invariant intervention framework to prevent price rebound effects. We use a toy 3 sector model, with one energy sector and one target sector for which energy efficiency is increased, modeled by a multiplicative Lie intervention on the energy requirements coefficient of the Leontief matrix. We use the final demand of this target sector as an invariant node, and achieve this by softly intervening on it through a modification of the unit price of this sector. The soft intervention is learnt using an MLP with two hidden layers (see Appendix D for details). Fig. 3 (top) compares 3 models: unintervened (called “reference” in the figure), Lie intervened (without enforcing invariance), and invariantly intervened. For a range of one parameter left free in the Leontief matrix, the results show the invariant intervention maintains the price close to the unintervened model (Fig. 3, top), while this price is much lower for the Lie intervention (due to the rebound effect). The benefit of invariance is demonstrated by the effect on the activity of overall energy demand of the economy (Fig. 3, top): the invariant intervention leads to a larger reduction of energy demand (relative to the unintervened system) than the Lie intervention, as the rebound through prices is prevented.

Compartmentalized interventions design.

We further illustrate the benefits of compartmentalized interventions for addressing Challenge 2 exposed in Sec. 2.1 in the context of high-dimensional muli-sector economic models in Appendix E.

6 Discussion

We introduced a differentiable soft interventions framework for complex systems. We argue those are more likely to approximate policy changes that can actually be deployed in real-world socio-economic systems and address key challenges of the transition to sustainable economies. Theoretical results and algorithmic tools are provided to design interventions with desirable invariance properties under the assumption that the considered system is in equilibrium and model parameters are known. Further work in this direction will need to address identifiability of the considered models from observational or experimental data, and extend the result to non-equilibrium settings.

Acknowledgments

The authors would like to thank Philipp Geiger for insightful discussions. This work was supported by the German Federal Ministry of Education and Research (BMBF): Tübingen AI Center, FKZ: 01IS18039B; and by the Machine Learning Cluster of Excellence, EXC number 2064/1 - Project number 390727645.

References

  • Amos et al. (2018) B. Amos, I. D. J. Rodriguez, J. Sacks, B. Boots, and J. Z. Kolter. Differentiable MPC for end-to-end planning and control. arXiv preprint arXiv:1810.13400, 2018.
  • Andersen (2013) H. Andersen. When to expect violations of causal faithfulness and why it matters. Philosophy of Science, 80(5):672–683, 2013.
  • Arrobbio and Padovan (2018) O. Arrobbio and D. Padovan. A vicious tenacity: The efficiency strategy confronted with the rebound effect. Frontiers in Energy Research, 6:114, 2018.
  • Bai et al. (2019) S. Bai, J. Z. Kolter, and V. Koltun. Deep equilibrium models. arXiv preprint arXiv:1909.01377, 2019.
  • Besserve et al. (2021) M. Besserve, R. Sun, D. Janzing, and B. Schölkopf. A theory of independent mechanisms for extrapolation in generative models. In

    Proceedings of the AAAI Conference on Artificial Intelligence

    , pages 6741–6749, 2021.
  • Blom and Mooij (2021) T. Blom and J. M. Mooij. Causality and independence in perfectly adapted dynamical systems. arXiv preprint arXiv:2101.11885, 2021.
  • Blom et al. (2020) T. Blom, M. M. van Diepen, and J. M. Mooij. Conditional independences and causal relations implied by sets of equations. arXiv preprint arXiv:2007.07183, 2020.
  • Bongers et al. (2016) S. Bongers, P. Forré, J. Peters, B. Schölkopf, J. M. Mooij, et al. Foundations of structural causal models with cycles and latent variables. arXiv preprint arXiv:1611.06221, 2016.
  • Brockway et al. (2021) P. E. Brockway, S. Sorrell, G. Semieniuk, M. K. Heun, and V. Court. Energy efficiency and economy-wide rebound effects: A review of the evidence and its implications. Renewable and Sustainable Energy Reviews, page 110781, 2021.
  • Correa and Bareinboim (2020) J. Correa and E. Bareinboim. General transportability of soft interventions: Completeness results. Advances in Neural Information Processing Systems, 33, 2020.
  • Dearing et al. (2014) J. A. Dearing, R. Wang, K. Zhang, J. G. Dyke, H. Haberl, M. S. Hossain, P. G. Langdon, T. M. Lenton, K. Raworth, S. Brown, et al. Safe and just operating spaces for regional social-ecological systems. Global Environmental Change, 28:227–238, 2014.
  • Eberhardt and Scheines (2007) F. Eberhardt and R. Scheines. Interventions and causal inference. Philosophy of science, 74(5):981–995, 2007.
  • El Ghaoui et al. (2021) L. El Ghaoui, F. Gu, B. Travacca, A. Askari, and A. Tsai. Implicit deep learning.

    SIAM Journal on Mathematics of Data Science

    , 3(3):930–958, 2021.
  • Esteban-Bravo (2004) M. Esteban-Bravo. Computing equilibria in general equilibrium models via interior-point methods. Computational Economics, 23(2):147–171, 2004.
  • Geiger and Straehle (2020) P. Geiger and C.-N. Straehle. Learning game-theoretic models of multiagent trajectories using implicit layers. arXiv preprint arXiv:2008.07303, 2020.
  • Haberl et al. (2019) H. Haberl, D. Wiedenhofer, S. Pauliuk, F. Krausmann, D. B. Müller, and M. Fischer-Kowalski. Contributions of sociometabolic research to sustainability science. Nature Sustainability, 2(3):173–184, 2019.
  • Imbens and Rubin (2015) G. W. Imbens and D. B. Rubin. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press, 2015.
  • Jevons (1866) W. S. Jevons. The coal question. Routledge, 1866.
  • Lee (2013) J. M. Lee. Smooth manifolds. In Introduction to Smooth Manifolds, pages 1–31. Springer, 2013.
  • Leontief (1951) W. W. Leontief. The structure of american economy, 1919-1939: an empirical application of equilibrium analysis. Technical report, 1951.
  • Mooij et al. (2013) J. Mooij, D. Janzing, and B. Schölkopf.

    From ordinary differential equations to structural causal models: the deterministic case.

    In Proceedings of the Twenty-Ninth Conference Annual Conference on Uncertainty in Artificial Intelligence, pages 440–448, Corvallis, OR, 2013. AUAI Press.
  • Paszke et al. (2019) A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
  • Pearl (2000) J. Pearl. Causality: models, reasoning and inference, volume 29. Cambridge Univ Press, 2000.
  • Peters et al. (2017) J. Peters, D. Janzing, and B. Schölkopf. Elements of Causal Inference – Foundations and Learning Algorithms. MIT Press, 2017.
  • Peters et al. (2020) J. Peters, S. Bauer, and N. Pfister. Causal models for dynamical systems. arXiv preprint arXiv:2001.06208, 2020.
  • Sherwood and Huber (2010) S. C. Sherwood and M. Huber. An adaptability limit to climate change due to heat stress. Proceedings of the National Academy of Sciences, 107(21):9552–9555, 2010.
  • Stadler (2021) K. Stadler. Pymrio–a python based multi-regional input-output analysis toolbox. Journal of Open Research Software, 9(1), 2021.
  • Stadler et al. (2018) K. Stadler, R. Wood, T. Bulavskaya, C.-J. Södersten, M. Simas, S. Schmidt, A. Usubiaga, J. Acosta-Fernández, J. Kuenen, M. Bruckner, et al. Exiobase 3: Developing a time series of detailed environmentally extended multi-regional input-output tables. Journal of Industrial Ecology, 22(3):502–515, 2018.
  • Wallenborn (2018) G. Wallenborn. Rebounds are structural effects of infrastructures and markets. Frontiers in Energy Research, 6:99, 2018.
  • Wiebe et al. (2018) K. S. Wiebe, E. L. Bjelle, J. Többen, and R. Wood. Implementing exogenous scenarios in a global MRIO model for the estimation of future environmental footprints. Journal of Economic Structures, 7(1):1–18, 2018.
  • Wood et al. (2018) R. Wood, D. Moran, K. Stadler, D. Ivanova, K. Steen-Olsen, A. Tisserant, and E. G. Hertwich. Prioritizing consumption-based carbon policy based on the evaluation of mitigation potential using input-output methods. Journal of Industrial Ecology, 22(3):540–552, 2018.

Learning soft interventions in complex equilibrium systems. Appendices

References

  • Amos et al. (2018) B. Amos, I. D. J. Rodriguez, J. Sacks, B. Boots, and J. Z. Kolter. Differentiable MPC for end-to-end planning and control. arXiv preprint arXiv:1810.13400, 2018.
  • Andersen (2013) H. Andersen. When to expect violations of causal faithfulness and why it matters. Philosophy of Science, 80(5):672–683, 2013.
  • Arrobbio and Padovan (2018) O. Arrobbio and D. Padovan. A vicious tenacity: The efficiency strategy confronted with the rebound effect. Frontiers in Energy Research, 6:114, 2018.
  • Bai et al. (2019) S. Bai, J. Z. Kolter, and V. Koltun. Deep equilibrium models. arXiv preprint arXiv:1909.01377, 2019.
  • Besserve et al. (2021) M. Besserve, R. Sun, D. Janzing, and B. Schölkopf. A theory of independent mechanisms for extrapolation in generative models. In

    Proceedings of the AAAI Conference on Artificial Intelligence

    , pages 6741–6749, 2021.
  • Blom and Mooij (2021) T. Blom and J. M. Mooij. Causality and independence in perfectly adapted dynamical systems. arXiv preprint arXiv:2101.11885, 2021.
  • Blom et al. (2020) T. Blom, M. M. van Diepen, and J. M. Mooij. Conditional independences and causal relations implied by sets of equations. arXiv preprint arXiv:2007.07183, 2020.
  • Bongers et al. (2016) S. Bongers, P. Forré, J. Peters, B. Schölkopf, J. M. Mooij, et al. Foundations of structural causal models with cycles and latent variables. arXiv preprint arXiv:1611.06221, 2016.
  • Brockway et al. (2021) P. E. Brockway, S. Sorrell, G. Semieniuk, M. K. Heun, and V. Court. Energy efficiency and economy-wide rebound effects: A review of the evidence and its implications. Renewable and Sustainable Energy Reviews, page 110781, 2021.
  • Correa and Bareinboim (2020) J. Correa and E. Bareinboim. General transportability of soft interventions: Completeness results. Advances in Neural Information Processing Systems, 33, 2020.
  • Dearing et al. (2014) J. A. Dearing, R. Wang, K. Zhang, J. G. Dyke, H. Haberl, M. S. Hossain, P. G. Langdon, T. M. Lenton, K. Raworth, S. Brown, et al. Safe and just operating spaces for regional social-ecological systems. Global Environmental Change, 28:227–238, 2014.
  • Eberhardt and Scheines (2007) F. Eberhardt and R. Scheines. Interventions and causal inference. Philosophy of science, 74(5):981–995, 2007.
  • El Ghaoui et al. (2021) L. El Ghaoui, F. Gu, B. Travacca, A. Askari, and A. Tsai. Implicit deep learning.

    SIAM Journal on Mathematics of Data Science

    , 3(3):930–958, 2021.
  • Esteban-Bravo (2004) M. Esteban-Bravo. Computing equilibria in general equilibrium models via interior-point methods. Computational Economics, 23(2):147–171, 2004.
  • Geiger and Straehle (2020) P. Geiger and C.-N. Straehle. Learning game-theoretic models of multiagent trajectories using implicit layers. arXiv preprint arXiv:2008.07303, 2020.
  • Haberl et al. (2019) H. Haberl, D. Wiedenhofer, S. Pauliuk, F. Krausmann, D. B. Müller, and M. Fischer-Kowalski. Contributions of sociometabolic research to sustainability science. Nature Sustainability, 2(3):173–184, 2019.
  • Imbens and Rubin (2015) G. W. Imbens and D. B. Rubin. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press, 2015.
  • Jevons (1866) W. S. Jevons. The coal question. Routledge, 1866.
  • Lee (2013) J. M. Lee. Smooth manifolds. In Introduction to Smooth Manifolds, pages 1–31. Springer, 2013.
  • Leontief (1951) W. W. Leontief. The structure of american economy, 1919-1939: an empirical application of equilibrium analysis. Technical report, 1951.
  • Mooij et al. (2013) J. Mooij, D. Janzing, and B. Schölkopf.

    From ordinary differential equations to structural causal models: the deterministic case.

    In Proceedings of the Twenty-Ninth Conference Annual Conference on Uncertainty in Artificial Intelligence, pages 440–448, Corvallis, OR, 2013. AUAI Press.
  • Paszke et al. (2019) A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
  • Pearl (2000) J. Pearl. Causality: models, reasoning and inference, volume 29. Cambridge Univ Press, 2000.
  • Peters et al. (2017) J. Peters, D. Janzing, and B. Schölkopf. Elements of Causal Inference – Foundations and Learning Algorithms. MIT Press, 2017.
  • Peters et al. (2020) J. Peters, S. Bauer, and N. Pfister. Causal models for dynamical systems. arXiv preprint arXiv:2001.06208, 2020.
  • Sherwood and Huber (2010) S. C. Sherwood and M. Huber. An adaptability limit to climate change due to heat stress. Proceedings of the National Academy of Sciences, 107(21):9552–9555, 2010.
  • Stadler (2021) K. Stadler. Pymrio–a python based multi-regional input-output analysis toolbox. Journal of Open Research Software, 9(1), 2021.
  • Stadler et al. (2018) K. Stadler, R. Wood, T. Bulavskaya, C.-J. Södersten, M. Simas, S. Schmidt, A. Usubiaga, J. Acosta-Fernández, J. Kuenen, M. Bruckner, et al. Exiobase 3: Developing a time series of detailed environmentally extended multi-regional input-output tables. Journal of Industrial Ecology, 22(3):502–515, 2018.
  • Wallenborn (2018) G. Wallenborn. Rebounds are structural effects of infrastructures and markets. Frontiers in Energy Research, 6:99, 2018.
  • Wiebe et al. (2018) K. S. Wiebe, E. L. Bjelle, J. Többen, and R. Wood. Implementing exogenous scenarios in a global MRIO model for the estimation of future environmental footprints. Journal of Economic Structures, 7(1):1–18, 2018.
  • Wood et al. (2018) R. Wood, D. Moran, K. Stadler, D. Ivanova, K. Steen-Olsen, A. Tisserant, and E. G. Hertwich. Prioritizing consumption-based carbon policy based on the evaluation of mitigation potential using input-output methods. Journal of Industrial Ecology, 22(3):540–552, 2018.

Appendix A Additional background

a.1 Smooth manifolds

While many non-equivalent definitions exist for smooth manifold, we follow Lee (2013) in defining smoothness as infinite continuously differentiability of functions. A diffeomorphism is then a smooth bijection whose inverse is also smooth.

For an n-dimensional topological manifold , an atlas is a collection of coordinate charts such that ’s are open sets of covering it, and such that the mappings are homeomorphisms (continuous bijection with continuous inverse). Briefly, the atlas is smooth whenever is are diffeomorphisms whenever well defined, and a smooth manifold is a topological manifold associated with a maximal smooth atlas.

A smooth map between two smooth manifolds and is a function such that for any chart and , is smooth whenever well defined.

a.2 Cyclic causal models

A classical type of hard interventions are perfect interventions, which replace the structural assignments of a given variable by an assignment , with constant (Blom et al., 2020). It thus eliminates the arrows in the causal graph pointing to this variables, and makes this variable deterministic.

In particular, tracing the effects of perfect interventions requires special assumptions. In contrast, soft interventions may be read from the so-called causal ordering graph, which can be built from the original SCM graph. Broadly construed, a unique causal ordering graph can be constructed with several algorithms (Blom et al., 2020). This is a directed cluster graph that contains groups of variables connected by oriented edges (starting from single variable in a given cluster, and pointing to another cluster). By construction, the resulting graph between clusters entailed by these edges is directed and contains no cycles. As a consequence, the effect of generic soft intervention on clustered variables can be easily read from this graph.

a.3 Link between equilibrium and dynamic models

The equilibrium of eq. (1) can be thought of as the asymptotic value of in a dynamic model (see Appendix A)

where the increase or decrease of the sectors’ activity is controlled by the imbalance between their demand and their current output . More generally, any fixed point-equation can be thought of a the equilibrium value of some dynamical system, for example by considering a numerical algorithm that converges to it. However, the relationship between dynamical systems and self-consistent equation is not one to one. Notably, we can rescale the time evolution of a stable dynamical system to create many other that converge to the same self-consistent equation. Moreover, by inverting the arrow of time, we can obtain systems for we the self-consistent equation is an unstable equilibrium. As mentioned in main text, in this work we leave aside the dynamical aspects to focus on the equilibrium properties.

a.4 MRIO models

Multi-regional input-output models are built based on macro-economic information, notably the one provided by the National Accounts of the countries involved in the model. The technical coefficient matrix of eq. (1) is computed from so-called Supply and Use Tables that form the basis of National Accounts. The unit used to measure output is frequently monetary (e.g., EUR) due to the data collection process and to allow an homogeneous treatment of the economic flows. However, under homogeneity and linearity assumptions, the output of each sector may be converted in appropriate physical units using unit prices and material flow data. Moreover, there also exist hybrid MRIO models which include information regarding physical flows in the economy (energy, raw materials, …) and the are combined with monetary information to ensure the best level of self-consistency.

Appendix B Proof of main text results

b.1 Proof of Proposition 1

Proof.

Assuming the SSCM is locally diffeomorphic entails that the Jacobian of is invertible at . Then the Jacobian of is also invertible at (due to its block triangular structure). Using the inverse function theorem for smooth maps between smooth manifolds (Lee, 2013, Theorem 4.5), this implies that there exists connected open neighborhoods of and such that

is a diffeomorphism. As a consequence, self-consistent solutions in are given by . It is a submanifold of same dimension as for the following reasons:

  • is a manifold diffeomorphic to and thus has the same dimension (Lee, 2013, Theorem 2.17),

  • is an open submanifold because is open, and thus has the same dimension as (Lee, 2013, Proposition 5.1)

  • has the same dimension as because it is diffeomorphic to it (Lee, 2013, Propositions 5.3 and 2.17).

Let us now define the cartesian projection

we want to establish that there exist an open neighborhood of such that there is a unique self-consistent solution for each parameter choice in this set is a smooth embedding because it is an injective smooth immersion, and is open222 is open because is a smooth submersion and thus open by Proposition 4.28 in Lee (2013), and is also open as the restriction of a diffeomorphism. , by Lee (2013, Proposition 4.22 )). As a consequence is an embedded submanifold of diffeomorphic to (by Lee (2013, Proposition 5.2)). Since we have shown that the dimension of is the dimension of , then is a submanifold of same codimension (same dimension as its ambient manifold) and is thus an open submanifold of (Proposition 5.1 in Lee (2013)). As a consequence, is open, such that there is an open neighborhood of included in it. Then for any parameter chosen in this neighborhood, there is one solution to the self-consistency equation, by definition of the image. Assume there are two distinct solution for this parameter, then the mapping would not be a diffeomorphism. ∎

b.2 Proof of Proposition 2

Proof.

We extend the smooth parameterization of function by to get a smooth parameterization of the intervened functional assignments by . Indeed, the mapping

is smooth as a composition of the following smooth maps

where the smoothness of each transformation stem from the definition of SSCM and Lie interventions, respectively. Proposition 1 applied around the extend parameter implies that there exists a neighborhood of this point such that the intervened solution is uniquely solvable and the mapping from the extended parameter to the solution is smooth. There exists moreover a product neighborhood (this is a basic property of neighborhoods on product spaces). By continuity of the partial derivative of the intervened functional assignment (due to smoothness of the Lie group action), dependency on the parents of the intervened variables is preserved in a neighborhood of the identity, such that the intervention is soft in the considered neighborhood. ∎

b.3 Proof of Propostion 3

Proof.

The Lie intervention parameterized by guaranties solvability of the SSCM is preserved in a neighborhood of the identity (Proposition 2), and we denote the unique solution in such neighborhood, with . The Jacobian is the Jacobian of the mapping from the parameters to the vector consisting of the parent nodes of at equilibrium. Because this Jacobian is full column rank, there exists a neighborhood of such that for any fixed in it, the mapping is injective in a neighborhood of the reference parameter. As a consequence the restriction to its image is a diffeomorphic map between manifolds. Let us denote its inverse.

Consider the SSCM obtained by performing a hard intervention . Because the original SSCM is locally diffeomorphic at , is a smooth assignment, and because additionally the Jacobian of the mapping is invertible, then this hard intervened system is also locally diffeomorphic at (exploiting the block diagonality of the Jacobian of its assignment). As a consequence, Lie intervention with parameter on node of this (already hard-intervened) system leads to a smooth intervened equilibrium .

Let us recall that the partial derivative corresponds to the derivative with respect to the hard interventions value. The assumption thus entails, by the inverse function theorem, that there exists also a smooth mapping such that in a neighborhood of . As a consequence, the mapping defined as is a soft intervention replacing achieving the same equilibrium values as the above hard-intervened system under Lie interventions, and in particular the invariance constraint . ∎

b.4 Proof of Proposition 4

Proof.

We proceed iteratively by adding one intervention after the next. First intervention on compartment leaves invariant the equilibrium values of the remaining compartments as the only node from influencing them is invariant.

Given satisfy invariance with respect to each others interventions, consider intervening on . As receives only inputs from intervened upon compartments through invariant nodes, the invariant intervention on it can be designed identical as for the non-intervened system. Morover, invariance of the outgoing node ensures the equilibrium values of (potentially intervened upon) other compartments . ∎

Appendix C Additional theoretical results

c.1 Motivating example of Sec. 3.4

Let us restate the unintervened assignments of this example.

The equilibrium solution then writes

Applying multiplicative Lie interventions on both an leads to the assignments

which leads to the intervened equilibrium

We can thus notice that choosing makes the intervened equilibrium value invariant for any choice of parameters .

Appendix D Methods

Following (Bai et al., 2019), we implemented implicit layers using the pyTorch library (Paszke et al., 2019). We use Anderson acceleration with a coefficient of 2.0 to compute iteratively the fixed points of the implicit layers, for both forward and backward passes, with a maximum number of iterations of 5000 and a tolerance of .0001.

Optimization of interventions is done using backpropagation with adaptive moment estimation (Adam), with a learning rate of .001. Soft interventions to enforce invariance are learned with two hidden layer perceptrons, with 20 and 10 hidden units respectively for the first and second layers, and all layers have ReLU activation functions.

Appendix E Additional experiments

e.1 Compartmentalized interventions

We design a two compartment Leontiev model according to Fig. 2. We optimize two invariant interventions, on compartment 1 and on compartment 2, to follow the conditions of Prop. 4. The results shown on Fig. 4, show the invariance of the invariant node on compartment 1 to both values of and (Fig. 4), while the intervened node of the same compartment changes value in a way similar as the (non-invariant) Lie intervention, only as a function of (Fig. 4).

Figure 4: Result of invariant interventions on a two compartment cyclic graph. The unintervened node values (in black) are comparded to invariant interventions (II, solide lines) and there corresponding Lie intervention (LI, dashed lines) (without enforcing invariance), for different values of the pair of the Lie interventions parameters . (a) Value of the invariant node in compartment 1. (b) Value of the intervened node in compartment 1.