CITRIS: Causal Identifiability from Temporal Intervened Sequences

by   Phillip Lippe, et al.

Understanding the latent causal factors of a dynamical system from visual observations is a crucial step towards agents reasoning in complex environments. In this paper, we propose CITRIS, a variational autoencoder framework that learns causal representations from temporal sequences of images in which underlying causal factors have possibly been intervened upon. In contrast to the recent literature, CITRIS exploits temporality and observing intervention targets to identify scalar and multidimensional causal factors, such as 3D rotation angles. Furthermore, by introducing a normalizing flow, CITRIS can be easily extended to leverage and disentangle representations obtained by already pretrained autoencoders. Extending previous results on scalar causal factors, we prove identifiability in a more general setting, in which only some components of a causal factor are affected by interventions. In experiments on 3D rendered image sequences, CITRIS outperforms previous methods on recovering the underlying causal variables. Moreover, using pretrained autoencoders, CITRIS can even generalize to unseen instantiations of causal factors, opening future research areas in sim-to-real generalization for causal representation learning.


page 28

page 30

page 31

page 32

page 41

page 42


iCITRIS: Causal Representation Learning for Instantaneous Temporal Effects

Causal representation learning is the task of identifying the underlying...

CausalVAE: Structured Causal Disentanglement in Variational Autoencoder

Learning disentanglement aims at finding a low dimensional representatio...

Neuro-Causal Factor Analysis

Factor analysis (FA) is a statistical tool for studying how observed var...

Weakly supervised causal representation learning

Learning high-level causal representations together with a causal model ...

Disentanglement of Latent Representations via Sparse Causal Interventions

The process of generating data such as images is controlled by independe...

Variational Causal Dynamics: Discovering Modular World Models from Interventions

Latent world models allow agents to reason about complex environments wi...

CIParsing: Unifying Causality Properties into Multiple Human Parsing

Existing methods of multiple human parsing (MHP) apply statistical model...

Please sign up or login with your details

Forgot password? Click here to reset