ModeRNN: Harnessing Spatiotemporal Mode Collapse in Unsupervised Predictive Learning

10/08/2021
by   Zhiyu Yao, et al.
0

Learning predictive models for unlabeled spatiotemporal data is challenging in part because visual dynamics can be highly entangled in real scenes, making existing approaches prone to overfit partial modes of physical processes while neglecting to reason about others. We name this phenomenon spatiotemporal mode collapse and explore it for the first time in predictive learning. The key is to provide the model with a strong inductive bias to discover the compositional structures of latent modes. To this end, we propose ModeRNN, which introduces a novel method to learn structured hidden representations between recurrent states. The core idea of this framework is to first extract various components of visual dynamics using a set of spatiotemporal slots with independent parameters. Considering that multiple space-time patterns may co-exist in a sequence, we leverage learnable importance weights to adaptively aggregate slot features into a unified hidden representation, which is then used to update the recurrent states. Across the entire dataset, different modes result in different responses on the mixtures of slots, which enhances the ability of ModeRNN to build structured representations and thus prevents the so-called mode collapse. Unlike existing models, ModeRNN is shown to prevent spatiotemporal mode collapse and further benefit from learning mixed visual dynamics.

READ FULL TEXT

page 7

page 9

page 10

page 13

research
09/24/2020

Unsupervised Transfer Learning for Spatiotemporal Predictive Networks

This paper explores a new research problem of unsupervised transfer lear...
research
03/17/2021

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning

The predictive learning of spatiotemporal sequences aims to generate fut...
research
02/14/2023

Anti-circulant dynamic mode decomposition with sparsity-promoting for highway traffic dynamics analysis

Highway traffic states data collected from a network of sensors can be c...
research
07/09/2022

Learning Structured Representations of Visual Scenes

As the intermediate-level representations bridging the two levels, struc...
research
02/09/2015

On the Dynamics of a Recurrent Hopfield Network

In this research paper novel real/complex valued recurrent Hopfield Neur...
research
10/21/2021

Variational Predictive Routing with Nested Subjective Timescales

Discovery and learning of an underlying spatiotemporal hierarchy in sequ...
research
09/13/2023

Spatiotemporal modelling of PM_2.5 concentrations in Lombardy (Italy) – A comparative study

This study presents a comparative analysis of three predictive models wi...

Please sign up or login with your details

Forgot password? Click here to reset