Predicting the Future with Transformational States

03/26/2018
by   Andrew Jaegle, et al.
0

An intelligent observer looks at the world and sees not only what is, but what is moving and what can be moved. In other words, the observer sees how the present state of the world can transform in the future. We propose a model that predicts future images by learning to represent the present state and its transformation given only a sequence of images. To do so, we introduce an architecture with a latent state composed of two components designed to capture (i) the present image state and (ii) the transformation between present and future states, respectively. We couple this latent state with a recurrent neural network (RNN) core that predicts future frames by transforming past states into future states by applying the accumulated state transformation with a learned operator. We describe how this model can be integrated into an encoder-decoder convolutional neural network (CNN) architecture that uses weighted residual connections to integrate representations of the past with representations of the future. Qualitatively, our approach generates image sequences that are stable and capture realistic motion over multiple predicted frames, without requiring adversarial training. Quantitatively, our method achieves prediction results comparable to state-of-the-art results on standard image prediction benchmarks (Moving MNIST, KTH, and UCF101).

READ FULL TEXT

page 2

page 9

page 12

page 13

page 15

page 22

page 23

page 24

research
04/01/2020

Future Video Synthesis with Object Motion Prediction

We present an approach to predict future video frames given a sequence o...
research
10/04/2022

Enhancing Spatiotemporal Prediction Model using Modular Design and Beyond

Predictive learning uses a known state to generate a future state over a...
research
03/04/2018

Egocentric Basketball Motion Planning from a Single First-Person Image

We present a model that uses a single first-person image to generate an ...
research
10/25/2021

MoDeRNN: Towards Fine-grained Motion Details for Spatiotemporal Predictive Learning

Spatiotemporal predictive learning (ST-PL) aims at predicting the subseq...
research
11/09/2022

Trackerless freehand ultrasound with sequence modelling and auxiliary transformation over past and future frames

Three-dimensional (3D) freehand ultrasound (US) reconstruction without a...
research
06/13/2019

IntrinSeqNet: Learning to Estimate the Reflectance from Varying Illumination

Intrinsic image decomposition describes an image based on its reflectanc...
research
02/09/2021

Robust Motion In-betweening

In this work we present a novel, robust transition generation technique ...

Please sign up or login with your details

Forgot password? Click here to reset