Action-conditioned Benchmarking of Robotic Video Prediction Models: a Comparative Study

10/07/2019
by   Manuel Serra Nunes, et al.
0

A defining characteristic of intelligent systems is the ability to make action decisions based on the anticipated outcomes. Video prediction systems have been demonstrated as a solution for predicting how the future will unfold visually, and thus, many models have been proposed that are capable of predicting future frames based on a history of observed frames (and sometimes robot actions). However, a comprehensive method for determining the fitness of different video prediction models at guiding the selection of actions is yet to be developed. Current metrics assess video prediction models based on human perception of frame quality. In contrast, we argue that if these systems are to be used to guide action, necessarily, the actions the robot performs should be encoded in the predicted frames. In this paper, we are proposing a new metric to compare different video prediction models based on this argument. More specifically, we propose an action inference system and quantitatively rank different models based on how well we can infer the robot actions from the predicted frames. Our extensive experiments show that models with high perceptual scores can perform poorly in the proposed action inference tests and thus, may not be suitable options to be used in robot planning systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2023

Combining Vision and Tactile Sensation for Video Prediction

In this paper, we explore the impact of adding tactile sensation to vide...
research
03/17/2022

Video Prediction at Multiple Scales with Hierarchical Recurrent Networks

Autonomous systems not only need to understand their current environment...
research
06/24/2021

FitVid: Overfitting in Pixel-Level Video Prediction

An agent that is capable of predicting what happens next can perform a v...
research
05/01/2020

A Naturalness Evaluation Database for Video Prediction Models

The study of video prediction models is believed to be a fundamental app...
research
11/17/2020

Mutual Information Based Method for Unsupervised Disentanglement of Video Representation

Video Prediction is an interesting and challenging task of predicting fu...
research
05/17/2021

Adaptive Video Encoding For Different Video Codecs

By 2022, we expect video traffic to reach 82 Undoubtedly, the abundance ...

Please sign up or login with your details

Forgot password? Click here to reset