Semantic Prediction: Which One Should Come First, Recognition or Prediction?

10/06/2021
by   Hafez Farazi, et al.
0

The ultimate goal of video prediction is not forecasting future pixel-values given some previous frames. Rather, the end goal of video prediction is to discover valuable internal representations from the vast amount of available unlabeled video data in a self-supervised fashion for downstream tasks. One of the primary downstream tasks is interpreting the scene's semantic composition and using it for decision-making. For example, by predicting human movements, an observer can anticipate human activities and collaborate in a shared workspace. There are two main ways to achieve the same outcome, given a pre-trained video prediction and pre-trained semantic extraction model; one can first apply predictions and then extract semantics or first extract semantics and then predict. We investigate these configurations using the Local Frequency Domain Transformer Network (LFDTN) as the video prediction model and U-Net as the semantic extraction model on synthetic and real datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2020

PTUM: Pre-training User Model from Unlabeled User Behaviors via Self-supervision

User modeling is critical for many personalized web services. Many exist...
research
12/13/2020

InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees

Building deep learning models on source code has found many successful s...
research
07/12/2020

TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech

We introduce a self-supervised speech pre-training method called TERA, w...
research
03/01/2019

Frequency Domain Transformer Networks for Video Prediction

The task of video prediction is forecasting the next frames given some p...
research
05/31/2023

Representation Reliability and Its Impact on Downstream Tasks

Self-supervised pre-trained models extract general-purpose representatio...
research
11/22/2020

Video SemNet: Memory-Augmented Video Semantic Network

Stories are a very compelling medium to convey ideas, experiences, socia...
research
06/28/2018

A Computational Theory for Life-Long Learning of Semantics

Semantic vectors are learned from data to express semantic relationships...

Please sign up or login with your details

Forgot password? Click here to reset