Time-Aware and View-Aware Video Rendering for Unsupervised Representation Learning

11/26/2018
by   Shruti Vyas, et al.
0

The recent success in deep learning has lead to various effective representation learning methods for videos. However, the current approaches for video representation require large amount of human labeled datasets for effective learning. We present an unsupervised representation learning framework to encode scene dynamics in videos captured from multiple viewpoints. The proposed framework has two main components: Representation Learning Network (RL-NET), which learns a representation with the help of Blending Network (BL-NET), and Video Rendering Network (VR-NET), which is used for video synthesis. The framework takes as input video clips from different viewpoints and time, learns an internal representation and uses this representation to render a video clip from an arbitrary given viewpoint and time. The ability of the proposed network to render video frames from arbitrary viewpoints and time enable it to learn a meaningful and robust representation of the scene dynamics. We demonstrate the effectiveness of the proposed method in rendering view-aware as well as time-aware video clips on two different real-world datasets including UCF-101 and NTU-RGB+D. To further validate the effectiveness of the learned representation, we use it for the task of view-invariant activity classification where we observe a significant improvement ( 26 the performance on NTU-RGB+D dataset compared to the existing state-of-the art methods.

READ FULL TEXT

page 6

page 7

page 8

page 9

page 18

page 19

research
06/07/2021

Novel View Video Prediction Using a Dual Representation

We address the problem of novel view video prediction; given a set of in...
research
09/06/2018

Unsupervised Learning of View-invariant Action Representations

The recent success in human action recognition with deep learning method...
research
01/07/2017

Unsupervised Learning of Long-Term Motion Dynamics for Videos

We present an unsupervised representation learning approach that compact...
research
08/14/2023

Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning

Due to the flexible representation of arbitrary-shaped scene text and si...
research
04/10/2018

DeepQoE: A unified Framework for Learning to Predict Video QoE

Motivated by the prowess of deep learning (DL) based techniques in predi...
research
07/12/2019

AVD: Adversarial Video Distillation

In this paper, we present a simple yet efficient approach for video repr...
research
09/11/2020

SA-Net: A deep spectral analysis network for image clustering

Although supervised deep representation learning has attracted enormous ...

Please sign up or login with your details

Forgot password? Click here to reset