Two-Stream Oriented Video Super-Resolution for Action Recognition

03/13/2019
by   Haochen Zhang, et al.
0

We study the video super-resolution (SR) problem not for visual quality, but for facilitating video analytics tasks, e.g. action recognition. The popular action recognition methods based on convolutional networks, exemplified by two-stream networks, are not directly applicable on videos of different spatial resolutions. This can be remedied by performing video SR prior to recognition, which motivates us to improve the SR procedure for recognition accuracy. Tailored for two-stream action recognition networks, we propose two video SR methods for the spatial and temporal streams respectively. On the one hand, we observe that the added details by image SR methods can be either helpful or harmful for recognition, and we propose an optical-flow guided weighted mean-squared-error loss for our spatial-oriented SR (SoSR) network. On the other hand, we observe that existing video SR methods incur temporal discontinuity between frames, which also worsens the recognition accuracy, and we propose a siamese network for our temporal-oriented SR (ToSR) that emphasizes the temporal continuity between consecutive frames. We perform experiments using two state-of-the-art action recognition networks and two well-known datasets--UCF101 and HMDB51. Results demonstrate the effectiveness of our proposed SoSR and ToSR in improving recognition accuracy.

READ FULL TEXT

page 3

page 4

page 5

page 8

research
01/06/2020

Deep Video Super-Resolution using HR Optical Flow Estimation

Video super-resolution (SR) aims at generating a sequence of high-resolu...
research
03/13/2020

Is There Tradeoff between Spatial and Temporal in Video Super-Resolution?

Recent advances of deep learning lead to great success of image and vide...
research
11/22/2020

Learnable Sampling 3D Convolution for Video Enhancement and Action Recognition

A key challenge in video enhancement and action recognition is to fuse u...
research
10/12/2016

Semi-Coupled Two-Stream Fusion ConvNets for Action Recognition at Extremely Low Resolutions

Deep convolutional neural networks (ConvNets) have been recently shown t...
research
04/14/2022

Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling

Temporal modeling is crucial for video super-resolution. Most of the vid...
research
01/11/2018

Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition

A major emerging challenge is how to protect people's privacy as cameras...
research
07/06/2023

RefVSR++: Exploiting Reference Inputs for Reference-based Video Super-resolution

Smartphones equipped with a multi-camera system comprising multiple came...

Please sign up or login with your details

Forgot password? Click here to reset