Fast Retinomorphic Event Stream for Video Recognition and Reinforcement Learning

05/16/2018
by   Wanjia Liu, et al.
0

Good temporal representations are crucial for video understanding, and the state-of-the-art video recognition framework is based on two-stream networks. In such framework, besides the regular ConvNets responsible for RGB frame inputs, a second network is introduced to handle the temporal representation, usually the optical flow (OF). However, OF or other task-oriented flow is computationally costly, and is thus typically pre-computed. Critically, this prevents the two-stream approach from being applied to reinforcement learning (RL) applications such as video game playing, where the next state depends on current state and action choices. Inspired by the early vision systems of mammals and insects, we propose a fast event-driven representation (EDR) that models several major properties of early retinal circuits: (1) logarithmic input response, (2) multi-timescale temporal smoothing to filter noise, and (3) bipolar (ON/OFF) pathways for primitive event detection[12]. Trading off the directional information for fast speed (> 9000 fps), EDR en-ables fast real-time inference/learning in video applications that require interaction between an agent and the world such as game-playing, virtual robotics, and domain adaptation. In this vein, we use EDR to demonstrate performance improvements over state-of-the-art reinforcement learning algorithms for Atari games, something that has not been possible with pre-computed OF. Moreover, with UCF-101 video action recognition experiments, we show that EDR performs near state-of-the-art in accuracy while achieving a 1,500x speedup in input representation processing, as compared to optical flow.

READ FULL TEXT

page 2

page 6

page 14

page 15

page 18

research
05/16/2018

Fast Retinomorphic Event Stream for Video Recognition and ReinforcementLearning

Good temporal representations are crucial for video understanding, and t...
research
06/07/2022

TadML: A fast temporal action detection with Mechanics-MLP

Temporal Action Detection(TAD) is a crucial but challenging task in vide...
research
02/23/2018

Real-Time End-to-End Action Detection with Two-Stream Networks

Two-stream networks have been very successful for solving the problem of...
research
04/26/2016

Real-time Action Recognition with Enhanced Motion Vector CNNs

The deep two-stream architecture exhibited excellent performance on vide...
research
05/08/2018

FFNet: Video Fast-Forwarding via Reinforcement Learning

For many applications with limited computation, communication, storage a...
research
05/25/2019

Exploring Temporal Information for Improved Video Understanding

In this dissertation, I present my work towards exploring temporal infor...
research
12/29/2019

Speeding up reinforcement learning by combining attention and agency features

When playing video-games we immediately detect which entity we control a...

Please sign up or login with your details

Forgot password? Click here to reset