Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks

02/02/2020
by   Joonatan Mänttäri, et al.
0

A number of techniques for interpretability have been presented for deep learning in computer vision, typically with the goal of understanding what it is that the networks have actually learned underneath a given classification decision. However, when it comes to deep video architectures, interpretability is still in its infancy and we do not yet have a clear concept of how we should decode spatiotemporal features. In this paper, we present a study comparing how 3D convolutional networks and convolutional LSTM networks learn features across temporally dependent frames. This is the first comparison of two video models that both convolve to learn spatial features but that have principally different methods of modeling time. Additionally, we extend the concept of meaningful perturbation introduced by Fong Vedaldi (2017) to the temporal dimension to search for the most meaningful part of a sequence for a classification decision.

READ FULL TEXT

page 7

page 8

research
09/15/2020

Comparison of Spatiotemporal Networks for Learning Video Related Tasks

Many methods for learning from video sequences involve temporally proces...
research
12/02/2014

Learning Spatiotemporal Features with 3D Convolutional Networks

We propose a simple, yet effective approach for spatiotemporal feature l...
research
10/11/2018

Location Dependency in Video Prediction

Deep convolutional neural networks are used to address many computer vis...
research
02/07/2020

Attentive Group Equivariant Convolutional Networks

Although group convolutional networks are able to learn powerful represe...
research
11/13/2018

Two-stream convolutional networks for end-to-end learning of self-driving cars

We propose a methodology to extend the concept of Two-Stream Convolution...
research
09/04/2019

Deep Convolutional Networks in System Identification

Recent developments within deep learning are relevant for nonlinear syst...
research
02/14/2019

Predicting Ergonomic Risks During Indoor Object Manipulation Using Spatiotemporal Convolutional Networks

Automated real-time prediction of the ergonomic risks of manipulating ob...

Please sign up or login with your details

Forgot password? Click here to reset