Excitation Backprop for RNNs

11/18/2017
by   Sarah Adel Bargal, et al.
0

Deep models are state-of-the-art for many vision tasks including video action recognition and video captioning. Models are trained to caption or classify activity in videos, but little is known about the evidence used to make such decisions. Grounding decisions made by deep networks has been studied in spatial visual content, giving more insight into model predictions for images. However, such studies are relatively lacking for models of spatiotemporal visual content - videos. In this work, we devise a formulation that simultaneously grounds evidence in space and time, in a single pass, using top-down saliency. We visualize the spatiotemporal cues that contribute to a deep model's classification/captioning output using the model's internal representation. Based on these spatiotemporal cues, we are able to localize segments within a video that correspond with a specific action, or phrase from a caption, without explicitly optimizing/training for these tasks.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 8

research
03/27/2016

Recurrent Mixture Density Network for Spatiotemporal Visual Attention

In many computer vision tasks, the relevant information to solve the pro...
research
03/24/2015

Unsupervised Video Analysis Based on a Spatiotemporal Saliency Detector

Visual saliency, which predicts regions in the field of view that draw t...
research
06/06/2022

A Deeper Dive Into What Deep Spatiotemporal Networks Encode: Quantifying Static vs. Dynamic Information

Deep spatiotemporal models are used in a variety of computer vision task...
research
04/19/2021

What can human minimal videos tell us about dynamic recognition models?

In human vision objects and their parts can be visually recognized from ...
research
09/15/2019

Multitask Learning to Improve Egocentric Action Recognition

In this work we employ multitask learning to capitalize on the structure...
research
03/17/2023

Leaping Into Memories: Space-Time Deep Feature Synthesis

The success of deep learning models has led to their adaptation and adop...
research
11/03/2022

Quantifying and Learning Static vs. Dynamic Information in Deep Spatiotemporal Networks

There is limited understanding of the information captured by deep spati...

Please sign up or login with your details

Forgot password? Click here to reset