Leaping Into Memories: Space-Time Deep Feature Synthesis

03/17/2023
by   Alexandros Stergiou, et al.
0

The success of deep learning models has led to their adaptation and adoption by prominent video understanding methods. The majority of these approaches encode features in a joint space-time modality for which the inner workings and learned representations are difficult to visually interpret. We propose LEArned Preconscious Synthesis (LEAPS), an architecture-agnostic method for synthesizing videos from the internal spatiotemporal representations of models. Using a stimulus video and a target class, we prime a fixed space-time model and iteratively optimize a video initialized with random noise. We incorporate additional regularizers to improve the feature diversity of the synthesized videos as well as the cross-frame temporal coherence of motions. We quantitatively and qualitatively evaluate the applicability of LEAPS by inverting a range of spatiotemporal convolutional and attention-based architectures trained on Kinetics-400, which to the best of our knowledge has not been previously accomplished.

READ FULL TEXT

page 13

page 14

page 15

page 16

page 17

page 18

page 19

page 20

research
02/09/2021

Is Space-Time Attention All You Need for Video Understanding?

We present a convolution-free approach to video classification built exc...
research
04/06/2020

Deep Space-Time Video Upsampling Networks

Video super-resolution (VSR) and frame interpolation (FI) are traditiona...
research
07/11/2007

The Trade-offs with Space Time Cube Representation of Spatiotemporal Patterns

Space time cube representation is an information visualization technique...
research
08/20/2020

Causal Future Prediction in a Minkowski Space-Time

Estimating future events is a difficult task. Unlike humans, machine lea...
research
11/26/2018

Evolving Space-Time Neural Architectures for Videos

In this paper, we present a new method for evolving video CNN models to ...
research
05/30/2019

AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures

Learning to represent videos is a very challenging task both algorithmic...
research
11/18/2017

Excitation Backprop for RNNs

Deep models are state-of-the-art for many vision tasks including video a...

Please sign up or login with your details

Forgot password? Click here to reset