Memory-augmented Attention Modelling for Videos

11/07/2016
by   Rasool Fakoor, et al.
0

We present a method to improve video description generation by modeling higher-order interactions between video frames and described concepts. By storing past visual attention in the video associated to previously generated words, the system is able to decide what to look at and describe in light of what it has already looked at and described. This enables not only more effective local attention, but tractable consideration of the video sequence while generating each word. Evaluation on the challenging and popular MSVD and Charades datasets demonstrates that the proposed architecture outperforms previous video description approaches without requiring external temporal video features.

READ FULL TEXT
research
05/03/2015

Sequence to Sequence -- Video to Text

Real-world videos often have complex dynamics; and methods for generatin...
research
01/11/2017

Attention-Based Multimodal Fusion for Video Description

Currently successful methods for video description are based on encoder-...
research
08/06/2020

Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos

Temporal language localization in videos aims to ground one video segmen...
research
07/20/2019

Order Matters: Shuffling Sequence Generation for Video Prediction

Predicting future frames in natural video sequences is a new challenge t...
research
08/13/2022

Memory Efficient Temporal Visual Graph Model for Unsupervised Video Domain Adaptation

Existing video domain adaption (DA) methods need to store all temporal c...
research
11/07/2019

Diversified Co-Attention towards Informative Live Video Commenting

We focus on the task of Automatic Live Video Commenting (ALVC), which ai...
research
04/25/2023

TCR: Short Video Title Generation and Cover Selection with Attention Refinement

With the widespread popularity of user-generated short videos, it become...

Please sign up or login with your details

Forgot password? Click here to reset