Towards Long-Form Video Understanding

06/21/2021
by   Chao-Yuan Wu, et al.
7

Our world offers a never-ending stream of visual stimuli, yet today's vision systems only accurately recognize patterns within a few seconds. These systems understand the present, but fail to contextualize it in past or future events. In this paper, we study long-form video understanding. We introduce a framework for modeling long-form videos and develop evaluation protocols on large-scale datasets. We show that existing state-of-the-art short-term models are limited for long-form tasks. A novel object-centric transformer-based video recognition architecture performs significantly better on 7 diverse tasks. It also outperforms comparable state-of-the-art on the AVA dataset.

READ FULL TEXT

Authors

page 1

page 5

page 8

page 13

page 14

12/12/2018

Long-Term Feature Banks for Detailed Video Understanding

To understand the world, we humans constantly need to relate the present...
01/20/2022

MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition

While today's video recognition systems parse snapshots or short clips a...
07/07/2021

Long Short-Term Transformer for Online Action Detection

In this paper, we present Long Short-term TRansformer (LSTR), a new temp...
12/04/2021

An Annotated Video Dataset for Computing Video Memorability

Using a collection of publicly available links to short form video clips...
12/02/2018

How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos

Understanding web instructional videos is an essential branch of video u...
05/09/2020

Human in Events: A Large-Scale Benchmark for Human-centric Video Analysis in Complex Events

Along with the development of the modern smart city, human-centric video...
10/10/2019

CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning

Computer vision has undergone a dramatic revolution in performance, driv...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.