ZSTAD: Zero-Shot Temporal Activity Detection

03/12/2020
by   Lingling Zhang, et al.
12

An integral part of video analysis and surveillance is temporal activity detection, which means to simultaneously recognize and localize activities in long untrimmed videos. Currently, the most effective methods of temporal activity detection are based on deep learning, and they typically perform very well with large scale annotated videos for training. However, these methods are limited in real applications due to the unavailable videos about certain activity classes and the time-consuming data annotation. To solve this challenging problem, we propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected. We design an end-to-end deep network based on R-C3D as the architecture for this solution. The proposed network is optimized with an innovative loss function that considers the embeddings of activity labels and their super-classes while learning the common semantics of seen and unseen activities. Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.

READ FULL TEXT

page 2

page 6

research
03/31/2020

Revisiting Few-shot Activity Detection with Class Similarity Control

Many interesting events in the real world are rare making preannotated m...
research
03/23/2021

Incrementally Zero-Shot Detection by an Extreme Value Analyzer

Human beings not only have the ability to recognize novel unseen classes...
research
12/25/2018

Similarity R-C3D for Few-shot Temporal Activity Detection

Many activities of interest are rare events, with only a few labeled exa...
research
11/07/2018

Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning

Although promising results have been achieved in video captioning, exist...
research
03/08/2017

A Pursuit of Temporal Accuracy in General Activity Detection

Detecting activities in untrimmed videos is an important but challenging...
research
11/30/2017

Budget-Aware Activity Detection with A Recurrent Policy Network

In this paper, we address the challenging problem of effi- cient tempora...
research
05/26/2023

Discovering Novel Actions in an Open World with Object-Grounded Visual Commonsense Reasoning

Learning to infer labels in an open world, i.e., in an environment where...

Please sign up or login with your details

Forgot password? Click here to reset