Elaborative Rehearsal for Zero-shot Action Recognition

08/05/2021
by   Shizhe Chen, et al.
0

The growing number of action classes has posed a new challenge for video understanding, making Zero-Shot Action Recognition (ZSAR) a thriving direction. The ZSAR task aims to recognize target (unseen) actions without training examples by leveraging semantic representations to bridge seen and unseen actions. However, due to the complexity and diversity of actions, it remains challenging to semantically represent action classes and transfer knowledge from seen data. In this work, we propose an ER-enhanced ZSAR model inspired by an effective human memory technique Elaborative Rehearsal (ER), which involves elaborating a new concept and relating it to known concepts. Specifically, we expand each action class as an Elaborative Description (ED) sentence, which is more discriminative than a class name and less costly than manual-defined attributes. Besides directly aligning class semantics with videos, we incorporate objects from the video as Elaborative Concepts (EC) to improve video semantics and generalization from seen actions to unseen actions. Our ER-enhanced ZSAR model achieves state-of-the-art results on three existing benchmarks. Moreover, we propose a new ZSAR evaluation protocol on the Kinetics dataset to overcome limitations of current benchmarks and demonstrate the first case where ZSAR performance is comparable to few-shot learning baselines on this more realistic setting. We will release our codes and collected EDs at https://github.com/DeLightCMU/ElaborativeRehearsal.

READ FULL TEXT
research
10/20/2017

Generalized Zero-Shot Learning for Action Recognition with Web-Scale Video Data

Action recognition in surveillance video makes our life safer by detecti...
research
01/27/2018

A Generative Approach to Zero-Shot and Few-Shot Action Recognition

We present a generative framework for zero-shot action recognition where...
research
08/03/2020

RareAct: A video dataset of unusual interactions

This paper introduces a manually annotated video dataset of unusual acti...
research
09/24/2022

Global Semantic Descriptors for Zero-Shot Action Recognition

The success of Zero-shot Action Recognition (ZSAR) methods is intrinsica...
research
03/26/2022

Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Action recognition models have shown a promising capability to classify ...
research
11/22/2022

Knowledge Prompting for Few-shot Action Recognition

Few-shot action recognition in videos is challenging for its lack of sup...
research
09/20/2019

Retro-Actions: Learning 'Close' by Time-Reversing 'Open' Videos

We investigate video transforms that result in class-homogeneous label-t...

Please sign up or login with your details

Forgot password? Click here to reset