Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

03/03/2020
by   Biagio Brattoli, et al.
17

Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at github.com/bbrattoli/ZeroShotVideoClassification.

READ FULL TEXT

page 1

page 2

page 15

page 16

page 17

page 18

research
03/07/2022

Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language

Learning to classify video data from classes not included in the trainin...
research
07/20/2022

Temporal and cross-modal attention for audio-visual zero-shot learning

Audio-visual generalised zero-shot learning for video classification req...
research
03/10/2022

Zero-Shot Action Recognition with Transformer-based Video Semantic Embedding

While video action recognition has been an active area of research for s...
research
08/21/2023

Image-free Classifier Injection for Zero-Shot Classification

Zero-shot learning models achieve remarkable results on image classifica...
research
03/29/2022

Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification

Most methods tackle zero-shot video classification by aligning visual-se...
research
06/27/2019

Few-Shot Video Classification via Temporal Alignment

There is a growing interest in learning a model which could recognize no...
research
02/15/2021

One Line To Rule Them All: Generating LO-Shot Soft-Label Prototypes

Increasingly large datasets are rapidly driving up the computational cos...

Please sign up or login with your details

Forgot password? Click here to reset