DeepAI AI Chat
Log In Sign Up

Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

03/03/2020
by   Biagio Brattoli, et al.
Amazon
University of Heidelberg
California Institute of Technology
17

Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at github.com/bbrattoli/ZeroShotVideoClassification.

READ FULL TEXT

page 1

page 2

page 15

page 16

page 17

page 18

03/07/2022

Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language

Learning to classify video data from classes not included in the trainin...
07/20/2022

Temporal and cross-modal attention for audio-visual zero-shot learning

Audio-visual generalised zero-shot learning for video classification req...
03/02/2016

Synthesized Classifiers for Zero-Shot Learning

Given semantic descriptions of object classes, zero-shot learning aims t...
03/10/2022

Zero-Shot Action Recognition with Transformer-based Video Semantic Embedding

While video action recognition has been an active area of research for s...
03/29/2022

Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification

Most methods tackle zero-shot video classification by aligning visual-se...
06/27/2019

Few-Shot Video Classification via Temporal Alignment

There is a growing interest in learning a model which could recognize no...
02/15/2021

One Line To Rule Them All: Generating LO-Shot Soft-Label Prototypes

Increasingly large datasets are rapidly driving up the computational cos...

Code Repositories

ZeroShotVideoClassification

Zero-shot video classification by end-to-end training of 3D convolutional neural networks


view repo