VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding

09/28/2021
by   Hu Xu, et al.
0

We present VideoCLIP, a contrastive approach to pre-train a unified model for zero-shot video and text understanding, without using any labels on downstream tasks. VideoCLIP trains a transformer for video and text by contrasting temporally overlapping positive video-text pairs with hard negatives from nearest neighbor retrieval. Our experiments on a diverse series of downstream tasks, including sequence-level text-video retrieval, VideoQA, token-level action localization, and action segmentation reveal state-of-the-art performance, surpassing prior work, and in some cases even outperforming supervised approaches. Code is made available at https://github.com/pytorch/fairseq/tree/main/examples/MMPT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2022

FitCLIP: Refining Large-Scale Pretrained Image-Text Models for Zero-Shot Video Understanding Tasks

Large-scale pretrained image-text models have shown incredible zero-shot...
research
07/03/2022

Can Language Understand Depth?

Besides image classification, Contrastive Language-Image Pre-training (C...
research
04/21/2023

Contrastive Language, Action, and State Pre-training for Robot Learning

In this paper, we introduce a method for unifying language, action, and ...
research
08/23/2021

TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment

Contrastive learning has been widely used to train transformer-based vis...
research
04/13/2023

Verbs in Action: Improving verb understanding in video-language models

Understanding verbs is crucial to modelling how people and objects inter...
research
04/12/2023

CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks

Contrastive Language-Image Pre-training (CLIP) is a powerful multimodal ...
research
03/20/2021

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

Action segmentation refers to inferring boundaries of semantically consi...

Please sign up or login with your details

Forgot password? Click here to reset