Self-Supervised Video Similarity Learning

04/06/2023
by   Giorgos Kordopatis-Zilos, et al.
0

We introduce S^2VS, a video similarity learning approach with self-supervision. Self-Supervised Learning (SSL) is typically used to train deep models on a proxy task so as to have strong transferability on target tasks after fine-tuning. Here, in contrast to prior work, SSL is used to perform video similarity learning and address multiple retrieval and detection tasks at once with no use of labeled data. This is achieved by learning via instance-discrimination with task-tailored augmentations and the widely used InfoNCE loss together with an additional loss operating jointly on self-similarity and hard-negative similarity. We benchmark our method on tasks where video relevance is defined with varying granularity, ranging from video copies to videos depicting the same incident or event. We learn a single universal model that achieves state-of-the-art performance on all tasks, surpassing previously proposed methods that use labeled data. The code and pretrained models are publicly available at: <https://github.com/gkordo/s2vs>

READ FULL TEXT

page 1

page 3

page 4

page 11

research
11/10/2022

3D-CSL: self-supervised 3D context similarity learning for Near-Duplicate Video Retrieval

In this paper, we introduce 3D-CSL, a compact pipeline for Near-Duplicat...
research
10/11/2022

OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions

The pretrain-finetune paradigm in modern computer vision facilitates the...
research
12/02/2021

InsCLR: Improving Instance Retrieval with Self-Supervision

This work aims at improving instance retrieval with self-supervision. We...
research
12/05/2019

Self-Supervised Learning of Video-Induced Visual Invariances

We propose a general framework for self-supervised learning of transfera...
research
06/16/2022

iBoot: Image-bootstrapped Self-Supervised Video Representation Learning

Learning visual representations through self-supervision is an extremely...
research
10/10/2022

HiCo: Hierarchical Contrastive Learning for Ultrasound Video Model Pretraining

The self-supervised ultrasound (US) video model pretraining can use a sm...
research
07/05/2021

Do Different Tracking Tasks Require Different Appearance Models?

Tracking objects of interest in a video is one of the most popular and w...

Please sign up or login with your details

Forgot password? Click here to reset