3D-CSL: self-supervised 3D context similarity learning for Near-Duplicate Video Retrieval

11/10/2022
by   Rui Deng, et al.
0

In this paper, we introduce 3D-CSL, a compact pipeline for Near-Duplicate Video Retrieval (NDVR), and explore a novel self-supervised learning strategy for video similarity learning. Most previous methods only extract video spatial features from frames separately and then design kinds of complex mechanisms to learn the temporal correlations among frame features. However, parts of spatiotemporal dependencies have already been lost. To address this, our 3D-CSL extracts global spatiotemporal dependencies in videos end-to-end with a 3D transformer and find a good balance between efficiency and effectiveness by matching on clip-level. Furthermore, we propose a two-stage self-supervised similarity learning strategy to optimize the entire network. Firstly, we propose PredMAE to pretrain the 3D transformer with video prediction task; Secondly, ShotMix, a novel video-specific augmentation, and FCS loss, a novel triplet loss, are proposed further promote the similarity learning results. The experiments on FIVR-200K and CC_WEB_VIDEO demonstrate the superiority and reliability of our method, which achieves the state-of-the-art performance on clip-level NDVR.

READ FULL TEXT
research
04/16/2021

Self-supervised Video Retrieval Transformer Network

Content-based video retrieval aims to find videos from a large video dat...
research
04/06/2023

Self-Supervised Video Similarity Learning

We introduce S^2VS, a video similarity learning approach with self-super...
research
12/02/2021

Self-supervised Video Transformer

In this paper, we propose self-supervised training for video transformer...
research
10/23/2022

Self-supervised Amodal Video Object Segmentation

Amodal perception requires inferring the full shape of an object that is...
research
10/17/2020

Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning

In this work, we address the problem of audio-based near-duplicate video...
research
10/03/2021

Disarranged Zone Learning (DZL): An unsupervised and dynamic automatic stenosis recognition methodology based on coronary angiography

We proposed a novel unsupervised methodology named Disarranged Zone Lear...
research
07/16/2021

Self-Supervised Learning Framework for Remote Heart Rate Estimation Using Spatiotemporal Augmentation

Recent supervised deep learning methods have shown that heart rate can b...

Please sign up or login with your details

Forgot password? Click here to reset