Learning Cross-modal Contrastive Features for Video Domain Adaptation

08/26/2021
by   Donghyun Kim, et al.
0

Learning transferable and domain adaptive feature representations from videos is important for video-relevant tasks such as action recognition. Existing video domain adaptation methods mainly rely on adversarial feature alignment, which has been derived from the RGB image space. However, video data is usually associated with multi-modal information, e.g., RGB and optical flow, and thus it remains a challenge to design a better method that considers the cross-modal inputs under the cross-domain adaptation setting. To this end, we propose a unified framework for video domain adaptation, which simultaneously regularizes cross-modal and cross-domain feature representations. Specifically, we treat each modality in a domain as a view and leverage the contrastive learning technique with properly designed sampling strategies. As a result, our objectives regularize feature spaces, which originally lack the connection across modalities or have less alignment across domains. We conduct experiments on domain adaptive action recognition benchmark datasets, i.e., UCF, HMDB, and EPIC-Kitchens, and demonstrate the effectiveness of our components against state-of-the-art algorithms.

READ FULL TEXT

page 8

page 12

research
11/25/2019

Deep Image-to-Video Adaptation and Fusion Networks for Action Recognition

Existing deep learning methods for action recognition in videos require ...
research
08/23/2023

Towards Privacy-Supporting Fall Detection via Deep Unsupervised RGB2Depth Adaptation

Fall detection is a vital task in health monitoring, as it allows the sy...
research
10/25/2021

Domain Adaptation in Multi-View Embedding for Cross-Modal Video Retrieval

Given a gallery of uncaptioned video sequences, this paper considers the...
research
01/27/2020

Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

Fine-grained action recognition datasets exhibit environmental bias, whe...
research
07/11/2021

Aligning Correlation Information for Domain Adaptation in Action Recognition

Domain adaptation (DA) approaches address domain shift and enable networ...
research
06/03/2021

Cross-Domain First Person Audio-Visual Action Recognition through Relative Norm Alignment

First person action recognition is an increasingly researched topic beca...
research
07/09/2023

Mx2M: Masked Cross-Modality Modeling in Domain Adaptation for 3D Semantic Segmentation

Existing methods of cross-modal domain adaptation for 3D semantic segmen...

Please sign up or login with your details

Forgot password? Click here to reset