Video alignment using unsupervised learning of local and global features

04/13/2023
by   Niloofar Fakhfour, et al.
0

In this paper, we tackle the problem of video alignment, the process of matching the frames of a pair of videos containing similar actions. The main challenge in video alignment is that accurate correspondence should be established despite the differences in the execution processes and appearances between the two videos. We introduce an unsupervised method for alignment that uses global and local features of the frames. In particular, we introduce effective features for each video frame by means of three machine vision tools: person detection, pose estimation, and VGG network. Then the features are processed and combined to construct a multidimensional time series that represent the video. The resulting time series are used to align videos of the same actions using a novel version of dynamic time warping named Diagonalized Dynamic Time Warping(DDTW). The main advantage of our approach is that no training is required, which makes it applicable for any new type of action without any need to collect training samples for it. For evaluation, we considered video synchronization and phase classification tasks on the Penn action dataset. Also, for an effective evaluation of the video synchronization task, we present a new metric called Enclosed Area Error(EAE). The results show that our method outperforms previous state-of-the-art methods, such as TCC and other self-supervised and supervised methods.

READ FULL TEXT

page 1

page 3

research
12/06/2022

Self-supervised and Weakly Supervised Contrastive Learning for Frame-wise Action Representations

Previous work on action representation learning focused on global repres...
research
08/24/2023

PoseSync: Robust pose based video synchronization

Pose based video sychronization can have applications in multiple domain...
research
10/19/2016

Learning Robust Video Synchronization without Annotations

Aligning video sequences is a fundamental yet still unsolved component f...
research
06/06/2019

Learning Temporal Pose Estimation from Sparsely-Labeled Videos

Modern approaches for multi-person pose estimation in video require larg...
research
11/17/2021

Learning to Align Sequential Actions in the Wild

State-of-the-art methods for self-supervised sequential action alignment...
research
01/24/2018

Unsupervised learning from videos using temporal coherency deep networks

In this work we address the challenging problem of unsupervised learning...
research
11/20/2017

Self-Similarity Based Time Warping

In this work, we explore the problem of aligning two time-ordered point ...

Please sign up or login with your details

Forgot password? Click here to reset