A Light-Weight Contrastive Approach for Aligning Human Pose Sequences

03/07/2023
by   Robert T. Collins, et al.
0

We present a simple unsupervised method for learning an encoder mapping short 3D pose sequences into embedding vectors suitable for sequence-to-sequence alignment by dynamic time warping. Training samples consist of temporal windows of frames containing 3D body points such as mocap markers or skeleton joints. A light-weight, 3-layer encoder is trained using a contrastive loss function that encourages embedding vectors of augmented sample pairs to have cosine similarity 1, and similarity 0 with all other samples in a minibatch. When multiple scripted training sequences are available, temporal alignments inferred from an initial round of training are harvested to extract additional, cross-performance match pairs for a second phase of training to refine the encoder. In addition to being simple, the proposed method is fast to train, making it easy to adapt to new data using different marker sets or skeletal joint layouts. Experimental results illustrate ease of use, transferability, and utility of the learned embeddings for comparing and analyzing human behavior sequences.

READ FULL TEXT

page 1

page 6

page 7

page 13

page 15

page 16

research
07/01/2015

Pose Embeddings: A Deep Architecture for Learning to Match Human Poses

We present a method for learning an embedding that places images of huma...
research
05/11/2021

Representation Learning via Global Temporal Alignment and Cycle-Consistency

We introduce a weakly supervised method for representation learning base...
research
04/11/2022

Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning

We introduce a simple neural encoder architecture that can be trained us...
research
06/30/2020

SCE: Scalable Network Embedding from Sparsest Cut

Large-scale network embedding is to learn a latent representation for ea...
research
07/15/2021

Multi-Level Contrastive Learning for Few-Shot Problems

Contrastive learning is a discriminative approach that aims at grouping ...
research
12/02/2020

Unsupervised Learning on Monocular Videos for 3D Human Pose Estimation

In this paper, we introduce an unsupervised feature extraction method th...
research
06/20/2019

Conflict as an Inverse of Attention in Sequence Relationship

Attention is a very efficient way to model the relationship between two ...

Please sign up or login with your details

Forgot password? Click here to reset