Unsupervised Discriminative Embedding for Sub-Action Learning in Complex Activities

04/30/2021
by   Sirnam Swetha, et al.
0

Action recognition and detection in the context of long untrimmed video sequences has seen an increased attention from the research community. However, annotation of complex activities is usually time consuming and challenging in practice. Therefore, recent works started to tackle the problem of unsupervised learning of sub-actions in complex activities. This paper proposes a novel approach for unsupervised sub-action learning in complex activities. The proposed method maps both visual and temporal representations to a latent space where the sub-actions are learnt discriminatively in an end-to-end fashion. To this end, we propose to learn sub-actions as latent concepts and a novel discriminative latent concept learning (DLCL) module aids in learning sub-actions. The proposed DLCL module lends on the idea of latent concepts to learn compact representations in the latent embedding space in an unsupervised way. The result is a set of latent vectors that can be interpreted as cluster centers in the embedding space. The latent space itself is formed by a joint visual and temporal embedding capturing the visual similarity and temporal ordering of the data. Our joint learning with discriminative latent concept module is novel which eliminates the need for explicit clustering. We validate our approach on three benchmark datasets and show that the proposed combination of visual-temporal embedding and discriminative latent concepts allow to learn robust action representations in an unsupervised setting.

READ FULL TEXT

page 5

page 6

01/29/2020

Joint Visual-Temporal Embedding for Unsupervised Learning of Actions in Untrimmed Sequences

Understanding the structure of complex activities in videos is one of th...
04/08/2019

Unsupervised learning of action classes with continuous temporal embedding

The task of temporally detecting and segmenting actions in untrimmed vid...
03/26/2018

Unsupervised Learning and Segmentation of Complex Activities from Video

This paper presents a new method for unsupervised segmentation of comple...
03/25/2022

Unsupervised Learning of Temporal Abstractions with Slot-based Transformers

The discovery of reusable sub-routines simplifies decision-making and pl...
07/16/2020

Learning End-to-End Action Interaction by Paired-Embedding Data Augmentation

In recognition-based action interaction, robots' responses to human acti...
12/04/2021

Interactive Disentanglement: Learning Concepts by Interacting with their Prototype Representations

Learning visual concepts from raw images without strong supervision is a...
05/29/2023

Concept Decomposition for Visual Exploration and Inspiration

A creative idea is often born from transforming, combining, and modifyin...

Please sign up or login with your details

Forgot password? Click here to reset