Unsupervised Human Action Detection by Action Matching

12/02/2016
by   Basura Fernando, et al.
0

We propose a new task of unsupervised action detection by action matching. Given two long videos, the objective is to temporally detect all pairs of matching video segments. A pair of video segments are matched if they share the same human action. The task is category independent---it does not matter what action is being performed---and no supervision is used to discover such video segments. Unsupervised action detection by action matching allows us to align videos in a meaningful manner. As such, it can be used to discover new action categories or as an action proposal technique within, say, an action detection pipeline. Moreover, it is a useful pre-processing step for generating video highlights, e.g., from sports videos. We present an effective and efficient method for unsupervised action detection. We use an unsupervised temporal encoding method and exploit the temporal consistency in human actions to obtain candidate action segments. We evaluate our method on this challenging task using three activity recognition benchmarks, namely, the MPII Cooking activities dataset, the THUMOS15 action detection benchmark and a new dataset called the IKEA dataset. On the MPII Cooking dataset we detect action segments with a precision of 21.6 of 11.7 Similarly, on THUMOS dataset we obtain 18.4 5094 ground truth action segment pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2017

Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze

Unsupervised segmentation of action segments in egocentric videos is a d...
research
08/15/2021

Temporal Action Segmentation with High-level Complex Activity Labels

Over the past few years, the success in action recognition on short trim...
research
03/28/2017

Towards Automatic Learning of Procedures from Web Instructional Videos

The potential for agents, whether embodied or software, to learn by obse...
research
09/30/2022

A Closer Look at Temporal Ordering in the Segmentation of Instructional Videos

Understanding the steps required to perform a task is an important skill...
research
11/10/2018

Near Real-Time Data Labeling Using a Depth Sensor for EMG Based Prosthetic Arms

Recognizing sEMG (Surface Electromyography) signals belonging to a parti...
research
10/19/2018

Temporal Action Detection by Joint Identification-Verification

Temporal action detection aims at not only recognizing action category b...
research
02/10/2016

DAP3D-Net: Where, What and How Actions Occur in Videos?

Action parsing in videos with complex scenes is an interesting but chall...

Please sign up or login with your details

Forgot password? Click here to reset