Watch-n-Patch: Unsupervised Learning of Actions and Relations

03/11/2016
by   Chenxia Wu, et al.
0

There is a large variation in the activities that humans perform in their everyday lives. We consider modeling these composite human activities which comprises multiple basic level actions in a completely unsupervised setting. Our model learns high-level co-occurrence and temporal relations between the actions. We consider the video as a sequence of short-term action clips, which contains human-words and object-words. An activity is about a set of action-topics and object-topics indicating which actions are present and which objects are interacting with. We then propose a new probabilistic model relating the words and the topics. It allows us to model long-range action relations that commonly exist in the composite activities, which is challenging in previous works. We apply our model to the unsupervised action segmentation and clustering, and to a novel application that detects forgotten actions, which we call action patching. For evaluation, we contribute a new challenging RGB-D activity video dataset recorded by the new Kinect v2, which contains several human daily activities as compositions of multiple actions interacting with different objects. Moreover, we develop a robotic system that watches people and reminds people by applying our action patching algorithm. Our robotic setup can be easily deployed on any assistive robot.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 8

page 10

page 12

research
12/14/2015

Watch-Bot: Unsupervised Learning for Reminding Humans of Forgotten Actions

We present a robotic system that watches a human using a Kinect v2 RGB-D...
research
08/15/2021

Temporal Action Segmentation with High-level Complex Activity Labels

Over the past few years, the success in action recognition on short trim...
research
08/20/2020

Learning to Abstract and Predict Human Actions

Human activities are naturally structured as hierarchies unrolled over t...
research
12/04/2018

Timeception for Complex Action Recognition

This paper focuses on the temporal aspect for recognizing human activiti...
research
02/22/2016

Augur: Mining Human Behaviors from Fiction to Power Interactive Systems

From smart homes that prepare coffee when we wake, to phones that know n...
research
10/04/2012

Learning Human Activities and Object Affordances from RGB-D Videos

Understanding human activities and object affordances are two very impor...
research
04/07/2021

The Use of Video Captioning for Fostering Physical Activity

Video Captioning is considered to be one of the most challenging problem...

Please sign up or login with your details

Forgot password? Click here to reset