Grounding Predicates through Actions

09/29/2021
by   Toki Migimatsu, et al.
0

Symbols representing abstract states such as "dish in dishwasher" or "cup on table" allow robots to reason over long horizons by hiding details unnecessary for high level planning. Current methods for learning to identify symbolic states in visual data require large amounts of labeled training data, but manually annotating such datasets is prohibitively expensive due to the combinatorial number of predicates in images. We propose a novel method for automatically labeling symbolic states in large-scale video activity datasets by exploiting known pre- and post-conditions of actions. This automatic labeling scheme only requires weak supervision in the form of an action label that describes which action is demonstrated in each video. We apply our framework to an existing large-scale human activity dataset. We train predicate classifiers to identify symbolic relationships between objects when prompted with object bounding boxes and achieve 0.93 test accuracy. We further demonstrate the ability of these predicate classifiers trained on human data to be applied to robot environments in a real-world task planning domain.

READ FULL TEXT

page 1

page 5

research
03/08/2020

Transferable Task Execution from Pixels through Deep Planning Domain Learning

While robots can learn models to solve many manipulation tasks from raw ...
research
08/03/2020

Action sequencing using visual permutations

Humans can easily reason about the sequence of high level actions needed...
research
03/04/2022

Symbolic State Estimation with Predicates for Contact-Rich Manipulation Tasks

Manipulation tasks often require a robot to adjust its sensorimotor skil...
research
12/26/2017

SLAC: A Sparsely Labeled Dataset for Action Classification and Localization

This paper describes a procedure for the creation of large-scale video d...
research
03/10/2020

Video Caption Dataset for Describing Human Actions in Japanese

In recent years, automatic video caption generation has attracted consid...
research
05/26/2023

Discovering Novel Actions in an Open World with Object-Grounded Visual Commonsense Reasoning

Learning to infer labels in an open world, i.e., in an environment where...
research
02/21/2022

A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets

Intent classifiers are vital to the successful operation of virtual agen...

Please sign up or login with your details

Forgot password? Click here to reset