GIFT: Generalizable Interaction-aware Functional Tool Affordances without Labels

06/28/2021
by   Dylan Turpin, et al.
0

Tool use requires reasoning about the fit between an object's affordances and the demands of a task. Visual affordance learning can benefit from goal-directed interaction experience, but current techniques rely on human labels or expert demonstrations to generate this data. In this paper, we describe a method that grounds affordances in physical interactions instead, thus removing the need for human labels or expert policies. We use an efficient sampling-based method to generate successful trajectories that provide contact data, which are then used to reveal affordance representations. Our framework, GIFT, operates in two phases: first, we discover visual affordances from goal-directed interaction with a set of procedurally generated tools; second, we train a model to predict new instances of the discovered affordances on novel tools in a self-supervised fashion. In our experiments, we show that GIFT can leverage a sparse keypoint representation to predict grasp and interaction points to accommodate multiple tasks, such as hooking, reaching, and hammering. GIFT outperforms baselines on all tasks and matches a human oracle on two of three tasks using novel tools.

READ FULL TEXT

page 2

page 3

page 6

page 7

page 9

page 10

page 11

page 13

research
04/11/2019

Improvisation through Physical Understanding: Using Novel Objects as Tools with Visual Foresight

Machine learning techniques have enabled robots to learn narrow, yet com...
research
10/26/2019

KETO: Learning Keypoint Representations for Tool Manipulation

We aim to develop an algorithm for robots to manipulate novel objects as...
research
12/01/2021

D-Grasp: Physically Plausible Dynamic Grasp Synthesis for Hand-Object Interactions

We introduce the dynamic grasp synthesis task: given an object with a kn...
research
07/19/2021

Playful Interactions for Representation Learning

One of the key challenges in visual imitation learning is collecting lar...
research
06/25/2019

Time-Varying Interaction Estimation Using Ensemble Methods

Directed information (DI) is a useful tool to explore time-directed inte...
research
09/23/2021

How much "human-like" visual experience do current self-supervised learning algorithms need to achieve human-level object recognition?

This paper addresses a fundamental question: how good are our current se...

Please sign up or login with your details

Forgot password? Click here to reset