Learning Geometric Representations of Objects via Interaction

09/11/2023
by   Alfredo Reichlin, et al.
0

We address the problem of learning representations from observations of a scene involving an agent and an external object the agent interacts with. To this end, we propose a representation learning framework extracting the location in physical space of both the agent and the object from unstructured observations of arbitrary nature. Our framework relies on the actions performed by the agent as the only source of supervision, while assuming that the object is displaced by the agent via unknown dynamics. We provide a theoretical foundation and formally prove that an ideal learner is guaranteed to infer an isometric representation, disentangling the agent from the object and correctly extracting their locations. We evaluate empirically our framework on a variety of scenarios, showing that it outperforms vision-based approaches such as a state-of-the-art keypoint extractor. We moreover demonstrate how the extracted representations enable the agent to solve downstream tasks via reinforcement learning in an efficient manner.

READ FULL TEXT

page 10

page 13

research
06/19/2019

Unsupervised Learning of Object Structure and Dynamics from Videos

Extracting and predicting object structure and dynamics from videos with...
research
02/18/2022

KINet: Keypoint Interaction Networks for Unsupervised Forward Modeling

Object-centric representation is an essential abstraction for physical r...
research
06/03/2023

MA2CL:Masked Attentive Contrastive Learning for Multi-Agent Reinforcement Learning

Recent approaches have utilized self-supervised auxiliary tasks as repre...
research
04/09/2021

GATSBI: Generative Agent-centric Spatio-temporal Object Interaction

We present GATSBI, a generative model that can transform a sequence of r...
research
06/19/2019

Unsupervised Learning of Object Keypoints for Perception and Control

The study of object representations in computer vision has primarily foc...
research
09/30/2022

An information-theoretic approach to unsupervised keypoint representation learning

Extracting informative representations from videos is fundamental for th...
research
06/16/2020

Learning About Objects by Learning to Interact with Them

Much of the remarkable progress in computer vision has been focused arou...

Please sign up or login with your details

Forgot password? Click here to reset