Entity Abstraction in Visual Model-Based Reinforcement Learning

10/28/2019
by   Rishi Veerapaneni, et al.
33

This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before. We present object-centric perception, prediction, and planning (OP3), which to the best of our knowledge is the first entity-centric dynamic latent variable framework for model-based reinforcement learning that acquires entity representations from raw visual observations without supervision and uses them to predict and plan. OP3 enforces entity-abstraction – symmetric processing of each entity representation with the same locally-scoped function – which enables it to scale to model different numbers and configurations of objects from those in training. Our approach to solving the key technical challenge of grounding these entity representations to actual objects in the environment is to frame this variable binding problem as an inference problem, and we developing an interactive inference algorithm that uses temporal continuity and interactive feedback to bind information about object properties to the entity variables. On block-stacking tasks, OP3 generalizes to novel block configurations and more objects than observed during training, outperforming an oracle model that assumes access to object supervision and achieving two to three times better accuracy than a state-of-the-art video prediction model.

READ FULL TEXT

page 8

page 9

page 18

research
12/28/2018

Reasoning About Physical Interactions with Object-Oriented Prediction and Planning

Object-based factorizations provide a useful level of abstraction for in...
research
03/20/2023

Neural Constraint Satisfaction: Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement

Object rearrangement is a challenge for embodied agents because solving ...
research
02/18/2022

KINet: Keypoint Interaction Networks for Unsupervised Forward Modeling

Object-centric representation is an essential abstraction for physical r...
research
07/18/2023

Learning Dynamic Attribute-factored World Models for Efficient Multi-object Reinforcement Learning

In many reinforcement learning tasks, the agent has to learn to interact...
research
07/30/2019

GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations

Generative models are emerging as promising tools in robotics and reinfo...
research
04/09/2021

GATSBI: Generative Agent-centric Spatio-temporal Object Interaction

We present GATSBI, a generative model that can transform a sequence of r...
research
11/20/2021

Combining Data-driven Supervision with Human-in-the-loop Feedback for Entity Resolution

The distribution gap between training datasets and data encountered in p...

Please sign up or login with your details

Forgot password? Click here to reset