CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

06/02/2023
by   Ayush Agrawal, et al.
0

This paper introduces a novel method for determining the best room to place an object in, for embodied scene rearrangement. While state-of-the-art approaches rely on large language models (LLMs) or reinforcement learned (RL) policies for this task, our approach, CLIPGraphs, efficiently combines commonsense domain knowledge, data-driven methods, and recent advances in multimodal learning. Specifically, it (a)encodes a knowledge graph of prior human preferences about the room location of different objects in home environments, (b) incorporates vision-language features to support multimodal queries based on images or text, and (c) uses a graph network to learn object-room affinities based on embeddings of the prior knowledge and the vision-language features. We demonstrate that our approach provides better estimates of the most appropriate location of objects from a benchmark set of object categories in comparison with state-of-the-art baselines

READ FULL TEXT

page 1

page 6

research
07/21/2022

TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors

We introduce TIDEE, an embodied agent that tidies up a disordered scene ...
research
10/27/2021

SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation

Natural language instructions for visual navigation often use scene desc...
research
08/31/2019

Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs

Recently, biomedical version of embeddings obtained from language models...
research
10/08/2020

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

Text-based games have emerged as an important test-bed for Reinforcement...
research
05/09/2023

TidyBot: Personalized Robot Assistance with Large Language Models

For a robot to personalize physical assistance effectively, it must lear...
research
06/26/2019

Generalization to Novel Objects using Prior Relational Knowledge

To solve tasks in new environments involving objects unseen during train...
research
05/13/2019

Joint Object and State Recognition using Language Knowledge

The state of an object is an important piece of knowledge in robotics ap...

Please sign up or login with your details

Forgot password? Click here to reset