Structured Latent Variable Models for Articulated Object Interaction

05/26/2023
by   Emily Liu, et al.
0

In this paper, we investigate a scenario in which a robot learns a low-dimensional representation of a door given a video of the door opening or closing. This representation can be used to infer door-related parameters and predict the outcomes of interacting with the door. Current machine learning based approaches in the doors domain are based primarily on labelled datasets. However, the large quantity of available door data suggests the feasibility of a semisupervised approach based on pretraining. To exploit the hierarchical structure of the dataset where each door has multiple associated images, we pretrain with a structured latent variable model known as a neural statistician. The neural satsitician enforces separation between shared context-level variables (common across all images associated with the same door) and instance-level variables (unique to each individual image). We first demonstrate that the neural statistician is able to learn an embedding that enables reconstruction and sampling of realistic door images. Then, we evaluate the correspondence of the learned embeddings to human-interpretable parameters in a series of supervised inference tasks. It was found that a pretrained neural statistician encoder outperformed analogous context-free baselines when predicting door handedness, size, angle location, and configuration from door images. Finally, in a visual bandit door-opening task with a variety of door configuration, we found that neural statistician embeddings achieve lower regret than context-free baselines.

READ FULL TEXT

page 2

page 4

page 7

page 8

research
11/01/2018

Latent Visual Cues for Neural Machine Translation

In this work, we propose to model the interaction between visual and tex...
research
10/07/2022

GOLLIC: Learning Global Context beyond Patches for Lossless High-Resolution Image Compression

Neural-network-based approaches recently emerged in the field of data co...
research
09/29/2020

Zero-Shot Clinical Acronym Expansion with a Hierarchical Metadata-Based Latent Variable Model

We introduce Latent Meaning Cells, a deep latent variable model which le...
research
05/02/2021

Learning Visually Guided Latent Actions for Assistive Teleoperation

It is challenging for humans – particularly those living with physical d...
research
05/17/2017

Learning a Hierarchical Latent-Variable Model of 3D Shapes

We propose the Variational Shape Learner (VSL), a hierarchical latent-va...
research
03/29/2016

Latent Embeddings for Zero-shot Classification

We present a novel latent embedding model for learning a compatibility f...
research
09/30/2019

Imagine That! Leveraging Emergent Affordances for Tool Synthesis in Reaching Tasks

In this paper we investigate an artificial agent's ability to perform ta...

Please sign up or login with your details

Forgot password? Click here to reset