An Efficient Image-to-Image Translation HourGlass-based Architecture for Object Pushing Policy Learning

08/02/2021
by   Marco Ewerton, et al.
5

Humans effortlessly solve pushing tasks in everyday life but unlocking these capabilities remains a challenge in robotics because physics models of these tasks are often inaccurate or unattainable. State-of-the-art data-driven approaches learn to compensate for these inaccuracies or replace the approximated physics models altogether. Nevertheless, approaches like Deep Q-Networks (DQNs) suffer from local optima in large state-action spaces. Furthermore, they rely on well-chosen deep learning architectures and learning paradigms. In this paper, we propose to frame the learning of pushing policies (where to push and how) by DQNs as an image-to-image translation problem and exploit an Hourglass-based architecture. We present an architecture combining a predictor of which pushes lead to changes in the environment with a state-action value predictor dedicated to the pushing task. Moreover, we investigate positional information encoding to learn position-dependent policy behaviors. We demonstrate in simulation experiments with a UR5 robot arm that our overall architecture helps the DQN learn faster and achieve higher performance in a pushing task involving objects with unknown dynamics.

READ FULL TEXT

page 1

page 5

page 6

page 7

research
07/27/2022

Vector Quantized Image-to-Image Translation

Current image-to-image translation methods formulate the task with condi...
research
07/29/2021

Guided Disentanglement in Generative Networks

Image-to-image translation (i2i) networks suffer from entanglement effec...
research
04/21/2019

TransGaGa: Geometry-Aware Unsupervised Image-to-Image Translation

Unsupervised image-to-image translation aims at learning a mapping betwe...
research
07/24/2020

The Surprising Effectiveness of Linear Unsupervised Image-to-Image Translation

Unsupervised image-to-image translation is an inherently ill-posed probl...
research
05/21/2019

Textured Neural Avatars

We present a system for learning full-body neural avatars, i.e. deep net...
research
12/15/2021

Positional Encoding Augmented GAN for the Assessment of Wind Flow for Pedestrian Comfort in Urban Areas

Approximating wind flows using computational fluid dynamics (CFD) method...
research
02/21/2020

The Surprising Effectiveness of Linear Models for Visual Foresight in Object Pile Manipulation

In this paper, we tackle the problem of pushing piles of small objects i...

Please sign up or login with your details

Forgot password? Click here to reset