Towards real-world navigation with deep differentiable planners

08/08/2021
by   Shu Ishida, et al.
3

We train embodied neural networks to plan and navigate unseen complex 3D environments, emphasising real-world deployment. Rather than requiring prior knowledge of the agent or environment, the planner learns to model the state transitions and rewards. To avoid the potentially hazardous trial-and-error of reinforcement learning, we focus on differentiable planners such as Value Iteration Networks (VIN), which are trained offline from safe expert demonstrations. Although they work well in small simulations, we address two major limitations that hinder their deployment. First, we observed that current differentiable planners struggle to plan long-term in environments with a high branching complexity. While they should ideally learn to assign low rewards to obstacles to avoid collisions, we posit that the constraints imposed on the network are not strong enough to guarantee the network to learn sufficiently large penalties for every possible collision. We thus impose a structural constraint on the value iteration, which explicitly learns to model any impossible actions. Secondly, we extend the model to work with a limited perspective camera under translation and rotation, which is crucial for real robot deployment. Many VIN-like planners assume a 360 degrees or overhead view without rotation. In contrast, our method uses a memory-efficient lattice map to aggregate CNN embeddings of partial observations, and models the rotational dynamics explicitly using a 3D state-space grid (translation and rotation). Our proposals significantly improve semantic navigation and exploration on several 2D and 3D environments, succeeding in settings that are otherwise challenging for this class of methods. As far as we know, we are the first to successfully perform differentiable planning on the difficult Active Vision Dataset, consisting of real images captured from a robot.

READ FULL TEXT

page 2

page 6

page 13

page 15

page 26

page 28

page 29

page 30

research
05/28/2018

Value Propagation Networks

We present Value Propagation (VProp), a parameter-efficient differentiab...
research
10/19/2020

Extended Abstract: Motion Planners Learned from Geometric Hallucination

Learning motion planners to move robot from one point to another within ...
research
10/16/2020

Agile Robot Navigation through Hallucinated Learning and Sober Deployment

Learning from Hallucination (LfH) is a recent machine learning paradigm ...
research
05/27/2019

Value Iteration Networks on Multiple Levels of Abstraction

Learning-based methods are promising to plan robot motion without perfor...
research
07/08/2020

Self-Supervised Policy Adaptation during Deployment

In most real world scenarios, a policy trained by reinforcement learning...
research
09/10/2019

Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning

Prior access to domain knowledge could significantly improve the perform...
research
09/17/2017

Memory Augmented Control Networks

Planning problems in partially observable environments cannot be solved ...

Please sign up or login with your details

Forgot password? Click here to reset