Learning Navigation Behaviors End to End

by   Hao-Tien Lewis Chiang, et al.

A longstanding goal of behavior-based robotics is to solve high-level navigation tasks using end to end navigation behaviors that directly map sensors to actions. Navigation behaviors, such as reaching a goal or following a path without collisions, can be learned from exploration and interaction with the environment, but are constrained by the type and quality of a robot's sensors, dynamics, and actuators. Traditional motion planning handles varied robot geometry and dynamics, but typically assumes high-quality observations. Modern vision-based navigation typically considers imperfect or partial observations, but simplifies the robot action space. With both approaches, the transition from simulation to reality can be difficult. Here, we learn two end to end navigation behaviors that avoid moving obstacles: point to point and path following. These policies receive noisy lidar observations and output robot linear and angular velocities. We train these policies in small, static environments with Shaped-DDPG, an adaptation of the Deep Deterministic Policy Gradient (DDPG) reinforcement learning method which optimizes reward and network architecture. Over 500 meters of on-robot experiments show , these policies generalize to new environments and moving obstacles, are robust to sensor, actuator, and localization noise, and can serve as robust building blocks for larger navigation tasks. The path following and point and point policies are 83


page 1

page 2

page 6

page 9


Learning Composable Behavior Embeddings for Long-horizon Visual Navigation

Learning high-level navigation behaviors has important implications: it ...

Indoor Point-to-Point Navigation with Deep Reinforcement Learning and Ultra-wideband

Indoor autonomous navigation requires a precise and accurate localizatio...

End-to-End Partially Observable Visual Navigation in a Diverse Environment

How can a robot navigate successfully in a rich and diverse environment,...

Deep Reactive Planning in Dynamic Environments

The main novelty of the proposed approach is that it allows a robot to l...

Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation

Multisensory polices are known to enhance both state estimation and targ...

Vision-Based Goal-Conditioned Policies for Underwater Navigation in the Presence of Obstacles

We present Nav2Goal, a data-efficient and end-to-end learning method for...

Towards navigation without precise localization: Weakly supervised learning of goal-directed navigation cost map

Autonomous navigation based on precise localization has been widely deve...

Please sign up or login with your details

Forgot password? Click here to reset