DeepAI AI Chat
Log In Sign Up

Observation Space Matters: Benchmark and Optimization Algorithm

by   Joanne Taery Kim, et al.

Recent advances in deep reinforcement learning (deep RL) enable researchers to solve challenging control problems, from simulated environments to real-world robotic tasks. However, deep RL algorithms are known to be sensitive to the problem formulation, including observation spaces, action spaces, and reward functions. There exist numerous choices for observation spaces but they are often designed solely based on prior knowledge due to the lack of established principles. In this work, we conduct benchmark experiments to verify common design choices for observation spaces, such as Cartesian transformation, binary contact flags, a short history, or global positions. Then we propose a search algorithm to find the optimal observation spaces, which examines various candidate observation spaces and removes unnecessary observation channels with a Dropout-Permutation test. We demonstrate that our algorithm significantly improves learning speed compared to manually designed observation spaces. We also analyze the proposed algorithm by evaluating different hyperparameters.


Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning

Learning to locomote is one of the most common tasks in physics-based an...

Transfer RL across Observation Feature Spaces via Model-Based Regularization

In many reinforcement learning (RL) applications, the observation space ...

Generalising Discrete Action Spaces with Conditional Action Trees

There are relatively few conventions followed in reinforcement learning ...

Partial Observability during DRL for Robot Control

Deep Reinforcement Learning (DRL) has made tremendous advances in both s...

Defining the problem of Observation Learning

This article defines and formulates the problem of observation learning ...

Investigating Generalisation in Continuous Deep Reinforcement Learning

Deep Reinforcement Learning has shown great success in a variety of cont...

Understanding reinforcement learned crowds

Simulating trajectories of virtual crowds is a commonly encountered task...