Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning

10/09/2020
by   Daniele Reda, et al.
0

Learning to locomote is one of the most common tasks in physics-based animation and deep reinforcement learning (RL). A learned policy is the product of the problem to be solved, as embodied by the RL environment, and the RL algorithm. While enormous attention has been devoted to RL algorithms, much less is known about the impact of design choices for the RL environment. In this paper, we show that environment design matters in significant ways and document how it can contribute to the brittle nature of many RL results. Specifically, we examine choices related to state representations, initial state distributions, reward structure, control frequency, episode termination procedures, curriculum usage, the action space, and the torque limits. We aim to stimulate discussion around such choices, which in practice strongly impact the success of RL when applied to continuous-action control problems of interest to animation, such as learning to locomote.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2020

Can Reinforcement Learning for Continuous Control Generalize Across Physics Engines?

Reinforcement learning (RL) algorithms should learn as much as possible ...
research
06/10/2020

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

In recent years, on-policy reinforcement learning (RL) has been successf...
research
03/13/2019

Task-oriented Design through Deep Reinforcement Learning

We propose a new low-cost machine-learning-based methodology which assis...
research
11/02/2020

Observation Space Matters: Benchmark and Optimization Algorithm

Recent advances in deep reinforcement learning (deep RL) enable research...
research
11/03/2021

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

Reinforcement learning (RL) for continuous control typically employs dis...
research
04/24/2020

Self-Paced Deep Reinforcement Learning

Generalization and reuse of agent behaviour across a variety of learning...
research
08/11/2020

Hardware as Policy: Mechanical and Computational Co-Optimization using Deep Reinforcement Learning

Deep Reinforcement Learning (RL) has shown great success in learning com...

Please sign up or login with your details

Forgot password? Click here to reset