Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

11/03/2021
by   Tim Seyde, et al.
7

Reinforcement learning (RL) for continuous control typically employs distributions whose support covers the entire action space. In this work, we investigate the colloquially known phenomenon that trained agents often prefer actions at the boundaries of that space. We draw theoretical connections to the emergence of bang-bang behavior in optimal control, and provide extensive empirical evaluation across a variety of recent RL algorithms. We replace the normal Gaussian by a Bernoulli distribution that solely considers the extremes along each action dimension - a bang-bang controller. Surprisingly, this achieves state-of-the-art performance on several continuous control benchmarks - in contrast to robotic hardware, where energy and maintenance cost affect controller choices. Since exploration, learning,and the final solution are entangled in RL, we provide additional imitation learning experiments to reduce the impact of exploration on our analysis. Finally, we show that our observations generalize to environments that aim to model real-world challenges and evaluate factors to mitigate the emergence of bang-bang solutions. Our findings emphasize challenges for benchmarking continuous control algorithms, particularly in light of potential real-world applications.

READ FULL TEXT

page 8

page 9

page 17

research
10/19/2021

Continuous Control with Action Quantization from Demonstrations

In Reinforcement Learning (RL), discrete actions, as opposed to continuo...
research
10/09/2020

Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning

Learning to locomote is one of the most common tasks in physics-based an...
research
02/19/2019

Investigating Generalisation in Continuous Deep Reinforcement Learning

Deep Reinforcement Learning has shown great success in a variety of cont...
research
06/10/2020

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

In recent years, on-policy reinforcement learning (RL) has been successf...
research
01/02/2020

Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics

Many real-world control problems involve both discrete decision variable...
research
09/20/2022

Optimizing Crop Management with Reinforcement Learning and Imitation Learning

Crop management, including nitrogen (N) fertilization and irrigation man...
research
01/02/2022

Toward Causal-Aware RL: State-Wise Action-Refined Temporal Difference

Although it is well known that exploration plays a key role in Reinforce...

Please sign up or login with your details

Forgot password? Click here to reset