Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective

01/31/2019
by   Anirudh Vemula, et al.
4

Black-box optimizers that explore in parameter space have often been shown to outperform more sophisticated action space exploration methods developed specifically for the reinforcement learning problem. We examine these black-box methods closely to identify situations in which they are worse than action space exploration methods and those in which they are superior. Through simple theoretical analyses, we prove that complexity of exploration in parameter space depends on the dimensionality of parameter space, while complexity of exploration in action space depends on both the dimensionality of action space and horizon length. This is also demonstrated empirically by comparing simple exploration methods on several model problems, including Contextual Bandit, Linear Regression and Reinforcement Learning in continuous control.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2020

Exploration in Action Space

Parameter space exploration methods with black-box optimization have rec...
research
05/23/2019

Combine PPO with NES to Improve Exploration

We introduce two approaches for combining neural evolution strategy (NES...
research
03/14/2023

Systematic design space exploration by learning the explored space using Machine Learning

Current practice in parameter space exploration in euclidean space is do...
research
02/21/2020

On the Search for Feedback in Reinforcement Learning

This paper addresses the problem of learning the optimal feedback policy...
research
02/22/2022

A Comparative Study of Deep Reinforcement Learning-based Transferable Energy Management Strategies for Hybrid Electric Vehicles

The deep reinforcement learning-based energy management strategies (EMS)...
research
04/19/2018

An open-source job management framework for parameter-space exploration: OACIS

We present an open-source software framework for parameter-space explora...
research
09/21/2022

ECSAS: Exploring Critical Scenarios from Action Sequence in Autonomous Driving

Critical scenario generation requires the ability of sampling critical c...

Please sign up or login with your details

Forgot password? Click here to reset