Toward Causal-Aware RL: State-Wise Action-Refined Temporal Difference

01/02/2022
by   Hao Sun, et al.
0

Although it is well known that exploration plays a key role in Reinforcement Learning (RL), prevailing exploration strategies for continuous control tasks in RL are mainly based on naive isotropic Gaussian noise regardless of the causality relationship between action space and the task and consider all dimensions of actions equally important. In this work, we propose to conduct interventions on the primal action space to discover the causal relationship between the action space and the task reward. We propose the method of State-Wise Action Refined (SWAR), which addresses the issue of action space redundancy and promote causality discovery in RL. We formulate causality discovery in RL tasks as a state-dependent action space selection problem and propose two practical algorithms as solutions. The first approach, TD-SWAR, detects task-related actions during temporal difference learning, while the second approach, Dyn-SWAR, reveals important actions through dynamic model prediction. Empirically, both methods provide approaches to understand the decisions made by RL agents and improve learning efficiency in action-redundant tasks.

READ FULL TEXT

page 7

page 14

research
06/06/2017

Parameter Space Noise for Exploration

Deep reinforcement learning (RL) methods generally engage in exploratory...
research
12/06/2021

Temporal-Spatial Causal Interpretations for Vision-Based Reinforcement Learning

Deep reinforcement learning (RL) agents are becoming increasingly profic...
research
10/19/2021

Continuous Control with Action Quantization from Demonstrations

In Reinforcement Learning (RL), discrete actions, as opposed to continuo...
research
09/06/2018

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Learning how to act when there are many available actions in each state ...
research
05/30/2022

DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems

Muscle-actuated organisms are capable of learning an unparalleled divers...
research
05/11/2022

Characterizing the Action-Generalization Gap in Deep Q-Learning

We study the action generalization ability of deep Q-learning in discret...
research
11/03/2021

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

Reinforcement learning (RL) for continuous control typically employs dis...

Please sign up or login with your details

Forgot password? Click here to reset