Ranking Policy Decisions

08/31/2020
by   Hadrien Pouget, et al.
0

Policies trained via Reinforcement Learning (RL) are often needlessly complex, making them more difficult to analyse and interpret. In a run with n time steps, a policy will decide n times on an action to take, even when only a tiny subset of these decisions deliver value over selecting a simple default action. Given a pre-trained policy, we propose a black-box method based on statistical fault localisation that ranks the states of the environment according to the importance of decisions made in those states. We evaluate our ranking method by creating new, simpler policies by pruning decisions identified as unimportant, and measure the impact on performance. Our experimental results on a diverse set of standard benchmarks (gridworld, CartPole, Atari games) show that in some cases less than half of the decisions made contribute to the expected reward. We furthermore show that the decisions made in the most frequently visited states are not the most important for the expected reward.

READ FULL TEXT

page 5

page 12

research
11/16/2021

Causal policy ranking

Policies trained via reinforcement learning (RL) are often very complex ...
research
10/08/2021

Training Transition Policies via Distribution Matching for Complex Tasks

Humans decompose novel complex tasks into simpler ones to exploit previo...
research
08/22/2022

Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking

Evolution Strategy (ES) is a powerful black-box optimization technique b...
research
03/16/2022

Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act

Traditionally, Reinforcement Learning (RL) aims at deciding how to act o...
research
03/19/2018

An Adaptable System to Support Provenance Management for the Public Policy-Making Process in Smart Cities

Government policies aim to address public issues and problems and theref...
research
03/17/2023

Policy/mechanism separation in the Warehouse-Scale OS

"As many of us know from bitter experience, the policies provided in ext...
research
06/06/2020

Understanding Finite-State Representations of Recurrent Policy Networks

We introduce an approach for understanding finite-state machine (FSM) re...

Please sign up or login with your details

Forgot password? Click here to reset