Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks

11/21/2022
by   Anton Dereventsov, et al.
0

This effort is focused on examining the behavior of reinforcement learning systems in personalization environments and detailing the differences in policy entropy associated with the type of learning algorithm utilized. We demonstrate that Policy Optimization agents often possess low-entropy policies during training, which in practice results in agents prioritizing certain actions and avoiding others. Conversely, we also show that Q-Learning agents are far less susceptible to such behavior and generally maintain high-entropy policies throughout training, which is often preferable in real-world applications. We provide a wide range of numerical experiments as well as theoretical justification to show that these differences in entropy are due to the type of learning being employed.

READ FULL TEXT

page 4

page 5

page 6

page 7

page 11

page 15

research
11/03/2019

Maximum Entropy Diverse Exploration: Disentangling Maximum Entropy Reinforcement Learning

Two hitherto disconnected threads of research, diverse exploration (DE) ...
research
04/21/2021

Policy Fusion for Adaptive and Customizable Reinforcement Learning Agents

In this article we study the problem of training intelligent agents usin...
research
05/25/2020

Policy Entropy for Out-of-Distribution Classification

One critical prerequisite for the deployment of reinforcement learning s...
research
09/19/2022

Measuring Interventional Robustness in Reinforcement Learning

Recent work in reinforcement learning has focused on several characteris...
research
10/05/2019

Attention-based Fault-tolerant Approach for Multi-agent Reinforcement Learning Systems

The aim of multi-agent reinforcement learning systems is to provide inte...
research
11/27/2018

Understanding the impact of entropy in policy learning

Entropy regularization is commonly used to improve policy optimization i...
research
06/26/2020

Q-Learning with Differential Entropy of Q-Tables

It is well-known that information loss can occur in the classic and simp...

Please sign up or login with your details

Forgot password? Click here to reset