Q-Learning with Differential Entropy of Q-Tables

06/26/2020
by   Tung D. Nguyen, et al.
0

It is well-known that information loss can occur in the classic and simple Q-learning algorithm. Entropy-based policy search methods were introduced to replace Q-learning and to design algorithms that are more robust against information loss. We conjecture that the reduction in performance during prolonged training sessions of Q-learning is caused by a loss of information, which is non-transparent when only examining the cumulative reward without changing the Q-learning algorithm itself. We introduce Differential Entropy of Q-tables (DE-QT) as an external information loss detector to the Q-learning algorithm. The behaviour of DE-QT over training episodes is analyzed to find an appropriate stopping criterion during training. The results reveal that DE-QT can detect the most appropriate stopping point, where a balance between a high success rate and a high efficiency is met for classic Q-Learning algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2013

A new stopping criterion for the mean shift iterative algorithm

The mean shift iterative algorithm was proposed in 2006, for using the e...
research
03/04/2019

QuickStop: A Markov Optimal Stopping Approach for Quickest Misinformation Detection

This paper combines data-driven and model-driven methods for real-time m...
research
08/31/2022

Tree-Based Adaptive Model Learning

We extend the Kearns-Vazirani learning algorithm to be able to handle sy...
research
11/21/2022

Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks

This effort is focused on examining the behavior of reinforcement learni...
research
03/13/2013

An Entropy-based Learning Algorithm of Bayesian Conditional Trees

This article offers a modification of Chow and Liu's learning algorithm ...
research
09/16/2014

Compute Less to Get More: Using ORC to Improve Sparse Filtering

Sparse Filtering is a popular feature learning algorithm for image class...
research
05/31/2021

The Role of Entropy in Guiding a Connection Prover

In this work we study how to learn good algorithms for selecting reasoni...

Please sign up or login with your details

Forgot password? Click here to reset