Analyzing the Hidden Activations of Deep Policy Networks: Why Representation Matters

03/11/2021
by   Trevor A. McInroe, et al.
0

We analyze the hidden activations of neural network policies of deep reinforcement learning (RL) agents and show, empirically, that it's possible to know a priori if a state representation will lend itself to fast learning. RL agents in high-dimensional states have two main learning burdens: (1) to learn an action-selection policy and (2) to learn to discern between useful and non-useful information in a given state. By learning a latent representation of these high-dimensional states with an auxiliary model, the latter burden is effectively removed, thereby leading to accelerated training progress. We examine this phenomenon across tasks in the PyBullet Kuka environment, where an agent must learn to control a robotic gripper to pick up an object. Our analysis reveals how neural network policies learn to organize their internal representation of the state space throughout training. The results from this analysis provide three main insights into how deep RL agents learn. First, a well-organized internal representation within the policy network is a prerequisite to learning good action-selection. Second, a poor initial representation can cause an unrecoverable collapse within a policy network. Third, a good initial representation allows an agent's policy network to organize its internal representation even before any training begins.

READ FULL TEXT
research
10/11/2021

Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning

Deep reinforcement learning (RL) agents that exist in high-dimensional s...
research
12/07/2018

Measuring and Characterizing Generalization in Deep Reinforcement Learning

Deep reinforcement-learning methods have achieved remarkable performance...
research
03/25/2019

On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning

In autonomous embedded systems, it is often vital to reduce the amount o...
research
11/11/2018

Towards Governing Agent's Efficacy: Action-Conditional β-VAE for Deep Transparent Reinforcement Learning

We tackle the blackbox issue of deep neural networks in the settings of ...
research
05/26/2020

Efficient Use of heuristics for accelerating XCS-based Policy Learning in Markov Games

In Markov games, playing against non-stationary opponents with learning ...
research
06/04/2023

Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL

Reinforcement learning agents may sometimes develop habits that are effe...
research
09/21/2023

Representation Abstractions as Incentives for Reinforcement Learning Agents: A Robotic Grasping Case Study

Choosing an appropriate representation of the environment for the underl...

Please sign up or login with your details

Forgot password? Click here to reset