Deep Successor Reinforcement Learning

06/08/2016
by   Tejas D. Kulkarni, et al.
0

Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms. There is a third alternative, called Successor Representations (SR), which decomposes the value function into two components -- a reward predictor and a successor map. The successor map represents the expected future state occupancy from any given state and the reward predictor maps states to scalar rewards. The value function of a state can be computed as the inner product between the successor map and the reward weights. In this paper, we present DSR, which generalizes SR within an end-to-end deep reinforcement learning framework. DSR has several appealing properties including: increased sensitivity to distal reward changes due to factorization of reward and world dynamics, and the ability to extract bottleneck states (subgoals) given successor maps trained under a random policy. We show the efficacy of our approach on two diverse environments given raw pixel observations -- simple grid-world domains (MazeBase) and the Doom game engine.

READ FULL TEXT

page 6

page 8

research
09/30/2018

Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement Learning

Learning in sparse reward settings remains a challenge in Reinforcement ...
research
04/07/2021

Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation

In recent years, there are great interests as well as challenges in appl...
research
04/13/2021

Reward Shaping with Dynamic Trajectory Aggregation

Reinforcement learning, which acquires a policy maximizing long-term rew...
research
06/22/2019

A neurally plausible model learns successor representations in partially observable environments

Animals need to devise strategies to maximize returns while interacting ...
research
09/07/2023

A State Representation for Diminishing Rewards

A common setting in multitask reinforcement learning (RL) demands that a...
research
06/18/2019

Inferred successor maps for better transfer learning

Humans and animals show remarkable flexibility in adjusting their behavi...
research
03/11/2022

Active Phase-Encode Selection for Slice-Specific Fast MR Scanning Using a Transformer-Based Deep Reinforcement Learning Framework

Purpose: Long scan time in phase encoding for forming complete K-space m...

Please sign up or login with your details

Forgot password? Click here to reset