
Physically Embedded Planning Problems: New Challenges for Reinforcement Learning
Recent work in deep reinforcement learning (RL) has produced algorithms ...
read it

Importance Weighted Policy Learning and Adaption
The ability to exploit prior experience to solve novel problems rapidly ...
read it

Action and Perception as Divergence Minimization
We introduce a unified objective for action and perception of intelligen...
read it

Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion
Modern Reinforcement Learning (RL) algorithms promise to solve difficult...
read it

Dataefficient Hindsight Offpolicy Option Learning
Solutions to most complex tasks can be decomposed into simpler, intermed...
read it

Critic Regularized Regression
Offline reinforcement learning (RL), also known as batch RL, offers the ...
read it

RL Unplugged: Benchmarks for Offline Reinforcement Learning
Offline methods for reinforcement learning have the potential to help br...
read it

dm_control: Software and Tasks for Continuous Control
The dm_control software package is a collection of Python libraries and ...
read it

Simple Sensor Intentions for Exploration
Modern reinforcement learning algorithms can learn solutions to increasi...
read it

A Distributional View on MultiObjective Policy Optimization
Many realworld problems require trading off multiple competing objectiv...
read it

DivideandConquer Monte Carlo Tree Search For GoalDirected Planning
Standard planners for sequential decision making (including Monte Carlo ...
read it

Valuedriven Hindsight Modelling
Value estimation is a critical component of the reinforcement learning (...
read it

ContinuousDiscrete Reinforcement Learning for Hybrid Control in Robotics
Many realworld control problems involve both discrete decision variable...
read it

Hindsight Credit Assignment
We consider the problem of efficient credit assignment in reinforcement ...
read it

Reusable neural skill embeddings for visionguided whole body movement and object manipulation
Both in simulation settings and robotics, there is an ambition to produc...
read it

Quinoa: a Qfunction You Infer Normalized Over Actions
We present an algorithm for learning an approximate actionvalue soft Q...
read it

Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions
A plethora of problems in AI, engineering and the sciences are naturally...
read it

Stabilizing Transformers for Reinforcement Learning
Owing to their ability to both effectively integrate information over lo...
read it

Imagined Value Gradients: ModelBased Policy Optimization with Transferable Latent Dynamics Models
Humans are masters at quickly learning many complex tasks, relying on an...
read it

A Generalized Training Approach for Multiagent Learning
This paper investigates a populationbased training regime based on game...
read it

VMPO: OnPolicy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
Some of the most successful applications of deep reinforcement learning ...
read it

Regularized Hierarchical Policies for Compositional Transfer in Robotics
The successful application of flexible, general learning algorithms  s...
read it

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Direct optimization is an appealing approach to differentiating through ...
read it

Meta reinforcement learning as task inference
Humans achieve efficient learning by relying on prior knowledge about th...
read it

Metalearning of Sequential Strategies
In this report we review memorybased metalearning as a tool for buildi...
read it

Information asymmetry in KLregularized RL
Many real world tasks exhibit rich structure that is repeated across dif...
read it

Exploiting Hierarchy for Learning and Transfer in KLregularized RL
As reinforcement learning agents are tasked with solving more challengin...
read it

The Termination Critic
In this work, we consider the problem of autonomously discovering behavi...
read it

Emergent Coordination Through Competition
We study the emergence of cooperative behaviors in reinforcement learnin...
read it

Value constrained modelfree continuous control
The naive application of Reinforcement Learning algorithms to continuous...
read it

Credit Assignment Techniques in Stochastic Computation Graphs
Stochastic computation graphs (SCGs) provide a formalism to represent st...
read it

Selfsupervised Learning of Image Embedding for Continuous Control
Operating directly from raw high dimensional sensory inputs like images ...
read it

Relative Entropy Regularized Policy Iteration
We present an offpolicy actorcritic algorithm for Reinforcement Learni...
read it

Entropic Policy Composition with Generalized Policy Improvement and Divergence Correction
Deep reinforcement learning (RL) algorithms have made great strides in r...
read it

Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures
This paper addresses the problem of evaluating learning systems in safet...
read it

Neural probabilistic motor primitives for humanoid control
We focus on the problem of learning a single motor module that can flexi...
read it

Hierarchical visuomotor control of humanoids
We aim to build complex humanoid agents that integrate perception, motor...
read it

Woulda, Coulda, Shoulda: CounterfactuallyGuided Policy Search
Learning policies on data synthesized by models can in principle quench ...
read it

Maximum a Posteriori Policy Optimisation
We introduce a new algorithm for reinforcement learning called Maximum a...
read it

Mix&Match  Agent Curricula for Reinforcement Learning
We introduce Mix&Match (M&M)  a training framework designed to facilita...
read it

Relational inductive biases, deep learning, and graph networks
Artificial intelligence (AI) has undergone a renaissance recently, makin...
read it

Graph networks as learnable physics engines for inference and control
Understanding and interacting with everyday physical scenes requires ric...
read it

Distributed Distributional Deterministic Policy Gradients
This work adopts the very successful distributional perspective on reinf...
read it

Learning by Playing  Solving Sparse Reward Tasks from Scratch
We propose Scheduled Auxiliary Control (SACX), a new learning paradigm ...
read it

Reinforcement and Imitation Learning for Diverse Visuomotor Skills
We propose a modelfree deep reinforcement learning method that leverage...
read it

Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
We propose a general and modelfree approach for Reinforcement Learning ...
read it

ImaginationAugmented Agents for Deep Reinforcement Learning
We introduce ImaginationAugmented Agents (I2As), a novel architecture f...
read it

Learning modelbased planning from scratch
Conventional wisdom holds that modelbased planning is a powerful approa...
read it

Distral: Robust Multitask Reinforcement Learning
Most deep reinforcement learning algorithms are data inefficient in comp...
read it

Emergence of Locomotion Behaviours in Rich Environments
The reinforcement learning paradigm allows, in principle, for complex be...
read it
Nicolas Heess
is this you? claim profile
PhD student at the Institute for Adaptive and Neural Computation, University of Edinburgh