
-
Neural Production Systems
Visual environments are structured, consisting of distinct objects or en...
read it
-
Counterfactual Credit Assignment in Model-Free Reinforcement Learning
Credit assignment in reinforcement learning is the problem of measuring ...
read it
-
Game Plan: What AI can do for Football, and What Football can do for AI
The rapid progress in artificial intelligence (AI) and machine learning ...
read it
-
Behavior Priors for Efficient Reinforcement Learning
As we deploy reinforcement learning agents to solve increasingly challen...
read it
-
Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification
Many real-world physical control systems are required to satisfy constra...
read it
-
Learning Dexterous Manipulation from Suboptimal Experts
Learning dexterous manipulation in high-dimensional state-action spaces ...
read it
-
Local Search for Policy Iteration in Continuous Control
We present an algorithm for local, regularized, policy improvement in re...
read it
-
Beyond Tabula-Rasa: a Modular Reinforcement Learning Approach for Physically Embedded 3D Sokoban
Intelligent robots need to achieve abstract objectives using concrete, s...
read it
-
Learning to swim in potential flow
Fish swim by undulating their bodies. These propulsive motions require c...
read it
-
Physically Embedded Planning Problems: New Challenges for Reinforcement Learning
Recent work in deep reinforcement learning (RL) has produced algorithms ...
read it
-
Importance Weighted Policy Learning and Adaption
The ability to exploit prior experience to solve novel problems rapidly ...
read it
-
Action and Perception as Divergence Minimization
We introduce a unified objective for action and perception of intelligen...
read it
-
Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion
Modern Reinforcement Learning (RL) algorithms promise to solve difficult...
read it
-
Data-efficient Hindsight Off-policy Option Learning
Solutions to most complex tasks can be decomposed into simpler, intermed...
read it
-
Critic Regularized Regression
Offline reinforcement learning (RL), also known as batch RL, offers the ...
read it
-
RL Unplugged: Benchmarks for Offline Reinforcement Learning
Offline methods for reinforcement learning have the potential to help br...
read it
-
dm_control: Software and Tasks for Continuous Control
The dm_control software package is a collection of Python libraries and ...
read it
-
Simple Sensor Intentions for Exploration
Modern reinforcement learning algorithms can learn solutions to increasi...
read it
-
A Distributional View on Multi-Objective Policy Optimization
Many real-world problems require trading off multiple competing objectiv...
read it
-
Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning
Standard planners for sequential decision making (including Monte Carlo ...
read it
-
Value-driven Hindsight Modelling
Value estimation is a critical component of the reinforcement learning (...
read it
-
Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics
Many real-world control problems involve both discrete decision variable...
read it
-
Hindsight Credit Assignment
We consider the problem of efficient credit assignment in reinforcement ...
read it
-
Reusable neural skill embeddings for vision-guided whole body movement and object manipulation
Both in simulation settings and robotics, there is an ambition to produc...
read it
-
Quinoa: a Q-function You Infer Normalized Over Actions
We present an algorithm for learning an approximate action-value soft Q-...
read it
-
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions
A plethora of problems in AI, engineering and the sciences are naturally...
read it
-
Stabilizing Transformers for Reinforcement Learning
Owing to their ability to both effectively integrate information over lo...
read it
-
Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
Humans are masters at quickly learning many complex tasks, relying on an...
read it
-
A Generalized Training Approach for Multiagent Learning
This paper investigates a population-based training regime based on game...
read it
-
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
Some of the most successful applications of deep reinforcement learning ...
read it
-
Regularized Hierarchical Policies for Compositional Transfer in Robotics
The successful application of flexible, general learning algorithms -- s...
read it
-
Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Direct optimization is an appealing approach to differentiating through ...
read it
-
Meta reinforcement learning as task inference
Humans achieve efficient learning by relying on prior knowledge about th...
read it
-
Meta-learning of Sequential Strategies
In this report we review memory-based meta-learning as a tool for buildi...
read it
-
Information asymmetry in KL-regularized RL
Many real world tasks exhibit rich structure that is repeated across dif...
read it
-
Exploiting Hierarchy for Learning and Transfer in KL-regularized RL
As reinforcement learning agents are tasked with solving more challengin...
read it
-
The Termination Critic
In this work, we consider the problem of autonomously discovering behavi...
read it
-
Emergent Coordination Through Competition
We study the emergence of cooperative behaviors in reinforcement learnin...
read it
-
Value constrained model-free continuous control
The naive application of Reinforcement Learning algorithms to continuous...
read it
-
Credit Assignment Techniques in Stochastic Computation Graphs
Stochastic computation graphs (SCGs) provide a formalism to represent st...
read it
-
Self-supervised Learning of Image Embedding for Continuous Control
Operating directly from raw high dimensional sensory inputs like images ...
read it
-
Relative Entropy Regularized Policy Iteration
We present an off-policy actor-critic algorithm for Reinforcement Learni...
read it
-
Entropic Policy Composition with Generalized Policy Improvement and Divergence Correction
Deep reinforcement learning (RL) algorithms have made great strides in r...
read it
-
Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures
This paper addresses the problem of evaluating learning systems in safet...
read it
-
Neural probabilistic motor primitives for humanoid control
We focus on the problem of learning a single motor module that can flexi...
read it
-
Hierarchical visuomotor control of humanoids
We aim to build complex humanoid agents that integrate perception, motor...
read it
-
Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search
Learning policies on data synthesized by models can in principle quench ...
read it
-
Maximum a Posteriori Policy Optimisation
We introduce a new algorithm for reinforcement learning called Maximum a...
read it
-
Mix&Match - Agent Curricula for Reinforcement Learning
We introduce Mix&Match (M&M) - a training framework designed to facilita...
read it
-
Relational inductive biases, deep learning, and graph networks
Artificial intelligence (AI) has undergone a renaissance recently, makin...
read it