
-
Bootstrapped Representation Learning on Graphs
Current state-of-the-art self-supervised learning methods for graph neur...
read it
-
Geometric Entropic Exploration
Exploration is essential for solving complex Reinforcement Learning (RL)...
read it
-
Counterfactual Credit Assignment in Model-Free Reinforcement Learning
Credit assignment in reinforcement learning is the problem of measuring ...
read it
-
Game Plan: What AI can do for Football, and What Football can do for AI
The rapid progress in artificial intelligence (AI) and machine learning ...
read it
-
The Advantage Regret-Matching Actor-Critic
Regret minimization has played a key role in online learning, equilibriu...
read it
-
Monte-Carlo Tree Search as Regularized Policy Optimization
The combination of Monte-Carlo tree search (MCTS) with deep reinforcemen...
read it
-
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
We introduce Bootstrap Your Own Latent (BYOL), a new approach to self-su...
read it
-
Navigating the Landscape of Multiplayer Games to Probe the Drosophila of AI
Multiplayer games have a long history in being used as key testbeds for ...
read it
-
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
Learning a good representation is an essential component for deep reinfo...
read it
-
Leverage the Average: an Analysis of Regularization in RL
Building upon the formalism of regularized Markov decision processes, we...
read it
-
Taylor Expansion Policy Optimization
In this work, we investigate the application of Taylor expansions in rei...
read it
-
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization
In this paper we investigate the Follow the Regularized Leader dynamics ...
read it
-
Hindsight Credit Assignment
We consider the problem of efficient credit assignment in reinforcement ...
read it
-
Conditional Importance Sampling for Off-Policy Learning
The principal contribution of this paper is a conceptual framework for o...
read it
-
Adaptive Trade-Offs in Off-Policy Learning
A great variety of off-policy learning algorithms exist in the literatur...
read it
-
A Generalized Training Approach for Multiagent Learning
This paper investigates a population-based training regime based on game...
read it
-
Multiagent Evaluation under Incomplete Information
This paper investigates the evaluation of learned multiagent strategies ...
read it
-
Neural Replicator Dynamics
In multiagent learning, agents interact in inherently nonstationary envi...
read it
-
α-Rank: Multi-Agent Evaluation by Evolution
We introduce α-Rank, a principled evolutionary dynamics methodology, for...
read it
-
The Termination Critic
In this work, we consider the problem of autonomously discovering behavi...
read it
-
Statistics and Samples in Distributional Reinforcement Learning
We present a unifying framework for designing and analysing distribution...
read it
-
World Discovery Models
As humans we are driven by a strong desire for seeking novelty in our wo...
read it
-
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
The ability to transfer skills across tasks has the potential to scale u...
read it
-
Optimistic optimization of a Brownian
We address the problem of optimizing a Brownian motion. We consider a (r...
read it
-
Universal Successor Features Approximators
The ability of a reinforcement learning (RL) agent to learn about many r...
read it
-
Neural Predictive Belief Representations
Unsupervised representation learning has succeeded with excellent result...
read it
-
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments
Optimization of parameterized policies for reinforcement learning (RL) i...
read it
-
Autoregressive Quantile Networks for Generative Modeling
We introduce autoregressive implicit quantile networks (AIQN), a fundame...
read it
-
Implicit Quantile Networks for Distributional Reinforcement Learning
In this work, we build on recent advances in distributional reinforcemen...
read it
-
Maximum a Posteriori Policy Optimisation
We introduce a new algorithm for reinforcement learning called Maximum a...
read it
-
Observe and Look Further: Achieving Consistent Performance on Atari
Despite significant advances in the field of deep Reinforcement Learning...
read it
-
Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery
Reinforcement learning (RL) agents performing complex tasks must be able...
read it
-
A Study on Overfitting in Deep Reinforcement Learning
Recent years have witnessed significant progresses in deep Reinforcement...
read it
-
An Analysis of Categorical Distributional Reinforcement Learning
Distributional approaches to value-based reinforcement learning model th...
read it
-
Learning to Search with MCTSnets
Planning problems are among the most important and well-studied problems...
read it
-
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
In this work we aim to solve a large collection of tasks using a single ...
read it
-
Distributional Reinforcement Learning with Quantile Regression
In reinforcement learning an agent interacts with the environment by tak...
read it
-
The Uncertainty Bellman Equation and Exploration
We consider the exploration/exploitation problem in reinforcement learni...
read it
-
A Distributional Perspective on Reinforcement Learning
In this paper we argue for the fundamental importance of the value distr...
read it
-
Noisy Networks for Exploration
We introduce NoisyNet, a deep reinforcement learning agent with parametr...
read it
-
Observational Learning by Reinforcement Learning
Observational learning is a type of learning that occurs as a function o...
read it
-
The Cramer Distance as a Solution to Biased Wasserstein Gradients
The Wasserstein probability metric has received much attention from the ...
read it
-
The Reactor: A Sample-Efficient Actor-Critic Architecture
In this work we present a new reinforcement learning agent, called React...
read it
-
Automated Curriculum Learning for Neural Networks
We introduce a method for automatically selecting the path, or syllabus,...
read it
-
Minimax Regret Bounds for Reinforcement Learning
We consider the problem of provably optimal exploration in reinforcement...
read it
-
Learning to reinforcement learn
In recent years deep reinforcement learning (RL) systems have attained s...
read it
-
Combining policy gradient and Q-learning
Policy gradient is an efficient technique for improving a policy in a re...
read it
-
Successor Features for Transfer in Reinforcement Learning
Transfer in reinforcement learning refers to the notion that generalizat...
read it
-
Memory-Efficient Backpropagation Through Time
We propose a novel approach to reduce memory consumption of the backprop...
read it
-
Safe and Efficient Off-Policy Reinforcement Learning
In this work, we take a fresh look at some old and new algorithms for of...
read it