
The Importance of Pessimism in FixedDataset Policy Optimization
We study worstcase guarantees on the expected return of fixeddataset p...
read it

Representations for Stable OffPolicy Reinforcement Learning
Reinforcement learning with function approximation can be unstable and e...
read it

A Distributional Analysis of SamplingBased Reinforcement Learning Algorithms
We present a distributional approach to theoretical analyses of reinforc...
read it

Zooming for Efficient ModelFree Reinforcement Learning in Metric Spaces
Despite the wealth of research into provably efficient reinforcement lea...
read it

On Catastrophic Interference in Atari 2600 Games
Modelfree deep reinforcement learning algorithms are troubled with poor...
read it

Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction
Textbased games are a natural challenge domain for deep reinforcement l...
read it

Benchmarking BonusBased Exploration Methods on the Arcade Learning Environment
This paper provides an empirical evaluation of recently developed explor...
read it

DeepMDP: Learning Continuous Latent Space Models for Representation Learning
Many reinforcement learning (RL) tasks provide the agent with highdimen...
read it

Statistics and Samples in Distributional Reinforcement Learning
We present a unifying framework for designing and analysing distribution...
read it

Hyperbolic Discounting and Learning over Multiple Horizons
Reinforcement learning (RL) typically defines a discount factor as part ...
read it

Distributional reinforcement learning with linear function approximation
Despite many algorithmic advances, our theoretical understanding of prac...
read it

The Hanabi Challenge: A New Frontier for AI Research
From the early days of computing, games have been important testbeds for...
read it

A Geometric Perspective on Optimal Representations for Reinforcement Learning
This paper proposes a new approach to representation learning based on g...
read it

Shaping the Narrative Arc: An InformationTheoretic Approach to Collaborative Dialogue
We consider the problem of designing an artificial agent capable of inte...
read it

The Value Function Polytope in Reinforcement Learning
We establish geometric and topological properties of the space of value ...
read it

A Comparative Analysis of Expected and Distributional Reinforcement Learning
Since their introduction a year ago, distributional approaches to reinfo...
read it

OffPolicy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
In this paper we revisit the method of offpolicy corrections for reinfo...
read it

Dopamine: A Research Framework for Deep Reinforcement Learning
Deep reinforcement learning (deep RL) research has grown significantly i...
read it

An Introduction to Deep Reinforcement Learning
Deep reinforcement learning is the combination of reinforcement learning...
read it

Approximate Exploration through State Abstraction
Although exploration in reinforcement learning is well understood from a...
read it

CountBased Exploration with the Successor Representation
The problem of exploration in reinforcement learning is wellunderstood ...
read it

An Analysis of Categorical Distributional Reinforcement Learning
Distributional approaches to valuebased reinforcement learning model th...
read it

Distributional Reinforcement Learning with Quantile Regression
In reinforcement learning an agent interacts with the environment by tak...
read it

A Distributional Perspective on Reinforcement Learning
In this paper we argue for the fundamental importance of the value distr...
read it

The Cramer Distance as a Solution to Biased Wasserstein Gradients
The Wasserstein probability metric has received much attention from the ...
read it

The Reactor: A SampleEfficient ActorCritic Architecture
In this work we present a new reinforcement learning agent, called React...
read it

Automated Curriculum Learning for Neural Networks
We introduce a method for automatically selecting the path, or syllabus,...
read it

Safe and Efficient OffPolicy Reinforcement Learning
In this work, we take a fresh look at some old and new algorithms for of...
read it

Unifying CountBased Exploration and Intrinsic Motivation
We consider an agent's uncertainty about its environment and the problem...
read it

Q(λ) with OffPolicy Corrections
We propose and analyze an alternate approach to offpolicy multistep te...
read it

Increasing the Action Gap: New Operators for Reinforcement Learning
This paper introduces new optimalitypreserving operators on Qfunctions...
read it

Compress and Control
This paper describes a new informationtheoretic policy evaluation techn...
read it

The Arcade Learning Environment: An Evaluation Platform for General Agents
In this article we introduce the Arcade Learning Environment (ALE): both...
read it
Marc G. Bellemare
is this you? claim profile
Research Scientist at Google Brain, Adjunct Professor at McGill University