
Taylor Expansion of Discount Factors
In practical reinforcement learning (RL), the discount factor used for e...
read it

Revisiting Peng's Q(λ) for Modern Reinforcement Learning
Offpolicy multistep reinforcement learning algorithms consist of conse...
read it

On The Effect of Auxiliary Tasks on Representation Dynamics
While auxiliary tasks play a key role in shaping the representations lea...
read it

Game Plan: What AI can do for Football, and What Football can do for AI
The rapid progress in artificial intelligence (AI) and machine learning ...
read it

Revisiting Fundamentals of Experience Replay
Experience replay is central to offpolicy algorithms in deep reinforcem...
read it

Navigating the Landscape of Multiplayer Games to Probe the Drosophila of AI
Multiplayer games have a long history in being used as key testbeds for ...
read it

From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization
In this paper we investigate the Follow the Regularized Leader dynamics ...
read it

Conditional Importance Sampling for OffPolicy Learning
The principal contribution of this paper is a conceptual framework for o...
read it

Adaptive TradeOffs in OffPolicy Learning
A great variety of offpolicy learning algorithms exist in the literatur...
read it

A Generalized Training Approach for Multiagent Learning
This paper investigates a populationbased training regime based on game...
read it

Multiagent Evaluation under Incomplete Information
This paper investigates the evaluation of learned multiagent strategies ...
read it

Metalearning of Sequential Strategies
In this report we review memorybased metalearning as a tool for buildi...
read it

Orthogonal Estimation of Wasserstein Distances
Wasserstein distances are increasingly used in a wide variety of applica...
read it

αRank: MultiAgent Evaluation by Evolution
We introduce αRank, a principled evolutionary dynamics methodology, for...
read it

Statistics and Samples in Distributional Reinforcement Learning
We present a unifying framework for designing and analysing distribution...
read it

Antithetic and Monte Carlo kernel estimators for partial rankings
In the modern age, rankings data is ubiquitous and it is useful for a va...
read it

Gaussian Process Behaviour in Wide Deep Neural Networks
Whilst deep neural networks have shown great empirical success, there is...
read it

Structured Evolution with Compact Architectures for Scalable Policy Optimization
We present a new method of blackbox optimization via gradient approximat...
read it

An Analysis of Categorical Distributional Reinforcement Learning
Distributional approaches to valuebased reinforcement learning model th...
read it

Distributional Reinforcement Learning with Quantile Regression
In reinforcement learning an agent interacts with the environment by tak...
read it

The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings
We examine a class of embeddings based on structured random matrices wit...
read it

Magnetic Hamiltonian Monte Carlo
Hamiltonian Monte Carlo (HMC) exploits Hamiltonian dynamics to construct...
read it

Blackbox αdivergence Minimization
Blackbox alpha (BBα) is a new approximate inference method based on th...
read it
Mark Rowland
is this you? claim profile