
Stable Opponent Shaping in Differentiable Games
A growing number of learning methods are actually games which optimise m...
A Survey of Reinforcement Learning Informed by Natural Language
To be successful in realworld tasks, Reinforcement Learning (RL) needs ...
The StarCraft MultiAgent Challenge
In the last few years, deep multiagent reinforcement learning (RL) has ...
Loaded DiCE: Trading off Bias and Variance in AnyOrder Score Function Estimators for Reinforcement Learning
Gradientbased methods for optimisation of objectives in stochastic sett...
On the Pitfalls of Measuring Emergent Communication
How do we know if communication is emerging in a multiagent system? The...
Capacity, Bandwidth, and Compositionality in Emergent Language Learning
Many recent works have discussed the propensity, or lack thereof, for em...
Monotonic Value Function Factorisation for Deep MultiAgent Reinforcement Learning
In many realworld settings, a team of agents must coordinate its behavi...
Stabilising Experience Replay for Deep MultiAgent Reinforcement Learning
Many realworld problems, such as network packet routing and urban traff...
Counterfactual MultiAgent Policy Gradients
Cooperative multiagent systems can be naturally used to model many real...
DiCE: The Infinitely Differentiable MonteCarlo Estimator
The score function estimator is widely used for estimating gradients of ...
Fake News in Social Networks
We model the spread of news as a social learning game on a network. Agen...
QMIX: Monotonic Value Function Factorisation for Deep MultiAgent Reinforcement Learning
In many realworld settings, a team of agents must coordinate their beha...
The Mechanics of nPlayer Differentiable Games
The cornerstone underpinning deep learning is the guarantee that gradien...
Pommerman: A MultiAgent Playground
We present Pommerman, a multiagent environment based on the classic con...
Differentiable Game Mechanics
Deep learning is built on the foundational guarantee that gradient desce...
"OtherPlay" for ZeroShot Coordination
We consider the problem of zeroshot coordination  constructing AI agen...
Improving Policies via Search in Cooperative Partially Observable Games
Recent superhuman results in games have largely been achieved in a varie...
Jakob Foerster
