
ReplayGuided Adversarial Environment Design
Deep reinforcement learning (RL) agents may successfully generalize to n...
Don't Sweep your Learning Rate under the Rug: A Closer Look at Crossmodal Transfer of Pretrained Transformers
Selfsupervised pretraining of largescale transformer models on text c...
Implicit Communication as Minimum Entropy Coupling
In many commonpayoff games, achieving good performance requires players...
Centralized Model and Exploration Policy for MultiAgent RL
Reinforcement learning (RL) in partially observable, fully cooperative m...
Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings
Search is an important tool for computing effective policies in single ...
A New Formalism, Method and Open Issues for ZeroShot Coordination
In many coordination problems, independently reasoning humans are able t...
QuasiEquivalence Discovery for ZeroShot Emergent Communication
Effective communication is an important skill for enabling information e...
OffBelief Learning
The standard problem setting in DecPOMDPs is selfplay, where the goal ...
Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
Over the last decade, a single algorithm has changed many facets of our ...
Exploring ZeroShot Emergent Communication in Embodied MultiAgent Populations
Effective communication is an important skill for enabling information e...
The Struggles of FeatureBased Explanations: Shapley Values vs. Minimal Sufficient Subsets
For neural models to garner widespread public trust and ensure fairness,...
Monotonic Value Function Factorisation for Deep MultiAgent Reinforcement Learning
In many realworld settings, a team of agents must coordinate its behavi...
"OtherPlay" for ZeroShot Coordination
We consider the problem of zeroshot coordination  constructing AI agen...
Improving Policies via Search in Cooperative Partially Observable Games
Recent superhuman results in games have largely been achieved in a varie...
Capacity, Bandwidth, and Compositionality in Emergent Language Learning
Many recent works have discussed the propensity, or lack thereof, for em...
Loaded DiCE: Trading off Bias and Variance in AnyOrder Score Function Estimators for Reinforcement Learning
Gradientbased methods for optimisation of objectives in stochastic sett...
A Survey of Reinforcement Learning Informed by Natural Language
To be successful in realworld tasks, Reinforcement Learning (RL) needs ...
Differentiable Game Mechanics
Deep learning is built on the foundational guarantee that gradient desce...
On the Pitfalls of Measuring Emergent Communication
How do we know if communication is emerging in a multiagent system? The...
The StarCraft MultiAgent Challenge
In the last few years, deep multiagent reinforcement learning (RL) has ...
Stable Opponent Shaping in Differentiable Games
A growing number of learning methods are actually games which optimise m...
Pommerman: A MultiAgent Playground
We present Pommerman, a multiagent environment based on the classic con...
QMIX: Monotonic Value Function Factorisation for Deep MultiAgent Reinforcement Learning
In many realworld settings, a team of agents must coordinate their beha...
The Mechanics of nPlayer Differentiable Games
The cornerstone underpinning deep learning is the guarantee that gradien...
DiCE: The Infinitely Differentiable MonteCarlo Estimator
The score function estimator is widely used for estimating gradients of ...
Fake News in Social Networks
We model the spread of news as a social learning game on a network. Agen...
Counterfactual MultiAgent Policy Gradients
Cooperative multiagent systems can be naturally used to model many real...
Stabilising Experience Replay for Deep MultiAgent Reinforcement Learning
Many realworld problems, such as network packet routing and urban traff...
Jakob Foerster
verfied profile