
Efficient Deviation Types and Learning for Hindsight Rationality in ExtensiveForm Games
Hindsight rationality is an approach to playing multiagent, generalsum...
read it

Solving CommonPayoff Games with Approximate Policy Iteration
For artificially intelligent learning systems to have widespread applica...
read it

Hindsight and Sequential Rationality of Correlated Play
Driven by recent successes in twoplayer, zerosum game solving and play...
read it

Useful Policy Invariant Shaping from Arbitrary Advice
Reinforcement learning is a powerful learning paradigm in which agents c...
read it

The Advantage RegretMatching ActorCritic
Regret minimization has played a key role in online learning, equilibriu...
read it

Sound Search in Imperfect Information Games
Search has played a fundamental role in computer game research since the...
read it

Marginal Utility for Planning in Continuous or Large Discrete Action Spaces
Samplebased planning is a powerful family of algorithms for generating ...
read it

SampleEfficient Modelbased ActorCritic for an Interactive Dialogue Task
Humancomputer interactive systems that rely on machine learning are bec...
read it

Approximate exploitability: Learning a best response in large games
A common metric in games of imperfect information is exploitability, i.e...
read it

Alternative Function Approximation Parameterizations for Solving Games: An Analysis of fRegression Counterfactual Regret Minimization
Function approximation is a powerful approach for structuring large deci...
read it

LowVariance and ZeroVariance Baselines for ExtensiveForm Games
Extensiveform games (EFGs) are a common model of multiagent interactio...
read it

Rethinking Formal Models of Partially Observable Multiagent Decision Making
Multiagent decisionmaking problems in partially observable environments...
read it

EaseofTeaching and Language Structure from Emergent Communication
Artificial agents have been shown to learn to communicate when needed to...
read it

The Hanabi Challenge: A New Frontier for AI Research
From the early days of computing, games have been important testbeds for...
read it

Bayesian Action Decoder for Deep MultiAgent Reinforcement Learning
When observing the actions of others, humans carry out inferences about ...
read it

ActorCritic Policy Optimization in Partially Observable Multiagent Environments
Optimization of parameterized policies for reinforcement learning (RL) i...
read it

Generalization and Regularization in DQN
Deep reinforcement learning (RL) algorithms have shown an impressive abi...
read it

Solving Large ExtensiveForm Games with Strategy Constraints
Extensiveform games are a common model for multiagent interactions with...
read it

Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VRMCCFR) for Extensive Form Games using Baselines
Learning strategies for imperfect information games from samples of inte...
read it

CountBased Exploration with the Successor Representation
The problem of exploration in reinforcement learning is wellunderstood ...
read it

The Effect of Planning Shape on Dynastyle Planning in Highdimensional State Spaces
Dyna is an architecture for reinforcement learning agents that interleav...
read it

DeepStack: ExpertLevel Artificial Intelligence in NoLimit Poker
Artificial intelligence has seen several breakthroughs in recent years, ...
read it

AIVAT: A New Variance Reduction Technique for Agent Evaluation in Imperfect Information Games
Evaluating agent performance when outcomes are stochastic and agents use...
read it

Solving Games with Functional Regret Estimation
We propose a novel online learning method for minimizing regret in large...
read it

DomainIndependent Optimistic Initialization for Reinforcement Learning
In Reinforcement Learning (RL), it is common to use optimistic initializ...
read it

Partition Tree Weighting
This paper introduces the Partition Tree Weighting technique, an efficie...
read it

The Arcade Learning Environment: An Evaluation Platform for General Agents
In this article we introduce the Arcade Learning Environment (ALE): both...
read it

On Local Regret
Online learning aims to perform nearly as well as the best hypothesis in...
read it

NoRegret Learning in ExtensiveForm Games with Imperfect Recall
Counterfactual Regret Minimization (CFR) is an efficient noregret learn...
read it

A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning
We consider the problem of simultaneously learning to linearly combine a...
read it

Alignment Based Kernel Learning with a Continuous Set of Base Kernels
The success of kernelbased learning methods depend on the choice of ker...
read it
Michael Bowling
is this you? claim profile