Michael Bowling

research

∙ 02/23/2023

Targeted Search Control in AlphaZero for Effective Policy Improvement

AlphaZero is a self-play reinforcement learning algorithm that achieves ...

0 Alexandre Trudeau, et al. ∙

research

∙ 12/20/2022

Settling the Reward Hypothesis

The reward hypothesis posits that, "all of what we mean by goals and pur...

4 Michael Bowling, et al. ∙

research

∙ 11/02/2022

Over-communicate no more: Situated RL agents learn concise communication protocols

While it is known that communication facilitates cooperation in multi-ag...

0 Aleksandra Kalinowska, et al. ∙

research

∙ 06/04/2022

Interpolating Between Softmax Policy Gradient and Neural Replicator Dynamics with Capped Implicit Exploration

Neural replicator dynamics (NeuRD) is an alternative to the foundational...

1 Dustin Morrill, et al. ∙

research

∙ 05/24/2022

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games: Corrections

Hindsight rationality is an approach to playing general-sum games that p...

5 Dustin Morrill, et al. ∙

research

∙ 05/22/2022

Should Models Be Accurate?

Model-based Reinforcement Learning (MBRL) holds promise for data-efficie...

39 Esra'a Saleh, et al. ∙

research

∙ 12/06/2021

Player of Games

Games have a long history of serving as a benchmark for progress in arti...

0 Martin Schmid, et al. ∙

research

∙ 11/15/2021

The Partially Observable History Process

We introduce the partially observable history process (POHP) formalism f...

4 Dustin Morrill, et al. ∙

research

∙ 10/29/2021

Learning to Be Cautious

A key challenge in the field of reinforcement learning is to develop age...

8 Montaser Mohammedalamen, et al. ∙

research

∙ 02/13/2021

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

Hindsight rationality is an approach to playing multi-agent, general-sum...

4 Dustin Morrill, et al. ∙

research

∙ 01/11/2021

Solving Common-Payoff Games with Approximate Policy Iteration

For artificially intelligent learning systems to have widespread applica...

0 Samuel Sokota, et al. ∙

research

∙ 12/10/2020

Hindsight and Sequential Rationality of Correlated Play

Driven by recent successes in two-player, zero-sum game solving and play...

1 Dustin Morrill, et al. ∙

research

∙ 11/02/2020

Useful Policy Invariant Shaping from Arbitrary Advice

Reinforcement learning is a powerful learning paradigm in which agents c...

0 Paniz Behboudian, et al. ∙

research

∙ 08/27/2020

The Advantage Regret-Matching Actor-Critic

Regret minimization has played a key role in online learning, equilibriu...

0 Audrūnas Gruslys, et al. ∙

research

∙ 06/15/2020

Sound Search in Imperfect Information Games

Search has played a fundamental role in computer game research since the...

0 Michal Šustr, et al. ∙

research

∙ 06/10/2020

Marginal Utility for Planning in Continuous or Large Discrete Action Spaces

Sample-based planning is a powerful family of algorithms for generating ...

0 Zaheen Farraz Ahmad, et al. ∙

research

∙ 04/28/2020

Sample-Efficient Model-based Actor-Critic for an Interactive Dialogue Task

Human-computer interactive systems that rely on machine learning are bec...

0 Katya Kudashkina, et al. ∙

research

∙ 04/20/2020

Approximate exploitability: Learning a best response in large games

A common metric in games of imperfect information is exploitability, i.e...

10 Finbarr Timbers, et al. ∙

research

∙ 12/06/2019

Alternative Function Approximation Parameterizations for Solving Games: An Analysis of f-Regression Counterfactual Regret Minimization

Function approximation is a powerful approach for structuring large deci...

0 Ryan D'Orazio, et al. ∙

research

∙ 07/22/2019

Low-Variance and Zero-Variance Baselines for Extensive-Form Games

Extensive-form games (EFGs) are a common model of multi-agent interactio...

0 Trevor Davis, et al. ∙

research

∙ 06/26/2019

Rethinking Formal Models of Partially Observable Multiagent Decision Making

Multiagent decision-making problems in partially observable environments...

5 Vojtěch Kovařík, et al. ∙

research

∙ 06/06/2019

Ease-of-Teaching and Language Structure from Emergent Communication

Artificial agents have been shown to learn to communicate when needed to...

0 Fushan Li, et al. ∙

research

∙ 02/01/2019

The Hanabi Challenge: A New Frontier for AI Research

From the early days of computing, games have been important testbeds for...

54 Nolan Bard, et al. ∙

research

∙ 11/04/2018

Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning

When observing the actions of others, humans carry out inferences about ...

0 Jakob N. Foerster, et al. ∙

research

∙ 10/21/2018

Actor-Critic Policy Optimization in Partially Observable Multiagent Environments

Optimization of parameterized policies for reinforcement learning (RL) i...

8 Sriram Srinivasan, et al. ∙

research

∙ 09/29/2018

Generalization and Regularization in DQN

Deep reinforcement learning (RL) algorithms have shown an impressive abi...

0 Jesse Farebrother, et al. ∙

research

∙ 09/20/2018

Solving Large Extensive-Form Games with Strategy Constraints

Extensive-form games are a common model for multiagent interactions with...

0 Trevor Davis, et al. ∙

research

∙ 09/09/2018

Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines

Learning strategies for imperfect information games from samples of inte...

12 Martin Schmid, et al. ∙

research

∙ 07/31/2018

Count-Based Exploration with the Successor Representation

The problem of exploration in reinforcement learning is well-understood ...

4 Marlos C. Machado, et al. ∙

research

∙ 06/05/2018

The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces

Dyna is an architecture for reinforcement learning agents that interleav...

0 G. Zacharias Holland, et al. ∙

research

∙ 01/06/2017

DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker

Artificial intelligence has seen several breakthroughs in recent years, ...

0 Matej Moravčík, et al. ∙

research

∙ 12/20/2016

AIVAT: A New Variance Reduction Technique for Agent Evaluation in Imperfect Information Games

Evaluating agent performance when outcomes are stochastic and agents use...

0 Neil Burch, et al. ∙

research

∙ 11/28/2014

Solving Games with Functional Regret Estimation

We propose a novel online learning method for minimizing regret in large...

0 Kevin Waugh, et al. ∙

research

∙ 10/16/2014

Domain-Independent Optimistic Initialization for Reinforcement Learning

In Reinforcement Learning (RL), it is common to use optimistic initializ...

0 Marlos C. Machado, et al. ∙

research

∙ 11/03/2012

Partition Tree Weighting

This paper introduces the Partition Tree Weighting technique, an efficie...

0 Joel Veness, et al. ∙

research

∙ 07/19/2012

The Arcade Learning Environment: An Evaluation Platform for General Agents

In this article we introduce the Arcade Learning Environment (ALE): both...

0 Marc G. Bellemare, et al. ∙

research

∙ 06/14/2012

On Local Regret

Online learning aims to perform nearly as well as the best hypothesis in...

0 Michael Bowling, et al. ∙

research

∙ 05/03/2012

No-Regret Learning in Extensive-Form Games with Imperfect Recall

Counterfactual Regret Minimization (CFR) is an efficient no-regret learn...

0 Marc Lanctot, et al. ∙

research

∙ 05/01/2012

A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning

We consider the problem of simultaneously learning to linearly combine a...

0 Arash Afkanpour, et al. ∙

research

∙ 12/20/2011

Alignment Based Kernel Learning with a Continuous Set of Base Kernels

The success of kernel-based learning methods depend on the choice of ker...

0 Arash Afkanpour, et al. ∙

Michael Bowling

Featured Co-authors

Sign in with Google

Consider DeepAI Pro