AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning

08/07/2023
by   Michael Mathieu, et al.
0

StarCraft II is one of the most challenging simulated reinforcement learning environments; it is partially observable, stochastic, multi-agent, and mastering StarCraft II requires strategic planning over long time horizons with real-time low-level execution. It also has an active professional competitive scene. StarCraft II is uniquely suited for advancing offline RL algorithms, both because of its challenging nature and because Blizzard has released a massive dataset of millions of StarCraft II games played by human players. This paper leverages that and establishes a benchmark, called AlphaStar Unplugged, introducing unprecedented challenges for offline reinforcement learning. We define a dataset (a subset of Blizzard's release), tools standardizing an API for machine learning methods, and an evaluation protocol. We also present baseline agents, including behavior cloning, offline variants of actor-critic and MuZero. We improve the state of the art of agents using only offline data, and we achieve 90 cloning agent.

READ FULL TEXT

page 1

page 6

page 8

page 15

page 32

research
06/07/2021

Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

Learning from datasets without interaction with environments (Offline Le...
research
07/21/2023

Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization

Offline reinforcement learning (RL) has received considerable attention ...
research
05/17/2021

Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning

Offline Reinforcement Learning promises to learn effective policies from...
research
07/25/2018

Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches

Reinforcement Learning (RL) is a learning paradigm concerned with learni...
research
06/10/2017

ACCNet: Actor-Coordinator-Critic Net for "Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning

Communication is a critical factor for the big multi-agent world to stay...
research
10/04/2021

Learning to Assist Agents by Observing Them

The ability of an AI agent to assist other agents, such as humans, is an...
research
03/27/2018

Entropy Controlled Non-Stationarity for Improving Performance of Independent Learners in Anonymous MARL Settings

With the advent of sequential matching (of supply and demand) systems (u...

Please sign up or login with your details

Forgot password? Click here to reset