Combinational Q-Learning for Dou Di Zhu

01/24/2019
by   Yang You, et al.
0

Deep reinforcement learning (DRL) has gained a lot of attention in recent years, and has been proven to be able to play Atari games and Go at or above human levels. However, those games are assumed to have a small fixed number of actions and could be trained with a simple CNN network. In this paper, we study a special class of Asian popular card games called Dou Di Zhu, in which two adversarial groups of agents must consider numerous card combinations at each time step, leading to huge number of actions. We propose a novel method to handle combinatorial actions, which we call combinational Q-learning (CQL). We employ a two-stage network to reduce action space and also leverage order-invariant max-pooling operations to extract relationships between primitive actions. Results show that our method prevails over state-of-the art methods like naive Q-learning and A3C. We develop an easy-to-use card game environments and train all agents adversarially from sractch, with only knowledge of game rules and verify that our agents are comparative to humans. Our code to reproduce all reported results will be available online.

READ FULL TEXT

page 5

page 11

research
03/14/2018

Learning to Play General Video-Games via an Object Embedding Network

Deep reinforcement learning (DRL) has proven to be an effective tool for...
research
05/05/2018

Deep Reinforcement Learning for Playing 2.5D Fighting Games

Deep reinforcement learning has shown its success in game playing. Howev...
research
04/02/2020

Action Space Shaping in Deep Reinforcement Learning

Reinforcement learning (RL) has been successful in training agents in va...
research
01/20/2021

Shielding Atari Games with Bounded Prescience

Deep reinforcement learning (DRL) is applied in safety-critical domains ...
research
04/06/2022

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning

Recent years have witnessed the great breakthrough of deep reinforcement...
research
12/07/2020

Deep Policy Networks for NPC Behaviors that Adapt to Changing Design Parameters in Roguelike Games

Recent advances in Deep Reinforcement Learning (DRL) have largely focuse...
research
04/14/2021

Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts

Toward the goal of automatic production for sports broadcasts, a paramou...

Please sign up or login with your details

Forgot password? Click here to reset