AI Chat AI Image Generator AI Video Text to Speech

A Novel Reward Shaping Function for Single-Player Mahjong

05/06/2023

∙

by Kai Jun Chen, et al.

∙

∙

Mahjong is a complex game with an intractably large state space with extremely sparse rewards, which poses challenges to develop an agent to play Mahjong. To overcome this, the ShangTing function was adopted as a reward shaping function. This was combined with a forward-search algorithm to create an agent capable of completing a winning hand in Single-player Mahjong (an average of 35 actions over 10,000 games). To increase performance, we propose a novel bonus reward shaping function, which assigns higher relative values to synergistic Mahjong hands. In a simulated 1-v-1 battle, usage of the new reward function outperformed the default ShangTing function, winning an average of 1.37 over 1000 games.

Kai Jun Chen
1 publication
Lok Him Lai
1 publication
Zi Iun Lai
1 publication

research

∙ 12/16/2019

Self-Play Learning Without a Reward Metric

The AlphaZero algorithm for the learning of strategy games via self-play...

0 Dan Schmidt, et al. ∙

research

∙ 09/18/2019

No-Regret Learning in Unknown Games with Correlated Payoffs

We consider the problem of learning to play a repeated multi-agent game ...

33 Pier Giuseppe Sessa, et al. ∙

research

∙ 07/04/2018

Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization

Adversarial self-play in two-player games has delivered impressive resul...

2 Alexandre Laterre, et al. ∙

research

∙ 02/24/2016

Learning values across many orders of magnitude

Most learning algorithms are not invariant to the scale of the function ...

0 Hado van Hasselt, et al. ∙

research

∙ 04/10/2018

Evaluating Actuators in a Purely Information-Theory Based Reward Model

AGINAO builds its cognitive engine by applying self-programming techniqu...

0 Wojciech Skaba, et al. ∙

research

∙ 06/14/2020

Tackling Morpion Solitaire with AlphaZero-likeRanked Reward Reinforcement Learning

Morpion Solitaire is a popular single player game, performed with paper ...

0 Hui Wang, et al. ∙

research

∙ 02/17/2018

A Deep Q-Learning Agent for the L-Game with Variable Batch Training

We employ the Deep Q-Learning algorithm with Experience Replay to train ...

0 Petros Giannakopoulos, et al. ∙