Regulating Reward Training by Means of Certainty Prediction in a Neural Network-Implemented Pong Game

09/23/2016
by   Matt Oberdorfer, et al.
0

We present the first reinforcement-learning model to self-improve its reward-modulated training implemented through a continuously improving "intuition" neural network. An agent was trained how to play the arcade video game Pong with two reward-based alternatives, one where the paddle was placed randomly during training, and a second where the paddle was simultaneously trained on three additional neural networks such that it could develop a sense of "certainty" as to how probable its own predicted paddle position will be to return the ball. If the agent was less than 95 policy used an intuition neural network to place the paddle. We trained both architectures for an equivalent number of epochs and tested learning performance by letting the trained programs play against a near-perfect opponent. Through this, we found that the reinforcement learning model that uses an intuition neural network for placing the paddle during reward training quickly overtakes the simple architecture in its ability to outplay the near-perfect opponent, additionally outscoring that opponent by an increasingly wide margin after additional epochs of training.

READ FULL TEXT
research
11/29/2022

Configurable Agent With Reward As Input: A Play-Style Continuum Generation

Modern video games are becoming richer and more complex in terms of game...
research
05/25/2023

Lucy-SKG: Learning to Play Rocket League Efficiently Using Deep Reinforcement Learning

A successful tactic that is followed by the scientific community for adv...
research
12/16/2019

Self-Play Learning Without a Reward Metric

The AlphaZero algorithm for the learning of strategy games via self-play...
research
10/11/2022

Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning

No-press Diplomacy is a complex strategy game involving both cooperation...
research
12/22/2021

Alpha-Mini: Minichess Agent with Deep Reinforcement Learning

We train an agent to compete in the game of Gardner minichess, a downsiz...
research
05/14/2022

Cliff Diving: Exploring Reward Surfaces in Reinforcement Learning Environments

Visualizing optimization landscapes has led to many fundamental insights...

Please sign up or login with your details

Forgot password? Click here to reset