Does DQN really learn? Exploring adversarial training schemes in Pong

03/20/2022
by   Bowen He, et al.
0

In this work, we study two self-play training schemes, Chainer and Pool, and show they lead to improved agent performance in Atari Pong compared to a standard DQN agent – trained against the built-in Atari opponent. To measure agent performance, we define a robustness metric that captures how difficult it is to learn a strategy that beats the agent's learned policy. Through playing past versions of themselves, Chainer and Pool are able to target weaknesses in their policies and improve their resistance to attack. Agents trained using these methods score well on our robustness metric and can easily defeat the standard DQN agent. We conclude by using linear probing to illuminate what internal structures the different agents develop to play the game. We show that training agents with Chainer or Pool leads to richer network activations with greater predictive power to estimate critical game-state features compared to the standard DQN agent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2020

Towards Playing Full MOBA Games with Deep Reinforcement Learning

MOBA games, e.g., Honor of Kings, League of Legends, and Dota 2, pose gr...
research
09/18/2019

Robust Opponent Modeling via Adversarial Ensemble Reinforcement Learning in Asymmetric Imperfect-Information Games

This paper presents an algorithmic framework for learning robust policie...
research
10/15/2019

Visual Hide and Seek

We train embodied agents to play Visual Hide and Seek where a prey must ...
research
04/28/2020

Evaluating the Rainbow DQN Agent in Hanabi with Unseen Partners

Hanabi is a cooperative game that challenges exist-ing AI techniques due...
research
10/15/2021

Collaborating with Humans without Human Data

Collaborating with humans requires rapidly adapting to their individual ...
research
12/08/2017

Nintendo Super Smash Bros. Melee: An "Untouchable" Agent

Nintendo's Super Smash Bros. Melee fighting game can be emulated on mode...
research
05/12/2023

Mastering Percolation-like Games with Deep Learning

Though robustness of networks to random attacks has been widely studied,...

Please sign up or login with your details

Forgot password? Click here to reset