Width-based Lookaheads with Learnt Base Policies and Heuristics Over the Atari-2600 Benchmark

06/23/2021
by   Stefan O'Toole, et al.
0

We propose new width-based planning and learning algorithms applied over the Atari-2600 benchmark. The algorithms presented are inspired from a careful analysis of the design decisions made by previous width-based planners. We benchmark our new algorithms over the Atari-2600 games and show that our best performing algorithm, RIW_C+CPV, outperforms previously introduced width-based planning and learning algorithms π-IW(1), π-IW(1)+ and π-HIW(n, 1). Furthermore, we present a taxonomy of the set of Atari-2600 games according to some of their defining characteristics. This analysis of the games provides further insight into the behaviour and performance of the width-based algorithms introduced. Namely, for games with large branching factors, and games with sparse meaningful rewards, RIW_C+CPV outperforms π-IW, π-IW(1)+ and π-HIW(n, 1).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2021

Planning for Novelty: Width-Based Algorithms for Common Problems in Control, Planning and Reinforcement Learning

Width-based algorithms search for solutions through a general definition...
research
01/15/2021

Hierarchical Width-Based Planning and Learning

Width-based search methods have demonstrated state-of-the-art performanc...
research
12/15/2020

General Policies, Serializations, and Planning Width

It has been observed that in many of the benchmark planning domains, ato...
research
04/12/2019

Deep Policies for Width-Based Planning in Pixel Domains

Width-based planning has demonstrated great success in recent years due ...
research
06/15/2018

Improving width-based planning with compact policies

Optimal action selection in decision problems characterized by sparse, d...
research
09/30/2021

Width-Based Planning and Active Learning for Atari

Width-based planning has shown promising results on Atari 2600 games usi...
research
02/15/2022

On the cartesian product of well-orderings

The width of a well partial ordering (wpo) is the ordinal rank of the se...

Please sign up or login with your details

Forgot password? Click here to reset