Width-Based Planning and Active Learning for Atari

09/30/2021
by   Benjamin Ayton, et al.
0

Width-based planning has shown promising results on Atari 2600 games using pixel input, while using substantially fewer environment interactions than reinforcement learning. Recent width-based approaches have computed feature vectors for each screen using a hand designed feature set or a variational autoencoder (VAE) trained on game screens, and prune screens that do not have novel features during the search. In this paper, we explore consideration of uncertainty in features generated by a VAE during width-based planning. Our primary contribution is the introduction of active learning to maximize the utility of screens observed during planning. Experimental results demonstrate that use of active learning strategies increases gameplay scores compared to alternative width-based approaches with equal numbers of environment interactions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2018

Learning a Policy for Opportunistic Active Learning

Active learning identifies data points to label that are expected to be ...
research
02/25/2017

Generative Adversarial Active Learning

We propose a new active learning by query synthesis approach using Gener...
research
04/12/2019

Deep Policies for Width-Based Planning in Pixel Domains

Width-based planning has demonstrated great success in recent years due ...
research
12/16/2020

Planning From Pixels in Atari With Learned Symbolic Representations

Width-based planning methods have been shown to yield state-of-the-art p...
research
03/31/2019

Variational Adversarial Active Learning

Active learning aims to develop label-efficient algorithms by sampling t...
research
06/23/2021

Width-based Lookaheads with Learnt Base Policies and Heuristics Over the Atari-2600 Benchmark

We propose new width-based planning and learning algorithms applied over...
research
01/10/2018

Planning with Pixels in (Almost) Real Time

Recently, width-based planning methods have been shown to yield state-of...

Please sign up or login with your details

Forgot password? Click here to reset