Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft

12/07/2021
by   Vinicius G. Goecks, et al.
0

Real-world tasks of interest are generally poorly defined by human-readable descriptions and have no pre-defined reward signals unless it is defined by a human designer. Conversely, data-driven algorithms are often designed to solve a specific, narrowly defined, task with performance metrics that drives the agent's learning. In this work, we present the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge: Learning from Human Feedback in Minecraft, which challenged participants to use human data to solve four tasks defined only by a natural language description and no reward function. Our approach uses the available human demonstration data to train an imitation learning policy for navigation and additional human feedback to train an image classifier. These modules, together with an estimated odometry map, are then combined into a state-machine designed based on human knowledge of the tasks that breaks them down in a natural hierarchy and controls which macro behavior the learning agent should follow at any instant. We compare this hybrid intelligence approach to both end-to-end machine learning and pure engineered solutions, which are then judged by human evaluators. Codebase is available at https://github.com/viniciusguigo/kairos_minerl_basalt.

READ FULL TEXT

page 6

page 7

page 17

page 19

page 23

page 24

page 25

page 26

research
07/05/2021

The MineRL BASALT Competition on Learning from Human Feedback

The last decade has seen a significant increase of interest in deep lear...
research
08/09/2021

Knowledge accumulating: The general pattern of learning

Artificial Intelligence has been developed for decades with the achievem...
research
12/02/2020

DERAIL: Diagnostic Environments for Reward And Imitation Learning

The objective of many real-world tasks is complex and difficult to proce...
research
12/20/2016

Unsupervised Perceptual Rewards for Imitation Learning

Reward function design and exploration time are arguably the biggest obs...
research
06/05/2021

Zero-shot Task Adaptation using Natural Language

Imitation learning and instruction-following are two common approaches t...
research
03/23/2023

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

To facilitate research in the direction of fine-tuning foundation models...
research
01/24/2023

Language-guided Task Adaptation for Imitation Learning

We introduce a novel setting, wherein an agent needs to learn a task fro...

Please sign up or login with your details

Forgot password? Click here to reset