Supervising strong learners by amplifying weak experts

10/19/2018
by   Paul Christiano, et al.
0

Many real world learning tasks involve complex or hard-to-specify objectives, and using an easier-to-specify proxy can lead to poor performance or misaligned behavior. One solution is to have humans provide a training signal by demonstrating or judging performance, but this approach fails if the task is too complicated for a human to directly evaluate. We propose Iterated Amplification, an alternative training strategy which progressively builds up a training signal for difficult problems by combining solutions to easier subproblems. Iterated Amplification is closely related to Expert Iteration (Anthony et al., 2017; Silver et al., 2017), except that it uses no external reward function. We present results in algorithmic environments, showing that Iterated Amplification can efficiently learn complex behaviors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2023

Reward Design with Language Models

Reward design in reinforcement learning (RL) is challenging since specif...
research
02/10/2022

On characterizations of learnability with computable learners

We study computable PAC (CPAC) learning as introduced by Agarwal et al. ...
research
12/14/2021

Programmatic Reward Design by Example

Reward design is a fundamental problem in reinforcement learning (RL). A...
research
05/20/2021

Comment on Stochastic Polyak Step-Size: Performance of ALI-G

This is a short note on the performance of the ALI-G algorithm (Berrada ...
research
09/27/2022

Defining and Characterizing Reward Hacking

We provide the first formal definition of reward hacking, a phenomenon w...
research
11/20/2017

Implementing the Deep Q-Network

The Deep Q-Network proposed by Mnih et al. [2015] has become a benchmark...

Please sign up or login with your details

Forgot password? Click here to reset