Metacontrol for Adaptive Imagination-Based Optimization

05/07/2017
by   Jessica B. Hamrick, et al.
0

Many machine learning systems are built to solve the hardest examples of a particular task, which often makes them large and expensive to run---especially with respect to the easier examples, which might require much less computation. For an agent with a limited computational budget, this "one-size-fits-all" approach may result in the agent wasting valuable computation on easy examples, while not spending enough on hard examples. Rather than learning a single, fixed policy for solving all instances of a task, we introduce a metacontroller which learns to optimize a sequence of "imagined" internal simulations over predictive models of the world in order to construct a more informed, and more economical, solution. The metacontroller component is a model-free reinforcement learning agent, which decides both how many iterations of the optimization procedure to run, as well as which model to consult on each iteration. The models (which we call "experts") can be state transition models, action-value functions, or any other mechanism that provides information useful for solving the task, and can be learned on-policy or off-policy in parallel with the metacontroller. When the metacontroller, controller, and experts were trained with "interaction networks" (Battaglia et al., 2016) as expert models, our approach was able to solve a challenging decision-making problem under complex non-linear dynamics. The metacontroller learned to adapt the amount of computation it performed to the difficulty of the task, and learned how to choose which experts to consult by factoring in both their reliability and individual computational resource costs. This allowed the metacontroller to achieve a lower overall cost (task loss plus computational cost) than more traditional fixed policy approaches. These results demonstrate that our approach is a powerful framework for using...

READ FULL TEXT

page 3

page 9

page 14

research
06/06/2022

Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL

Model-based reinforcement learning promises to learn an optimal policy f...
research
05/11/2018

Human-Machine Collaborative Optimization via Apprenticeship Scheduling

Coordinating agents to complete a set of tasks with intercoupled tempora...
research
03/01/2021

Learning Monopoly Gameplay: A Hybrid Model-Free Deep Reinforcement Learning and Imitation Learning Approach

Learning how to adapt and make real-time informed decisions in dynamic a...
research
10/01/2020

Learning Social Learning

Social learning is a key component of human and animal intelligence. By ...
research
08/18/2023

Learning Computational Efficient Bots with Costly Features

Deep reinforcement learning (DRL) techniques have become increasingly us...
research
01/11/2022

Learning Robust Policies for Generalized Debris Capture with an Automated Tether-Net System

Tether-net launched from a chaser spacecraft provides a promising method...
research
11/03/2020

Specialization in Hierarchical Learning Systems

Joining multiple decision-makers together is a powerful way to obtain mo...

Please sign up or login with your details

Forgot password? Click here to reset