DeepAI AI Chat
Log In Sign Up

MAGIC: Learning Macro-Actions for Online POMDP Planning using Generator-Critic

by   Yiyuan Lee, et al.

When robots operate in the real-world, they need to handle uncertainties in sensing, acting, and the environment. Many tasks also require reasoning about long-term consequences of robot decisions. The partially observable Markov decision process (POMDP) offers a principled approach for planning under uncertainty. However, its computational complexity grows exponentially with the planning horizon. We propose to use temporally-extended macro-actions to cut down the effective planning horizon and thus the exponential factor of the complexity. We propose Macro-Action Generator-Critic (MAGIC), an algorithm that learns a macro-action generator from data, and uses the learned macro-actions to perform long-horizon planning. MAGIC learns the generator using experience provided by an online planner, and in-turn conditions the planner using the generated macro-actions. We evaluate MAGIC on several long-term planning tasks, showing that it significantly outperforms planning using primitive actions, hand-crafted macro-actions, as well as naive reinforcement learning in both simulation and on a real robot.


Macro Action Reinforcement Learning with Sequence Disentanglement using Variational Autoencoder

One problem in the application of reinforcement learning to real-world p...

Marvin: A Heuristic Search Planner with Online Macro-Action Learning

This paper describes Marvin, a planner that competed in the Fourth Inter...

Efficient Planning under Uncertainty with Macro-actions

Deciding how to act in partially observable environments remains an acti...

POMDP Manipulation Planning under Object Composition Uncertainty

Manipulating unknown objects in a cluttered environment is difficult bec...

Policy Transfer via Enhanced Action Space

Though transfer learning is promising to increase the learning efficienc...

Strategic Attentive Writer for Learning Macro-Actions

We present a novel deep recurrent neural network architecture that learn...

Hierarchy through Composition with Linearly Solvable Markov Decision Processes

Hierarchical architectures are critical to the scalability of reinforcem...