Learning Quickly to Plan Quickly Using Modular Meta-Learning
Multi-object manipulation problems in continuous state and action spaces can be solved by planners that search over sampled values for the continuous parameters of operators. The efficiency of these planners depends critically on the effectiveness of the samplers used, but effective sampling in turn depends on details of the robot, environment, and task. Our strategy is to learn functions called specializers that generate values for continuous operator parameters, given a state description and values for the discrete parameters. Rather than trying to learn a single specializer for each operator from large amounts of data on a single task, we take a modular meta-learning approach. We train on multiple tasks and learn a variety of specializers that, on a new task, can be quickly adapted using relatively little data -- thus, our system "learns quickly to plan quickly" using these specializers. We validate our approach experimentally in simulated 3D pick-and-place tasks with continuous state and action spaces.
READ FULL TEXT