Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation

06/23/2023
by   Massimiliano Patacchiola, et al.
0

In this paper we explore few-shot imitation learning for control problems, which involves learning to imitate a target policy by accessing a limited set of offline rollouts. This setting has been relatively under-explored despite its relevance to robotics and control applications. State-of-the-art methods developed to tackle few-shot imitation rely on meta-learning, which is expensive to train as it requires access to a distribution over tasks (rollouts from many target policies and variations of the base environment). Given this limitation we investigate an alternative approach, fine-tuning, a family of methods that pretrain on a single dataset and then fine-tune on unseen domain-specific data. Recent work has shown that fine-tuners outperform meta-learners in few-shot image classification tasks, especially when the data is out-of-domain. Here we evaluate to what extent this is true for control problems, proposing a simple yet effective baseline which relies on two stages: (i) training a base policy online via reinforcement learning (e.g. Soft Actor-Critic) on a single base environment, (ii) fine-tuning the base policy via behavioral cloning on a few offline rollouts of the target policy. Despite its simplicity this baseline is competitive with meta-learning methods on a variety of conditions and is able to imitate target policies trained on unseen variations of the original environment. Importantly, the proposed approach is practical and easy to implement, as it does not need any complex meta-training protocol. As a further contribution, we release an open source dataset called iMuJoCo (iMitation MuJoCo) consisting of 154 variants of popular OpenAI-Gym MuJoCo environments with associated pretrained target policies and rollouts, which can be used by the community to study few-shot imitation learning and offline reinforcement learning.

READ FULL TEXT
research
02/12/2022

Robust Learning from Observation with Model Misspecification

Imitation learning (IL) is a popular paradigm for training policies in r...
research
11/03/2021

Curriculum Offline Imitation Learning

Offline reinforcement learning (RL) tasks require the agent to learn fro...
research
10/11/2021

Learning a subspace of policies for online adaptation in Reinforcement Learning

Deep Reinforcement Learning (RL) is mainly studied in a setting where th...
research
10/26/2021

Towards More Generalizable One-shot Visual Imitation Learning

A general-purpose robot should be able to master a wide range of tasks a...
research
08/25/2020

The Advantage of Conditional Meta-Learning for Biased Regularization and Fine-Tuning

Biased regularization and fine-tuning are two recent meta-learning appro...
research
06/20/2022

Contextual Squeeze-and-Excitation for Efficient Few-Shot Image Classification

Recent years have seen a growth in user-centric applications that requir...
research
02/06/2023

A Strong Baseline for Batch Imitation Learning

Imitation of expert behaviour is a highly desirable and safe approach to...

Please sign up or login with your details

Forgot password? Click here to reset