A Markov Decision Process Approach to Active Meta Learning

09/10/2020
by   Bingjia Wang, et al.
6

In supervised learning, we fit a single statistical model to a given data set, assuming that the data is associated with a singular task, which yields well-tuned models for specific use, but does not adapt well to new contexts. By contrast, in meta-learning, the data is associated with numerous tasks, and we seek a model that may perform well on all tasks simultaneously, in pursuit of greater generalization. One challenge in meta-learning is how to exploit relationships between tasks and classes, which is overlooked by commonly used random or cyclic passes through data. In this work, we propose actively selecting samples on which to train by discerning covariates inside and between meta-training sets. Specifically, we cast the problem of selecting a sample from a number of meta-training sets as either a multi-armed bandit or a Markov Decision Process (MDP), depending on how one encapsulates correlation across tasks. We develop scheduling schemes based on Upper Confidence Bound (UCB), Gittins Index and tabular Markov Decision Problems (MDPs) solved with linear programming, where the reward is the scaled statistical accuracy to ensure it is a time-invariant function of state and action. Across a variety of experimental contexts, we observe significant reductions in sample complexity of active selection scheme relative to cyclic or i.i.d. sampling, demonstrating the merit of exploiting covariates in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2023

Invariant Meta Learning for Out-of-Distribution Generalization

Modern deep learning techniques have illustrated their excellent capabil...
research
06/10/2020

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algo...
research
03/05/2021

Meta Learning Black-Box Population-Based Optimizers

The no free lunch theorem states that no model is better suited to every...
research
01/21/2022

Meta Learning MDPs with Linear Transition Models

We study meta-learning in Markov Decision Processes (MDP) with linear tr...
research
12/21/2020

Universal Policies for Software-Defined MDPs

We introduce a new programming paradigm called oracle-guided decision pr...
research
05/01/2023

Model-agnostic Measure of Generalization Difficulty

The measure of a machine learning algorithm is the difficulty of the tas...
research
12/10/2019

Before we can find a model, we must forget about perfection

With Reinforcement Learning we assume that a model of the world does exi...

Please sign up or login with your details

Forgot password? Click here to reset