Chameleon: Learning Model Initializations Across Tasks With Different Schemas

09/30/2019
by   Lukas Brinkmeyer, et al.
5

Parametric models, and particularly neural networks, require weight initialization as a starting point for gradient-based optimization. In most current practices, this is accomplished by using some form of random initialization. Instead, recent work shows that a specific initial parameter set can be learned from a population of tasks, i.e., dataset and target variable for supervised learning tasks. Using this initial parameter set leads to faster convergence for new tasks (model-agnostic meta-learning). Currently, methods for learning model initializations are limited to a population of tasks sharing the same schema, i.e., the same number, order, type and semantics of predictor and target variables. In this paper, we address the problem of meta-learning parameter initialization across tasks with different schemas, i.e., if the number of predictors varies across tasks, while they still share some variables. We propose Chameleon, a model that learns to align different predictor schemas to a common representation. We use permutations and masks of the predictors of the training tasks at hand. In experiments on real-life data sets, we show that Chameleon successfully can learn parameter initializations across tasks with different schemas providing a 26% lift on accuracy on average over random initialization and of 5% over a state-of-the-art method for fixed-schema learning model initializations. To the best of our knowledge, our paper is the first work on the problem of learning model initialization across tasks with different schemas.

READ FULL TEXT

page 6

page 9

research
10/28/2019

HIDRA: Head Initialization across Dynamic targets for Robust Architectures

The performance of gradient-based optimization strategies depends heavil...
research
05/27/2019

Dataset2Vec: Learning Dataset Meta-Features

Machine learning tasks such as optimizing the hyper-parameters of a mode...
research
06/13/2019

Learning to Forget for Meta-Learning

Few-shot learning is a challenging problem where the system is required ...
research
11/15/2021

Optimizing Unlicensed Coexistence Network Performance Through Data Learning

Unlicensed LTE-WiFi coexistence networks are undergoing consistent densi...
research
10/16/2021

Meta-Learning with Adjoint Methods

Model Agnostic Meta-Learning (MAML) is widely used to find a good initia...
research
08/19/2021

Learning-to-learn non-convex piecewise-Lipschitz functions

We analyze the meta-learning of the initialization and step-size of lear...
research
04/18/2023

Parameterized Neural Networks for Finance

We discuss and analyze a neural network architecture, that enables learn...

Please sign up or login with your details

Forgot password? Click here to reset