Meta-learning for downstream aware and agnostic pretraining

06/06/2021
by   Hongyin Luo, et al.
0

Neural network pretraining is gaining attention due to its outstanding performance in natural language processing applications. However, pretraining usually leverages predefined task sequences to learn general linguistic clues. The lack of mechanisms in choosing proper tasks during pretraining makes the learning and knowledge encoding inefficient. We thus propose using meta-learning to select tasks that provide the most informative learning signals in each episode of pretraining. With the proposed method, we aim to achieve better efficiency in computation and memory usage for the pretraining process and resulting networks while maintaining the performance. In this preliminary work, we discuss the algorithm of the method and its two variants, downstream-aware and downstream-agnostic pretraining. Our experiment plan is also summarized, while empirical results will be shared in our future works.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2019

Meta Learning for End-to-End Low-Resource Speech Recognition

In this paper, we proposed to apply meta learning approach for low-resou...
research
04/19/2020

Are we pretraining it right? Digging deeper into visio-linguistic pretraining

Numerous recent works have proposed pretraining generic visio-linguistic...
research
11/03/2020

Meta-Learning for Natural Language Understanding under Continual Learning Framework

Neural network has been recognized with its accomplishments on tackling ...
research
07/03/2023

Improving Language Plasticity via Pretraining with Active Forgetting

Pretrained language models (PLMs) are today the primary model for natura...
research
03/14/2023

Diversity-Aware Meta Visual Prompting

We present Diversity-Aware Meta Visual Prompting (DAM-VP), an efficient ...
research
06/07/2022

On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

Intelligent agents should have the ability to leverage knowledge from pr...
research
05/18/2023

How does the task complexity of masked pretraining objectives affect downstream performance?

Masked language modeling (MLM) is a widely used self-supervised pretrain...

Please sign up or login with your details

Forgot password? Click here to reset