ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data

05/25/2022
by   Xiaochuang Han, et al.
0

Large pretrained language models have been performing increasingly well in a variety of downstream tasks via prompting. However, it remains unclear from where the model learns the task-specific knowledge, especially in a zero-shot setup. In this work, we want to find evidence of the model's task-specific competence from pretraining and are specifically interested in locating a very small subset of pretraining data that directly supports the model in the task. We call such a subset supporting data evidence and propose a novel method ORCA to effectively identify it, by iteratively using gradient information related to the downstream task. This supporting data evidence offers interesting insights about the prompted language models: in the tasks of sentiment analysis and textual entailment, BERT shows a substantial reliance on BookCorpus, the smaller corpus of BERT's two pretraining corpora, as well as on pretraining examples that mask out synonyms to the task verbalizers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2021

Robust Transfer Learning with Pretrained Language Models through Adapters

Transfer learning with large pretrained transformer-based language model...
research
09/28/2022

Downstream Datasets Make Surprisingly Good Pretraining Corpora

For most natural language processing tasks, the dominant practice is to ...
research
01/06/2022

Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis

In recent years, pretrained language models have revolutionized the NLP ...
research
02/15/2022

Impact of Pretraining Term Frequencies on Few-Shot Reasoning

Pretrained Language Models (LMs) have demonstrated ability to perform nu...
research
06/26/2023

Understanding In-Context Learning via Supportive Pretraining Data

In-context learning (ICL) improves language models' performance on a var...
research
04/28/2022

On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model

Many recent studies on large-scale language models have reported success...
research
03/16/2022

C-MORE: Pretraining to Answer Open-Domain Questions by Consulting Millions of References

We consider the problem of pretraining a two-stage open-domain question ...

Please sign up or login with your details

Forgot password? Click here to reset