ELECTRA is a Zero-Shot Learner, Too

07/17/2022
by   Shiwen Ni, et al.
0

Recently, for few-shot or even zero-shot learning, the new paradigm "pre-train, prompt, and predict" has achieved remarkable achievements compared with the "pre-train, fine-tune" paradigm. After the success of prompt-based GPT-3, a series of masked language model (MLM)-based (e.g., BERT, RoBERTa) prompt learning methods became popular and widely used. However, another efficient pre-trained discriminative model, ELECTRA, has probably been neglected. In this paper, we attempt to accomplish several NLP tasks in the zero-shot scenario using a novel our proposed replaced token detection (RTD)-based prompt learning method. Experimental results show that ELECTRA model based on RTD-prompt learning achieves surprisingly state-of-the-art zero-shot performance. Numerically, compared to MLM-RoBERTa-large and MLM-BERT-large, our RTD-ELECTRA-large has an average of about 8.4 improvement on all 15 tasks. Especially on the SST-2 task, our RTD-ELECTRA-large achieves an astonishing 90.1 data. Overall, compared to the pre-trained masked language models, the pre-trained replaced token detection model performs better in zero-shot learning. The source code is available at: https://github.com/nishiwen1214/RTD-ELECTRA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2020

CPM: A Large-scale Generative Chinese Pre-trained Language Model

Pre-trained Language Models (PLMs) have proven to be beneficial for vari...
research
09/08/2021

NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task–Next Sentence Prediction

Using prompts to utilize language models to perform various downstream t...
research
03/07/2022

Pre-trained Token-replaced Detection Model as Few-shot Learner

Pre-trained masked language models have demonstrated remarkable ability ...
research
10/25/2022

IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models

We introduce a new open information extraction (OIE) benchmark for pre-t...
research
08/21/2023

Image-free Classifier Injection for Zero-Shot Classification

Zero-shot learning models achieve remarkable results on image classifica...
research
07/09/2022

Few-shot training LLMs for project-specific code-summarization

Very large language models (LLMs), such as GPT-3 and Codex have achieved...
research
05/01/2023

ZeroSearch: Local Image Search from Text with Zero Shot Learning

The problem of organizing and finding images in a user's directory has b...

Please sign up or login with your details

Forgot password? Click here to reset