CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models

02/07/2021
by   Yusheng Su, et al.
0

Fine-tuning pre-trained language models (PLMs) has demonstrated its effectiveness on various downstream NLP tasks recently. However, in many low-resource scenarios, the conventional fine-tuning strategies cannot sufficiently capture the important semantic features for downstream tasks. To address this issue, we introduce a novel framework (named "CSS-LM") to improve the fine-tuning phase of PLMs via contrastive semi-supervised learning. Specifically, given a specific task, we retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to the task. We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features. The experimental results show that CSS-LM achieves better results than the conventional fine-tuning strategy on a series of downstream tasks with few-shot settings, and outperforms the latest supervised contrastive fine-tuning strategies. Our datasets and source code will be available to provide more details.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2022

Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively

Large-scale pre-trained language models have achieved impressive results...
research
04/01/2022

Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning

Pre-trained Language Models (PLMs) have achieved remarkable performance ...
research
05/05/2023

Low-Resource Multi-Granularity Academic Function Recognition Based on Multiple Prompt Knowledge

Fine-tuning pre-trained language models (PLMs), e.g., SciBERT, generally...
research
05/20/2023

DisCo: Distilled Student Models Co-training for Semi-supervised Text Mining

Many text mining models are constructed by fine-tuning a large deep pre-...
research
02/22/2023

Distribution Normalization: An "Effortless" Test-Time Augmentation for Contrastively Learned Visual-language Models

Advances in the field of visual-language contrastive learning have made ...
research
06/04/2021

Bi-Granularity Contrastive Learning for Post-Training in Few-Shot Scene

The major paradigm of applying a pre-trained language model to downstrea...
research
01/03/2023

Graph Contrastive Learning for Multi-omics Data

Advancements in technologies related to working with omics data require ...

Please sign up or login with your details

Forgot password? Click here to reset