Span Fine-tuning for Pre-trained Language Models

08/29/2021
by   Rongzhou Bao, et al.
0

Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have shown that incorporating span-level information over consecutive words in pre-training could further improve the performance of PrLMs. However, given that span-level clues are introduced and fixed in pre-training, previous methods are time-consuming and lack of flexibility. To alleviate the inconvenience, this paper presents a novel span fine-tuning method for PrLMs, which facilitates the span setting to be adaptively determined by specific downstream tasks during the fine-tuning phase. In detail, any sentences processed by the PrLM will be segmented into multiple spans according to a pre-sampled dictionary. Then the segmentation information will be sent through a hierarchical CNN module together with the representation outputs of the PrLM and ultimately generate a span-enhanced representation. Experiments on GLUE benchmark show that the proposed span fine-tuning method significantly enhances the PrLM, and at the same time, offer more flexibility in an efficient way.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2023

SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models

The pre-training and fine-tuning paradigm has contributed to a number of...
research
12/17/2022

HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation

Language models with the Transformers structure have shown great perform...
research
01/26/2020

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

Current pre-training works in natural language generation pay little att...
research
06/08/2023

Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

This work introduces approaches to assessing phrase breaks in ESL learne...
research
09/20/2021

Improving Span Representation for Domain-adapted Coreference Resolution

Recent work has shown fine-tuning neural coreference models can produce ...
research
08/26/2019

Measuring Patent Claim Generation by Span Relevancy

Our goal of patent claim generation is to realize "augmented inventing" ...
research
06/02/2021

A Span Extraction Approach for Information Extraction on Visually-Rich Documents

Information extraction (IE) from visually-rich documents (VRDs) has achi...

Please sign up or login with your details

Forgot password? Click here to reset