Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification

05/21/2023
by   Renliang Sun, et al.
0

Randomly masking text spans in ordinary texts in the pre-training stage hardly allows models to acquire the ability to generate simple texts. It can hurt the performance of pre-trained models on text simplification tasks. In this paper, we propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts. We continue pre-training BART, a representative model, to obtain SimpleBART. It consistently and significantly improves the results on lexical simplification, sentence simplification, and document-level simplification tasks over BART. At the end, we compare SimpleBART with several representative large language models (LLMs).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/16/2022

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

Pre-trained models are widely used in the tasks of natural language proc...
research
10/06/2022

XDoc: Unified Pre-training for Cross-Format Document Understanding

The surge of pre-training has witnessed the rapid development of documen...
research
03/08/2022

Language Matters: A Weakly Supervised Pre-training Approach for Scene Text Detection and Spotting

Recently, Vision-Language Pre-training (VLP) techniques have greatly ben...
research
12/01/2022

Language Model Pre-training on True Negatives

Discriminative pre-trained language models (PLMs) learn to predict origi...
research
08/23/2023

D4: Improving LLM Pretraining via Document De-Duplication and Diversification

Over recent years, an increasing amount of compute and data has been pou...
research
05/26/2021

Unsupervised Pronoun Resolution via Masked Noun-Phrase Prediction

In this work, we propose Masked Noun-Phrase Prediction (MNPP), a pre-tra...
research
09/12/2022

PreSTU: Pre-Training for Scene-Text Understanding

The ability to read and reason about texts in an image is often lacking ...

Please sign up or login with your details

Forgot password? Click here to reset