Enhancing Pre-trained Language Model with Lexical Simplification

12/30/2020
by   Rongzhou Bao, et al.
11

For both human readers and pre-trained language models (PrLMs), lexical diversity may lead to confusion and inaccuracy when understanding the underlying semantic meanings of given sentences. By substituting complex words with simple alternatives, lexical simplification (LS) is a recognized method to reduce such lexical diversity, and therefore to improve the understandability of sentences. In this paper, we leverage LS and propose a novel approach which can effectively improve the performance of PrLMs in text classification. A rule-based simplification process is applied to a given sentence. PrLMs are encouraged to predict the real label of the given sentence with auxiliary inputs from the simplified version. Using strong PrLMs (BERT and ELECTRA) as baselines, our approach can still further improve the performance in various text classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2019

A Simple BERT-Based Approach for Lexical Simplification

Lexical simplification (LS) aims to replace complex words in a given sen...
research
10/20/2020

Elaborative Simplification: Content Addition and Explanation Generation in Text Simplification

Much of modern day text simplification research focuses on sentence-leve...
research
01/09/2021

Learning Better Sentence Representation with Syntax Information

Sentence semantic understanding is a key topic in the field of natural l...
research
09/13/2021

Show Me How To Revise: Improving Lexically Constrained Sentence Generation with XLNet

Lexically constrained sentence generation allows the incorporation of pr...
research
09/28/2018

Learning Robust, Transferable Sentence Representations for Text Classification

Despite deep recurrent neural networks (RNNs) demonstrate strong perform...
research
11/26/2022

Lexical Complexity Controlled Sentence Generation

Text generation rarely considers the control of lexical complexity, whic...
research
12/15/2021

Tracing Text Provenance via Context-Aware Lexical Substitution

Text content created by humans or language models is often stolen or mis...

Please sign up or login with your details

Forgot password? Click here to reset