Chinese Lexical Simplification

10/14/2020
by   Jipeng Qiang, et al.
0

Lexical simplification has attracted much attention in many languages, which is the process of replacing complex words in a given sentence with simpler alternatives of equivalent meaning. Although the richness of vocabulary in Chinese makes the text very difficult to read for children and non-native speakers, there is no research work for Chinese lexical simplification (CLS) task. To circumvent difficulties in acquiring annotations, we manually create the first benchmark dataset for CLS, which can be used for evaluating the lexical simplification systems automatically. In order to acquire more thorough comparison, we present five different types of methods as baselines to generate substitute candidates for the complex word that include synonym-based approach, word embedding-based approach, pretrained language model-based approach, sememe-based approach, and a hybrid approach. Finally, we design the experimental evaluation of these baselines and discuss their advantages and disadvantages. To our best knowledge, this is the first study for CLS task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2019

A Simple BERT-Based Approach for Lexical Simplification

Lexical simplification (LS) aims to replace complex words in a given sen...
research
09/12/2022

Lexical Simplification Benchmarks for English, Portuguese, and Spanish

Even in highly-developed countries, as many as 15-30% of the population ...
research
06/01/2019

COS960: A Chinese Word Similarity Dataset of 960 Word Pairs

Word similarity computation is a widely recognized task in the field of ...
research
06/30/2023

Japanese Lexical Complexity for Non-Native Readers: A New Dataset

Lexical complexity prediction (LCP) is the task of predicting the comple...
research
04/11/2020

End to End Chinese Lexical Fusion Recognition with Sememe Knowledge

In this paper, we present Chinese lexical fusion recognition, a new task...
research
04/11/2018

English Out-of-Vocabulary Lexical Evaluation Task

Unlike previous unknown nouns tagging task (Curran, 2005) (Ciaramita and...
research
05/12/2020

Detecting Multiword Expression Type Helps Lexical Complexity Assessment

Multiword expressions (MWEs) represent lexemes that should be treated as...

Please sign up or login with your details

Forgot password? Click here to reset