COMPILING: A Benchmark Dataset for Chinese Complexity Controllable Definition Generation

09/29/2022
by   Jiaxin Yuan, et al.
0

The definition generation task aims to generate a word's definition within a specific context automatically. However, owing to the lack of datasets for different complexities, the definitions produced by models tend to keep the same complexity level. This paper proposes a novel task of generating definitions for a word with controllable complexity levels. Correspondingly, we introduce COMPILING, a dataset given detailed information about Chinese definitions, and each definition is labeled with its complexity levels. The COMPILING dataset includes 74,303 words and 106,882 definitions. To the best of our knowledge, it is the largest dataset of the Chinese definition generation task. We select various representative generation methods as baselines for this task and conduct evaluations, which illustrates that our dataset plays an outstanding role in assisting models in generating different complexity-level definitions. We believe that the COMPILING dataset will benefit further research in complexity controllable definition generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2022

Multitasking Framework for Unsupervised Simple Definition Generation

The definition generation task can help language learners by providing e...
research
11/14/2021

CDM: Combining Extraction and Generation for Definition Modeling

Definitions are essential for term understanding. Recently, there is an ...
research
10/12/2020

Toward Cross-Lingual Definition Generation for Language Learners

Generating dictionary definitions automatically can prove useful for lan...
research
10/02/2022

Fine-grained Contrastive Learning for Definition Generation

Recently, pre-trained transformer-based models have achieved great succe...
research
05/16/2019

Incorporating Sememes into Chinese Definition Modeling

Chinese definition modeling is a challenging task that generates a dicti...
research
06/06/2018

Open Domain Suggestion Mining: Problem Definition and Datasets

We propose a formal definition for the task of suggestion mining in the ...
research
06/09/2023

Assisting Language Learners: Automated Trans-Lingual Definition Generation via Contrastive Prompt Learning

The standard definition generation task requires to automatically produc...

Please sign up or login with your details

Forgot password? Click here to reset