Building an Ellipsis-aware Chinese Dependency Treebank for Web Text

01/20/2018
by   Xuancheng Ren, et al.
0

Web 2.0 has brought with it numerous user-produced data revealing one's thoughts, experiences, and knowledge, which are a great source for many tasks, such as information extraction, and knowledge base construction. However, the colloquial nature of the texts poses new challenges for current natural language processing techniques, which are more adapt to the formal form of the language. Ellipsis is a common linguistic phenomenon that some words are left out as they are understood from the context, especially in oral utterance, hindering the improvement of dependency parsing, which is of great importance for tasks relied on the meaning of the sentence. In order to promote research in this area, we are releasing a Chinese dependency treebank of 319 weibos, containing 572 sentences with omissions restored and contexts reserved.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2019

A Unified Model for Joint Chinese Word Segmentation and Dependency Parsing

Chinese word segmentation and dependency parsing are two fundamental tas...
research
10/16/2017

BKTreebank: Building a Vietnamese Dependency Treebank

Dependency treebank is an important resource in any language. In this pa...
research
05/20/2021

Dependency Parsing with Bottom-up Hierarchical Pointer Networks

Dependency parsing is a crucial step towards deep language understanding...
research
10/15/2019

On Constructing a Knowledge Base of Chinese Criminal Cases

We are developing a knowledge base over Chinese judicial decision docume...
research
09/18/2020

fastHan: A BERT-based Joint Many-Task Toolkit for Chinese NLP

We present fastHan, an open-source toolkit for four basic tasks in Chine...
research
12/18/2017

A Chinese Dataset with Negative Full Forms for General Abbreviation Prediction

Abbreviation is a common phenomenon across languages, especially in Chin...
research
05/01/2022

Conventions and Mutual Expectations – understanding sources for web genres

Genres can be understood in many different ways. They are often perceive...

Please sign up or login with your details

Forgot password? Click here to reset