Semi-Supervised Methods for Out-of-Domain Dependency Parsing

10/04/2018
by   Juntao Yu, et al.
0

Dependency parsing is one of the important natural language processing tasks that assigns syntactic trees to texts. Due to the wider availability of dependency corpora and improved parsing and machine learning techniques, parsing accuracies of supervised learning-based systems have been significantly improved. However, due to the nature of supervised learning, those parsing systems highly rely on the manually annotated training corpora. They work reasonably good on the in-domain data but the performance drops significantly when tested on out-of-domain texts. To bridge the performance gap between in-domain and out-of-domain, this thesis investigates three semi-supervised techniques for out-of-domain dependency parsing, namely co-training, self-training and dependency language models. Our approaches use easily obtainable unlabelled data to improve out-of-domain parsing accuracies without the need of expensive corpora annotation. The evaluations on several English domains and multi-lingual data show quite good improvements on parsing accuracy. Overall this work conducted a survey of semi-supervised methods for out-of-domain dependency parsing, where I extended and compared a number of important semi-supervised methods in a unified framework. The comparison between those techniques shows that self-training works equally well as co-training on out-of-domain parsing, while dependency language models can improve both in- and out-of-domain accuracies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2019

Error Analysis for Vietnamese Dependency Parsing

Dependency parsing is needed in different applications of natural langua...
research
06/16/2015

Parsing Natural Language Sentences by Semi-supervised Methods

We present our work on semi-supervised parsing of natural language sente...
research
07/25/2018

Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder

Human annotation for syntactic parsing is expensive, and large resources...
research
01/25/2023

Weakly Supervised Headline Dependency Parsing

English news headlines form a register with unique syntactic properties ...
research
10/21/2019

On Semi-Supervised Multiple Representation Behavior Learning

We propose a novel paradigm of semi-supervised learning (SSL)–the semi-s...
research
12/16/2015

Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning

Morpho-syntactic lexicons provide information about the morphological an...
research
01/13/2016

Predicting the Effectiveness of Self-Training: Application to Sentiment Classification

The goal of this paper is to investigate the connection between the perf...

Please sign up or login with your details

Forgot password? Click here to reset