CPTAM: Constituency Parse Tree Aggregation Method

01/19/2022
by   Adithya Kulkarni, et al.
0

Diverse Natural Language Processing tasks employ constituency parsing to understand the syntactic structure of a sentence according to a phrase structure grammar. Many state-of-the-art constituency parsers are proposed, but they may provide different results for the same sentences, especially for corpora outside their training domains. This paper adopts the truth discovery idea to aggregate constituency parse trees from different parsers by estimating their reliability in the absence of ground truth. Our goal is to consistently obtain high-quality aggregated constituency parse trees. We formulate the constituency parse tree aggregation problem in two steps, structure aggregation and constituent label aggregation. Specifically, we propose the first truth discovery solution for tree structures by minimizing the weighted sum of Robinson-Foulds (RF) distances, a classic symmetric distance metric between two trees. Extensive experiments are conducted on benchmark datasets in different languages and domains. The experimental results show that our method, CPTAM, outperforms the state-of-the-art aggregation baselines. We also demonstrate that the weights estimated by CPTAM can adequately evaluate constituency parsers in the absence of ground truth.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2020

Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach

It is commonly believed that knowledge of syntactic structure should imp...
research
09/04/2017

Learning to parse from a semantic objective: It works. Is it syntax?

Recent work on reinforcement learning and other gradient estimators for ...
research
08/07/2017

From Appearance to Essence: Comparing Truth Discovery Methods without Using Ground Truth

Truth discovery has been widely studied in recent years as a fundamental...
research
10/28/2021

Aggregation as Unsupervised Learning and its Evaluation

Regression uses supervised machine learning to find a model that combine...
research
04/30/2017

Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation

Different from other sequential data, sentences in natural language are ...
research
09/09/2021

Truth Discovery in Sequence Labels from Crowds

Annotations quality and quantity positively affect the performance of se...
research
05/14/2016

Monotone Retargeting for Unsupervised Rank Aggregation with Object Features

Learning the true ordering between objects by aggregating a set of exper...

Please sign up or login with your details

Forgot password? Click here to reset