Improved Latent Tree Induction with Distant Supervision via Span Constraints

09/10/2021
by   Zhiyang Xu, et al.
7

For over thirty years, researchers have developed and analyzed methods for latent tree induction as an approach for unsupervised syntactic parsing. Nonetheless, modern systems still do not perform well enough compared to their supervised counterparts to have any practical use as structural annotation of text. In this work, we present a technique that uses distant supervision in the form of span constraints (i.e. phrase bracketing) to improve performance in unsupervised constituency parsing. Using a relatively small number of span constraints we can substantially improve the output from DIORA, an already competitive unsupervised parsing system. Compared with full parse tree annotation, span constraints can be acquired with minimal effort, such as with a lexicon derived from Wikipedia, to find exact text matches. Our experiments show span constraints based on entities improves constituency parsing on English WSJ Penn Treebank by more than 5 F1. Furthermore, our method extends to any domain where span constraints are easily attainable, and as a case study we demonstrate its effectiveness by parsing biomedical text from the CRAFT dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2021

Headed Span-Based Projective Dependency Parsing

We propose a headed span-based method for projective dependency parsing....
research
06/24/2020

Efficient Constituency Parsing by Pointing

We propose a novel constituency parsing model that casts the parsing pro...
research
08/28/2018

Unsupervised Learning of Syntactic Structure with Invertible Neural Projections

Unsupervised learning of syntactic structure is typically performed usin...
research
05/17/2021

Dependency Parsing as MRC-based Span-Span Prediction

Higher-order methods for dependency parsing can partially but not fully ...
research
10/05/2021

Co-training an Unsupervised Constituency Parser with Weak Supervision

We introduce a method for unsupervised parsing that relies on bootstrapp...
research
10/07/2020

Unsupervised Parsing via Constituency Tests

We propose a method for unsupervised parsing based on the linguistic not...
research
12/17/2020

Unsupervised Learning of Discourse Structures using a Tree Autoencoder

Discourse information, as postulated by popular discourse theories, such...

Please sign up or login with your details

Forgot password? Click here to reset