Co-training an Unsupervised Constituency Parser with Weak Supervision

10/05/2021
by   Nickil Maveli, et al.
0

We introduce a method for unsupervised parsing that relies on bootstrapping classifiers to identify if a node dominates a specific span in a sentence. There are two types of classifiers, an inside classifier that acts on a span, and an outside classifier that acts on everything outside of a given span. Through self-training and co-training with the two classifiers, we show that the interplay between them helps improve the accuracy of both, and as a result, effectively parse. A seed bootstrapping technique prepares the data to train these classifiers. Our analyses further validate that such an approach in conjunction with weak supervision using prior branching knowledge of a known language (left/right-branching) and minimal heuristics injects strong inductive bias into the parser, achieving 63.1 F_1 on the English (PTB) test set. In addition, we show the effectiveness of our architecture by evaluating on treebanks for Chinese (CTB) and Japanese (KTB) and achieve new state-of-the-art results.[For code or data, please contact the authors.]

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2020

Unsupervised Parsing via Constituency Tests

We propose a method for unsupervised parsing based on the linguistic not...
research
07/18/2016

Dependency Language Models for Transition-based Dependency Parsing

In this paper, we present an approach to improve the accuracy of a stron...
research
05/20/2021

Dependency Parsing with Bottom-up Hierarchical Pointer Networks

Dependency parsing is a crucial step towards deep language understanding...
research
09/10/2021

Improved Latent Tree Induction with Distant Supervision via Span Constraints

For over thirty years, researchers have developed and analyzed methods f...
research
01/27/2021

PPT: Parsimonious Parser Transfer for Unsupervised Cross-Lingual Adaptation

Cross-lingual transfer is a leading technique for parsing low-resource l...
research
04/03/2019

Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders

We introduce deep inside-outside recursive autoencoders (DIORA), a fully...
research
10/29/2021

Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence

Unsupervised constituency parsing has been explored much but is still fa...

Please sign up or login with your details

Forgot password? Click here to reset