Depth-bounding is effective: Improvements and evaluation of unsupervised PCFG induction

09/10/2018
by   Lifeng Jin, et al.
0

There have been several recent attempts to improve the accuracy of grammar induction systems by bounding the recursive complexity of the induction model (Ponvert et al., 2011; Noji and Johnson, 2016; Shain et al., 2016; Jin et al., 2018). Modern depth-bounded grammar inducers have been shown to be more accurate than early unbounded PCFG inducers, but this technique has never been compared against unbounded induction within the same system, in part because most previous depth-bounding models are built around sequence models, the complexity of which grows exponentially with the maximum allowed depth. The present work instead applies depth bounds within a chart-based Bayesian PCFG inducer (Johnson et al., 2007b), where bounding can be switched on and off, and then samples trees with and without bounding. Results show that depth-bounding is indeed significantly effective in limiting the search space of the inducer and thereby increasing the accuracy of the resulting parsing model. Moreover, parsing results on English, Chinese and German show that this bounded model with a new inference technique is able to produce parse trees more accurately than or competitively with state-of-the-art constituency-based grammar induction models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2018

Unsupervised Grammar Induction with Depth-bounded PCFG

There has been recent interest in applying cognitively or empirically mo...
research
06/20/2020

The Importance of Category Labels in Grammar Induction with Child-directed Utterances

Recent progress in grammar induction has shown that grammar induction is...
research
08/29/2018

Grammar Induction with Neural Language Models: An Unusual Replication

A substantial thread of recent work on latent tree learning has attempte...
research
05/24/2017

Matroids Hitting Sets and Unsupervised Dependency Grammar Induction

This paper formulates a novel problem on graphs: find the minimal subset...
research
05/31/2021

Neural Bi-Lexicalized PCFG Induction

Neural lexicalized PCFGs (L-PCFGs) have been shown effective in grammar ...
research
02/03/2021

Top-down Discourse Parsing via Sequence Labelling

We introduce a top-down approach to discourse parsing that is conceptual...
research
08/02/2017

Dependency Grammar Induction with Neural Lexicalization and Big Training Data

We study the impact of big models (in terms of the degree of lexicalizat...

Please sign up or login with your details

Forgot password? Click here to reset