On the Role of Supervision in Unsupervised Constituency Parsing

10/06/2020
by   Haoyue Shi, et al.
0

We analyze several recent unsupervised constituency parsing models, which are tuned with respect to the parsing F_1 score on the Wall Street Journal (WSJ) development set (1,700 sentences). We introduce strong baselines for them, by training an existing supervised parsing model (Kitaev and Klein, 2018) on the same labeled examples they access. When training on the 1,700 examples, or even when using only 50 examples for training and 5 for development, such a few-shot parsing approach can outperform all the unsupervised parsing methods by a significant margin. Few-shot parsing can be further improved by a simple data augmentation method and self-training. This suggests that, in order to arrive at fair conclusions, we should carefully consider the amount of labeled data used for model development. We propose two protocols for future work on unsupervised parsing: (i) use fully unsupervised criteria for hyperparameter tuning and model selection; (ii) use as few labeled examples as possible for model development, and compare to few-shot parsing trained on the same labeled examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2020

Self-Training for Unsupervised Parsing with PRPN

Neural unsupervised parsing (UP) models learn to parse without access to...
research
04/18/2015

Unsupervised Dependency Parsing: Let's Use Supervised Parsers

We present a self-training approach to unsupervised dependency parsing t...
research
09/14/2022

The Fragility of Multi-Treebank Parsing Evaluation

Treebank selection for parsing evaluation and the spurious effects that ...
research
10/29/2021

Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence

Unsupervised constituency parsing has been explored much but is still fa...
research
05/21/2019

AMR Parsing as Sequence-to-Graph Transduction

We propose an attention-based model that treats AMR parsing as sequence-...
research
09/04/2020

GPU-based Self-Organizing Maps for Post-Labeled Few-Shot Unsupervised Learning

Few-shot classification is a challenge in machine learning where the goa...
research
05/30/2019

Unsupervised Classification of Street Architectures Based on InfoGAN

Street architectures play an essential role in city image and streetscap...

Please sign up or login with your details

Forgot password? Click here to reset