Unsupervised and Few-shot Parsing from Pretrained Language Models

06/10/2022
by   Zhiyuan Zeng, et al.
0

Pretrained language models are generally acknowledged to be able to encode syntax [Tenney et al., 2019, Jawahar et al., 2019, Hewitt and Manning, 2019]. In this article, we propose UPOA, an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model as the syntactic distance for span segmentation. We further propose an enhanced version, UPIO, which exploits both inside association and outside association scores for estimating the likelihood of a span. Experiments with UPOA and UPIO disclose that the linear projection matrices for the query and key in the self-attention mechanism play an important role in parsing. We therefore extend the unsupervised models to few-shot parsing models (FPOA, FPIO) that use a few annotated trees to learn better linear projection matrices for parsing. Experiments on the Penn Treebank demonstrate that our unsupervised parsing model UPIO achieves results comparable to the state of the art on short sentences (length <= 10). Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees. Experiments on cross-lingual parsing show that both unsupervised and few-shot parsing methods are better than previous methods on most languages of SPMRL [Seddah et al., 2013].

READ FULL TEXT

page 4

page 9

page 14

page 16

research
09/20/2019

A Critical Analysis of Biased Parsers in Unsupervised Parsing

A series of recent papers has used a parsing algorithm due to Shen et al...
research
06/03/2021

The Limitations of Limited Context for Constituency Parsing

Incorporating syntax into neural approaches in NLP has a multitude of pr...
research
10/19/2020

Heads-up! Unsupervised Constituency Parsing via Self-Attention Heads

Transformer-based pre-trained language models (PLMs) have dramatically i...
research
06/05/2019

An Imitation Learning Approach to Unsupervised Parsing

Recently, there has been an increasing interest in unsupervised parsers ...
research
03/14/2023

Do Transformers Parse while Predicting the Masked Word?

Pre-trained language models have been shown to encode linguistic structu...
research
03/30/2020

A Hierarchical Transformer for Unsupervised Parsing

The underlying structure of natural language is hierarchical; words comb...
research
06/01/2023

Contextual Distortion Reveals Constituency: Masked Language Models are Implicit Parsers

Recent advancements in pre-trained language models (PLMs) have demonstra...

Please sign up or login with your details

Forgot password? Click here to reset