Cross-Domain Generalization of Neural Constituency Parsers

07/09/2019
by   Daniel Fried, et al.
0

Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing -- but to what degree do they generalize to other domains? We present three results about the generalization of neural parsers in a zero-shot setting: training on trees from one corpus and evaluating on out-of-domain corpora. First, neural and non-neural parsers generalize comparably to new domains. Second, incorporating pre-trained encoder representations into neural parsers substantially improves their performance across all domains, but does not give a larger relative improvement for out-of-domain treebanks. Finally, despite the rich input representations they learn, neural parsers still benefit from structured output prediction of output trees, yielding higher exact match accuracy and stronger generalization both to larger text spans and to out-of-domain corpora. We analyze generalization on English and Chinese corpora, and in the process obtain state-of-the-art parsing results for the Brown, Genia, and English Web treebanks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2023

Why Can't Discourse Parsing Generalize? A Thorough Investigation of the Impact of Data Diversity

Recent advances in discourse parsing performance create the impression t...
research
01/26/2023

Neural-Symbolic Inference for Robust Autoregressive Graph Parsing via Compositional Uncertainty Quantification

Pre-trained seq2seq models excel at graph semantic parsing with rich ann...
research
10/22/2020

Meta-Learning for Domain Generalization in Semantic Parsing

The importance of building semantic parsers which can be applied to new ...
research
08/27/2018

Zero-shot Transfer Learning for Semantic Parsing

While neural networks have shown impressive performance on large dataset...
research
12/04/2021

Hierarchical Neural Data Synthesis for Semantic Parsing

Semantic parsing datasets are expensive to collect. Moreover, even the q...
research
04/29/2020

A Cross-Genre Ensemble Approach to Robust Reddit Part of Speech Tagging

Part of speech tagging is a fundamental NLP task often regarded as solve...
research
02/10/2023

Cross-Corpora Spoken Language Identification with Domain Diversification and Generalization

This work addresses the cross-corpora generalization issue for the low-r...

Please sign up or login with your details

Forgot password? Click here to reset