Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance

06/01/2021
by   Mark Anderson, et al.
0

Søgaard (2020) obtained results suggesting the fraction of trees occurring in the test data isomorphic to trees in the training set accounts for a non-trivial variation in parser performance. Similar to other statistical analyses in NLP, the results were based on evaluating linear regressions. However, the study had methodological issues and was undertaken using a small sample size leading to unreliable results. We present a replication study in which we also bin sentences by length and find that only a small subset of sentences vary in performance with respect to graph isomorphism. Further, the correlation observed between parser performance and graph isomorphism in the wild disappears when controlling for covariants. However, in a controlled experiment, where covariants are kept fixed, we do observe a strong correlation. We suggest that conclusions drawn from statistical analyses like this need to be tempered and that controlled experiments can complement them by more readily teasing factors apart.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2021

A Biologically Plausible Parser

We describe a parser of English effectuated by biologically plausible ne...
research
07/24/2017

AMR Parsing using Stack-LSTMs

We present a transition-based AMR parser that directly generates AMR par...
research
06/01/2016

Improved Parsing for Argument-Clusters Coordination

Syntactic parsers perform poorly in prediction of Argument-Cluster Coord...
research
09/01/2017

Arc-Standard Spinal Parsing with Stack-LSTMs

We present a neural transition-based parser for spinal trees, a dependen...
research
09/09/2021

A Derivative-based Parser Generator for Visibly Pushdown Grammars

In this paper, we present a derivative-based, functional recognizer and ...
research
06/24/2021

Splitting EUD graphs into trees: A quick and clatty approach

We present the system submission from the FASTPARSE team for the EUD Sha...
research
08/26/2018

Identifying Domain Adjacent Instances for Semantic Parsers

When the semantics of a sentence are not representable in a semantic par...

Please sign up or login with your details

Forgot password? Click here to reset