Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation

by   Xiang Hu, et al.

Recently CKY-based models show great potential in unsupervised grammar induction thanks to their human-like encoding paradigm, which runs recursively and hierarchically, but requires O(n^3) time-complexity. Recursive Transformer based on Differentiable Trees (R2D2) makes it possible to scale to large language model pre-training even with complex tree encoder by introducing a heuristic pruning method. However, the rule-based pruning approach suffers from local optimum and slow inference issues. In this paper, we fix those issues in a unified method. We propose to use a top-down parser as a model-based pruning method, which also enables parallel encoding during inference. Typically, our parser casts parsing as a split point scoring task, which first scores all split points for a given sentence, and then recursively splits a span into two by picking a split point with the highest score in the current span. The reverse order of the splits is considered as the order of pruning in R2D2 encoder. Beside the bi-directional language model loss, we also optimize the parser by minimizing the KL distance between tree probabilities from parser and R2D2. Our experiments show that our Fast-R2D2 improves performance significantly in grammar induction and achieves competitive results in downstream classification tasks.


page 1

page 2

page 3

page 4


Dependency Grammar Induction with a Neural Variational Transition-based Parser

Dependency grammar induction is the task of learning dependency syntax w...

A Span-based Linearization for Constituent Trees

We propose a novel linearization of a constituent tree, together with a ...

Pika parsing: parsing in reverse solves the left recursion and error recovery problems

A recursive descent parser is built from a set of mutually-recursive fun...

R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Human language understanding operates at multiple levels of granularity ...

A Fast Unified Model for Parsing and Sentence Understanding

Tree-structured neural networks exploit valuable syntactic parse informa...

Unsupervised Parsing via Constituency Tests

We propose a method for unsupervised parsing based on the linguistic not...

An Empirical Study of Compound PCFGs

Compound probabilistic context-free grammars (C-PCFGs) have recently est...

Please sign up or login with your details

Forgot password? Click here to reset