R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

07/02/2021
by   Xiang Hu, et al.
0

Human language understanding operates at multiple levels of granularity (e.g., words, phrases, and sentences) with increasing levels of abstraction that can be hierarchically combined. However, existing deep models with stacked layers do not explicitly model any sort of hierarchical process. This paper proposes a recursive Transformer model based on differentiable CKY style binary trees to emulate the composition process. We extend the bidirectional language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes. To scale up our approach, we also introduce an efficient pruned tree induction algorithm to enable encoding in just a linear number of composition steps. Experimental results on language modeling and unsupervised parsing show the effectiveness of our approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2020

StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling

There are two major classes of natural language grammars – the dependenc...
research
11/02/2017

Neural Language Modeling by Jointly Learning Syntax and Lexicon

We propose a neural language model capable of unsupervised syntactic str...
research
03/01/2022

Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation

Recently CKY-based models show great potential in unsupervised grammar i...
research
09/14/2019

Tree Transformer: Integrating Tree Structures into Self-Attention

Pre-training Transformer from large-scale raw texts and fine-tuning on t...
research
12/15/2022

Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking

Masked language modeling (MLM) has been widely used for pre-training eff...
research
03/30/2020

A Hierarchical Transformer for Unsupervised Parsing

The underlying structure of natural language is hierarchical; words comb...

Please sign up or login with your details

Forgot password? Click here to reset