Linguistic dependencies and statistical dependence

04/18/2021
by   Jacob Louis Hoover, et al.
0

What is the relationship between linguistic dependencies and statistical dependence? Building on earlier work in NLP and cognitive science, we study this question. We introduce a contextualized version of pointwise mutual information (CPMI), using pretrained language models to estimate probabilities of words in context. Extracting dependency trees which maximize CPMI, we compare the resulting structures against gold dependencies. Overall, we find that these maximum-CPMI trees correspond to linguistic dependencies more often than trees extracted from non-contextual PMI estimate, but only roughly as often as a simple baseline formed by connecting adjacent words. We also provide evidence that the extent to which the two kinds of dependency align cannot be explained by the distance between words or by the category of the dependency relation. Finally, our analysis sheds some light on the differences between large pretrained language models, specifically in the kinds of inductive biases they encode.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2021

On the Inductive Bias of Masked Language Modeling: From Statistical to Syntactic Dependencies

We study how masking and predicting tokens in an unsupervised fashion ca...
research
04/29/2020

Do Neural Language Models Show Preferences for Syntactic Formalisms?

Recent work on the interpretability of deep neural language models has c...
research
10/11/2020

Do Language Embeddings Capture Scales?

Pretrained Language Models (LMs) have been shown to possess significant ...
research
04/25/2021

Reranking Machine Translation Hypotheses with Structured and Web-based Language Models

In this paper, we investigate the use of linguistically motivated and co...
research
12/10/2020

Infusing Finetuning with Semantic Dependencies

For natural language processing systems, two kinds of evidence support t...
research
07/30/2015

Information-theoretical analysis of the statistical dependencies among three variables: Applications to written language

We develop the information-theoretical concepts required to study the st...

Please sign up or login with your details

Forgot password? Click here to reset