Generalized Optimal Linear Orders

08/13/2021
by   Rishi Bommasani, et al.
0

The sequential structure of language, and the order of words in a sentence specifically, plays a central role in human language processing. Consequently, in designing computational models of language, the de facto approach is to present sentences to machines with the words ordered in the same order as in the original human-authored sentence. The very essence of this work is to question the implicit assumption that this is desirable and inject theoretical soundness into the consideration of word order in natural language processing. In this thesis, we begin by uniting the disparate treatments of word order in cognitive science, psycholinguistics, computational linguistics, and natural language processing under a flexible algorithmic framework. We proceed to use this heterogeneous theoretical foundation as the basis for exploring new word orders with an undercurrent of psycholinguistic optimality. In particular, we focus on notions of dependency length minimization given the difficulties in human and computational language processing in handling long-distance dependencies. We then discuss algorithms for finding optimal word orders efficiently in spite of the combinatorial space of possibilities. We conclude by addressing the implications of these word orders on human language and their downstream impacts when integrated in computational models.

READ FULL TEXT
research
08/04/2020

Word meaning in minds and machines

Machines show an increasingly broad set of linguistic competencies, than...
research
12/30/2020

Out of Order: How important is the sequential order of words in a sentence in Natural Language Understanding tasks?

Do state-of-the-art natural language understanding models care about wor...
research
06/04/2014

A Geometric Method to Obtain the Generation Probability of a Sentence

"How to generate a sentence" is the most critical and difficult problem ...
research
05/09/2023

Estimating related words computationally using language model from the Mahabharata – an Indian epic

'Mahabharata' is the most popular among many Indian pieces of literature...
research
10/09/2015

Human languages order information efficiently

Most languages use the relative order between words to encode meaning re...
research
05/19/2020

Human-like general language processing

Using language makes human beings surpass animals in wisdom. To let mach...
research
09/10/2021

Integrating Approaches to Word Representation

The problem of representing the atomic elements of language in modern ne...

Please sign up or login with your details

Forgot password? Click here to reset