Text analysis and deep learning: A network approach

10/08/2021
by   Ingo Marquart, et al.
0

Much information available to applied researchers is contained within written language or spoken text. Deep language models such as BERT have achieved unprecedented success in many applications of computational linguistics. However, much less is known about how these models can be used to analyze existing text. We propose a novel method that combines transformer models with network analysis to form a self-referential representation of language use within a corpus of interest. Our approach produces linguistic relations strongly consistent with the underlying model as well as mathematically well-defined operations on them, while reducing the amount of discretionary choices of representation and distance measures. It represents, to the best of our knowledge, the first unsupervised method to extract semantic networks directly from deep language models. We illustrate our approach in a semantic analysis of the term "founder". Using the entire corpus of Harvard Business Review from 1980 to 2020, we find that ties in our network track the semantics of discourse over time, and across contexts, identifying and relating clusters of semantic and syntactic relations. Finally, we discuss how this method can also complement and inform analyses of the behavior of deep learning models.

READ FULL TEXT

page 10

page 15

page 16

page 19

page 21

page 23

research
04/22/2021

Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand?

Language models trained on billions of tokens have recently led to unpre...
research
01/09/2022

Semantic and sentiment analysis of selected Bhagavad Gita translations using BERT-based language framework

It is well known that translations of songs and poems not only breaks rh...
research
11/18/2021

How much do language models copy from their training data? Evaluating linguistic novelty in text generation using RAVEN

Current language models can generate high-quality text. Are they simply ...
research
09/09/2021

All Bark and No Bite: Rogue Dimensions in Transformer Language Models Obscure Representational Quality

Similarity measures are a vital tool for understanding how language mode...
research
10/01/2020

Examining the rhetorical capacities of neural language models

Recently, neural language models (LMs) have demonstrated impressive abil...
research
08/17/2023

Linearity of Relation Decoding in Transformer Language Models

Much of the knowledge encoded in transformer language models (LMs) may b...
research
07/03/2022

Understanding Tieq Viet with Deep Learning Models

Deep learning is a powerful approach in recovering lost information as w...

Please sign up or login with your details

Forgot password? Click here to reset