Polling Latent Opinions: A Method for Computational Sociolinguistics Using Transformer Language Models

04/15/2022
by   Philip Feldman, et al.
0

Text analysis of social media for sentiment, topic analysis, and other analysis depends initially on the selection of keywords and phrases that will be used to create the research corpora. However, keywords that researchers choose may occur infrequently, leading to errors that arise from using small samples. In this paper, we use the capacity for memorization, interpolation, and extrapolation of Transformer Language Models such as the GPT series to learn the linguistic behaviors of a subgroup within larger corpora of Yelp reviews. We then use prompt-based queries to generate synthetic text that can be analyzed to produce insights into specific opinions held by the populations that the models were trained on. Once learned, more specific sentiment queries can be made of the model with high levels of accuracy when compared to traditional keyword searches. We show that even in cases where a specific keyphrase is limited or not present at all in the training corpora, the GPT is able to accurately generate large volumes of text that have the correct sentiment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2021

Analyzing COVID-19 Tweets with Transformer-based Language Models

This paper describes a method for using Transformer-based Language Model...
research
01/12/2023

The Keyword Explorer Suite: A Toolkit for Understanding Online Populations

We have developed a set of Python applications that use large language m...
research
10/06/2020

Investigating African-American Vernacular English in Transformer-Based Text Generation

The growth of social media has encouraged the written use of African Ame...
research
03/29/2021

Contextual Text Embeddings for Twi

Transformer-based language models have been changing the modern Natural ...
research
08/07/2023

Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion Mining

Opinion mining, also known as sentiment analysis, is a subfield of natur...
research
03/08/2021

Language Models have a Moral Dimension

Artificial writing is permeating our lives due to recent advances in lar...
research
05/19/2022

Mapping Complex Technologies via Science-Technology Linkages; The Case of Neuroscience – A transformer based keyword extraction approach

In this paper, we present an efficient deep learning based approach to e...

Please sign up or login with your details

Forgot password? Click here to reset