A framework for streamlined statistical prediction using topic models

04/15/2019
by   Vanessa Glenny, et al.
0

In the Humanities and Social Sciences, there is increasing interest in approaches to information extraction, prediction, intelligent linkage, and dimension reduction applicable to large text corpora. With approaches in these fields being grounded in traditional statistical techniques, the need arises for frameworks whereby advanced NLP techniques such as topic modelling may be incorporated within classical methodologies. This paper provides a classical, supervised, statistical learning framework for prediction from text, using topic models as a data reduction method and the topics themselves as predictors, alongside typical statistical tools for predictive modelling. We apply this framework in a Social Sciences context (applied animal behaviour) as well as a Humanities context (narrative analysis) as examples of this framework. The results show that topic regression models perform comparably to their much less efficient equivalents that use individual words as predictors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/17/2022

Dimension Reduction via Supervised Clustering of Regression Coefficients: A Review

The development and use of dimension reduction methods is prevalent in m...
research
08/05/2015

Topic Stability over Noisy Sources

Topic modelling techniques such as LDA have recently been applied to spe...
research
08/18/2019

TDAM: a Topic-Dependent Attention Model for Sentiment Analysis

We propose a topic-dependent attention model for sentiment classificatio...
research
04/24/2023

A Cheat Sheet for Bayesian Prediction

This paper reviews the growing field of Bayesian prediction. Bayes point...
research
05/04/2021

Unsupervised Graph-based Topic Modeling from Video Transcriptions

To unfold the tremendous amount of audiovisual data uploaded daily to so...
research
10/03/2022

Theme and Topic: How Qualitative Research and Topic Modeling Can Be Brought Together

Qualitative research is an approach to understanding social phenomenon b...
research
08/22/2018

Model Interpretation: A Unified Derivative-based Framework for Nonparametric Regression and Supervised Machine Learning

Interpreting a nonparametric regression model with many predictors is kn...

Please sign up or login with your details

Forgot password? Click here to reset