User Ex Machina : Simulation as a Design Probe in Human-in-the-Loop Text Analytics

01/06/2021
by   Anamaria Crisan, et al.
0

Topic models are widely used analysis techniques for clustering documents and surfacing thematic elements of text corpora. These models remain challenging to optimize and often require a "human-in-the-loop" approach where domain experts use their knowledge to steer and adjust. However, the fragility, incompleteness, and opacity of these models means even minor changes could induce large and potentially undesirable changes in resulting model. In this paper we conduct a simulation-based analysis of human-centered interactions with topic models, with the objective of measuring the sensitivity of topic models to common classes of user actions. We find that user interactions have impacts that differ in magnitude but often negatively affect the quality of the resulting modelling in a way that can be difficult for the user to evaluate. We suggest the incorporation of sensitivity and "multiverse" analyses to topic model interfaces to surface and overcome these deficiencies.

READ FULL TEXT

page 12

page 13

research
04/04/2023

A User-Centered, Interactive, Human-in-the-Loop Topic Modelling System

Human-in-the-loop topic modelling incorporates users' knowledge into the...
research
08/12/2022

Scholastic: Graphical Human-Al Collaboration for Inductive and Interpretive Text Analysis

Interpretive scholars generate knowledge from text corpora by manually s...
research
05/23/2019

Why Didn't You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models

To address the lack of comparative evaluation of Human-in-the-Loop Topic...
research
08/01/2019

Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections

We present a framework that allows users to incorporate the semantics of...
research
05/26/2023

DeepSI: Interactive Deep Learning for Semantic Interaction

In this paper, we design novel interactive deep learning methods to impr...
research
10/24/2019

Deep topic modeling by multilayer bootstrap network and lasso

Topic modeling is widely studied for the dimension reduction and analysi...
research
11/18/2020

Non-Linear Multiple Field Interactions Neural Document Ranking

Ranking tasks are usually based on the text of the main body of the page...

Please sign up or login with your details

Forgot password? Click here to reset