ClioQuery: Interactive Query-Oriented Text Analytics for Comprehensive Investigation of Historical News Archives

04/10/2022
by   Abram Handler, et al.
0

Historians and archivists often find and analyze the occurrences of query words in newspaper archives, to help answer fundamental questions about society. But much work in text analytics focuses on helping people investigate other textual units, such as events, clusters, ranked documents, entity relationships, or thematic hierarchies. Informed by a study into the needs of historians and archivists, we thus propose ClioQuery, a text analytics system uniquely organized around the analysis of query words in context. ClioQuery applies text simplification techniques from natural language processing to help historians quickly and comprehensively gather and analyze all occurrences of a query word across an archive. It also pairs these new NLP methods with more traditional features like linked views and in-text highlighting to help engender trust in summarization techniques. We evaluate ClioQuery with two separate user studies, in which historians explain how ClioQuery's novel text simplification features can help facilitate historical research. We also evaluate with a separate quantitative comparison study, which shows that ClioQuery helps crowdworkers find and remember historical information. Such results suggest possible new directions for text analytics in other query-oriented settings.

READ FULL TEXT

page 2

page 32

research
03/01/2016

Event Search and Analytics: Detecting Events in Semantically Annotated Corpora for Search and Analytics

In this article, I present the questions that I seek to answer in my PhD...
research
06/01/2021

HisVA: A Visual Analytics System for Studying History

Studying history involves many difficult tasks. Examples include searchi...
research
09/20/2021

Visually Connecting Historical Figures Through Event Knowledge Graphs

Knowledge graphs store information about historical figures and their re...
research
06/20/2016

Visualizing textual models with in-text and word-as-pixel highlighting

We explore two techniques which use color to make sense of statistical t...
research
01/05/2022

Strategies of Effective Digitization of Commentaries and Sub-commentaries: Towards the Construction of Textual History

This paper describes additional aspects of a digital tool called the 'Te...
research
10/02/2019

Neural Word Decomposition Models for Abusive Language Detection

User generated text on social media often suffers from a lot of undesire...
research
06/13/2018

Beyond Bags of Words: Inferring Systemic Nets

Textual analytics based on representations of documents as bags of words...

Please sign up or login with your details

Forgot password? Click here to reset