Matching with Text Data: An Experimental Evaluation of Methods for Matching Documents and of Measuring Match Quality

01/02/2018
by   Reagan Mozer, et al.
0

How should one perform matching in observational studies when the units are text documents? The lack of randomized assignment of documents into treatment and control groups may lead to systematic differences between groups on high-dimensional and latent features of text such as topical content and sentiment. Standard balance metrics, used to measure the quality of a matching method, fail in this setting. We decompose text matching methods into two parts: (1) a text representation, and (2) a distance metric, and present a framework for measuring the quality of text matches experimentally using human subjects. We consider 28 potential methods, and find that representing text as term vectors and matching on cosine distance significantly outperform alternative representations and distance metrics. We apply our chosen method to a substantive debate in the study of media bias using a novel data set of front page news articles from thirteen news sources. Media bias is composed of topic selection bias and presentation bias; using our matching method to control for topic selection, we find that both components contribute significantly to media bias, though some news sources rely on one component more than the other.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2018

Predicting Factuality of Reporting and Bias of News Media Sources

We present a study on predicting the factuality of reporting and bias of...
research
03/27/2018

Sampling the News Producers: A Large News and Feature Data Set for the Study of the Complex Media Landscape

The complexity and diversity of today's media landscape provides many ch...
research
05/20/2021

Enabling News Consumers to View and Understand Biased News Coverage: A Study on the Perception and Visualization of Media Bias

Traditional media outlets are known to report political news in a biased...
research
10/31/2019

Predicting the Politics of an Image Using Webly Supervised Data

The news media shape public opinion, and often, the visual bias they con...
research
08/31/2021

Machine-Learning media bias

We present an automated method for measuring media bias. Inferring which...
research
12/01/2022

Inference of Media Bias and Content Quality Using Natural-Language Processing

Media bias can significantly impact the formation and development of opi...
research
01/25/2019

Assessing Partisan Traits of News Text Attributions

On the topic of journalistic integrity, the current state of accurate, i...

Please sign up or login with your details

Forgot password? Click here to reset