Log In Sign Up

Fairness for Whom? Understanding the Reader's Perception of Fairness in Text Summarization

by   Anurag Shandilya, et al.

With the surge in user-generated textual information, there has been a recent increase in the use of summarization algorithms for providing an overview of the extensive content. Traditional metrics for evaluation of these algorithms (e.g. ROUGE scores) rely on matching algorithmic summaries to human-generated ones. However, it has been shown that when the textual contents are heterogeneous, e.g., when they come from different socially salient groups, most existing summarization algorithms represent the social groups very differently compared to their distribution in the original data. To mitigate such adverse impacts, some fairness-preserving summarization algorithms have also been proposed. All of these studies have considered normative notions of fairness from the perspective of writers of the contents, neglecting the readers' perceptions of the underlying fairness notions. To bridge this gap, in this work, we study the interplay between the fairness notions and how readers perceive them in textual summaries. Through our experiments, we show that reader's perception of fairness is often context-sensitive. Moreover, standard ROUGE evaluation metrics are unable to quantify the perceived (un)fairness of the summaries. To this end, we propose a human-in-the-loop metric and an automated graph-based methodology to quantify the perceived bias in textual summaries. We demonstrate their utility by quantifying the (un)fairness of several summaries of heterogeneous socio-political microblog datasets.


page 1

page 6


Beyond ROUGE Scores in Algorithmic Summarization: Creating Fairness-Preserving Textual Summaries

As the amount of textual information grows rapidly, text summarization a...

Fairness-Preserving Text Summarzation

As the amount of textual information grows rapidly, text summarization a...

Improving abstractive summarization with energy-based re-ranking

Current abstractive summarization systems present important weaknesses w...

Towards Human-Free Automatic Quality Evaluation of German Summarization

Evaluating large summarization corpora using humans has proven to be exp...

On Faithfulness and Factuality in Abstractive Summarization

It is well known that the standard likelihood training and approximate d...

Intweetive Text Summarization

The amount of user generated contents from various social medias allows ...

A Semantically Motivated Approach to Compute ROUGE Scores

ROUGE is one of the first and most widely used evaluation metrics for te...