Reconciliation k-median: Clustering with Non-Polarized Representatives

02/27/2019
by   Bruno Ordozgoiti, et al.
0

We propose a new variant of the k-median problem, where the objective function models not only the cost of assigning data points to cluster representatives, but also a penalty term for disagreement among the representatives. We motivate this novel problem by applications where we are interested in clustering data while avoiding selecting representatives that are too far from each other. For example, we may want to summarize a set of news sources, but avoid selecting ideologically-extreme articles in order to reduce polarization. To solve the proposed k-median formulation we adopt the local-search algorithm of Arya et al. We show that the algorithm provides a provable approximation guarantee, which becomes constant under a mild assumption on the minimum number of points for each cluster. We experimentally evaluate our problem formulation and proposed algorithm on datasets inspired by the motivating applications. In particular, we experiment with data extracted from Twitter, the US Congress voting records, and popular news sources. The results show that our objective can lead to choosing less polarized groups of representatives without significant loss in representation fidelity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2021

An Improved Local Search Algorithm for k-Median

We present a new local-search algorithm for the k-median clustering prob...
research
03/29/2016

Local Search Yields a PTAS for k-Means in Doubling Metrics

The most well known and ubiquitous clustering problem encountered in nea...
research
11/12/2021

Hierarchical Clustering: New Bounds and Objective

Hierarchical Clustering has been studied and used extensively as a metho...
research
12/12/2012

Optimal Time Bounds for Approximate Clustering

Clustering is a fundamental problem in unsupervised learning, and has be...
research
06/23/2021

Better Algorithms for Individually Fair k-Clustering

We study data clustering problems with ℓ_p-norm objectives (e.g. k-Media...
research
05/29/2019

Clustering without Over-Representation

In this paper we consider clustering problems in which each point is end...

Please sign up or login with your details

Forgot password? Click here to reset