Fair k-Center: a Coreset Approach in Low Dimensions
Center-based clustering techniques are fundamental in some areas of machine learning such as data summarization. Generic k-center algorithms can produce biased cluster representatives so there has been a recent interest in fair k-center clustering. Our main theoretical contributions are two new (3+ϵ)-approximation algorithms for solving the fair k-center problem in (1) the dynamic incremental, i.e., one-pass streaming, model and (2) the MapReduce model. Our dynamic incremental algorithm is the first such algorithm for this problem (previous streaming algorithms required two passes) and our MapReduce one improves upon the previous approximation factor of (17+ϵ). Both algorithms work by maintaining a small coreset to represent the full point set and their analysis requires that the underlying metric has finite-doubling dimension. We also provide related heuristics for higher dimensional data and experimental results that compare the performance of our algorithms to existing ones.
READ FULL TEXT