Sparse Partitioning Around Medoids

09/05/2023
by   Lars Lenssen, et al.
0

Partitioning Around Medoids (PAM, k-Medoids) is a popular clustering technique to use with arbitrary distance functions or similarities, where each cluster is represented by its most central object, called the medoid or the discrete median. In operations research, this family of problems is also known as facility location problem (FLP). FastPAM recently introduced a speedup for large k to make it applicable for larger problems, but the method still has a runtime quadratic in N. In this chapter, we discuss a sparse and asymmetric variant of this problem, to be used for example on graph data such as road networks. By exploiting sparsity, we can avoid the quadratic runtime and memory requirements, and make this method scalable to even larger problems, as long as we are able to build a small enough graph of sufficient connectivity to perform local optimization. Furthermore, we consider asymmetric cases, where the set of medoids is not identical to the set of points to be covered (or in the interpretation of facility location, where the possible facility locations are not identical to the consumer locations). Because of sparsity, it may be impossible to cover all points with just k medoids for too small k, which would render the problem unsolvable, and this breaks common heuristics for finding a good starting condition. We, hence, consider determining k as a part of the optimization problem and propose to first construct a greedy initial solution with a larger k, then to optimize the problem by alternating between PAM-style "swap" operations where the result is improved by replacing medoids with better alternatives and "remove" operations to reduce the number of k until neither allows further improving the result quality. We demonstrate the usefulness of this method on a problem from electrical engineering, with the input graph derived from cartographic data.

READ FULL TEXT

page 4

page 5

page 8

page 11

research
03/12/2020

Regular Intersection Emptiness of Graph Problems: Finding a Needle in a Haystack of Graphs with the Help of Automata

The Int_reg-problem of a combinatorial problem P asks, given a nondeterm...
research
07/17/2023

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Scaling Transformers to longer sequence lengths has been a major problem...
research
01/23/2008

A path following algorithm for the graph matching problem

We propose a convex-concave programming approach for the labeled weighte...
research
06/24/2020

Ramanujan Bipartite Graph Products for Efficient Block Sparse Neural Networks

Sparse neural networks are shown to give accurate predictions competitiv...
research
11/03/2020

Distributing Sparse Matrix/Graph Applications in Heterogeneous Clusters – an Experimental Study

Many problems in scientific and engineering applications contain sparse ...
research
10/05/2022

Differentiable Mathematical Programming for Object-Centric Representation Learning

We propose topology-aware feature partitioning into k disjoint partition...
research
07/16/2022

On Non-Negative Quadratic Programming in Geometric Optimization

We present experimental and theoretical results on a method that applies...

Please sign up or login with your details

Forgot password? Click here to reset