Metricizing the Euclidean Space towards Desired Distance Relations in Point Clouds

by   Stefan Rass, et al.

Given a set of points in the Euclidean space ℝ^ℓ with ℓ>1, the pairwise distances between the points are determined by their spatial location and the metric d that we endow ℝ^ℓ with. Hence, the distance d(𝐱,𝐲)=δ between two points is fixed by the choice of 𝐱 and 𝐲 and d. We study the related problem of fixing the value δ, and the points 𝐱,𝐲, and ask if there is a topological metric d that computes the desired distance δ. We demonstrate this problem to be solvable by constructing a metric to simultaneously give desired pairwise distances between up to O(√(ℓ)) many points in ℝ^ℓ. We then introduce the notion of an ε-semimetric d̃ to formulate our main result: for all ε>0, for all m≥ 1, for any choice of m points 𝐲_1,…,𝐲_m∈ℝ^ℓ, and all chosen sets of values {δ_ij≥ 0: 1≤ i<j≤ m}, there exists an ε-semimetric δ̃:ℝ^ℓ×ℝ^ℓ→ℝ such that d̃(𝐲_i,𝐲_j)=δ_ij, i.e., the desired distances are accomplished, irrespectively of the topology that the Euclidean or other norms would induce. We showcase our results by using them to attack unsupervised learning algorithms, specifically k-Means and density-based (DBSCAN) clustering algorithms. These have manifold applications in artificial intelligence, and letting them run with externally provided distance measures constructed in the way as shown here, can make clustering algorithms produce results that are pre-determined and hence malleable. This demonstrates that the results of clustering algorithms may not generally be trustworthy, unless there is a standardized and fixed prescription to use a specific distance function.


page 1

page 2

page 3

page 4


Approximation Algorithms For The Dispersion Problems in a Metric Space

In this article, we consider the c-dispersion problem in a metric space ...

Efficient Clustering with Limited Distance Information

Given a point set S and an unknown metric d on S, we study the problem o...

Parameterized k-Clustering: The distance matters!

We consider the k-Clustering problem, which is for a given multiset of n...

If it ain't broke, don't fix it: Sparse metric repair

Many modern data-intensive computational problems either require, or ben...

Geometry and clustering with metrics derived from separable Bregman divergences

Separable Bregman divergences induce Riemannian metric spaces that are i...

Fixed and adaptive landmark sets for finite pseudometric spaces

Topological data analysis (TDA) is an expanding field that leverages pri...

Which Point Configurations are Determined by the Distribution of their Pairwise Distances?

In a previous paper we showed that, for any n > m+2, most sets of n poin...

Please sign up or login with your details

Forgot password? Click here to reset