Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds

10/31/2018
by   Kry Yik Chau Lui, et al.
0

In this paper, we investigate Dimensionality reduction (DR) maps in an information retrieval setting from a quantitative topology point of view. In particular, we show that no DR maps can achieve perfect precision and perfect recall simultaneously. Thus a continuous DR map must have imperfect precision. We further prove an upper bound on the precision of Lipschitz continuous DR maps. While precision is a natural measure in an information retrieval setting, it does not measure `how' wrong the retrieved data is. We therefore propose a new measure based on Wasserstein distance that comes with similar theoretical guarantee. A key technical step in our proofs is a particular optimization problem of the L_2-Wasserstein distance over a constrained set of distributions. We provide a complete solution to this optimization problem, which can be of independent interest on the technical side.

READ FULL TEXT
research
01/15/2021

Multi-point dimensionality reduction to improve projection layout reliability

In ordinary Dimensionality Reduction (DR), each data instance in an m-di...
research
03/01/2018

A more globally accurate dimensionality reduction method using triplets

We first show that the commonly used dimensionality reduction (DR) metho...
research
06/18/2017

Dimensionality Reduction using Similarity-induced Embeddings

The vast majority of Dimensionality Reduction (DR) techniques rely on se...
research
04/15/2023

Dimensionality Reduction as Probabilistic Inference

Dimensionality reduction (DR) algorithms compress high-dimensional data ...
research
04/28/2014

Conditional Density Estimation with Dimensionality Reduction via Squared-Loss Conditional Entropy Minimization

Regression aims at estimating the conditional mean of output given input...
research
11/03/2019

Unimodal-uniform Constrained Wasserstein Training for Medical Diagnosis

The labels in medical diagnosis task are usually discrete and successive...
research
08/26/2023

Class-constrained t-SNE: Combining Data Features and Class Probabilities

Data features and class probabilities are two main perspectives when, e....

Please sign up or login with your details

Forgot password? Click here to reset