Unrolling Swiss Cheese: Metric repair on manifolds with holes

by   Anna C. Gilbert, et al.

For many machine learning tasks, the input data lie on a low-dimensional manifold embedded in a high dimensional space and, because of this high-dimensional struc- ture, most algorithms inefficient. The typical solution is to reduce the dimension of the input data using a standard dimension reduction algorithms such as ISOMAP, LAPLACIAN EIGENMAPS or LLES. This approach, however, does not always work in practice as these algorithms require that we have somewhat ideal data. Unfortunately, most data sets either have missing entries or unacceptably noisy values. That is, real data are far from ideal and we cannot use these algorithms directly. In this paper, we focus on the case when we have missing data. Some techniques, such as matrix completion, can be used to fill in missing data but these methods do not capture the non-linear structure of the manifold. Here, we present a new algorithm MR-MISSING that extends these previous algorithms and can be used to compute low dimensional representation on data sets with missing entries. We demonstrate the effectiveness of our algorithm by running three different experiments. We visually verify the effectiveness of our algorithm on synthetic manifolds, we numerically compare our projections against those computed by first filling in data using nlPCA and mDRUR on the MNIST data set, and we also show that we can do classification on MNIST with missing data. We also provide a theoretical guarantee for MR-MISSING under some simplifying assumptions.


Polynomial Matrix Completion for Missing Data Imputation and Transductive Learning

This paper develops new methods to recover the missing entries of a high...

Locality preserving projection on SPD matrix Lie group: algorithm and analysis

Symmetric positive definite (SPD) matrices used as feature descriptors i...

Approximation of Functions on Manifolds in High Dimension from Noisy Scattered Data

In this paper, we consider the fundamental problem of approximation of f...

Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

In this paper, we examine the problem of missing data in high-dimensiona...

Bounded Manifold Completion

Nonlinear dimensionality reduction or, equivalently, the approximation o...

Minimax rate of consistency for linear models with missing values

Missing values arise in most real-world data sets due to the aggregation...

Reconstruction of sequential data with density models

We introduce the problem of reconstructing a sequence of multidimensiona...

Please sign up or login with your details

Forgot password? Click here to reset