Matrix optimization based Euclidean embedding with outliers

by   Qian Zhang, et al.

Euclidean embedding from noisy observations containing outlier errors is an important and challenging problem in statistics and machine learning. Many existing methods would struggle with outliers due to a lack of detection ability. In this paper, we propose a matrix optimization based embedding model that can produce reliable embeddings and identify the outliers jointly. We show that the estimators obtained by the proposed method satisfy a non-asymptotic risk bound, implying that the model provides a high accuracy estimator with high probability when the order of the sample size is roughly the degree of freedom up to a logarithmic factor. Moreover, we show that under some mild conditions, the proposed model also can identify the outliers without any prior information with high probability. Finally, numerical experiments demonstrate that the matrix optimization-based model can produce configurations of high quality and successfully identify outliers even for large networks.


page 1

page 2

page 3

page 4


Convex Optimization Learning of Faithful Euclidean Distance Representations in Nonlinear Dimensionality Reduction

Classical multidimensional scaling only works well when the noisy distan...

Exploring Outliers in Crowdsourced Ranking for QoE

Outlier detection is a crucial part of robust evaluation for crowdsource...

High-dimensional outlier detection using random projections

There exist multiple methods to detect outliers in multivariate data in ...

Truncated Cauchy Non-negative Matrix Factorization

Non-negative matrix factorization (NMF) minimizes the Euclidean distance...

High-dimensional inference robust to outliers with l1-norm penalization

This paper studies inference in the high-dimensional linear regression m...

Bib2vec: An Embedding-based Search System for Bibliographic Information

We propose a novel embedding model that represents relationships among s...

Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels

The problem of estimating subjective visual properties from image and vi...