Outlier Detection and Data Clustering via Innovation Search

12/30/2019
by   Mostafa Rahmani, et al.
48

The idea of Innovation Search was proposed as a data clustering method in which the directions of innovation were utilized to compute the adjacency matrix and it was shown that Innovation Pursuit can notably outperform the self representation based subspace clustering methods. In this paper, we present a new discovery that the directions of innovation can be used to design a provable and strong robust (to outlier) PCA method. The proposed approach, dubbed iSearch, uses the direction search optimization problem to compute an optimal direction corresponding to each data point. iSearch utilizes the directions of innovation to measure the innovation of the data points and it identifies the outliers as the most innovative data points. Analytical performance guarantees are derived for the proposed robust PCA method under different models for the distribution of the outliers including randomly distributed outliers, clustered outliers, and linearly dependent outliers. In addition, we study the problem of outlier detection in a union of subspaces and it is shown that iSearch provably recovers the span of the inliers when the inliers lie in a union of subspaces. Moreover, we present theoretical studies which show that the proposed measure of innovation remains stable in the presence of noise and the performance of iSearch is robust to noisy data. In the challenging scenarios in which the outliers are close to each other or they are close to the span of the inliers, iSearch is shown to remarkably outperform most of the existing methods. The presented method shows that the directions of innovation are useful representation of the data which can be used to perform both data clustering and outlier detection.

READ FULL TEXT

page 1

page 9

research
06/23/2021

Closed-Form, Provable, and Robust PCA via Leverage Statistics and Innovation Search

The idea of Innovation Search, which was initially proposed for data clu...
research
08/16/2021

Provable Data Clustering via Innovation Search

This paper studies the subspace clustering problem in which data points ...
research
01/08/2022

Provable Clustering of a Union of Linear Manifolds Using Optimal Directions

This paper focuses on the Matrix Factorization based Clustering (MFC) me...
research
09/15/2016

Coherence Pursuit: Fast, Simple, and Robust Principal Component Analysis

This paper presents a remarkably simple, yet powerful, algorithm termed ...
research
04/12/2017

Provable Self-Representation Based Outlier Detection in a Union of Subspaces

Many computer vision tasks involve processing large amounts of data cont...
research
06/14/2019

Detecting Network Soft-failures with the Network Link Outlier Factor (NLOF)

In this paper, we describe and experimentally evaluate the performance o...
research
06/14/2017

A New Adaptive Video SRR Algorithm With Improved Robustness to Innovations

In this paper, a new video super-resolution reconstruction (SRR) method ...

Please sign up or login with your details

Forgot password? Click here to reset