Finding Inner Outliers in High Dimensional Space

05/05/2014
by   Zhana Bao, et al.
0

Outlier detection in a large-scale database is a significant and complex issue in knowledge discovering field. As the data distributions are obscure and uncertain in high dimensional space, most existing solutions try to solve the issue taking into account the two intuitive points: first, outliers are extremely far away from other points in high dimensional space; second, outliers are detected obviously different in projected-dimensional subspaces. However, for a complicated case that outliers are hidden inside the normal points in all dimensions, existing detection methods fail to find such inner outliers. In this paper, we propose a method with twice dimension-projections, which integrates primary subspace outlier detection and secondary point-projection between subspaces, and sums up the multiple weight values for each point. The points are computed with local density ratio separately in twice-projected dimensions. After the process, outliers are those points scoring the largest values of weight. The proposed method succeeds to find all inner outliers on the synthetic test datasets with the dimension varying from 100 to 10000. The experimental results also show that the proposed algorithm can work in low dimensional space and can achieve perfect performance in high dimensional space. As for this reason, our proposed approach has considerable potential to apply it in multimedia applications helping to process images or video with large-scale attributes.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro