AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction

02/23/2018
by   Francisco J. Pulgar, et al.
0

High dimensionality, i.e. data having a large number of variables, tends to be a challenge for most machine learning tasks, including classification. A classifier usually builds a model representing how a set of inputs explain the outputs. The larger is the set of inputs and/or outputs, the more complex would be that model. There is a family of classification algorithms, known as lazy learning methods, which does not build a model. One of the best known members of this family is the kNN algorithm. Its strategy relies on searching a set of nearest neighbors, using the input variables as position vectors and computing distances among them. These distances loss significance in high-dimensional spaces. Therefore kNN, as many other classifiers, tends to worse its performance as the number of input variables grows. In this work AEkNN, a new kNN-based algorithm with built-in dimensionality reduction, is presented. Aiming to obtain a new representation of the data, having a lower dimensionality but with more informational features, AEkNN internally uses autoencoders. From this new feature vectors the computed distances should be more significant, thus providing a way to choose better neighbors. A experimental evaluation of the new proposal is conducted, analyzing several configurations and comparing them against the classical kNN algorithm. The obtained conclusions demonstrate that AEkNN offers better results in predictive and runtime performance.

READ FULL TEXT

page 18

page 21

page 22

page 24

page 26

research
11/13/2019

Topological Stability: Guided Determination of the Nearest Neighbors in Non-Linear Dimensionality Reduction Techniques

In machine learning field, dimensionality reduction is one of the import...
research
12/11/2019

Performance Analysis of Deep Autoencoder and NCA Dimensionality Reduction Techniques with KNN, ENN and SVM Classifiers

The central aim of this paper is to implement Deep Autoencoder and Neigh...
research
03/05/2018

Deep Continuous Clustering

Clustering high-dimensional datasets is hard because interpoint distance...
research
10/29/2020

Graph Regularized Autoencoder and its Application in Unsupervised Anomaly Detection

Dimensionality reduction is a crucial first step for many unsupervised l...
research
01/11/2023

Towards Microstructural State Variables in Materials Systems

The vast combination of material properties seen in nature are achieved ...
research
06/01/2020

Dimensionality Reduction for Sentiment Classification: Evolving for the Most Prominent and Separable Features

In sentiment classification, the enormous amount of textual data, its im...
research
10/22/2018

Optimal terminal dimensionality reduction in Euclidean space

Let ε∈(0,1) and X⊂ R^d be arbitrary with |X| having size n>1. The Johnso...

Please sign up or login with your details

Forgot password? Click here to reset