A new hashing based nearest neighbors selection technique for big datasets

04/05/2020
by   Jude Tchaye-Kondi, et al.
0

KNN has the reputation to be the word simplest but efficient supervised learning algorithm used for either classification or regression. KNN prediction efficiency highly depends on the size of its training data but when this training data grows KNN suffers from slowness in making decisions since it needs to search nearest neighbors within the entire dataset at each decision making. This paper proposes a new technique that enables the selection of nearest neighbors directly in the neighborhood of a given observation. The proposed approach consists of dividing the data space into subcells of a virtual grid built on top of data space. The mapping between the data points and subcells is performed using hashing. When it comes to select the nearest neighbors of a given observation, we firstly identify the cell the observation belongs by using hashing, and then we look for nearest neighbors from that central cell and cells around it layer by layer. From our experiment performance analysis on publicly available datasets, our algorithm outperforms the original KNN in time efficiency with a prediction quality as good as that of KNN it also offers competitive performance with solutions like KDtree

READ FULL TEXT
research
12/01/2019

Active Search for Nearest Neighbors

In pattern recognition or machine learning, it is a very fundamental tas...
research
09/01/2021

Under-bagging Nearest Neighbors for Imbalanced Classification

In this paper, we propose an ensemble learning algorithm called under-ba...
research
03/26/2021

Applying k-nearest neighbors to time series forecasting : two new approaches

K-nearest neighbors algorithm is one of the prominent techniques used in...
research
01/22/2018

Scalable Secure Computation of Statistical Functions with Applications to k-Nearest Neighbors

Given a set S of n d-dimensional points, the k-nearest neighbors (KNN) i...
research
11/04/2022

Improving the Predictive Performances of k Nearest Neighbors Learning by Efficient Variable Selection

This paper computationally demonstrates a sharp improvement in predictiv...
research
08/16/2021

Evolving Fuzzy k-Nearest Neighbors Using an Enhanced Sine Cosine Algorithm: Case Study of Lupus Nephritis

Because of its simplicity and effectiveness, fuzzy K-nearest neighbors (...
research
05/30/2022

A k nearest neighbours classifiers ensemble based on extended neighbourhood rule and features subsets

kNN based ensemble methods minimise the effect of outliers by identifyin...

Please sign up or login with your details

Forgot password? Click here to reset