COREclust: a new package for a robust and scalable analysis of complex data

05/25/2018
by   Camille Champion, et al.
0

In this paper, we present a new R package COREclust dedicated to the detection of representative variables in high dimensional spaces with a potentially limited number of observations. Variable sets detection is based on an original graph clustering strategy denoted CORE-clustering algorithm that detects CORE-clusters, i.e. variable sets having a user defined size range and in which each variable is very similar to at least another variable. Representative variables are then robustely estimate as the CORE-cluster centers. This strategy is entirely coded in C++ and wrapped by R using the Rcpp package. A particular effort has been dedicated to keep its algorithmic cost reasonable so that it can be used on large datasets. After motivating our work, we will explain the CORE-clustering algorithm as well as a greedy extension of this algorithm. We will then present how to use it and results obtained on synthetic and real data.

READ FULL TEXT
research
10/09/2018

Improvement of K Mean Clustering Algorithm Based on Density

The purpose of this paper is to improve the traditional K-means algorith...
research
09/09/2021

On the use of Wasserstein metric in topological clustering of distributional data

This paper deals with a clustering algorithm for histogram data based on...
research
11/19/2018

An efficient density-based clustering algorithm using reverse nearest neighbour

Density-based clustering is the task of discovering high-density regions...
research
08/21/2023

A Clustering Algorithm to Organize Satellite Hotspot Data for the Purpose of Tracking Bushfires Remotely

This paper proposes a spatiotemporal clustering algorithm and its implem...
research
04/19/2018

varrank: an R package for variable ranking based on mutual information with applications to observed systemic datasets

This article describes the R package varrank. It has a flexible implemen...
research
11/12/2021

An Enhanced Adaptive Bi-clustering Algorithm through Building a Shielding Complex Sub-Matrix

Bi-clustering refers to the task of finding sub-matrices (indexed by a g...

Please sign up or login with your details

Forgot password? Click here to reset