DeepAI AI Chat
Log In Sign Up

Distributed Spatial Data Clustering as a New Approach for Big Data Analysis

by   Malika Bendechache, et al.

In this paper we propose a new approach for Big Data mining and analysis. This new approach works well on distributed datasets and deals with data clustering task of the analysis. The approach consists of two main phases, the first phase executes a clustering algorithm on local data, assuming that the datasets was already distributed among the system processing nodes. The second phase deals with the local clusters aggregation to generate global clusters. This approach not only generates local clusters on each processing node in parallel, but also facilitates the formation of global clusters without prior knowledge of the number of the clusters, which many partitioning clustering algorithm require. In this study, this approach was applied on spatial datasets. The proposed aggregation phase is very efficient and does not involve the exchange of large amounts of data between the processing nodes. The experimental results show that the approach has super linear speed up, scales up very well, and can take advantage of the recent programming models, such as MapReduce model, as its results are not affected by the types of communications.


Hierarchical Aggregation Approach for Distributed clustering of spatial datasets

In this paper, we present a new approach of distributed clustering for s...

Distributed Clustering Algorithm for Spatial Data Mining

Distributed data mining techniques and mainly distributed clustering are...

Knowledge Map: Toward a New Approach Supporting the Knowledge Management in Distributed Data Mining

Distributed data mining (DDM) deals with the problem of finding patterns...

Parallel Computation of PDFs on Big Spatial Data Using Spark

We consider big spatial data, which is typically produced in scientific ...

DAOC: Stable Clustering of Large Networks

Clustering is a crucial component of many data mining systems involving ...

Super-klust: Another Way of Piecewise Linear Classification

With our previous study, the Super-k algorithm, we have introduced a nov...