Performance Evaluation of Linear Regression Algorithm in Cluster Environment

09/14/2020
by   Cinantya Paramita, et al.
0

Cluster computing was introduced to replace the superiority of super computers. Cluster computing is able to overcome the problems that cannot be effectively dealt with supercomputers. In this paper, we are going to evaluate the performance of cluster computing by executing one of data mining techniques in the cluster environment. The experiment will attempt to predict the flight delay by using linear regression algorithm with apache spark as a framework for cluster computing. The result shows that, by involving 5 PCs in cluster environment with equal specifications can increase the performance of computation up to 39.76 to the cluster can make the process become faster significantly.

READ FULL TEXT
research
06/07/2022

High-performance computing for super-resolution microscopy on a cluster of computers

Multiple signal classification algorithm (MUSICAL) provides a super-reso...
research
08/04/2019

A Data Structure Perspective to the RDD-based Apriori Algorithm on Spark

During the recent years, a number of efficient and scalable frequent ite...
research
07/05/2016

Algorithms for Generalized Cluster-wise Linear Regression

Cluster-wise linear regression (CLR), a clustering problem intertwined w...
research
02/21/2019

Dynamic task scheduling in computing cluster environments

In this study, a cluster-computing environment is employed as a computat...
research
04/20/2018

Analyzing astronomical data with Apache Spark

We investigate the performances of Apache Spark, a cluster computing fra...
research
05/21/2022

Experiences with task-based programming using cluster nodes as OpenMP devices

Programming a distributed system, such as a cluster, requires extended u...

Please sign up or login with your details

Forgot password? Click here to reset