The Coming Age of Pervasive Data Processing

06/21/2019
by   Jan S. Rellermeyer, et al.
0

Emerging Big Data analytics and machine learning applications require a significant amount of computational power. While there exists a plethora of large-scale data processing frameworks which thrive in handling the various complexities of data-intensive workloads, the ever-increasing demand of applications have made us reconsider the traditional ways of scaling (e.g., scale-out) and seek new opportunities for improving the performance. In order to prepare for an era where data collection and processing occur on a wide range of devices, from powerful HPC machines to small embedded devices, it is crucial to investigate and eliminate the potential sources of inefficiency in the current state of the art platforms. In this paper, we address the current and upcoming challenges of pervasive data processing and present directions for designing the next generation of large-scale data processing systems.

READ FULL TEXT
research
10/26/2021

Evaluating Serverless Architecture for Big Data Enterprise Applications

In this paper, we investigate serverless computing for performing large ...
research
07/06/2017

A Survey on Geographically Distributed Big-Data Processing using MapReduce

Hadoop and Spark are widely used distributed processing frameworks for l...
research
05/16/2018

Spark-MPI: Approaching the Fifth Paradigm of Cognitive Applications

Over the past decade, the fourth paradigm of data-intensive science rapi...
research
12/02/2019

Large-scale text processing pipeline with Apache Spark

In this paper, we evaluate Apache Spark for a data-intensive machine lea...
research
10/20/2019

Micro-level Modularity of Computaion-intensive Programs in Big Data Platforms: A Case Study with Image Data

With the rapid advancement of Big Data platforms such as Hadoop, Spark, ...
research
07/31/2019

Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark

Real-world data from diverse domains require real-time scalable analysis...
research
03/09/2022

Exoshuffle: Large-Scale Shuffle at the Application Level

Shuffle is a key primitive in large-scale data processing applications. ...

Please sign up or login with your details

Forgot password? Click here to reset