Templating Shuffles

07/21/2022
by   Qizhen Zhang, et al.
0

Cloud data centers are rapidly evolving. At the same time, large-scale data analytics applications require non-trivial performance tuning that is often specific to the applications, workloads, and data center infrastructure. We propose TeShu, which makes network shuffling an extensible unified service layer common to all data analytics. Since an optimal shuffle depends on a myriad of factors, TeShu introduces parameterized shuffle templates, instantiated by accurate and efficient sampling that enables TeShu to dynamically adapt to different application workloads and data center layouts. Our experimental results with real-world graph workloads show that TeShu efficiently enables shuffling optimizations that improve performance and adapt to a variety of scenarios.

READ FULL TEXT
research
01/20/2021

Neural-based Modeling for Performance Tuning of Spark Data Analytics

Cloud data analytics has become an integral part of enterprise business ...
research
04/28/2016

Architectural Impact on Performance of In-memory Data Analytics: Apache Spark Case Study

While cluster computing frameworks are continuously evolving to provide ...
research
02/01/2018

Towards Reliable (and Efficient) Job Executions in a Practical Geo-distributed Data Analytics System

Geo-distributed data analytics are increasingly common to derive useful ...
research
04/24/2022

Taming Hybrid-Cloud Fast and Scalable Graph Analytics at Twitter

We have witnessed a boosted demand for graph analytics at Twitter in rec...
research
01/29/2023

Accelerating Graph Analytics on a Reconfigurable Architecture with a Data-Indirect Prefetcher

The irregular nature of memory accesses of graph workloads makes their p...
research
07/18/2018

BOLT: A Practical Binary Optimizer for Data Centers and Beyond

Performance optimization for large-scale applications has recently becom...
research
06/30/2020

Lachesis: Automated Generation of Persistent Partitionings for UDF-Centric Analytics

Persistent partitioning is effective in avoiding expensive shuffling ope...

Please sign up or login with your details

Forgot password? Click here to reset