Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing

07/05/2022
by   Chenghao Lyu, et al.
0

Big data processing at the production scale presents a highly complex environment for resource optimization (RO), a problem crucial for meeting performance goals and budgetary constraints of analytical users. The RO problem is challenging because it involves a set of decisions (the partition count, placement of parallel instances on machines, and resource allocation to each instance), requires multi-objective optimization (MOO), and is compounded by the scale and complexity of big data systems while having to meet stringent time constraints for scheduling. This paper presents a MaxCompute-based integrated system to support multi-objective resource optimization via fine-grained instance-level modeling and optimization. We propose a new architecture that breaks RO into a series of simpler problems, new fine-grained predictive models, and novel optimization methods that exploit these models to make effective instance-level recommendations in a hierarchical MOO framework. Evaluation using production workloads shows that our new RO system could reduce 37-72 optimizer and scheduler, while running in 0.02-0.23s.

READ FULL TEXT

page 2

page 5

page 26

page 27

page 33

page 34

research
12/03/2018

Resource Management and Scheduling for Big Data Applications in Cloud Computing Environments

This chapter presents software architectures of the big data processing ...
research
02/27/2020

Cost Models for Big Data Query Processing: Learning, Retrofitting, and Our Findings

Query processing over big data is ubiquitous in modern clouds, where the...
research
08/11/2020

DV-ARPA: Data Variety Aware Resource Provisioning for Big Data Processing in Accumulative Applications

In Cloud Computing, the resource provisioning approach used has a great ...
research
08/19/2020

FIRM: An Intelligent Fine-Grained Resource Management Framework for SLO-Oriented Microservices

Modern user-facing latency-sensitive web services include numerous distr...
research
12/26/2020

Toward Compact Data from Big Data

Bigdata is a dataset of which size is beyond the ability of handling a v...
research
10/13/2020

PIUMA: Programmable Integrated Unified Memory Architecture

High performance large scale graph analytics is essential to timely anal...
research
04/05/2018

Big enterprise registration data imputation: Supporting spatiotemporal analysis of industries in China

Big, fine-grained enterprise registration data that includes time and lo...

Please sign up or login with your details

Forgot password? Click here to reset