Scalable Infrastructure for Workload Characterization of Cluster Traces

05/23/2022
by   Thomas van Loo, et al.
0

In the recent past, characterizing workloads has been attempted to gain a foothold in the emerging serverless cloud market, especially in the large production cloud clusters of Google, AWS, and so forth. While analyzing and characterizing real workloads from a large production cloud cluster benefits cloud providers, researchers, and daily users, analyzing the workload traces of these clusters has been an arduous task due to the heterogeneous nature of data. This article proposes a scalable infrastructure based on Google's dataproc for analyzing the workload traces of cloud environments. We evaluated the functioning of the proposed infrastructure using the workload traces of Google cloud cluster-usage-traces-v3. We perform the workload characterization on this dataset, focusing on the heterogeneity of the workload, the variations in job durations, aspects of resources consumption, and the overall availability of resources provided by the cluster. The findings reported in the paper will be beneficial for cloud infrastructure providers and users while managing the cloud computing resources, especially serverless platforms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2023

A Deep Dive into the Google Cluster Workload Traces: Analyzing the Application Failure Characteristics and User Behaviors

Large-scale cloud data centers have gained popularity due to their high ...
research
11/16/2021

On the Potential of Execution Traces for Batch Processing Workload Optimization in Public Clouds

With the growing amount of data, data processing workloads and the manag...
research
03/06/2020

Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider

Function as a Service (FaaS) has been gaining popularity as a way to dep...
research
03/19/2018

Cloud Workload Prediction based on Workflow Execution Time Discrepancies

Infrastructure as a service clouds hide the complexity of maintaining th...
research
10/19/2022

Miners in the Cloud: Measuring and Analyzing Cryptocurrency Mining in Public Clouds

Cryptocurrencies, arguably the most prominent application of blockchains...
research
12/16/2022

Mystique: Accurate and Scalable Production AI Benchmarks Generation

Building and maintaining large AI fleets to efficiently support the fast...
research
11/03/2021

Implementing a scalable and elastic computing environment based on Cloud Containers

In this article we look at the potential of cloud containers and we prov...

Please sign up or login with your details

Forgot password? Click here to reset