ContainerStress: Autonomous Cloud-Node Scoping Framework for Big-Data ML Use Cases

03/18/2020
by   Guang Chao Wang, et al.
0

Deploying big-data Machine Learning (ML) services in a cloud environment presents a challenge to the cloud vendor with respect to the cloud container configuration sizing for any given customer use case. OracleLabs has developed an automated framework that uses nested-loop Monte Carlo simulation to autonomously scale any size customer ML use cases across the range of cloud CPU-GPU "Shapes" (configurations of CPUs and/or GPUs in Cloud containers available to end customers). Moreover, the OracleLabs and NVIDIA authors have collaborated on a ML benchmark study which analyzes the compute cost and GPU acceleration of any ML prognostic algorithm and assesses the reduction of compute cost in a cloud container comprising conventional CPUs and NVIDIA GPUs.

READ FULL TEXT

page 3

page 4

page 5

research
08/27/2021

Machine Learning for Performance Prediction of Spark Cloud Applications

Big data applications and analytics are employed in many sectors for a v...
research
10/04/2021

TACC: A Full-stack Cloud Computing Infrastructure for Machine Learning Tasks

In Machine Learning (ML) system research, efficient resource scheduling ...
research
11/08/2021

Accelerating GAN training using highly parallel hardware on public cloud

With the increasing number of Machine and Deep Learning applications in ...
research
06/22/2020

Potential customer mining application of smart home products based on LightGBM PU learning and Spark ML algorithm practice

This paper studies the case of big data-based intelligent product potent...
research
08/04/2022

Edge-centric Optimization of Multi-modal ML-driven eHealth Applications

Smart eHealth applications deliver personalized and preventive digital h...
research
01/29/2020

SLO-ML: A Language for Service Level Objective Modelling in Multi-cloud Applications

Cloud modelling languages (CMLs) are designed to assist customers in tac...
research
01/25/2022

SOL: Safe On-Node Learning in Cloud Platforms

Cloud platforms run many software agents on each server node. These agen...

Please sign up or login with your details

Forgot password? Click here to reset