Green Carbon Footprint for Model Inference Serving via Exploiting Mixed-Quality Models and GPU Partitioning

04/19/2023
by   Baolin Li, et al.
0

This paper presents a solution to the challenge of mitigating carbon emissions from large-scale high performance computing (HPC) systems and datacenters that host machine learning (ML) inference services. ML inference is critical to modern technology products, but it is also a significant contributor to datacenter compute cycles and carbon emissions. We introduce Clover, a carbon-friendly ML inference service runtime system that balances performance, accuracy, and carbon emissions through mixed-quality models and GPU resource partitioning. Our experimental results demonstrate that Clover is effective in substantially reducing carbon emissions while maintaining high accuracy and meeting service level agreement (SLA) targets. Therefore, it is a promising solution toward achieving carbon neutrality in HPC systems and datacenters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/01/2021

Multi-model Machine Learning Inference Serving with GPU Spatial Partitioning

As machine learning techniques are applied to a widening range of applic...
research
06/06/2023

FaaSwap: SLO-Aware, GPU-Efficient Serverless Inference via Model Swapping

The dynamic request patterns of machine learning (ML) inference workload...
research
04/13/2021

Using Machine Learning at Scale in HPC Simulations with SmartSim: An Application to Ocean Climate Modeling

We demonstrate the first climate-scale, numerical ocean simulations impr...
research
12/05/2019

Merlin: Enabling Machine Learning-Ready HPC Ensembles

With the growing complexity of computational and experimental facilities...
research
03/09/2023

GPU-enabled Function-as-a-Service for Machine Learning Inference

Function-as-a-Service (FaaS) is emerging as an important cloud computing...
research
04/18/2022

Dynamic Network Adaptation at Inference

Machine learning (ML) inference is a real-time workload that must comply...
research
10/17/2022

Scaling up Trustless DNN Inference with Zero-Knowledge Proofs

As ML models have increased in capabilities and accuracy, so has the com...

Please sign up or login with your details

Forgot password? Click here to reset