Understanding the Energy Consumption of HPC Scale Artificial Intelligence

This paper contributes towards better understanding the energy consumption trade-offs of HPC scale Artificial Intelligence (AI), and more specifically Deep Learning (DL) algorithms. For this task we developed benchmark-tracker, a benchmark tool to evaluate the speed and energy consumption of DL algorithms in HPC environments. We exploited hardware counters and Python libraries to collect energy information through software, which enabled us to instrument a known AI benchmark tool, and to evaluate the energy consumption of numerous DL algorithms and models. Through an experimental campaign, we show a case example of the potential of benchmark-tracker to measure the computing speed and the energy consumption for training and inference DL algorithms, and also the potential of Benchmark-Tracker to help better understanding the energy behavior of DL algorithms in HPC platforms. This work is a step forward to better understand the energy consumption of Deep Learning in HPC, and it also contributes with a new tool to help HPC DL developers to better balance the HPC infrastructure in terms of speed and energy consumption.

READ FULL TEXT
research
08/23/2023

FECoM: A Step towards Fine-Grained Energy Measurement for Deep Learning

With the increasing usage, scale, and complexity of Deep Learning (DL) m...
research
07/31/2022

Eco2AI: carbon emissions tracking of machine learning models as the first step towards sustainable AI

The size and complexity of deep neural networks continue to grow exponen...
research
08/26/2020

Optimising AI Training Deployments using Graph Compilers and Containers

Artificial Intelligence (AI) applications based on Deep Neural Networks ...
research
05/30/2022

A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law

Deep Learning (DL) has transformed the automation of a wide range of ind...
research
08/01/2018

Energy-based Tuning of Convolutional Neural Networks on Multi-GPUs

Deep Learning (DL) applications are gaining momentum in the realm of Art...
research
05/11/2021

ANDREAS: Artificial intelligence traiNing scheDuler foR accElerAted resource clusterS

Artificial Intelligence (AI) and Deep Learning (DL) algorithms are curre...

Please sign up or login with your details

Forgot password? Click here to reset