Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models

09/15/2019
by   Yuxin Wang, et al.
0

Deep neural networks (DNNs) have become widely used in many AI applications. Yet, training a DNN requires a huge amount of calculations and it takes a long time and energy to train a satisfying model. Nowadays, many-core AI accelerators (e.g., GPUs and TPUs) play a key role in training DNNs. However, different many-core processors from different vendors perform very differently in terms of performance and power consumption. To investigate the differences among several popular off-the-shelf processors (i.e., Intel CPU, Nvidia GPU, AMD GPU and Google TPU) in training DNNs, we carry out a detailed performance and power evaluation on these processors by training multiple types of benchmark DNNs including convolutional neural networks (CNNs), recurrent neural networks (LSTM), Deep Speech and transformers. Our evaluation results make two valuable directions for end-users and vendors. For the end-users, the evaluation results provide a guide for selecting a proper accelerator for training DNN models. For the vendors, some advantage and disadvantage revealed in our evaluation results could be useful for future architecture design and software library optimization.

READ FULL TEXT
research
05/19/2022

Multi-DNN Accelerators for Next-Generation AI Systems

As the use of AI-powered applications widens across multiple domains, so...
research
11/16/2017

Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs

Deep learning frameworks have been widely deployed on GPU servers for de...
research
07/24/2019

Benchmarking TPU, GPU, and CPU Platforms for Deep Learning

Training deep learning models is compute-intensive and there is an indus...
research
01/31/2023

Tricking AI chips into Simulating the Human Brain: A Detailed Performance Analysis

Challenging the Nvidia monopoly, dedicated AI-accelerator chips have beg...
research
05/19/2021

High performance and energy efficient inference for deep learning on ARM processors

We evolve PyDTNN, a framework for distributed parallel training of Deep ...
research
08/29/2019

Survey and Benchmarking of Machine Learning Accelerators

Advances in multicore processors and accelerators have opened the flood ...
research
02/20/2021

An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks

Edge TPUs are a domain of accelerators for low-power, edge devices and a...

Please sign up or login with your details

Forgot password? Click here to reset