Demystifying the MLPerf Benchmark Suite

08/24/2019
by   Snehil Verma, et al.
0

MLPerf, an emerging machine learning benchmark suite strives to cover a broad range of applications of machine learning. We present a study on its characteristics and how the MLPerf benchmarks differ from some of the previous deep learning benchmarks like DAWNBench and DeepBench. We find that application benchmarks such as MLPerf (although rich in kernels) exhibit different features compared to kernel benchmarks such as DeepBench. MLPerf benchmark suite contains a diverse set of models which allows unveiling various bottlenecks in the system. Based on our findings, dedicated low latency interconnect between GPUs in multi-GPU systems is required for optimal distributed deep learning training. We also observe variation in scaling efficiency across the MLPerf models. The variation exhibited by the different models highlight the importance of smart scheduling strategies for multi-GPU training. Another observation is that CPU utilization increases with increase in number of GPUs used for training. Corroborating prior work we also observe and quantify improvements possible by compiler optimizations, mixed-precision training and use of Tensor Cores.

READ FULL TEXT
research
03/15/2023

Towards a Benchmarking Suite for Kernel Tuners

As computing system become more complex, it is becoming harder for progr...
research
12/14/2018

An Empirical Evaluation of Allgatherv on Multi-GPU Systems

Applications for deep learning and big data analytics have compute and m...
research
07/18/2021

Effective GPU Sharing Under Compiler Guidance

Modern computing platforms tend to deploy multiple GPUs (2, 4, or more) ...
research
07/20/2020

UVMBench: A Comprehensive Benchmark Suite for Researching Unified Virtual Memory in GPUs

The recent introduction of Unified Virtual Memory (UVM) in GPUs offers a...
research
10/27/2021

JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization

The rapid development in computing technology has paved the way for dire...
research
04/17/2020

GEVO: GPU Code Optimization using EvolutionaryComputation

GPUs are a key enabler of the revolution in machine learning and high pe...
research
11/16/2019

Benanza: Automatic uBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs

As Deep Learning (DL) models have been increasingly used in latency-sens...

Please sign up or login with your details

Forgot password? Click here to reset