A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning

01/29/2019
by   Tal Ben-Nun, et al.
0

We introduce Deep500: the first customizable benchmarking infrastructure that enables fair comparison of the plethora of deep learning frameworks, algorithms, libraries, and techniques. The key idea behind Deep500 is its modular design, where deep learning is factorized into four distinct levels: operators, network processing, training, and distributed training. Our evaluation illustrates that Deep500 is customizable (enables combining and benchmarking different deep learning codes) and fair (uses carefully selected metrics). Moreover, Deep500 is fast (incurs negligible overheads), verifiable (offers infrastructure to analyze correctness), and reproducible. Finally, as the first distributed and reproducible benchmarking system for deep learning, Deep500 provides software infrastructure to utilize the most powerful supercomputers for extreme-scale workloads.

READ FULL TEXT

page 1

page 2

page 5

page 6

page 7

page 8

page 9

research
11/08/2017

DLVM: A modern compiler infrastructure for deep learning systems

Deep learning software demands reliability and performance. However, man...
research
07/05/2019

Benchmarking Contemporary Deep Learning Hardware and Frameworks:A Survey of Qualitative Metrics

This paper surveys benchmarking principles, machine learning devices inc...
research
07/05/2019

Benchmarking Deep Learning Hardware and Frameworks: Qualitative Metrics

Previous survey papers offer knowledge of deep learning hardware devices...
research
09/18/2018

SCOPE: C3SR Systems Characterization and Benchmarking Framework

This report presents the design of the Scope infrastructure for extensib...
research
07/05/2019

Qualitative Benchmarking of Deep Learning Hardware and Frameworks: Review and Tutorial

Previous survey papers offer knowledge of deep learning hardware devices...
research
06/09/2021

Benchmarking NetBASILISK: a Network Security Project for Science

Infrastructures supporting distributed scientific collaborations must ad...
research
02/12/2021

Towards Large Scale Automated Algorithm Design by Integrating Modular Benchmarking Frameworks

We present a first proof-of-concept use-case that demonstrates the effic...

Please sign up or login with your details

Forgot password? Click here to reset