The Framework Tax: Disparities Between Inference Efficiency in Research and Deployment

02/13/2023
by   Jared Fernandez, et al.
0

Increased focus on the deployment of machine learning systems has led to rapid improvements in hardware accelerator performance and neural network model efficiency. However, the resulting reductions in floating point operations and increases in computational throughput of accelerators have not directly translated to improvements in real-world inference latency. We demonstrate that these discrepancies can be largely attributed to mis-alignments between model architectures and the capabilities of underlying hardware due to bottlenecks introduced by deep learning frameworks. We denote this phenomena as the framework tax, and observe that the disparity is growing as hardware speed increases over time. In this work, we examine this phenomena through a series of case studies analyzing the effects of model design decisions, framework paradigms, and hardware platforms on total model latency. Based on our findings, we provide actionable recommendations to ML researchers and practitioners aimed at narrowing the gap between efficient ML model research and practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2021

SONIC: A Sparse Neural Network Inference Accelerator with Silicon Photonics for Energy-Efficient Deep Learning

Sparse neural networks can greatly facilitate the deployment of neural n...
research
06/21/2023

Subgraph Stationary Hardware-Software Inference Co-Design

A growing number of applications depend on Machine Learning (ML) functio...
research
08/23/2023

An Open-Source ML-Based Full-Stack Optimization Framework for Machine Learning Accelerators

Parameterizable machine learning (ML) accelerators are the product of re...
research
12/07/2022

CODEBench: A Neural Architecture and Hardware Accelerator Co-Design Framework

Recently, automated co-design of machine learning (ML) models and accele...
research
07/31/2019

Tuning Algorithms and Generators for Efficient Edge Inference

A surge in artificial intelligence and autonomous technologies have incr...
research
11/09/2022

Profiling and Improving the PyTorch Dataloader for high-latency Storage: A Technical Report

A growing number of Machine Learning Frameworks recently made Deep Learn...
research
05/08/2020

Measuring the Algorithmic Efficiency of Neural Networks

Three factors drive the advance of AI: algorithmic innovation, data, and...

Please sign up or login with your details

Forgot password? Click here to reset