IOS: Inter-Operator Scheduler for CNN Acceleration

11/02/2020
by   Yaoyao Ding, et al.
0

To accelerate CNN inference, existing deep learning frameworks focus on optimizing intra-operator parallelization. However, a single operator can no longer fully utilize the available parallelism given the rapid advances in high-performance hardware, resulting in a large gap between the peak performance and the real performance. This performance gap is more severe under smaller batch sizes. In this work, we extensively study the parallelism between operators and propose Inter-Operator Scheduler (IOS) to automatically schedule the execution of multiple operators in parallel. IOS utilizes dynamic programming to find a scheduling policy specialized for the target hardware. IOS consistently outperforms state-of-the-art libraries (e.g., TensorRT) by 1.1 to 1.5x on modern CNN benchmarks.

READ FULL TEXT

page 6

page 8

page 10

research
08/22/2023

Automatic Task Parallelization of Dataflow Graphs in ML/DL models

Several methods exist today to accelerate Machine Learning(ML) or Deep-L...
research
01/11/2023

TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching Algorithm for Deep Neural Networks

TAPS is a Topology-Aware intra-operator Parallelism strategy Searching a...
research
01/23/2023

Manticore: Hardware-Accelerated RTL Simulation with Static Bulk-Synchronous Parallelism

The demise of Moore's Law and Dennard Scaling has revived interest in sp...
research
07/31/2023

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

Deep learning models have demonstrated impressive performance in various...
research
02/01/2023

Xenos: Dataflow-Centric Optimization to Accelerate Model Inference on Edge Devices

Edge computing has been emerging as a popular scenario for model inferen...
research
04/22/2022

An Evaluation of Intra-Transaction Parallelism in Actor-Relational Database Systems

Over the past decade, we have witnessed a dramatic evolution in main-mem...
research
03/01/2023

On the Semantic Overlap of Operators in Stream Processing Engines

Stream processing is extensively used in the IoT-to-Cloud spectrum to di...

Please sign up or login with your details

Forgot password? Click here to reset