HEP-BNN: A Framework for Finding Low-Latency Execution Configurations of BNNs on Heterogeneous Multiprocessor Platforms

Binarized Neural Networks (BNNs) significantly reduce the computation and memory demands with binarized weights and activations compared to full-precision NNs. Executing a layer in a BNN on different devices of a heterogeneous multiprocessor platform consisting of CPU and GPU can affect the inference performance, i.e., accuracy and latency. Usually, a heterogeneous HW platform consisting of a CPU and a GPU is available to execute the BNN workloads. However, to use the heterogeneous HW effectively, it is necessary to find an efficient strategy for BNN workload mapping. In this work, we propose a framework that generates efficient BNN layer-to-device mappings (i.e. suitable parallel configuration for each layer of the model) for execution platforms comprised of CPU and CUDA-capable GPU. We evaluate our proposed framework with two BNN architectures using two well-known datasets, Fashion-MNIST and CIFAR-10, on three hardware platforms with different characteristics. The results show that compared to running a fully-parallelized GPU implementation, our framework generates an efficient configuration up to 2x, 2.6x and 11.8x faster on our tested hardware respectively.

READ FULL TEXT

page 1

page 8

research
12/23/2017

Protecting Real-Time GPU Applications on Integrated CPU-GPU SoC Platforms

Integrated CPU-GPU architecture provides excellent acceleration capabili...
research
05/15/2016

A Foray into Efficient Mapping of Algorithms to Hardware Platforms on Heterogeneous Systems

Heterogeneous computing can potentially offer significant performance an...
research
05/09/2022

Towards a High-performance and Secure Memory System and Architecture for Emerging Applications

In this dissertation, we propose a memory and computing coordinated meth...
research
03/05/2020

Optimizing Streaming Parallelism on Heterogeneous Many-Core Architectures: A Machine Learning Based Approach

This article presents an automatic approach to quickly derive a good sol...
research
02/03/2021

Llama: A Heterogeneous Serverless Framework for Auto-Tuning Video Analytics Pipelines

The proliferation of camera-enabled devices and large video repositories...
research
07/31/2023

DiviML: A Module-based Heuristic for Mapping Neural Networks onto Heterogeneous Platforms

Datacenters are increasingly becoming heterogeneous, and are starting to...
research
12/01/2020

HPM-Frame: A Decision Framework for Executing Software on Heterogeneous Platforms

Heterogeneous computing is one of the most important computational solut...

Please sign up or login with your details

Forgot password? Click here to reset