Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning

07/14/2020
by   Shauharda Khadka, et al.
19

As modern neural networks have grown to billions of parameters, meeting tight latency budgets has become increasingly challenging. Approaches like compression, sparsification and network pruning have proven effective to tackle this problem - but they rely on modifications of the underlying network. In this paper, we look at a complimentary approach of optimizing how tensors are mapped to on-chip memory in an inference accelerator while leaving the network parameters untouched. Since different memory components trade off capacity for bandwidth differently, a sub-optimal mapping can result in high latency. We introduce evolutionary graph reinforcement learning (EGRL) - a method combining graph neural networks, reinforcement learning (RL) and evolutionary search - that aims to find the optimal mapping to minimize latency. Furthermore, a set of fast, stateless policies guide the evolutionary search to improve sample-efficiency. We train and validate our approach directly on the Intel NNP-I chip for inference using a batch size of 1. EGRL outperforms policy-gradient, evolutionary search and dynamic programming baselines on BERT, ResNet-101 and ResNet-50. We achieve 28-78 NNP-I compiler on all three workloads.

READ FULL TEXT
research
03/18/2020

Placement Optimization with Deep Reinforcement Learning

Placement Optimization is an important problem in systems and chip desig...
research
02/05/2021

GNN-RL Compression: Topology-Aware Network Pruning using Multi-stage Graph Embedding and Reinforcement Learning

Model compression is an essential technique for deploying deep neural ne...
research
06/11/2021

DECORE: Deep Compression with Reinforcement Learning

Deep learning has become an increasingly popular and powerful option for...
research
04/22/2020

Chip Placement with Deep Reinforcement Learning

In this work, we present a learning-based approach to chip placement, on...
research
08/06/2018

On Optimizing Deep Convolutional Neural Networks by Evolutionary Computing

Optimization for deep networks is currently a very active area of resear...
research
05/11/2023

Optimizing Memory Mapping Using Deep Reinforcement Learning

Resource scheduling and allocation is a critical component of many high ...

Please sign up or login with your details

Forgot password? Click here to reset