HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks

11/21/2022
by   Zining Zhang, et al.
0

To efficiently perform inference with neural networks, the underlying tensor programs require sufficient tuning efforts before being deployed into production environments. Usually, enormous tensor program candidates need to be sufficiently explored to find the one with the best performance. This is necessary to make the neural network products meet the high demand of real-world applications such as natural language processing, auto-driving, etc. Auto-schedulers are being developed to avoid the need for human intervention. However, due to the gigantic search space and lack of intelligent search guidance, current auto-schedulers require hours to days of tuning time to find the best-performing tensor program for the entire neural network. In this paper, we propose HARL, a reinforcement learning (RL) based auto-scheduler specifically designed for efficient tensor program exploration. HARL uses a hierarchical RL architecture in which learning-based decisions are made at all different levels of search granularity. It also automatically adjusts exploration configurations in real-time for faster performance convergence. As a result, HARL improves the tensor operator performance by 22 and the search speed by 4.3x compared to the state-of-the-art auto-scheduler. Inference performance and search speed are also significantly improved on end-to-end neural networks.

READ FULL TEXT

page 8

page 10

research
10/25/2021

Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance

Today's auto-tuners (e.g., AutoTVM, Ansor) generate efficient tensor pro...
research
01/14/2022

Transfer-Tuning: Reusing Auto-Schedules for Efficient Tensor Program Code Generation

Auto-scheduling for tensor programs is a process where a search algorith...
research
06/11/2020

Ansor : Generating High-Performance Tensor Programs for Deep Learning

High-performance tensor programs are crucial to guarantee efficient exec...
research
11/07/2022

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

Tensor program tuning is a non-convex objective optimization problem, to...
research
07/16/2021

Boosting the Convergence of Reinforcement Learning-based Auto-pruning Using Historical Data

Recently, neural network compression schemes like channel pruning have b...
research
05/31/2022

HW-Aware Initialization of DNN Auto-Tuning to Improve Exploration Time and Robustness

The process of optimizing the latency of DNN operators with ML models an...
research
01/01/2022

FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation Subgraph Similarity

Deploying various deep learning (DL) models efficiently has boosted the ...

Please sign up or login with your details

Forgot password? Click here to reset