TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

11/07/2022
by   Yi Zhai, et al.
0

Tensor program tuning is a non-convex objective optimization problem, to which search-based approaches have proven to be effective. At the core of the search-based approaches lies the design of the cost model. Though deep learning-based cost models perform significantly better than other methods, they still fall short and suffer from the following problems. First, their feature extraction heavily relies on expert-level domain knowledge in hardware architectures. Even so, the extracted features are often unsatisfactory and require separate considerations for CPUs and GPUs. Second, a cost model trained on one hardware platform usually performs poorly on another, a problem we call cross-hardware unavailability. In order to address these problems, we propose TLP and MTLTLP. TLP is a deep learning-based cost model that facilitates tensor program tuning. Instead of extracting features from the tensor program itself, TLP extracts features from the schedule primitives. We treat schedule primitives as tensor languages. TLP is thus a Tensor Language Processing task. In this way, the task of predicting the tensor program latency through the cost model is transformed into a natural language processing (NLP) regression task. MTL-TLP combines Multi-Task Learning and TLP to cope with the cross-hardware unavailability problem. We incorporate these techniques into the Ansor framework and conduct detailed experiments. Results show that TLP can speed up the average search time by 9.1X and 3.0X on CPU and GPU workloads, respectively, compared to the state-of-the-art implementation. MTL-TLP can achieve a speed-up of 4.7X and 2.9X on CPU and GPU workloads, respectively, using only 7 hardware data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2018

Learning to Optimize Tensor Programs

We introduce a learning-based framework to optimize tensor programs for ...
research
06/11/2020

Ansor : Generating High-Performance Tensor Programs for Deep Learning

High-performance tensor programs are crucial to guarantee efficient exec...
research
11/21/2022

HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks

To efficiently perform inference with neural networks, the underlying te...
research
10/29/2022

Enabling Data Movement and Computation Pipelining in Deep Learning Compiler

Pipelining between data loading and computation is a critical tensor pro...
research
01/15/2022

Moses: Efficient Exploitation of Cross-device Transferable Features for Tensor Program Optimization

Achieving efficient execution of machine learning models has attracted s...
research
10/18/2022

Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs

As deep learning models nowadays are widely adopted by both cloud servic...
research
06/22/2020

Similarity Search with Tensor Core Units

Tensor Core Units (TCUs) are hardware accelerators developed for deep ne...

Please sign up or login with your details

Forgot password? Click here to reset