Moses: Efficient Exploitation of Cross-device Transferable Features for Tensor Program Optimization

01/15/2022
by   Zhihe Zhao, et al.
0

Achieving efficient execution of machine learning models has attracted significant attention recently. To generate tensor programs efficiently, a key component of DNN compilers is the cost model that can predict the performance of each configuration on specific devices. However, due to the rapid emergence of hardware platforms, it is increasingly labor-intensive to train domain-specific predictors for every new platform. Besides, current design of cost models cannot provide transferable features between different hardware accelerators efficiently and effectively. In this paper, we propose Moses, a simple and efficient design based on the lottery ticket hypothesis, which fully takes advantage of the features transferable to the target device via domain adaptation. Compared with state-of-the-art approaches, Moses achieves up to 1.53X efficiency gain in the search stage and 1.41X inference speedup on challenging DNN benchmarks.

READ FULL TEXT

page 5

page 6

research
09/01/2020

Scaling Up Deep Neural Network Optimization for Edge Inference

Deep neural networks (DNNs) have been increasingly deployed on and integ...
research
02/13/2021

COMET: A Domain-Specific Compilation of High-Performance Computational Chemistry

The computational power increases over the past decades havegreatly enha...
research
07/11/2023

PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR

Deep neural networks (DNNs) are of critical use in different domains. To...
research
06/10/2020

OpEvo: An Evolutionary Method for Tensor Operator Optimization

Training and inference efficiency of deep neural networks highly rely on...
research
01/26/2023

PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices

The ability to accurately predict deep neural network (DNN) inference pe...
research
11/07/2022

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

Tensor program tuning is a non-convex objective optimization problem, to...
research
06/22/2020

Similarity Search with Tensor Core Units

Tensor Core Units (TCUs) are hardware accelerators developed for deep ne...

Please sign up or login with your details

Forgot password? Click here to reset