Learning to Optimize Tensor Programs

05/21/2018
by   Tianqi Chen, et al.
0

We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learning systems. However, existing systems rely on manually optimized libraries such as cuDNN where only a narrow range of server class GPUs are well-supported. The reliance on hardware-specific operator libraries limits the applicability of high-level graph optimizations and incurs significant engineering costs when deploying to new hardware targets. We use learning to remove this engineering burden. We learn domain-specific statistical cost models to guide the search of tensor operator implementations over billions of possible program variants. We further accelerate the search by effective model transfer across workloads. Experimental results show that our framework delivers performance competitive with state-of-the-art hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2018

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

There is an increasing need to bring machine learning to a wide diversit...
research
06/11/2020

Ansor : Generating High-Performance Tensor Programs for Deep Learning

High-performance tensor programs are crucial to guarantee efficient exec...
research
10/25/2021

Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance

Today's auto-tuners (e.g., AutoTVM, Ansor) generate efficient tensor pro...
research
11/07/2022

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

Tensor program tuning is a non-convex objective optimization problem, to...
research
07/11/2022

SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning

Sparse tensors are rapidly becoming critical components of modern deep l...
research
04/16/2017

In-Datacenter Performance Analysis of a Tensor Processing Unit

Many architects believe that major improvements in cost-energy-performan...
research
11/20/2016

Deep Tensor Convolution on Multicores

Deep convolutional neural networks (ConvNets) of 3-dimensional kernels a...

Please sign up or login with your details

Forgot password? Click here to reset