LoopTune: Optimizing Tensor Computations with Reinforcement Learning

09/04/2023
by   Dejan Grubisic, et al.
0

Advanced compiler technology is crucial for enabling machine learning applications to run on novel hardware, but traditional compilers fail to deliver performance, popular auto-tuners have long search times and expert-optimized libraries introduce unsustainable costs. To address this, we developed LoopTune, a deep reinforcement learning compiler that optimizes tensor computations in deep learning models for the CPU. LoopTune optimizes tensor traversal order while using the ultra-fast lightweight code generator LoopNest to perform hardware-specific optimizations. With a novel graph-based representation and action space, LoopTune speeds up LoopNest by 3.2x, generating an order of magnitude faster code than TVM, 2.8x faster than MetaSchedule, and 1.08x faster than AutoTVM, consistently performing at the level of the hand-tuned library Numpy. Moreover, LoopTune tunes code in order of seconds.

READ FULL TEXT
research
11/23/2021

Generating GPU Compiler Heuristics using Reinforcement Learning

GPU compilers are complex software programs with many optimizations spec...
research
05/02/2022

LoopStack: a Lightweight Tensor Algebra Compiler Stack

We present LoopStack, a domain specific compiler stack for tensor operat...
research
05/17/2023

ACRoBat: Optimizing Auto-batching of Dynamic Deep Learning at Compile Time

Dynamic control flow is an important technique often used to design expr...
research
03/14/2019

Stripe: Tensor Compilation via the Nested Polyhedral Model

Hardware architectures and machine learning (ML) libraries evolve rapidl...
research
09/14/2017

Weld: Rethinking the Interface Between Data-Intensive Applications

Data analytics applications combine multiple functions from different li...
research
09/23/2019

Compiler-Level Matrix Multiplication Optimization for Deep Learning

An important linear algebra routine, GEneral Matrix Multiplication (GEMM...
research
09/16/2023

Accelerating In-Browser Deep Learning Inference on Diverse Edge Clients through Just-in-Time Kernel Optimizations

Web applications are increasingly becoming the primary platform for AI s...

Please sign up or login with your details

Forgot password? Click here to reset