Large Language Models for Compiler Optimization

09/11/2023
by   Chris Cummins, et al.
0

We explore the novel application of Large Language Models to code optimization. We present a 7B-parameter transformer model trained from scratch to optimize LLVM assembly for code size. The model takes as input unoptimized assembly and outputs a list of compiler options to best optimize the program. Crucially, during training, we ask the model to predict the instruction counts before and after optimization, and the optimized code itself. These auxiliary learning tasks significantly improve the optimization performance of the model and improve the model's depth of understanding. We evaluate on a large suite of test programs. Our approach achieves a 3.0 improvement in reducing instruction counts over the compiler, outperforming two state-of-the-art baselines that require thousands of compilations. Furthermore, the model shows surprisingly strong code reasoning abilities, generating compilable code 91 compiler 70

READ FULL TEXT
research
08/13/2022

BinBert: Binary Code Understanding with a Fine-tunable and Execution-aware Transformer

A recent trend in binary code analysis promotes the use of neural soluti...
research
06/27/2023

SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design

This paper introduces SparseOptimizer, a novel deep learning optimizer t...
research
01/18/2023

Understand Code Style: Efficient CNN-based Compiler Optimization Recognition System

Compiler optimization level recognition can be applied to vulnerability ...
research
05/25/2023

Tuning Models of Code with Compiler-Generated Reinforcement Learning Feedback

Large Language Models (LLMs) pre-trained on code have recently emerged a...
research
01/21/2021

PalmTree: Learning an Assembly Language Model for Instruction Embedding

Deep learning has demonstrated its strengths in numerous binary analysis...
research
02/22/2022

Learning to Combine Instructions in LLVM Compiler

Instruction combiner (IC) is a critical compiler optimization pass, whic...
research
05/29/2019

Categorization of Program Regions for Agile Compilation using Machine Learning and Hardware Support

A compiler processes the code written in a high level language and produ...

Please sign up or login with your details

Forgot password? Click here to reset