Improved Basic Block Reordering

09/12/2018
by   Andy Newell, et al.
0

Basic block reordering is an important step for profile-guided binary optimization. The state-of-the-art for basic block reordering is to maximize the number of fall-through branches. However, we demonstrate that such orderings may impose suboptimal performance on instruction and I-TLB caches. We propose a new algorithm that relies on a model combining the effects of fall-through and caching behavior. As details of modern processor caching is quite complex and often unknown, we show how to use machine learning in selecting parameters that best trade off different caching effects to maximize binary performance. An extensive evaluation on a variety of applications, including Facebook production workloads, the open-source compiler Clang, and SPEC CPU 2006 benchmarks, indicate that the new method outperforms existing block reordering techniques, improving the resulting performance of large-scale data-center applications. We have open sourced the code of the new algorithm as a part of a post-link binary optimization tool, BOLT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/18/2018

BOLT: A Practical Binary Optimizer for Data Centers and Beyond

Performance optimization for large-scale applications has recently becom...
research
08/21/2018

Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks

Statically estimating the number of processor clock cycles it takes to e...
research
08/16/2020

Dependability Evaluation of Middleware Technology for Large-scale Distributed Caching

Distributed caching systems (e.g., Memcached) are widely used by service...
research
11/17/2022

Optimizing Function Layout for Mobile Applications

Function layout, also referred to as function reordering or function pla...
research
10/01/2018

Codestitcher: Inter-Procedural Basic Block Layout Optimization

Modern software executes a large amount of code. Previous techniques of ...
research
07/30/2022

A Comparative Study of Application-level Caching Recommendations at the Method Level

Performance and scalability requirements have a fundamental role in most...
research
08/29/2022

Minimum Coverage Instrumentation

Modern compilers leverage block coverage profile data to carry out downs...

Please sign up or login with your details

Forgot password? Click here to reset