Modern Multicore CPUs are not Energy Proportional: Opportunity for Bi-objective Optimization for Performance and Energy

10/15/2019
by   Semyon Khokhriakov, et al.
0

Energy proportionality is the key design goal followed by architects of modern multicore CPUs. One of its implications is that optimization of an application for performance will also optimize it for energy. In this work, we show that energy proportionality does not hold true for multicore CPUs. This finding creates the opportunity for bi-objective optimization of applications for performance and energy. We propose and study the first application-level method for bi-objective optimization of multithreaded data-parallel applications for performance and energy. The method uses two decision variables, the number of identical multithreaded kernels (threadgroups) executing the application and the number of threads in each threadgroup, with the workload always partitioned equally between the threadgroups. We experimentally demonstrate the efficiency of the method using four highly optimized multithreaded data-parallel applications, 2D fast Fourier transform based on FFTW and Intel MKL, and dense matrix-matrix multiplication using OpenBLAS and Intel MKL. Four modern multicore CPUs are used in the experiments. The experiments show that optimization for performance alone results in the increase in dynamic energy consumption by up to 89 dynamic energy alone degrades the performance by up to 49 bi-objective optimization problem, the method determines up to 11 globally Pareto-optimal solutions. Finally, we propose a qualitative dynamic energy model employing performance monitoring counters as parameters, which we use to explain the discovered energy nonproportionality and the Pareto-optimal solutions determined by our method. The model shows that the energy nonproportionality in our case is due to the activity of the data translation lookaside buffer (dTLB), which is disproportionately energy expensive.

READ FULL TEXT

page 7

page 10

page 11

page 12

page 13

page 18

page 19

page 20

research
04/12/2022

Coevolutionary Pareto Diversity Optimization

Computing diverse sets of high quality solutions for a given optimizatio...
research
01/20/2022

The Energy-Delay Pareto Front in Cache-enabled Integrated Access and Backhaul mmWave HetNets

In this paper, to address backhaul capacity bottleneck and concurrently ...
research
08/16/2018

Novel Model-based Methods for Performance Optimization of Multithreaded 2D Discrete Fourier Transform on Multicore Processors

In this paper, we use multithreaded fast Fourier transforms provided in ...
research
09/06/2022

Novel bi-objective optimization algorithms minimizing the max and sum of vectors of functions

We study a bi-objective optimization problem, which for a given positive...
research
01/05/2022

Dynamic GPU Energy Optimization for Machine Learning Training Workloads

GPUs are widely used to accelerate the training of machine learning work...
research
07/14/2021

A New Parallel Algorithm for Sinkhorn Word-Movers Distance and Its Performance on PIUMA and Xeon CPU

The Word Movers Distance (WMD) measures the semantic dissimilarity betwe...

Please sign up or login with your details

Forgot password? Click here to reset