BOPS, Not FLOPS! A New Metric and Roofline Performance Model For Datacenter Computing

01/28/2018
by   Lei Wang, et al.
0

The past decades witness FLOPS (Floating-point Operations per Second) as an important computation-centric performance metric. However, for datacenter (in short, DC) computing workloads, such as Internet services or big data analytics, previous work reports that they have extremely low floating point operation intensity, and the average FLOPS efficiency is only 0.1 average IPC is 1.3 (the theoretic IPC is 4 on the Intel Xeon E5600 platform). Furthermore, we reveal that the traditional FLOPS based Roofline performance model is not suitable for modern DC workloads, and gives misleading information for system optimization. These observations imply that FLOPS is inappropriate for evaluating DC computer systems. To address the above issue, we propose a new computation-centric metric BOPs (Basic OPerations) that measures the efficient work defined by the source code, includes floating-point operations and the arithmetic, logical, comparing, and array addressing parts of integer operations. We define BOPS as the average number of BOPs per second, and propose replacing FLOPS with BOPS to measure DC computer systems. On the basis of BOPS, we propose a new Roofline performance model for DC computing, which we call DC-Roofline model, with which we optimize DC workloads with the improvement varying from 119

READ FULL TEXT
research
01/28/2018

BOPS, Not FLOPS! A New Metric, Measuring Tool, and Roofline Performance Model For Datacenter Computing

The past decades witness FLOPS (Floating-point Operations per Second), a...
research
09/09/2022

FLInt: Exploiting Floating Point Enabled Integer Arithmetic for Efficient Random Forest Inference

In many machine learning applications, e.g., tree-based ensembles, float...
research
04/04/2018

End-to-End DNN Training with Block Floating Point Arithmetic

DNNs are ubiquitous datacenter workloads, requiring orders of magnitude ...
research
06/10/2021

NetFC: enabling accurate floating-point arithmetic on programmable switches

In-network computation has been widely used to accelerate data-intensive...
research
11/08/2022

Numerical analysis of Givens rotation

Generating 2-by-2 unitary matrices in floating-precision arithmetic is a...
research
09/20/2018

FFT Convolutions are Faster than Winograd on Modern CPUs, Here is Why

Winograd-based convolution has quickly gained traction as a preferred ap...
research
05/08/2020

Measuring the Algorithmic Efficiency of Neural Networks

Three factors drive the advance of AI: algorithmic innovation, data, and...

Please sign up or login with your details

Forgot password? Click here to reset