BOPS, Not FLOPS! A New Metric, Measuring Tool, and Roofline Performance Model For Datacenter Computing

by   Lei Wang, et al.

The past decades witness FLOPS (Floating-point Operations per Second), as an important computation-centric performance metric, guides computer architecture evolution, bridges hardware and software co-design, and provides quantitative performance number for system optimization. However, for emerging datacenter computing (in short, DC) workloads, such as internet services or big data analytics, previous work reports on the modern CPU architecture that the average proportion of floating-point instructions only takes 1 FLOPS efficiency is only 0.1 63 for evaluating DC computer systems. To address the above issue, we propose a new computation-centric metric BOPS (Basic OPerations per Second). In our definition, Basic Operations include all of arithmetic, logical, comparing and array addressing operations for integer and floating point. BOPS is the average number of BOPs (Basic OPerations) completed each second. To that end, we present a dwarf-based measuring tool to evaluate DC computer systems in terms of our new metrics. On the basis of BOPS, also we propose a new roofline performance model for DC computing. Through the experiments, we demonstrate that our new metrics--BOPS, measuring tool, and new performance model indeed facilitate DC computer system design and optimization.


page 12

page 14


BOPS, Not FLOPS! A New Metric and Roofline Performance Model For Datacenter Computing

The past decades witness FLOPS (Floating-point Operations per Second) as...

FLInt: Exploiting Floating Point Enabled Integer Arithmetic for Efficient Random Forest Inference

In many machine learning applications, e.g., tree-based ensembles, float...

Design and implementation of an out-of-order execution engine of floating-point arithmetic operations

In this thesis, work is undertaken towards the design in hardware descri...

Manticore: A 4096-core RISC-V Chiplet Architecture for Ultra-efficient Floating-point Computing

Data-parallel problems demand ever growing floating-point (FP) operation...

NetFC: enabling accurate floating-point arithmetic on programmable switches

In-network computation has been widely used to accelerate data-intensive...

Accelerating BLAS and LAPACK via Efficient Floating Point Architecture Design

Basic Linear Algebra Subprograms (BLAS) and Linear Algebra Package (LAPA...

FFT Convolutions are Faster than Winograd on Modern CPUs, Here is Why

Winograd-based convolution has quickly gained traction as a preferred ap...

Please sign up or login with your details

Forgot password? Click here to reset