AVX-512 extension to OpenQCD 1.6

06/15/2018
by   Ed Bennett, et al.
0

We publish an extension of openQCD-1.6 with AVX-512 vector instructions using Intel intrinsics. Recent Intel processors support extended instruction sets with operations on 512-bit wide vectors, increasing both the capacity for floating point operations and register memory. Optimal use of the new capabilities requires reorganising data and floating point operations into these wider vector units. We report on the implementation and performance of the AVX-512 OpenQCD extension on clusters using Intel Knights Landing and Xeon Scalable (Skylake) CPUs. In complete HMC trajectories with physically relevant parameters we observe a performance increase of 5

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2006

Implementation of float-float operators on graphics hardware

The Graphic Processing Unit (GPU) has evolved into a powerful and flexib...
research
02/24/2017

An analysis of core- and chip-level architectural features in four generations of Intel server processors

This paper presents a survey of architectural features among four genera...
research
06/15/2011

A Characterization of the SPARC T3-4 System

This technical report covers a set of experiments on the 64-core SPARC T...
research
06/19/2018

LazyFP: Leaking FPU Register State using Microarchitectural Side-Channels

Modern processors utilize an increasingly large register set to facilita...
research
01/13/2020

The Two-Pass Softmax Algorithm

The softmax (also called softargmax) function is widely used in machine ...
research
01/07/2022

A SIMD algorithm for the detection of epistatic interactions of any order

Epistasis is a phenomenon in which a phenotype outcome is determined by ...
research
05/04/2023

A Quantitative Analysis and Guideline of Data Streaming Accelerator in Intel 4th Gen Xeon Scalable Processors

As semiconductor power density is no longer constant with the technology...

Please sign up or login with your details

Forgot password? Click here to reset