The ARM Scalable Vector Extension

by   Nigel Stephens, et al.

This article describes the ARM Scalable Vector Extension (SVE). Several goals guided the design of the architecture. First was the need to extend the vector processing capability associated with the ARM AArch64 execution state to better address the computational requirements in domains such as high-performance computing, data analytics, computer vision, and machine learning. Second was the desire to introduce an extension that can scale across multiple implementations, both now and into the future, allowing CPU designers to choose the vector length most suitable for their power, performance, and area targets. Finally, the architecture should avoid imposing a software development cost as the vector length changes and where possible reduce it by improving the reach of compiler auto-vectorization technologies. SVE achieves these goals. It allows implementations to choose a vector register length between 128 and 2,048 bits. It supports a vector-length agnostic programming model that lets code run and scale automatically across all vector lengths without recompilation. Finally, it introduces several innovative features that begin to overcome some of the traditional barriers to autovectorization.



There are no comments yet.


page 7


Adaptable Register File Organization for Vector Processors

Modern scientific applications are getting more diverse, and the vector ...

A fast vectorized sorting implementation based on the ARM scalable vector extension (SVE)

The way developers implement their algorithms and how these implementati...

SVE-enabling Lattice QCD Codes

Optimization of applications for supercomputers of the highest performan...

A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation

This paper presents a methodology for using LLVM-based tools to tune the...

ClangJIT: Enhancing C++ with Just-in-Time Compilation

The C++ programming language is not only a keystone of the high-performa...

Grid on QPACE 4

In 2020 we deployed QPACE 4, which features 64 Fujitsu A64FX model FX700...

Analytical Cost Metrics : Days of Future Past

As we move towards the exascale era, the new architectures must be capab...

Code Repositories


Sources of system engineering wisdom and wizardry

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.