Adaptable Register File Organization for Vector Processors

11/09/2021
by   Cristóbal Ramírez Lazo, et al.
0

Modern scientific applications are getting more diverse, and the vector lengths in those applications vary widely. Contemporary Vector Processors (VPs) are designed either for short vector lengths, e.g., Fujitsu A64FX with 512-bit ARM SVE vector support, or long vectors, e.g., NEC Aurora Tsubasa with 16Kbits Maximum Vector Length (MVL). Unfortunately, both approaches have drawbacks. On the one hand, short vector length VP designs struggle to provide high efficiency for applications featuring long vectors with high Data Level Parallelism (DLP). On the other hand, long vector VP designs waste resources and underutilize the Vector Register File (VRF) when executing low DLP applications with short vector lengths. Therefore, those long vector VP implementations are limited to a specialized subset of applications, where relatively high DLP must be present to achieve excellent performance with high efficiency. To overcome these limitations, we propose an Adaptable Vector Architecture (AVA) that leads to having the best of both worlds. AVA is designed for short vectors (MVL=16 elements) and is thus area and energy-efficient. However, AVA has the functionality to reconfigure the MVL, thereby allowing to exploit the benefits of having a longer vector (up to 128 elements) microarchitecture when abundant DLP is present. We model AVA on the gem5 simulator and evaluate the performance with six applications taken from the RiVEC Benchmark Suite. To obtain area and power consumption metrics, we model AVA on McPAT for 22nm technology. Our results show that by reconfiguring our small VRF (8KB) plus our novel issue queue scheme, AVA yields a 2X speedup over the default configuration for short vectors. Additionally, AVA shows competitive performance when compared to a long vector VP, while saving 50 area.

READ FULL TEXT

page 7

page 9

page 12

research
02/26/2021

A Variable Vector Length SIMD Architecture for HW/SW Co-designed Processors

Hardware/Software (HW/SW) co-designed processors provide a promising sol...
research
10/29/2021

A RISC-V Simulator and Benchmark Suite for Designing and Evaluating Vector Architectures

Vector architectures lack tools for research. Consider the gem5 simulato...
research
12/22/2022

Accelerating CNN inference on long vector architectures via co-design

CPU-based inference can be an alternative to off-chip accelerators, and ...
research
03/16/2018

The ARM Scalable Vector Extension

This article describes the ARM Scalable Vector Extension (SVE). Several ...
research
06/02/2019

Ara: A 1 GHz+ Scalable and Energy-Efficient RISC-V Vector Processor with Multi-Precision Floating Point Support in 22 nm FD-SOI

In this paper, we present Ara, a 64-bit vector processor based on the ve...
research
09/13/2023

Short reasons for long vectors in HPC CPUs: a study based on RISC-V

For years, SIMD/vector units have enhanced the capabilities of modern CP...
research
05/16/2022

TNN7: A Custom Macro Suite for Implementing Highly Optimized Designs of Neuromorphic TNNs

Temporal Neural Networks (TNNs), inspired from the mammalian neocortex, ...

Please sign up or login with your details

Forgot password? Click here to reset