High-Throughput Parallel Viterbi Decoder on GPU Tensor Cores

11/27/2020
by   Alireza Mohammadidoost, et al.
0

Many research works have been performed on implementation of Vitrerbi decoding algorithm on GPU instead of FPGA because this platform provides considerable flexibility in addition to great performance. Recently, the recently-introduced Tensor cores in modern GPU architectures provide incredible computing capability. This paper proposes a novel parallel implementation of Viterbi decoding algorithm based on Tensor cores in modern GPU architectures. The proposed parallel algorithm is optimized to efficiently utilize the computing power of Tensor cores. Experiments show considerable throughput improvements in comparison with previous works.

READ FULL TEXT

page 5

page 10

page 12

page 13

research
11/18/2020

High-Throughput and Memory-Efficient Parallel Viterbi Decoder for Convolutional Codes on GPU

This paper describes a parallel implementation of Viterbi decoding algor...
research
03/08/2019

Analyzing GPU Tensor Core Potential for Fast Reductions

The Nvidia GPU architecture has introduced new computing elements such a...
research
08/29/2023

Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library

NVIDIA Tensor Core is a mixed-precision matrix-matrix multiplication and...
research
01/15/2020

GPU Tensor Cores for fast Arithmetic Reductions

This work proposes a GPU tensor core approach that encodes the arithmeti...
research
11/19/2018

Modeling Deep Learning Accelerator Enabled GPUs

The efficacy of deep learning has resulted in its use in a growing numbe...
research
09/22/2022

Computing Double Precision Euclidean Distances using GPU Tensor Cores

Tensor cores (TCs) are a type of Application-Specific Integrated Circuit...
research
09/11/2020

Fast LDPC GPU Decoder for Cloud RAN

The GPU as a digital signal processing accelerator for cloud RAN is inve...

Please sign up or login with your details

Forgot password? Click here to reset