Signal Processing for a Reverse-GPS Wildlife Tracking System: CPU and GPU Implementation Experiences

05/21/2020
by   Yaniv Rubinpur, et al.
0

We present robust high-performance implementations of signal-processing tasks performed by a high-throughput wildlife tracking system called ATLAS. The system tracks radio transmitters attached to wild animals by estimating the time of arrival of radio packets to multiple receivers (base stations). Time-of-arrival estimation of wideband radio signals is computationally expensive, especially in acquisition mode (when the time of transmission is not known, not even approximately). These computations are a bottleneck that limits the throughput of the system. We developed a sequential high-performance CPU implementation of the computations a few years back, and more recencely a GPU implementation. Both strive to balance performance with simplicity, maintainability, and development effort, as most real-world codes do. The paper reports on the two implementations and carefully evaluates their performance. The evaluations indicates that the GPU implementation dramatically improves performance and power-performance relative to the sequential CPU implementation running on a desktop CPU typical of the computers in current base stations. Performance improves by more than 50X on a high-end GPU and more than 4X with a GPU platform that consumes almost 5 times less power than the CPU platform. Performance-per-Watt ratios also improve (by more than 16X), and so do the price-performance ratios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2020

Performance Analysis of Noise Subspace-based Narrowband Direction-of-Arrival (DOA) Estimation Algorithms on CPU and GPU

High-performance computing of array signal processing problems is a crit...
research
02/06/2018

The performances of R GPU implementations of the GMRES method

Although the performance of commodity computers has improved drastically...
research
11/20/2022

A Hybrid Multi-GPU Implementation of Simplex Algorithm with CPU Collaboration

The simplex algorithm has been successfully used for many years in solvi...
research
04/20/2023

Optimizing High-Performance Linpack for Exascale Accelerated Architectures

We detail the performance optimizations made in rocHPL, AMD's open-sourc...
research
11/22/2022

High-Throughput GPU Implementation of Dilithium Post-Quantum Digital Signature

In this work, we present a well-optimized GPU implementation of Dilithiu...
research
11/28/2022

High-performance xPU Stencil Computations in Julia

We present an efficient approach for writing architecture-agnostic paral...
research
09/10/2019

High-performance Cryptographically Secure Pseudo-random Number Generation via Bitslicing

In this paper, a high-throughput Cryptographically Secure Pseudo-Random ...

Please sign up or login with your details

Forgot password? Click here to reset