__host__ __device__ – Generic programming in Cuda

08/09/2023
by   Thomas Mejstrik, et al.
0

We present patterns for Cuda/C++ to write save generic code which works both on the host and device side. Writing templated functions in Cuda/C++ both for the CPU and the GPU bears the problem that in general both __host__ and __device__ functions are instantiated, which leads to lots of compiler warnings or errors.

READ FULL TEXT
research
06/06/2021

Experience Report: Writing A Portable GPU Runtime with OpenMP 5.1

GPU runtimes are historically implemented in CUDA or other vendor specif...
research
09/29/2021

Unified Shader Programming in C++

In real-time graphics, the strict separation of programming languages an...
research
10/12/2021

Generic Level Polymorphic N-ary Functions

Agda's standard library struggles in various places with n-ary functions...
research
10/03/2013

Cudagrind: A Valgrind Extension for CUDA

Valgrind, and specifically the included tool Memcheck, offers an easy an...
research
12/31/2018

Generic Programming in OCaml

We present a library for generic programming in OCaml, adapting some tec...
research
11/28/2022

High-performance xPU Stencil Computations in Julia

We present an efficient approach for writing architecture-agnostic paral...
research
12/19/2021

COX: CUDA on X86 by Exposing Warp-Level Functions to CPUs

As CUDA programs become the de facto program among data parallel applica...

Please sign up or login with your details

Forgot password? Click here to reset