GPU Programming - Speeding Up the 3D Surface Generator VESTA

01/26/2015 ∙ by B. R. Schlei, et al. ∙ 0

The novel "Volume-Enclosing Surface exTraction Algorithm" (VESTA) generates triangular isosurfaces from computed tomography volumetric images and/or three-dimensional (3D) simulation data. Here, we present various benchmarks for GPU-based code implementations of both VESTA and the current state-of-the-art Marching Cubes Algorithm (MCA). One major result of this study is that VESTA runs significantly faster than the MCA.



There are no comments yet.


page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

NVIDIAs toolkits (cf., e.g., Ref. [1]) for the development of CUDA®-based software contain, among many other things, example code for an extended version [2] of the original MCA [3]. Here, we compare the performance of this code with our CUDA®- and ANSI-C-based implementation of VESTA [4] on a Linux-based (i.e., openSUSE ) PC with a GeForce GTX Ti graphics card.

In particular, the times that we have measured (cf., Table 1) are averages over runs each. The measurements start after the data sets have been loaded into texture memory, and they stop after all point coordinates and triplets of point IDs (i.e., triangles) have been computed on the GPU.

Technique Extended MCA Marching VESTA
Mode DCED / L DCED / L Mixed / H
(a) Points
Time (ms)
(b) Points
Time (ms)
(c) Points
Time (ms)
(d) Points
Time (ms)
(e) Points
Time (ms)
Table 1: Benchmarks for various processed tomographic data sets: for (a) – (c), cf., Ref. [4] and Ref.s therein, (d) Bucky.raw data is a portion of [1], and (e) Happy Buddha VRI file [5]. For the selected isovalues, cf., Fig. 1.
Figure 1: VESTA high resolution “mixed” mode (Mixed/ H) isosurface renderings, where the isovalues equal to (a) , (b) , (c) , (d) , and (e) , respectively.

2 Results

For the here considered data sets [1, 4, 5], the extended MCA is about (a) , (b) , (c) , (d) , and (e) , slower than the marching variant of VESTA [4], when the latter is executed in its low resolution “disconnect” mode (DCED/L). Furthermore, VESTA is also faster even if higher resolution isosurfaces are computed (cf., Fig. 1), which have about twice the number of triangles (cf., Table 1).

Note that the current code implementation of VESTA does not yet use parallel streaming, nor it does call device kernels from within kernels. As a consequence, further GPU-based code optimisations may result in an even faster VESTA code.


  • [1] NVIDIA® CUDA® Toolkit 6.5; for more detail, cf., https://
  • [2] P. Bourke, “Polygonising a scalar field”, May 1994; for more detail, cf.,
  • [3] W. E. Lorenzen and H. E. Cline, “Marching Cubes: A High Resolution 3D Surface Construction Algorithm”, Comput. Graph. 21 (1987), p. 163.
  • [4] B. R. Schlei, “Volume-Enclosing Surface Extraction”, Computers & Graphics 36 (2012) p. 111, doi: 10.1016/j.cag.2011. 12.008.
  • [5]