1. Introduction
Real time rendering of highly complex scenes used to be a major topic in computer graphics for a considerable time. In recent years, the interest dropped as one might consider the problem solved considering the vast computational power of modern graphics hardware and the rich set on different rendering techniques like culling and approximation techniques.
A new challenge arises from the newest generation of head mounted displays (HMD) like the Oculus Rift or its competitors – high framerates are no longer a nicetohave feature but become a hard constraint on what content can actually be shown to the user. To avoid nausea it is necessary to keep a high framerate of while rendering in stereo at a high resolution. Although, the lower framerate limit of can be reduced to by using techniques such as time warping (van Waveren, 2016)
, it is still challenging to display highly complex 3d scenes in stereo at the required resolutions. Furthermore, due to the distortion by the HMD lenses, it is usually required to render to an oversized frame buffer that exceeds the HMDs resolution to achieve the required image quality at the focal area. But, this also means, that pixels at the border get skewed and we would render unnecessary high details. Modern game engines adapted to these requirements and allow the rendering of virtual scenes with an impressive visual quality on head mounted displays; but only if the scenes are specially designed for visualization purposes or have a low complexity.
Another challenge is posed by the rendering of complex CAD data. It can be a requirement to visualize multiple machines in a virtual machine hall for interactive design reviews for, e.g., planning purposes. These virtual prototypes are often visualized in a big cavelike system with multiple projection screens to provide a multiuser virtual reality view on the 3d scene to analyse and discuss potential problems and improvements of a technical system. However, the realtime requirements for such interactive virtual design reviews are usually much lower than for, e.g., games () and for the image quality, visualizing functionallity is usually more important than realism. For virtual design reviews of actual machines, factories, or buildings, the underlying data is not created for the visualization itself but is based on potentially highly complex 3d CADdata. Converting such CAD data into a suitable virtual scene typically requires expert knowledge, manual work and a substantial amount of time.
We present a new approach for rendering very large and complex scenes supporting a wide range of input scenes, including CAD data, fast enough for displaying on HMDs. Our approach is fast, robust, easy to implement, requires only minimal user interaction for preparing an input scene, and offers good visual quality, which automatically adapts to the required framerate and available GPU computation power. Because our method does not have any requirements for the type of scene (e.g. landscape, architecture, machines), it can render any scene equally well. No timeconsuming manual work for converting CAD data is required. This is achieved by combining and extending ideas from imagebased and pointbased rendering, visibility algorithms and approximation algorithms.
1.1. Outline
The basic idea of our technique is to approximate complex parts of the scene having a small projected size by a set of points with much lower complexity. In contrast to other point based approximation algorithms, the points are not placed on all surfaces, but only on surfaces that are visible if the approximated part is seen from the outside (external visibility). To minimize the number of points needed to cover a surface without visible holes, the placement algorithm aims at maximizing the minimal distances between neighboring points (and thereby aims at creating a blue noise distribution). Unlike other techniques distributing points evenly on a three dimensional surface, our algorithm creates a particular ordering of the distributed points: Each prefix of the complete set of points of an approximated part maximizes the closest distances between neighboring points. Choosing a larger prefix results in a smaller distance between points and in a denser coverage of the surface. This allows to dynamically choose as few points for a part of the scene as are necessary to cover each of its projected pixels with high probability. The sorted surface points are created in a preprocessing step and are stored in ordinary vertex buffers. During runtime a traversal algorithm determines which parts of the scene are rendered using the original geometry and which ones are approximated using a subset of points. The number of rendered points is determined by the available projected size of the rendered part and the overall rendering budget for that frame. A huge benefit of this arrangement is, that rendering one array of points requires only a single draw call to the GPU with a variable length parameter. To our knowledge, other current progressive simplification methods still need to perform complex operations for the dynamic simplification or refinement on the CPU or GPU, or require a certain structure of the simplified object.
Our technique consists of two steps: In a preprocessing step, the surfels are generated based on the scene’s original geometry. We describe the generation of the inital surfel set, the sampling process and the surfel generation for hierarchical scenes in Section 3. The second step is to render the precalculated surfel approximations during realtime rendering. We present the rendering in Section 4. In Section 5 we present experimental results evaluating the overhead of the preprocessing and rendering time and visual quality of the rendering for a highly complex virtual scene. In Section 6 we discuss limitations and possible extensions to the technique.
2. Related work
A lot of research has gone into the area of rendering highly complex 3d scenes in real time and vast amount of techniques has been developed over the years. The usual approach is to reduce the amount of data that has to be processed by the graphics hardware by culling invisible parts of a 3d scene and reducing the complexity of some objects by replacing it with a simplification (levelofdetail) where feasible.
For levelofdetail (LOD) based algorithms the 3d scene is usally partitioned into a hierarchical data structure where each level provides an approximation of the underlying subtree (HLOD (Erikson et al., 2001)). The approximations can consist of a discrete set of geometric models with varying complexity (Luebke et al., 2003), imagebased simplifications (Aliaga et al., 1999; Oliveira et al., 2000; Süß et al., 2010), pointbased simplifications (Gross, 2009; Kobbelt and Botsch, 2004; Alexa et al., 2004), or progressive levelofdetail approximations (Hoppe, 1996; Yoon et al., 2004; Derzapf and Guthe, 2012).
Progressive LODs have the advantage, that the degree of abstraction can be chosen dynamically at run time dependent on the observers position and viewport, and therefore there is a continuous transition between different detail levels without popping artifacts that occur when switching between discrete models. Progressive Meshes were introduced by Hoppe (Hoppe, 1996). A Mesh is progressively refined or simplified by performing a sequence of split or collapse operations on the vertices and edges on a mesh. This idea was later combined with the idea of HLOD to allow for a hierarchiy of progressive simplifications (Yoon et al., 2004). A problem of progressive meshes is, that they require a certain structure of the mesh (i.e. 2manifold geometry), and do not translate well to the GPU, because of the dependencies between operations and vertices.
Although there are a few attempts for progressive meshes on the GPU (Hu et al., 2009; Derzapf and Guthe, 2012), using progressive pointbased approximations are often better suited, since they usually don’t require neighborhood dependencies between points. Dachsbacher et al. (Dachsbacher et al., 2003) proposed a progressive pointbased LOD technique that allows adaptive rendering of point clouds completely on the GPU. They transfer points effectively to the GPU by transforming it into a sequential point tree which can be traversed directly on the GPU by sequential vertex processing. A similar approach to progressive point rendering as progressive meshes by Hoppe (Hoppe, 1996) was proposed by Wu et al. (Wu et al., 2005). From an initial point set they arrange all possible splat merge operations into a priority queue according to some error metric. The operations can then be iteratively performed to achieve the desired detail level.
To our knowledge, all available progressive LOD techniques require some sort of complex operations on the CPU or GPU to refine or simplify the geometry or point cloud. We propose an approach that does not require any refinement or simplification operations. We order the points in a vertex buffer such that each prefix presents a good approximation and the detail level can be chosen by simply specifying the number of points to be rendered. This can be achieved by ordering the points by an approximate greedy permutation or farthestfirst traversal (Eppstein et al., 2015), a well known technique from image sampling (Eldar et al., 1997). Using a farthest point strategy, Eldar et al. (Eldar et al., 1997) showed that such a sampling scheme possesses good blue noise characteristics. Having a blue noise quality for a set of point samples (either 2d or 3d) is a desirable property. It guarantees high spatial uniformity and low regularity of the distribution which avoids aliasing artifacts when displaying the points (Yan et al., 2015). Moenning and Dodgson (Moenning and Dodgson, 2003) used a farthest point strategy (FastFPS) for the simplification of point clouds and they also hinted at the usefulness of this strategy for progressive point rendering. However, their algorithm requires a dense point cloud or an implicit surface as input.
For the rendering of point clouds, the most common practice is the usage of surfels or splats as introduced by Pfister et al. (Pfister et al., 2000) and Rusinkiewicz et al. (Rusinkiewicz and Levoy, 2000). A surfel is a ntuple which encodes a 3d position together with an orientation and shading attributes that locally approximate an object’s surface. The point cloud can then be rendered using hardware accelerated point sprites (Coconu and Hege, 2002; Botsch and Kobbelt, 2003) or screen space splatting (Zwicker et al., 2001; Guennebaud et al., 2004; Preiner et al., 2012). Multiple techniques have been developed over the years to improve the visual quality of the rendered surfels, and this is still an active research area.
One major difference of available pointbased rendering techniques to our method is how we aquire the initial point set which then gets refined. We only sample the externally visible surfaces of an object by rendering it from multiple directions and using the rendered pixels as basis for the surfel computations. A similiar approach for sampling visibility was proposed by Eikel et al. (Eikel et al., 2013). They use the rasterization hardware to compute the directiondependent visibility of objects in a bounding sphere hierarchy. This allows for efficient rendering of nested object hierarchies without the need of time consuming occlusion culling or high memory consumption for precomputed visibility.
3. Generation of Surfel Approximations
In the following, we describe the generation of surfel approximations (LODs) for complex scenes. A surfel approximation is stored in a single contiguous vertex buffer object where each vertex entry represents a surfel, where a surfel consists of a 3d position, a normal vector and material properties (e.g., color). The order of the surfels in a vertex buffer gives an approximation of the underlying object that can be progressively refined by simply adjusting the number of rendered surfels. This allows for an efficient, cachefriendly rendering of a surfel approximation independent of the desired detail level, by simply rendering only a prefix of the a single vertex buffer object.
We assume the scenes to be represented by a hierarchically organized scene graph, preferably representing the spatial structure of the scene. Scenes originating from CAD data often already provide a suitable structure (object hierarchies, assembly groups). If no such structure is available, commonly used spatial data structures can be applied, e.g. a loose octree (Ulrich, 2000). We assume that the scene’s geometry is stored in the leaf nodes of the scene graph and that the geometry is represented by polygonal data, although any renderable opaque surface representation can be used.
We begin by describing the generation of a surfel approximation for a single object, i.e., a single node in the scene graph. First, we generate an initial set of surfel canditates, wich is described in Subsection 3.1. Then, we progressively sample the initial set of candidates to achieve the desired surfel approximation of the object (Subsection 3.2). Finally, we describe the hierarchical generation of surfel approximations for an entire scene graph (Subsection 3.4).
3.1. Creating the initial set of surfels
The first step for creating the surfel approximation for a single node in a scene graph is to determine an initial set of possible surfels from which the resulting surfels for the node’s approximation will be drawn. We generate these initial samples using the rasterization hardware by rendering the node’s subtree from multiple directions into a set of Gbuffers. This allows us to capture the visible surface of a subtree as seen from outside of it’s bounding volume. In practice, it has shown that rendering from the eight corners of the node’s bounding box, using orthographic projection directed to the center of the node, gives a sufficiently good approximation of the visible surface for most applications. We use multitarget rendering to render the node into multiple output buffers in a single rendering pass, each output buffer encoding a different surface property. One buffer contains the pixels’ 3d position relative to the node’s local coordinate system, another buffer encodes the surface normal in the same coordinate system. At least one buffer is used to encode the pixels material properties. In the simplest version, the ambient, diffuse and specular color values are combined into a single color value. To allow for more complex lighting of the approximation during rendering, further properties can be captured in additional buffers, like PBR material parameters. As this step uses the standard rendering pipeline, even complex shaders for surface modification and generation can be used if they provide the desired output and only produce opaque surfaces. To assure a sufficiently dense coverage of the surface with surfels, the resolution for rendering the buffers for one direction should at least match the intended resolution used at runtime. A much higher resolution unnecessarily increases the preprocessing time and requires a larger sample size during the following adaptive sampling phase. An example for the created buffers for a single object is shown in Figure 2. After the buffers have been filled, a surfel entry (a record storing a 3d position, normal, color etc.) is added to the initial set of surfels for each covered pixel.
3.2. Progressive Sampling
After creating the initial surfel set, we select and sort a subset of the surfels. The goal is to achieve a sequence of surfels in which for each prefix, the minimal closest distance between neighboring surfels is maximized (greedy permutation), i.e. the first surfels are evenly spread over different parts of the object while further surfels start covering the complete surface until small details are refined at the end of the sequence. In order to approximate such a sequence, we apply a random sampling technique: We start by selecting and removing a random starting surfel from the input set (the initial set of surfels) and add it to the output sequence. Now, we select a uniformly chosen random subset of the remaining input set of fixed size (in practice, samples yield a reasonably good quality result). We determine the candidate with the largest distance to all surfels chosen before (e.g. using an octree data structure over the chosen surfels), append it to the result sequence, and remove it from the input set. Samples that are at close vicinity to the chosen samples can also be removed from the input set. The other samples remain in the input set for the next round. The sampling step is repeated until a desired number of chosen surfels is reached or the input set becomes empty. The number of created surfels influences the point size that can be chosen during rendering of the node and therefore the quality of the approximation.
In order to speed up the sampling process, we apply a heuristic: After each
iterations, the number of candidates chosen from the random sample set is increased by one (e.g., ). This increases the prepocessing speed while reducing the quality of the distribution only slightly as shown in Subsection 3.3.3.3. Quality of Sampling Distribution
Greedy Permutation 

Blue Surfels 

Random 

Min. rel. point distances 

1k surfels  5k surfels  10k surfels 
Since the goal is to cover the entire surface of an object with as few surfels as possible to get the best possible image quality when rendering a surfel approximation, a uniform distribution of the points is very important. A greedy permutation (a sequence of points where every point is as far away as possible from all previous points) has the property, that the points of every prefix are uniformly distributed with blue noise characteristics which reduces aliasing artifacts when displaying the points and therefore gives a good approximation of the underlying surface. We use a simple randomized sampling algorithm, which runs indepentently from the size of the input data, to quickly get an approximate greedy permutation. Although we might loose some of the desired properties, our experiments show, that our method still gives a reasonably good approximation of a greedy permutation while perfoming much faster than an exact greedy permutation (computing an exact greedy permutation for 10k points of the bunny model shown in Figure 3 took us while our method only took ).
Figure 3 shows surfel prefixes of different sizes () for the stanford bunny model^{1}^{1}1http://graphics.stanford.edu/data/3Dscanrep/ in comparison to surfels chosen uniformly at random and using an exact greedy permutation (based on the initial surfel set). Although our method lacks in the good uniformity of the greedy permutation, it still proves a significant improvement over the random solution. This can be especially seen at smaller prefix lengths of surfels. While the random solution forms many clusters of surfels as well as holes, our solution is only slightly different from the exact solution. The difference between our solution and the exact solutions becomes more visible at surfels. However, the points are still roughly uniformly distributed on the surface.
The bottom row of Figure 3
shows a combined violin and box plot of the minimum relative point distances between surfels at the given prefix sizes and for each of the three sampling methods. Since the greedy permutation maximizes the minimum distance between points for each prefix, one can see a clear minimum cap at some distance while our method has more outliers that fall below this value. However, the median distance of our method is still close to the minimum distance of the greedy permutation approach while significantly better than the random approach. This is an indicator that our method yields a good surface coverage for a fixed prefix length and therefore allows for fast rendering with a good image quality in comparison to the other two solutions. The image quality is further examined in
Section 5.3.4. Hierarchical generation
For the rendering of complex scenes, we hierarchically generate surfel approximations for multiple subtrees of the existing scene graph structure of a scene. This can be done by traversing the scene graph in a topdown or bottomup order and generate surfel approximations for each node that exceeds a certain complexity (e.g., generate a surfel approximation of surfels for each subtree that consists of more than triangles). When generating the surfel approximations bottomup, one can use the already computed surfels of the child nodes instead of the original geometry to speed up the rendering step for the initial surfel sampling (see Subsection 3.1). Otherwise, any existing approximation or culling technique can be used to generate the images for the initial sampling process. This also allows for easy outofcore generation of the surfel sets.
For animated or moving objects, one should generate the surfel approximations seperately from the static scene parts, since already computed approximations cannot easily be modified without breaking the desired distribution qualities. Unfortunately, complex deforming animations cannot easily be handled by our method, but it shouldn’t be too difficult to incorporate bone weights for skeletal animation in the vertices of the surfel buffer.
4. Rendering Progressive Blue Surfels
In this section, we describe the rendering of Progressive Blue Surfels during an interactive walkthrough of a complex scene. The goal is to replace entire subtrees of a scene graph with their corresponding surfel approximation (LOD) as long as the visible surface of the original geometry can be covered by the oriented discs defined by the surfels and as long as the image quality (and run time) suffices for the intended application. Given a fixed surfel size, we can easily compute the required prefix of the surfel approximation dependent on the distance of the approximated object to the observer to cover all pixels of the object in screen space (see Subsection 4.1). An example of different prefix length (100, 1k, 10k, 100k) of a blue surfel approximation of the UNC power plant model (The Walkthru Group, 2001) can be seen in Figure 4 compared to the model without any approximation (e). It also shows a zoomed in view (upper left boxes of Figure (a)a(d)d) for each of the surfel prefixes as they would actually be rendered with the corresponding prefix length.
We render the surfels as oriented discs by using OpenGL point primitives together with a fragment shader as described in Subsection 4.2. When a surfel approximation for a scene node cannot sufficiently cover the visible geometry of the subtree anymore (i.e., when getting too close to the object), we blend between the node’s approximation and its children’s approximations or original geometry by gradually decreasing the number of rendered surfels for the node while increasing the number of rendered surfels of the child nodes or rendering the original geometry. Finally, in Subsection 4.3, we describe a simple extension to our rendering algorithm that adaptively tries to keep a desired framerate while maximizing the possible image quality, and in Subsection 4.4, we describe how our method can be used for simple fixed foveated rendering for headmounted displays.
4.1. Rendering a surfel prefix
A surfel approximation for a single object is stored in a contiguous vertex buffer on the GPU. We can choose the quality of the approximation by simply adjusting the number of rendered point primitives from this buffer. Ideally, we choose the rendered prefix of the buffer such that the entire surface of the object is covered by surfels without holes, i.e., we choose the prefix length in such a way, that every pixel of the rendered object is covered by at least one surfel. That means, for a given surfel radius , we want to find a minimal prefix length s.t. every other surfel in the entire surfel set (which approximates the surface) is covered by a surfel of radius in the prefix. To find this value , we use the close relation of greedy permutations to nets. An net of a point set is a subset s.t. no point in is within a distance of of each other and every point in is within a distance of to a point in . Now, each prefix of a greedy permutation is an net for equal to the minimum distance between points in this prefix (Eppstein et al., 2015). Using this, and the fact, that for Euclidean metrics in , every ball of radius can be covered by balls of radius (Gupta et al., 2003)
, we can easily estimate the prefix length
for a given radius :Here, is the precomputed median minimum distance for a fixed prefix length (e.g., ), which can easily be computed during the preprocessing phase (see Subsection 3.2). We use the median of the minimum distances between points to compensate for our used heuristic. We furthermore use the simplified assumption, that the generated surfels lie on a 2manifold surface (which does not have to be the case), i.e. every surfel disk covers surfels with half the radius (in this value should be , but gives a good enough estimation).
To get a covering of all pixels in screenspace, we choose proportional to the projected distance between two pixels relative to the approximated object’s local coordinate system.
For this, we take two neighboring pixels in screenspace, project them onto the view plane going through a point at the object (e.g., the closest point to the bounding box of the object from the camera position) and compute the distance between these two points (in the object’s local coordinate system).
Now, we compute by , where is the desired surfel size in pixels (which corresponds with gl_PointSize
in OpenGL).
4.2. Drawing oriented discs
We render each surfel of a surfel prefix as a point primitive with a fixed size, i.e., a square in screen space with fixed pixel dimensions. To draw the points as oriented discs (using the stored normals of a surfel), we use a fragment shader. For each rendered fragment of a point primitive, we project the fragment in screen space back onto the plane defined by the surfels position and normal. The fragment is discarded if the distance to the center of the surfel is larger than the size of the surfel in object space. This results in opaque elliptic discs as can be seen in Figure 4.
In this step, it is also possible to use extended filtering methods for surfels as, e.g., EWA filtering as described by Botsch et al. (Botsch et al., 2005), to blend between the colors of neighboring surfels. However, when rendering massive scenes using our method, such filtering methods can become too slow very quickly and for our purposes, it was enough to only render elliptical splats without further filtering. But, as shown in Section 5, we still achieve a reasonable good image quality.
Scene part  Objects (unique)  LODs (unique)  Triangles (unique)  Surfels (unique)  Triangle Memory  Surfel Memory  Total Memory 

Terrain  
Pompeii  
Car Factory  
Bakery  
Power Plants  
Total 
4.3. Adaptive rendering
Due to the progressive nature of our method, it is easy to extend our algorithm for rendering complex scenes with an adaptive levelofdetail mechanism that tries to keep a desired framerate while maximizing the possible image quality. The image quality of a surfel approximation depends mainly on the size of the rendered surfels (smaller is better) while the framerate depends on the number of rendered surfels and polygon count of the original geometry. Now, for our method, the number of rendered surfels can be directly derived from the desired surfel size (or vice versa) to cover the visible surface of the original geometry (see Subsection 4.1). Therefore, we can easily reduce the frametime by increasing the size of the rendered surfels and therefore reducing the image quality. This allows for a simple reactive algorithm that increases or decreases the surfel size based on the frametime of the last rendered frame. We do this by calculating the moving average of the surfel size of the last 3 frames and the surfel size weighted by the deviation factor of the last frametime to the target frametime :
To avoid flickering, we only modify this value when the deviation factor reaches a certain threshold, e.g. when it falls outside of the interval . We also clamp the value by a minimum size of and a small maximum size (e.g., ) to avoid surfels that are too big.
4.4. Foveated rendering
To allow efficient rendering for headmounted displays (HMDs), foveated rendering is a great method to significantly improve performance on current HMDs, even without eyetracking capabilities. The basic idea is, to decrease rendering complexity and quality in the periphery of the viewport while maintaining high fidelity in the focal area (fovea). This is possible due to the distortion by the HMD lenses which skewes pixel at the border of the frame buffers for each eye.
With our method it is easily possible to implement a simple foveated rendering technique. For this, we defined different fovea zones in screen space for each eye with different quality settings for the surfel sizes. We then simply interpolate the size of the surfels between zones on a perobject basis to achieve a gradual increase in surfel size to reduce the complexity towards the periphery. This allows for a smooth change in quality which is very important for rendering on HMDs since popping artifacts are especially noticable in the peripheral view.
5. Results
In this section we describe the experimental results of our proposed rendering method. In Subsection 5.1 we describe the hardware configuration and the test scene used for evaluating the performance and visual quality of our method. In Subsection 5.2 we examine the preprocessing time of our method. Subsequent, the running time and visual quality during rendering is discussed (Subsection 5.3).
5.1. Benchmark
We implemented Progressive Blue Surfels in our experimental rendering framework. All measurements of the subsequent evaluations were performed using a workstation PC (Intel Core i76700 with GHz, 32 GB RAM, NVIDIA GeForce GTX 1060). For experiments involving HeadMounted Displays we used the Oculus Rift CV1 and stereo rendering using an oversized framebuffer of . An example image rendered using our method on the Oculus Rift can be seen in Figure 6.
For our tests, we create a large heterogenous scene which consists of various parts (see Figure 5). The basic scene is a terrain chunk with roads and a high number of trees. On this terrain we placed a set of smaller scenes that each highlight different strengths of our proposed rendering algorithm:
 Terrain:

The terrain consists of tiles, each having a size of and containing Triangles (hexagonal grid). On the terrain we placed trees (randomly selected from 6 unique trees with triangles each). The other scene parts are connected by roads consisting of simple road segments with triangles each.
 Pompeii:

Highly detailed model of pompeii generated using CityEngine^{2}^{2}2http://www.esri.com/software/cityengine/industries/proceduralpompeii (Müller et al., 2006) (Figure 5 view 1). It consists of a high amount of small objects with various materials.
 Car Factory:

A large car factory created during a student project consisting of multiple factory halls with moderately complex machinery and car parts (Figure 5 view 2&3).
 Bakery:

A smaller factory hall with 5 high detailed triangluated CAD models of donut production lines provided by WP Kemper GmbH^{3}^{3}3https://www.wpkemper.de (Figure 5 view 4&5). Each production line contains triangles.
 Power Plants:
In total the scene consist of triangles in individual objects ( unique triangle meshes). A more detailed breakdown of the scene geometry and memory consumption of each part, as well as the generated surfel approximations, can be gathered from Table 1.
5.2. Preprocessing
Surfel count  10k  50k  100k 

Render to texture ()  12 ms  12 ms  12 ms 
Creating initial surfel set (1.75M entries)  82 ms  82 ms  82 ms 
Sampling (sample size 200)  105 ms  311 ms  592 ms 
Total  199 ms  405 ms  686 ms 
In this section we examine the preprocessing time of our method. Table 2 shows the preprocessing times for generating surfel approximations of various sizes for a single object (the UNC power plant (The Walkthru Group, 2001)). The initial samples were generated using a resolution of using 8 directions. For the progressive sampling, a sample size of 200 samples per round were chosen. The only part that depends on the approximated scene geometry is the first rendering step from multiple directions from which the initial sample set is created. However, this is usually only a small part for the generation and can be sped up by using preexisting culling or approximation techniques, or by using previously computed surfel approximations (when approximating larger subtrees of a scene graph). The time taken for creating the initial surfel set is primarily due to transfering the data from the GPU to main memory and depends on the resolution used for creating the samples and (to some degree) the shape of the approximated object. The major portion of the computation is the sampling part, which (due to our randomized sampling technique) only depends on the intended target size of the surfel approximation. This step can easily be multithreaded when generating LODs for an entire scene.
Our benchmark scene contains a total number 3248 unique surfel approximations of varying sizes ( surfels). We bounded the surfel count for an object by the minimum of surfels and half the complexity of the approximated object (number of triangles). Objects with a complexity of below 1000 triangles were only approximated in groups in a higher hierarchy level. The total preprocessing time for the scene took only minutes on a single thread.
5.3. Rendering Performance & Image Quality
#  View  No LOD  1k  2.5k  4k  HMD  Distribution of frame times  

1  Overview  Draw calls  
Triangles  
Surfels  
FPS  
Quality  
2  Pompeii  Draw calls  
Triangles  
Surfels  
FPS  
Quality  
3  Car Factory  Draw calls  
Triangles  
Surfels  
FPS  
Quality  
4  Bakery  Draw calls  
Triangles  
Surfels  
FPS  
Quality  
5  Power Plants  Draw calls  
Triangles  
Surfels  
FPS  
Quality 
In this section we examine how our proposed method performs in terms of realtime rendering performance and image quality. The image quality was measured by comparing the approximated image with the image rendered without LOD using the hierarchical Structural SIMilarity (SSIM) index method proposed by Wang et al. (Wang et al., 2004). An example of this method is shown in Figure 7. The upper image shows a camera view rendered with our method, while the middle image shows a view without any approximations. The bottom image shows the SSIM image which highlights the differences of both images from which the SSIM index (0.81) is computed. Especially noticable is the difference at the trees and the crane of the power plant. The trees and crane rendered with our method seem more volumetric since our method cannot handle thin or finely detailed objects very well. However, this fact can also be utilized to achieve some degree of antialiasing.
For the evaluation of our method, we placed cameras at various representative positions in our benchmark scene to show different aspects of our algorithm. For each camera position we measured the average frame time to render a single image at different resolutions (1k, 2.5k, 4k) as well as stereo rendering for headmounted displays (). We computed the SSIM index for these positions at each resolution, except for HMD rendering since the value might get a wrong impression due to the distortion of the HMD lenses.
Table 3 shows the measured camera positions with the resulting statistics for number of draw calls, rendering time, and image quality of this specific view for each of the different resolutions. The last column additionally shows a combined box & violin plot of the distribution of frame times measured at multiple uniformly distributed positions at the specific region of the view for each of the cardinal directions. For the overview, we measured positions at a height of 200m above ground with a slight downward tilt for each camera view. For the other views, we measured positions ( for the bakery) at ground level (2m).
In general, we achieved high frame rates of at least 30 fps for each camera position and resolution while achieving a relatively good SSIM index of at least 0.68 for the overview (which is mainly due to the large amount of trees) and at least 0.9 for camera positions closer to the ground level. For resolution of we even achieve frame rates of at least 60 fps throughout the entire scene. The lowest frame rates can be observed at Pompeii and the Bakery, which are the most complex parts of our benchmark scene. Here, we have high number of complex scene objects close to the observer, which cannot be effectively approximated anymore by our algorithm alone (see, e.g., Figure 8). Still, the median frame times in these cases are still at least around 15 ms () for a 4k resolution and at least 13 ms () for HMD stereo rendering as can be seen in the box plot in the rightmost column of Table 3.
Although we did not achieve the desired 90 fps for stereo rendering on an HMD everywhere, we were still able too keep the frame rates (mostly) in the range of 4590 fps which is the range used for effective timewarping to reduce nausea (van Waveren, 2016). These frame rates can certainly be improved using further specialized techniques for HMD rendering, like fixed foveated rendering or multiresolution framebuffers.
6. Conclusion and future work
We have presented an efficient point based algorithm for generating and rendering continuous approximations of highly complex scenes in real time, even for VR applications using headmounted displays on standard consumer hardware. Our method can handle a large variety of scene types, including complex CAD data. It can robustly create approximations of almost any surface requiring only little user interaction and parameterization. The image quality is reasonably good (with room for improvement) and, due to the continuous nature, there are almost no visible popping artifacts when navigating a scene. Using our method combined with other culling and levelofdetail techniques, we are certain that we can achieve an even better performance, and with better point based filtering to improve the visual quality it might also be applicable in the context of games.
However, there are some limitations. Since our method is mostly intended for objects with a small projected size on the screen, very complex geometry, which covers a large portion of the screen, cannot be effectively approximated. Also, since we uniformly distribute points on the visible surface of an object, we might draw unnecessarily many points for long objects where one part is close to the observer while other parts are further away. Currently, the only way to circumvent these problems is to cut these objectes into smaller parts and approximate each part seperately, which results in more draw calls and possibly higher memory consumption for the surfel approximations.
Another type of objects that cannot be handled well, are objects with thin surfaces or walls. Since our sampling method currently does not incorporate the normal of a sampling point, it might happen that we incorectly distribute the surfels on the surface of such objects, which results in holes during rendering. In future work, we want to include (approximate) geodetic distances in our generation of greedy permutations to better capture the surface of an object.
Furthermore, we want to include better filtering of color values of the surfels, since our current sampling process does not take the colors into account. It can happen that samples are taken during the preprocessing step from pixels with colors of minor importance, as can be seen in Figure 4 (blue disc in the first few pictures). This could be circumvented, e.g., by averaging the color values of early prefixes in an additional postprocessing step. Another idea would be, to generate the samples from different miplevels of the rendered images.
Due to the progressive nature of our method, it is an ideal basis for streaming applications in outofcore systems and mobile rendering. In future work, we also want to use our method for onthefly generation of approximations at run time, e.g. for very large, procedurally generated worlds.
References
 (1)
 Alexa et al. (2004) Marc Alexa, Markus Gross, Mark Pauly, Hanspeter Pfister, Marc Stamminger, and Matthias Zwicker. 2004. Pointbased Computer Graphics. In ACM SIGGRAPH 2004 Course Notes (SIGGRAPH ’04). ACM, New York, NY, USA. https://doi.org/10.1145/1103900.1103907
 Aliaga et al. (1999) Daniel Aliaga, Jon Cohen, Andrew Wilson, Eric Baker, Hansong Zhang, Carl Erikson, Kenny Hoff, Tom Hudson, Wolfgang Stuerzlinger, Rui Bastos, Mary Whitton, Fred Brooks, and Dinesh Manocha. 1999. MMR: an interactive massive model rendering system using geometric and imagebased acceleration. In Proceedings of the 1999 symposium on Interactive 3D graphics (I3D ’99). ACM, New York, NY, USA, 199–206. https://doi.org/10.1145/300523.300554
 Botsch et al. (2005) Mario Botsch, Alexander Hornung, Matthias Zwicker, and Leif Kobbelt. 2005. HighQuality Surface Splatting on Today’s GPUs. In Symposium on Point Based Graphics, Stony Brook, NY, USA, 2005. Proceedings, Marc Alexa, Szymon Rusinkiewicz, Mark Pauly, and Matthias Zwicker (Eds.). Eurographics Association, 17–24. https://doi.org/10.2312/SPBG/SPBG05/017024
 Botsch and Kobbelt (2003) M. Botsch and L. Kobbelt. 2003. Highquality pointbased rendering on modern GPUs. In 11th Pacific Conference onComputer Graphics and Applications, 2003. Proceedings. 335–343. https://doi.org/10.1109/PCCGA.2003.1238275
 Coconu and Hege (2002) Liviu Coconu and HansChristian Hege. 2002. HardwareAccelerated PointBased Rendering of Complex Scenes. In Proceedings of the 13^{th} Eurographics workshop on Rendering (EGWR ’02), P. Debevec and S. Gibson (Eds.). Eurographics Association, 43–52. http://www.eg.org/EG/DL/WS/egwr02/papers/5
 Dachsbacher et al. (2003) Carsten Dachsbacher, Christian Vogelgsang, and Marc Stamminger. 2003. Sequential Point Trees. ACM Trans. Graph. 22, 3 (July 2003), 657–662. https://doi.org/10.1145/882262.882321
 Derzapf and Guthe (2012) Evgenij Derzapf and Michael Guthe. 2012. DependencyFree Parallel Progressive Meshes. Comput. Graph. Forum 31, 8 (2012), 2288–2302. https://doi.org/10.1111/j.14678659.2012.03154.x
 Eikel et al. (2013) Benjamin Eikel, Claudius Jähn, Matthias Fischer, and Friedhelm Meyer auf der Heide. 2013. Spherical Visibility Sampling. Comput. Graph. Forum 32, 4 (2013), 49–58. https://doi.org/10.1111/cgf.12150
 Eldar et al. (1997) Y. Eldar, M. Lindenbaum, M. Porat, and Y. Y. Zeevi. 1997. The farthest point strategy for progressive image sampling. IEEE Transactions on Image Processing 6, 9 (Sep 1997), 1305–1315. https://doi.org/10.1109/83.623193
 Eppstein et al. (2015) David Eppstein, Sariel HarPeled, and Anastasios Sidiropoulos. 2015. Approximate Greedy Clustering and Distance Selection for Graph Metrics. CoRR abs/1507.01555 (2015). http://arxiv.org/abs/1507.01555
 Erikson et al. (2001) Carl Erikson, Dinesh Manocha, and William V. Baxter III. 2001. HLODs for faster display of large static and dynamic environments. In Proceedings of the 2001 Symposium on Interactive 3D Graphics, SI3D 2001, Chapel Hill, NC, USA, March 2629, 2001, John F. Hughes and Carlo H. Séquin (Eds.). ACM, 111–120. http://portal.acm.org/citation.cfm?id=364338.364376
 Gross (2009) Markus Gross. 2009. Point based graphics. In ACM SIGGRAPH 2009 Courses on  SIGGRAPH '09. ACM Press. https://doi.org/10.1145/1667239.1667257
 Guennebaud et al. (2004) Gaël Guennebaud, Loïc Barthe, and Mathias Paulin. 2004. Dynamic surfel set refinement for highquality rendering. Computers & Graphics 28, 6 (2004), 827 – 838. https://doi.org/10.1016/j.cag.2004.08.011
 Gupta et al. (2003) Anupam Gupta, Robert Krauthgamer, and James R. Lee. 2003. Bounded Geometries, Fractals, and LowDistortion Embeddings. In Proceedings of the 44th Symposium on Foundations of Computer Science (FOCS 2003). IEEE Computer Society, Cambridge, MA, USA, 534–543. https://doi.org/10.1109/SFCS.2003.1238226
 Hoppe (1996) Hugues Hoppe. 1996. Progressive meshes. In Proceedings of the 23^{rd} annual conference on Computer graphics and interactive techniques (SIGGRAPH ’96). ACM, New York, NY, USA, 99–108. https://doi.org/10.1145/237170.237216
 Hu et al. (2009) Liang Hu, Pedro V. Sander, and Hugues Hoppe. 2009. Parallel viewdependent refinement of progressive meshes. In Proceedings of the 2009 Symposium on Interactive 3D Graphics, SI3D 2009, February 27  March 1, 2009, Boston, Massachusetts, USA, Eric Haines, Morgan McGuire, Daniel G. Aliaga, Manuel M. Oliveira, and Stephen N. Spencer (Eds.). ACM, 169–176. https://doi.org/10.1145/1507149.1507177
 Kobbelt and Botsch (2004) Leif Kobbelt and Mario Botsch. 2004. A survey of pointbased techniques in computer graphics. Computers & Graphics 28, 6 (2004), 801 – 814. https://doi.org/10.1016/j.cag.2004.08.009
 Luebke et al. (2003) David Luebke, Martin Reddy, Jonathan d. Cohen, Amitabh Varshney, Benjamin Watson, and Robert Huebner. 2003. Level of Detail for 3D Graphics. Morgan Kaufman Publishers, San Francisco, USA.
 Moenning and Dodgson (2003) Carsten Moenning and Neil A. Dodgson. 2003. Fast Marching farthest point sampling. In Eurographics 2003  Posters. Eurographics Association. https://doi.org/10.2312/egp.20031024
 Müller et al. (2006) Pascal Müller, Tijl Vereenooghe, Peter Wonka, Iken Paap, and Luc J. Van Gool. 2006. Procedural 3D Reconstruction of Puuc Buildings in Xkipché. In VAST 2006: The 7^{th} International Symposium on Virtual Reality, Archaeology and Intelligent Cultural Heritage, Nicosia, Cyprus, 2006. Proceedings, Marinos Ioannides, David B. Arnold, Franco Niccolucci, and Katerina Mania (Eds.). Eurographics Association, 139–146. https://doi.org/10.2312/VAST/VAST06/139146
 Oliveira et al. (2000) Manuel M. Oliveira, Gary Bishop, and David McAllister. 2000. Relief texture mapping. In Proceedings of the 27^{th} annual conference on Computer graphics and interactive techniques. ACM Press/AddisonWesley Publishing Co., 359–368. https://doi.org/10.1145/344779.344947
 Pfister et al. (2000) Hanspeter Pfister, Matthias Zwicker, Jeroen van Baar, and Markus H. Gross. 2000. Surfels: surface elements as rendering primitives. In Proceedings of the 27^{th} Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2000, New Orleans, LA, USA, July 2328, 2000, Judith R. Brown and Kurt Akeley (Eds.). ACM, 335–342. https://doi.org/10.1145/344779.344936
 Preiner et al. (2012) Reinhold Preiner, Stefan Jeschke, and Michael Wimmer. 2012. Auto Splats: Dynamic Point Cloud Visualization on the GPU. In Eurographics Symposium on Parallel Graphics and Visualization, EGPGV 2012, Cagliari, Italy, May 1314, 2012. Proceedings, Hank Childs, Torsten Kuhlen, and Fabio Marton (Eds.). Eurographics Association, 139–148. https://doi.org/10.2312/EGPGV/EGPGV12/139148
 Rusinkiewicz and Levoy (2000) Szymon Rusinkiewicz and Marc Levoy. 2000. QSplat: a multiresolution point rendering system for large meshes. In Proceedings of the 27^{th} Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2000, New Orleans, LA, USA, July 2328, 2000, Judith R. Brown and Kurt Akeley (Eds.). ACM, 343–352. https://doi.org/10.1145/344779.344940
 Süß et al. (2010) Tim Süß, Claudius Jähn, and Matthias Fischer. 2010. Asynchronous Parallel Reliefboard Computation for Scene Object Approximation. In Proceedings of the 10^{th} Eurographics Symposium on Parallel Graphics and Visualization (EGPGV ’10). Eurographics Association, Eurographics Association, Norrköping, Sweden, 43–51.
 The Walkthru Group (2001) The Walkthru Group. 2001. Power Plant Model. Internet page. (March 2001). http://gamma.cs.unc.edu/POWERPLANT/ University of North Carolina at Chapel Hill.
 Ulrich (2000) Thatcher Ulrich. 2000. Loose Octrees. In Game Programming Gems, Mark DeLoura (Ed.). Charles River Media, Boston, MA, USA, Chapter 4.11, 444–453.
 van Waveren (2016) J. M. P. van Waveren. 2016. The asynchronous time warp for virtual reality on consumer hardware. In Proceedings of the 22^{nd} ACM Conference on Virtual Reality Software and Technology, VRST 2016, Munich, Germany, 24 November, 2016, Dieter Kranzlmüller and Gudrun Klinker (Eds.). ACM, 37–46. https://doi.org/10.1145/2993369.2993375
 Wang et al. (2004) Zhou Wang, Alan Conrad Bovik, Hamid Rahim Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. Image Processing, IEEE Transactions on 13, 4 (April 2004), 600–612. https://doi.org/10.1109/TIP.2003.819861
 Wu et al. (2005) Jianhua Wu, Zhuo Zhang, and Leif Kobbelt. 2005. Progressive Splatting. In Symposium on Point Based Graphics, Stony Brook, NY, USA, 2005. Proceedings, Marc Alexa, Szymon Rusinkiewicz, Mark Pauly, and Matthias Zwicker (Eds.). Eurographics Association, 25–32. https://doi.org/10.2312/SPBG/SPBG05/025032
 Yan et al. (2015) DongMing Yan, Jianwei Guo, Bin Wang, Xiaopeng Zhang, and Peter Wonka. 2015. A Survey of BlueNoise Sampling and Its Applications. J. Comput. Sci. Technol. 30, 3 (2015), 439–452. https://doi.org/10.1007/s1139001515350
 Yoon et al. (2004) SungEui Yoon, Brian Salomon, Russell Gayle, and Dinesh Manocha. 2004. QuickVDR: interactive viewdependent rendering of massive models. In 31. International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2004, Los Angeles, California, USA, August 812, 2004, Sketches, Ronen Barzel (Ed.). ACM, 22. https://doi.org/10.1145/1186223.1186251
 Zwicker et al. (2001) Matthias Zwicker, Hanspeter Pfister, Jeroen van Baar, and Markus Gross. 2001. Surface Splatting. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’01). ACM, New York, NY, USA, 371–378. https://doi.org/10.1145/383259.383300
Comments
There are no comments yet.