1. Introduction
Ray Tracing is the method of choice for high-fidelity image generation. However, it is computationally expensive for real-time applications where only a few milliseconds of rendering time is available per-frame. Precomputed Radiance Transfer (PRT) offloads expensive computations of ray tracing to a pre-computation step, after which the stored data can be utilized for real-time photorealistic rendering. PRT uses Spherical Harmonic (SH) lighting [Ramamoorthi, 2009] to efficiently store and render complex effects in both video games and offline rendering for movie production [Pantaleoni et al., 2010]. Recently, PRT has been extended to efficiently handle area light sources [Belcour et al., 2018; Wang and Ramamoorthi, 2018; Wu et al., 2020] further increasing its utility.
The PRT framework proposed by [Sloan et al., 2002] pre-computes the transfer function and stores it in SH basis at vertices of the scene, possibly with compression [Sloan et al., 2003]. For a band SH projection with coefficients, this amounts to storing a -dimensional vector for diffuse and -dimensional matrix for glossy materials at each vertex. Using the Triple Product formulation [Ng et al., 2004], it is possible to still store a -dimensional vector per-vertex for glossy materials by additionally storing a global three-dimensional tripling coefficient matrix. A further modification of triple products fixes the light [Sloan, 2008] and requires the storage of a two-dimensional global matrix. The advantage of this method is that it retains a compute complexity of as opposed to of traditional triple products. Recently, the work of [Xin et al., 2021] reduces this time complexity to , although a real benefit is gained only for large .
The above PRT approaches store the transfer vectors/matrices on vertices of the mesh. While rendering, color is evaluated at each vertex and interpolated for internal points. This necessitates a reasonably dense mesh tessellation (high mesh resolution) for high-quality renders. There exists few methods like textured hierarchical PRT [McKenzie Chapter, 2010] and PRT of D3D9 framework [Microsoft, 2003] which leverage the continuous texture space to store transfer. The work of [Iwanicki and Sloan, 2009] also uses texture space similar to [McKenzie Chapter, 2010] with a specific focus on shadows. These methods focus only on diffuse reflection and do not demonstrate glossy reflection and inter-reflections with textures. Directly extending above mentioned methods for glossy renders requires coefficients per each texel resulting in heavy texture storage, because they use the original PRT formulation of [Sloan et al., 2002]. These extensions are thus infeasible in real-time scenarios (Table 1). Can methods like Textured-PRT be extended to handle glossy materials, inter-reflections, etc., at real-time rendering rates?
In this paper, we present precomputed radiance transfer textures (transfer textures) to efficiently store the transfer values using coefficients per texel for both diffuse and glossy reflections. Our method uses the triple product formulation of PRT [Ng et al., 2004]. Transfer textures store more finely sampled transfer values and can evaluate color for each fragment in a fragment shader. This improves render quality even for coarsely tessellated meshes (Fig. 1). They can also support high-quality local effects like glossy renders, inter-reflections, and normal maps. Higher texture resolution results in greater precomputation effort but higher render quality at fixed render times. This provides a way to balance computation effort and rendering quality and to trade one off for the other. Texture space techniques like mip-mapping and texture sets can be used for greater efficiency. We describe methods to correctly and efficiently compute transfer textures and show real-time framerates and superior render quality at low mesh tessellations for several scenes. We further formulate and demonstrate inter-reflections using transfer textures by completely fixing the light. We also show transfer texture facilitate local effects like normal mapping. Finally, we compare and analyze run-times and storage against vertex-based approaches with the triple product method.
2. Related Work
Precomputed Radiance Transfer (PRT)
PRT was first proposed by [Sloan et al., 2002], where they projected environment lighting and light transport to SH basis for dynamic lighting for diffuse and glossy materials. Since then, PRT and SH have received lot of attention to efficiently compute SH basis [Snyder, 2006], efficient rotation of SH [Nowrouzezahrai et al., 2012], compressing SH basis [Sloan et al., 2003], microfacet BRDFs [Lehtinen and Kautz, 2003] and extending PRT for dynamic scenes [Zhou et al., 2005]. More recently, [Wang and Ramamoorthi, 2018] extend PRT to support area lights achieving real-time frame rates for a few light sources. This work was later extended to support a large number of area lights while maintaining real-time frame rates [Wu et al., 2020]. All of these methods were either orthogonal to the core PRT framework or extended the core framework to support additional scenarios. In contrast, we improve the core PRT framework to compute and store transfer on a texture instead of at vertices.
Triple Products.
Triple products naturally arise in computer graphics in the rendering equation. Triple products in wavelet basis and spherical harmonics have been studied in depth [Ng et al., 2004]. Today, state-of-the art in PRT uses triple products for dynamic relighting for diffuse and glossy scenes. By itself, triple product method has a compute complexity of . This method can be made more computationally efficient by fixing the lighting [Sloan, 2008]. Specifically, triple products with fixed lighting in SH based PRT achieves a compute complexity of and per-vertex storage of -dimensional vector. We augment the triple product method with transfer textures and demonstrate superior rendering quality and real-time framerates.
Transfer stored on Textures.
Storage of transfer on a texture was first suggested by [Sloan et al., 2002]. [McKenzie Chapter, 2010; Iwanicki and Sloan, 2009] formally demonstrated transfer storage on textures. These methods mainly focused on diffuse materials without inter-reflections. In this work, we focus primarily on glossy materials and also demonstrate inter-reflections with transfer textures.
3. Background on PRT with Triple Products
To make our document self-contained, we first review and discuss Precomputed Radiance Transfer with triple products, which is the current state of the art in PRT. The inspiration for the choice of using triple product formulation is discussed in Sect. 6.3. The rendering equation for direct lighting at point is given by:
(1) |
where is the direction towards the viewer from , is the incoming direction on the unit hemisphere and is the surface normal at . is the reflected radiance in direction , is the incoming environment light from , is the binary visibility function and is the Bi-directional Reflectance Distribution Function (BRDF). Eq. 1 is decomposed into the lighting and transfer which are then projected to the SH basis with coefficients and respectively. Triple products [Ng et al., 2004] formulate transferred radiance as:
(2) |
where
is the triple product tensor (tripling coefficient matrix). The final color is calculated by convolution of
with BRDF coefficients and evaluation at reflection direction. The above formulation for PRT results in a -vector for transfer stored at each point and compute [Ng et al., 2004] required for rendering. The compute efficiency can be further improved to by fixing the light and precomputing a global product matrix: [Sloan, 2008].4. Transfer Textures
In this section, we begin with a description of computing and storing a band SH projection of transfer on a texture (Sec 4.1). Next, we show how inter-reflections can be pre-computed and incorporated in our framework (Sec 4.2). Our implementation is described in Sec 5. Our approach achieves real-time frame rates and better render quality, especially on low tessellation meshes, as shown in Sec 6.
4.1. Pre-computing Transfer Textures
The computation of transfer involves shooting multiple rays from a point in the scene and then evaluating and projecting the transfer to the SH basis. For transfer textures, there are scene points corresponding to each pixel in the texture. The mapping between and is defined by the UV coordinates. To efficiently compute the transfer texture, we leverage G-Buffers(Alg. 2) to interpolate vertex positions and normals based on their corresponding UV-Coordinates (Alg. 1, line 2). Next, we read the G-buffer and the scene geometry and evaluate the transfer function for each pixel in the buffer (Alg. 1, lines 4-6). The transfer obtained is then projected to SH basis and stored at the same pixel location in a initially empty texture (Alg. 1, lines 7-8). Finally, is dilated to ensure that all points inside a triangle receive a transfer value. At run-time, we fetch transfer and use it with the triple product formulation to obtain .
4.2. Pre-computing transfer textures for inter-reflections
Inter-reflected radiance at point can be modeled as:
(3) |
where is the radiance from a secondary hit-point towards and is the inter-reflected radiance [Sloan et al., 2002]. First, we factor out by only integrating over rays which hit some geometry. For a scene point and a secondary hit , the radiance can easily be precomputed given a zero-bounce transfer texture from Alg. 1 (See Fig. 2).

The radiance from towards is obtained using the triple product formulation by fetching to obtain transfer at .
This is done for all hit-points from . This radiance now forms an indirect environment map for the point , which is then projected to SH basis resulting in a -vector , which is stored in a separate one-bounce inter-reflection texture . At run-time, the inter-reflected radiance is obtained by convolving fetched from with the BRDF SH and evaluating at the reflection direction. The final color is given as: . Alg. 1 can be easily extended to compute the second bounce texture and so on. The number of textures required is linear in the number of bounces in this setting, and the final color is just their summation.
4.3. Handling dense UV-packing
In the previous section we described methods for efficient computation of transfer textures. Usage of these textures requires UV co-ordinates each vertex to be defined. To obtain UV unwrapping of scene geometry, we used Smart UV-Unwrap or Light Map Pack from Blender 3D [Blender, 2021]. One caveat with UV unwrapping is that dense packing of UV islands may cause overlaps which manifest as rendering artefacts. Smart UV-Unwrap does not guarantee non-overlapping islands while Light Map Pack leads to texture wastage and tiny pixel coverage for some parts of the geometry. In such scenarios, texture-sets are beneficial.

Consider an example scene as shown in Fig. 3. This scene contains 441K triangles, all of which are packed into a single texture (Fig. 3, left). As shown in the insets, this leads to artefacts. A better approach is to use texture-sets, which means assigning individual textures to each object in the scene (Fig. 3, right). In this case, each UV island can occupy the entire space of the texture thus eliminating artefacts.
5. Implementation Details
We implement Alg. 2 in Python using the ModernGL [Dombi, 2020] framework. We generate and store the resulting G-buffers for each scene in a pre-process step. Alg. 1 is implemented in Python and uses Embree [Wald et al., 2014] for efficient ray intersection tests. We project to band (25 coefficients) real spherical harmonics. As mentioned in Sect. 4.1 dilation is required to ensure that all points in the scene receive a transfer value. Experimentally, we found a dilation of three to be sufficient which may need adjustment depending on the scene complexity. The time taken for generating transfer textures for a scene like in Fig. 1 is approximately three hours.
Our real-time renderer is also implemented in the ModernGL framework. We implement the triple product (TP) and triple product with fixed light (TPFL) methods augmented with our transfer textures. Rendering is done in the fragment shader using the generated transfer textures for the respective scene. We render all scenes with glossy materials with spatially varying roughness on a workstation with an NVIDIA RTX 3090 with a resolution of 19201080. An important detail is that we use the early depth pass to prune fragments that are not visible thus avoiding unnecessary computations. We use a texture resolution of 10241024 texture as we have found it to be best trade-off in between memory and quality for our scenes.
Scene | # tris. | Vert. (Trad.) | Frag. (Ours) | ||
---|---|---|---|---|---|
TP | TPFL | TP | TPFL | ||
Dragon (Fig 4) | 1.3M | 3.62 | 41.2 | 5.2 | 151.2 |
TRM (Fig 4) | 441K | 10.2 | 116.2 | 15.2 | 202.9 |
Room (Fig 1) | 21K | 352.3 | 2432.7 | 83.6 | 568.2 |
Plants (Fig 4) | 18K | 363.2 | 2597.6 | 6.7 | 168.3 |
6. Results & Evaluation

In this section, we present glossy rendering results including inter-reflections using transfer textures on the fragment shader. We compare the renderings with traditional vertex shader based approaches. We also discuss and demonstrate the use of normal maps with transfer textures which is not possible with traditional vertex based PRT. Finally, we analyze the memory requirements and give a lower bound of FPS for tranfer texture usage in a fragment shader. Rendering results are demonstrated on four scenes whose statistics and performance comparisons are given in Table 1.
6.1. Glossy rendering & Inter-reflections
Fig. 4 shows the renders for three scenes: Plants, Dragon and TRM (two Roza, one Monkey). All scenes have a ground plane, which is minimally tessellated, as shown in the wireframe insets. The TP and TPFL methods on vertex shader are unable to capture proper shadows on the ground plane due to sparse sampling of the transfer function. In contrast, the TP method on the fragment shader using our transfer textures properly reproduces shadows on the plane, albeit at a very low FPS. The TPFL method with transfer textures also achieves a similar render quality at a higher FPS. We note that the TP/TPFL methods on vertex shader approach the render quality of our transfer textures with a highly tessellated ground plane, as shown in the high-tessellation renderings. We note that this requires the addition of redundant vertices. We further note that such situations frequently arise in production, for example with walls in a room or any large surface with minimal curvature (Fig 1). In such cases, all previous PRT methods on vertex shaders require the addition of avoidable vertices to store the transfer on leading to drop in performance, as opposed to our transfer textures method. Additional renders with different phong exponents and environments maps for four different scenes are shown in Fig. 7 & 8.

Next, we demonstrate inter-reflections using transfer textures with the method described in Sec 4.2. The zero-bounce and one-bounce renders with their corresponding FPS are shown in Fig. 5 for two scenes: Monkey and Roza. Because of extra texture fetch, convolution and evaluation operations the FPS with inter-reflections is slightly lower, albeit still real-time. As described in Sect. 4.2, additional bounces can be added with additional pre-computed textures.

6.2. Normal Maps
Transfer textures make it possible to use normals maps during precomputation. This translates to lesser vertices during rendering as finer detail can instead be embedded in the normal map. Consider precomputation in traditional vertex based PRT. In this case if a normal map is applied, it only ever affects the transfer at those vertices thus loosing detail within each face when using high frequency normal-texture. With transfer textures we can output the shading normal from the normal map instead of the geometric normal in the G-buffer during precomputation. In Alg. 1 line 6, the transfer will then be computed at the shading normal instead. Since this texture is used to fetch transfer during rendering, all normal map details are preserved. We show renderings with normal maps in Fig. 6. The detail on the floor is due to the normal map without any additional vertices, as can be seen in the wireframe insets.
6.3. Memory Requirements
McKenzie demonstrated textures with diffuse PRT using Sloan’s formulation. Consider directly implementing Sloan’s glossy formulation with textures instead. This amounts to storing a matrix per texel which quickly becomes intractable, even for reasonably small textures. Thus augmenting the triple product formulation to transfer textures is a clear choice. The memory requirements for vertex as well as texture (fragment) based approaches is shown in Table 2. The former’s memory requirements depend on the scene complexity whereas it is constant for textures. Furthermore, a direct extension of Sloan’s method to textures is infeasible, as shown in the fifth column (2.5 GB per texture).
Scene | # tris. | Vert. Mem. | Tex. Mem. | ||
---|---|---|---|---|---|
Sloan | [Ng et al.] | McKenzie | Ours | ||
Room | 21K | 64MB | 2.5MB | 2.5GB | 100MB |
Dragon | 1.3M | 5.2GB | 215.8MB | 2.5GB | 100MB |
TRM | 441K | 2.3GB | 139.3MB | 2.5GB | 100MB |
Plants | 18K | 64MB | 2.5MB | 2.5GB | 100MB |
6.4. Lower Bound on FPS
Since transfer textures are used in fragment shaders with an early depth pass, we achieve a lower bound on the FPS. The computation is roughly the same for each fragment and the worst case is when all fragments contain some geometry to be processed and rendered. This is in contrast to vertex based approaches, where run-time depends on the number of vertices in the scene. We demonstrate this in Fig. 4 in the TRM (441K verts) and Dragon (1.3M verts) scenes. Here, the FPS is lower for vertex based approach as compared to fragment based approach with transfer texture, in both TP and TPFL.


7. Conclusions, Limitations & Future work
In this paper, we presented precomputed radiance transfer textures for decoupling mesh tessellation from transfer sampling and storage for glossy rendering. We described methods to efficiently and correctly compute these textures and also demonstrated incorporation of inter-reflections using additional precomputed textures. We compared our renderings with traditional vertex based PRT approaches and thoroughly analyzed the memory requirements of transfer textures. We demonstrated real-time framerates for rendering with transfer textures on the fragment shader and superior render quality for minimally tessellated meshes. Additionally, we gave a lower bound on the FPS which will be useful in performance analysis in production. Our approach inherits the advantages of texture based optimizations like textures-sets, mip-maps and level of detail which can be easily incorporated. Although we demonstrate on a fixed texture resolution, it can be tailored accordingly depending on the hardware constraints and rendering quality needed. This is in contrast to vertex based methods that provide vertex count as the only control knob and little control over level of detail.
A limitation of transfer textures is that inter-reflections essentially bake the lighting and BRDF i.e. they cannot be changed without re-computation. We note that the work of [Sloan et al., 2002] also bakes BRDF (including albedo) into their transfer matrices for inter-reflections. We would like to address this issue for future extensions of this work.
References
- Integrating clipped spherical harmonics expansions. ACM Trans. Graph. 37 (2). External Links: ISSN 0730-0301, Document Cited by: §1.
- Blender - a 3d modelling and rendering package. Blender Foundation, Blender Institute, Amsterdam. External Links: Link Cited by: §4.3.
- Note: https://github.com/moderngl/moderngl Cited by: §5.
- Normal mapping with low-frequency precomputed visibility. In SIGGRAPH 2009: Talks, SIGGRAPH ’09, New York, NY, USA. External Links: ISBN 9781605588346, Link Cited by: §1, §2.
- Matrix radiance transfer. In Proceedings of the 2003 Symposium on Interactive 3D Graphics, I3D ’03, New York, NY, USA, pp. 59–64. External Links: ISBN 1581136455, Document Cited by: §2.
- Textured hierarchical precomputed radiance transfer. Cited by: §1, §2.
- Microsoft directx 9 programmable graphics pipeline. Microsoft Press, USA. External Links: ISBN 0735616531 Cited by: §1.
- Triple product wavelet integrals for all-frequency relighting. ACM Trans. Graph. 23 (3), pp. 477–487. External Links: ISSN 0730-0301, Document Cited by: §1, §1, §2, §3.
- Sparse zonal harmonic factorization for efficient sh rotation. ACM Trans. Graph. 31 (3). External Links: ISSN 0730-0301, Document Cited by: §2.
- PantaRay: fast ray-traced occlusion caching of massive scenes. ACM Trans. Graph. 29 (4). External Links: Document Cited by: §1.
- Precomputation-based rendering. NOW Publishers Inc. Cited by: §1.
- Clustered principal components for precomputed radiance transfer. ACM SIGGRAPH 2003 Papers. Cited by: §1, §2.
- Precomputed radiance transfer for real-time rendering in dynamic, low-frequency lighting environments. ACM Trans. Graph. 21 (3), pp. 527–536. External Links: ISSN 0730-0301 Cited by: §1, §1, §2, §2, §4.2, §7.
- Stupid spherical harmonics (sh) tricks. In Game developers conference, Vol. 9, pp. 42. Cited by: §1, §2, §3.
- Code generation and factoring for fast evaluation of low-order spherical harmonic products and squares. Technical report Technical Report MSR-TR-2006-53. Cited by: §2.
- Embree: a kernel framework for efficient cpu ray tracing. ACM Trans. Graph. 33 (4). External Links: ISSN 0730-0301, Document Cited by: §5.
- Analytic spherical harmonic coefficients for polygonal area lights. ACM Trans. Graph. 37 (4). External Links: ISSN 0730-0301, Document Cited by: §1, §2.
- Analytic spherical harmonic gradients for real-time rendering with many polygonal area lights. ACM Trans. Graph. 39 (4). External Links: ISSN 0730-0301, Document Cited by: §1, §2.
- Fast and accurate spherical harmonics products. ACM Trans. Graph. 40 (6). External Links: ISSN 0730-0301, Document Cited by: §1.
- Precomputed shadow fields for dynamic scenes. ACM Trans. Graph. 24 (3), pp. 1196–1201. External Links: ISSN 0730-0301, Document Cited by: §2.