In this chapter we will discuss about some of the results we have reached. We will compare the render time, achieved with the methods explained in the chapter 4, with a renderer commonly used by the Computer Graphics research community: the Mitsuba renderer by . Moreover, we will discuss some correlation between the scenes and the best performing method.

Results

In table 6.1 we can see the first comparison of the Mitsuba renderer with the scenes presented in figure 4.6. There are multiple algorithms that can be used inside the Mitsuba renderer to render the scene, however the algorithm that works best in this type of scene is the simple volumetric path tracing (called volpath_simple). Contrary to our methods, this algorithm is also using shadow rays. For this reason, the total number of rays traced for this algorithm is calculated by adding the shadow rays and the normal rays traced. The results show that the number of rays traced every second by the Mitsuba volumetric path tracer is much lower than any other algorithm. In the scenes used in this comparison the density is constant and the algorithm that is performing better is the regenerationSK using the single thread regeneration. The reason being that if the density is constant, also the standard deviation over the length of a path that traverses the medium will be small. Thus, the rays are starting and stopping by themselves all together and there is no need for extra care on synchronizing the threads or compacting them. The other conclusion that is possible to take is that the GPU is performing much better than the CPU algorithm also considering the naive single kernel approach.

methodrays/sec
cgg-logoad
naiveSK3.882.13
regenerationSK (thread)81.6242.37
regenerationSK (warp)4.377.02
streamingSK (compaction)52.6937.81
streamingSK (sorting)20.4117.61
sortingSK20.9017.83
mitsuba (volpath_simple)0.00317.83
comparison with Mitsuba renderer. This table shows the comparison of the algorithm showed in the thesis with a CPU volumetric path tracing implementation provided by . The table shows that the CPU algorithm is performing worse in all the case. The two measures provided are the millions of rays traced every second and the total time for rendering a 400x400 image with the scene

Let’s consider now a scene with varying density but not varying albedo, like the one in figure 6.1. The results relative to this scene are presented in the table 6.2. The table shows the millions of rays per second and the total time to render the smoke scene with density scale factor 800 which corresponds to the scaling factor of the 0 to 1 density volume. The results are showing that in this case the algorithm that is performing better is the streamingSK. The explanation can be found in the long duration of some rays inside the medium compared to other rays that are ending immediately. In this case, the streaming method is able to group together the long term rays and decrease the divergence of the warps. This results in a better efficiency of the streaming algorithm when the scenes present high varying density. However, the complexity of the density function is not the only factor to take in consideration.

smoke scene: heterogeneous volume representing smoke. The file which have grid resolution of (128,128,50) can be found in the website of the Mitsuba renderer . The density scale used in this scene is 800.
methodrays/sectime (sec)
regenerationSK (thread)2.52131
regenerationSK (warp)7.35127
streamingSK (compaction)17.4153.68
sortingSK14.6463.83
mitsuba (volpath_simple)0.581076
mitsuba (volpath)0.541195
comparison with Mitsuba renderer on the smoke scene. This table shows the behavior of the algorithms showed in this work in the case the scene present high varying density and high resolution.

The scene in figure 6.2 presents an high varying density but a small texture resolution. In this case the methods that are using compaction are performing worse than the RegenerationSk method with a single thread regeneration as it is showed in the table 6.3. A possible reason for this behavior can be attributed to the ratio between traced paths and grid resolution. That is, in this type of scenes if the number of rays is high enough it is more probable that groups of rays access the same texture data. That happen because the texture resolution is much lower respect to the previous cases. If threads that are close together access the same texture values they are behaving similarly to the case of constant density factor and exactly like in that case a simple method, like the regenerationSK, that fully utilize the GPU gives the best performance gain.

bucky heterogeneous volume with grid resolution of (32,32,32). The file is usually used for testing volumetric rendering because it presents an high varying density. In this test the density values are varying between 0 and 40. For the color a static transfer function is applied which maps green to low density values, red to medium density values and blue to high density values. The number of iterations used for the Monte Carlo estimation is 300.
methodrays/sectime (sec)
regenerationSK (thread)10.9639.71
streamingSK (compaction)5.7552.83
sortingSK5.6953.30
naiveSK4.9960.86
comparison of the GPU rendering algorithms detailed in the chapter 4 on a high varying density scene in figure 6.2 with many density holes and small resolution

Which is the Best?

Looking at the results that we have showed, we can say that the best algorithm to use for GPU volumetric path tracing closely depends on the type of scene we want to render. We have seen that for complex scenes with high varying density and high resolution texture the more sophisticated algorithm which uses compaction and smart regeneration are performing better. On the other hand, if our scene is a simple scene, simple methods which requires less synchronization between threads and less branch divergence are getting better results.