CS 6620 Fall 2014 - Project 2

Rendered Image

Hardware Used and Render Times

The plot below shows the render times for one sample per pixel averaged over 10 runs, measured from a build compiled with -O3 -mfpmath=sse -march=native -flto. Measurements were taken using std::chrono::high_resolution_clock and only include time to render, ie. time to load the scene and write the images to disk is ignored. Now that the renderer has a bit more work to do we can start to see thread contention hurting the render time when running on significantly more threads than the number of hardware threads.

CPU: Intel i5-2500K @ 4.0GHz, 4 hardware threads
RAM: 8GB 1600MHz DDR3
Compiler: gcc 4.8.0 (MinGW on Windows)

Chart made using C3.js

Multithreading Tweaks

I also tweaked my multithreading implementation based on what was mentioned in class since the original implementation was pretty braindead, eg. split the image into num_threads blocks, hand them off and relax. This new method chops the image up into a specified number of blocks, shuffles them and then hands them off to the threads as they render. This does a bit better job of distributing the workload over the threads and is also more fun to watch. Below is a recording of the rendering slowed down significantly by inserting some short sleeps into the worker threads.

8 Threads 128 Blocks