Multithreading
The graph below show the CPU activity on my dual-core machine while running 10000 iterations of the simulation. Clearly there is room for improvement.
The most obvious approach is to do a row-wise data decomposition of the arrays typical of parallel image processing operations. The basic process is the same as that described in this article by Larry O’Brien. The code listing is simplified, and only shows how it differs from the single threaded version. In summary, we make use of the following threading primitives:
ManualResetEventprovides a means of cross thread communication. It is used here to indicate that a simulation step has completed and is ready for rendering.Interlockedprovides a means of updating and reading a counter in a thread-safe manner, which we use to keep track of the number of blocks remaining to be updated.ThreadPoolis a pool of worker threads managed by the Silverlight runtime.WaitCallbackis the signature for a function that can be executed on a thread from theThreadPool.
// the ManualResetEvent is used to signal the Main thread
// when a simulation step is done
ManualResetEvent done = new ManualResetEvent(false);
int blocksPlusOne = 1;
WaitCallback rdStepBlock = delegate(object state)
{
int[] range= (int[])state;
for (int j = range[0]; j < range[1]; j++)
{
// process block as per single threaded code
// if this is the last block unblock the thread
int isDone = Interlocked.Decrement(ref blocksPlusOne );
if (isDone == 0) { done.Set(); }
};
for (int b = 0; b < NumBlocks; b++)
{
int min = rowsPerBlock * b;
int max = min + rowsPerBlock;
Interlocked.Increment(ref blocksPlusOne );
ThreadPool.QueueUserWorkItem(rdStepBlock, new int[] { min, max });
}
int isDoneAlt = Interlocked.Decrement(ref blocksPlusOne );
if (isDoneAlt == 0) { done.Set(); }
// block until done
done.WaitOne();
This graph below shows the CPU activity for this method, indicating that both cpu cores are saturated. This reduces the time for ten iterations to about 220ms. Note that the use of cross thread communication reduces the amount of code that can actually be executed in parallel, so one of the goals of parallel programming is to reduce the need for such primitives as much as possible. However, at this point the simulation is fast enough for our purposes.
Image Generation
One useful feature is the ability to generate a high resolution PNG that can be saved to the user’s computer. To do this, we generate a base64 encoding of the PNG stream and embed it inline in a new web page using the data URI scheme. Unfortunately, currently only Firefox appears to support streams of the required length.
