Faster Hydrax!


13-10-2010 13:14:29


Hydrax is a wonderful library but a little slow... so I have spent a good one day optimizing Hydrax 0.5.1, and the good news is.. FPS went up to 83-84FPS from 66FPS which is about ~25% increase. With the spirit of LGPL and open source overall, I am releasing the patch. Here it is: ... ects=0&d=1.

Download the latest Hydrax from the link above (0.5.1) and apply this patch. Note that you should also copy the patched include files to Hydrax Demo too because they are in different directories. And also, this also enables Hydrax to be compiled against Cthutgha (so for Shoggoth, you have to revert a few changes...)

Well.. about the patch.There is nothing really fancy about this patch ie I am not really improving the algorithm because I don't think I can really understand all of it.. Basically I run my profiler intel vtune and look at the functions consuming cpu cycles and just optimize..

:arrow: precalculate some variables which are noticably recalculated every loop

:arrow: inline bottle-necked functions

:arrow: there is one virtual function: Noise::update, and this function is called lotss of time. So to make it faster, I remove the virtual keyword. The effect is that all Noise instant will be direct ie no Noise* mNoise but Perlin* mNoise because we are calling update directly.

:arrow: probably the only algo I changed was in the depthlistener, refractionlistener and reflectionlistener. Currently Hydrax iterates all entities and from there, the corresponding subentities and change their materials to corresponding material (ie depth material). You can find the code in RttManager.cpp, specifically in 'preRenderTargetUpdate' and 'postRenderTargetUpdate'. To me, this is going to be very slow in large scene when there are many entities and the pre/postRenderTargetUpdate has to loop all of them.. So my fix is this:-
1) First, determine which entities that will be in water, and in the demo.. they are all palm trees. Only entities that are determined to be in the water should have the depth rendered (which is rendered into depth texture), so that way big FPS can be saved in a big scene where there are 1000 entities but only 10 entities in the water.
2) Now you know which entities are in water, just call mHydrax->addEntityInWater(ent)... done!

Having said that, I have not fully tested this patch but I *think* it works...

And there are many other optimizations that can be done.. eg multithreading using JobSwarm but this will take more time. I tried this previously, but the FPS counter just get worse, but I believe this is the correct approach just that it needs fine-tuning ie the number of jobs spawned vs the each job workload (too little or too much workload for a job then FPS suffer). If this get done correctly, I think the performance can go up by at least 50%... drool.. hehe

Ogre library is showing up in the profiler to be consuming ~20% of the time, and half of it is spent on... Vector3::normalisedCopy and Vector3::dotProduct. So again, SIMD class like in here: can be inserted and have another huge FPS gain!

And also, this patch is meant for VC++ and I am using VC 2005. So for non-vc programmer, there are non-standard keyword like __forceinline so I am not sure how it works out.


26-11-2010 07:33:41

Would you realease a version of your Hydrax,with Binaries and source please?
Instead of applying a patch.