I experienced while working with NvPerfHUD that the caelum stars and sky are drawn first. This is not very clever because everything rendered afterwards will
override more or less pixels (maybe all of them e.g. if you look at the ground in the demo). This leads to a pixelshader calculation overhead which should be avoided.
Maybe it is possible to ensure that caelum is drawn at the end of the queue before transparent objects are drawn...
I am not sure how this would work. AFAIK ogre sorts the transparent objects inside each render queue. That means there's no way to render something "before the transparent objects"; each render queue has it's own transparent objects.
Implementing this requires first enabling depth testing on caelum's materials and then moving them to later queue groups. Buy if you have transparent objects drawn against the sky (like smoke) you'd be on your own.
One very acceptable solution here is to allow the user to change render queues from the current default values. This can be accomplished by adding an additional member function to all existing caelum components; similar to how the user can set custom visibility and query flags.
If you know what you're doing you can draw first the opaque objects; then the sky and then transparent objects like smoke.
I have to admit I don't know much about ogre's render queues. Is there any way to force laying down the z-buffer first; as so many people recommend?
If I recall correctly, each render queue is "opaque first, transparent later". The point of using several render queues is that you might want to skip that order for some reason (like drawing a translucent object on top of other like HUDs overlaid on a scene with particles). That's why it's so tricky to alter the rendering order of sky vs scene. Indeed, there are two render queues specifically put in Ogre for this purpose: RENDER_QUEUE_SKIES_EARLY and RENDER_QUEUE_SKIES_LATE.
I'm afraid nowadays nobody will use the latter together with Caelum. The chances you're not using any translucency in your scene are really low. This might have sense in embedded systems that can't handle well translucency and have a hurtingly low fillrate.