Ogre-Next 2.3.0 Deadalus Released and Merry Christmas!

First of all, Merry Christmas to all those who celebrate it on behalf of the OGRE Team! (and if you don’t, have a nice day too!)

Second, after a bit more than a year in development, Ogre-Next 2.3.0 is released!

Magnificent work on Device Lost handling by Eugene Golushkov!

Most games don’t care too much about device lost because games can assume they own almost the entire computer while they’re running, and nothing else will be happening. A device lost is considered a critical failure and very uncommon, typically because of a Hardware or Software malfunction. Or a Windows Update in the middle of a gaming session, in which case the gaming experience is already interrupted anyway.

However this is not true for non-gaming apps: device lost can happen because of multiple reasons, but the two most common are:

  • The graphics driver is upgraded
  • Switching from power saving mode to performance or viceversa (mostly on laptops or other mobile devices)

Due to these two reasons, device lost becomes an almost certainty for long-running applications that could encounter a graphics driver suddenly upgrading; or for mobile/laptop-oriented applications where power mode switching can be very frequent.

Recovering from device lost can range from very easy to very difficult; depending on the complexity of an application and what the application was doing at the time the device was lost.

Eugene’s work goes to great lengths to try to gracefully recover from a Device Lost.

Switch importV1 to createByImportingV1

In 2.2.2 and earlier we had a function called Mesh::importV1 which would populate a v2 mesh by filling it with data from a v1 mesh, effectively importing it.

In 2.2.3 users should use MeshManager::createByImportingV1 instead. This function ‘remembers’ which meshes have been created through a conversion process, which allows device lost handling to repeat this import process and recreate the resources.

Aside from this little difference, there are no major functionality changes and the function arguments are the same.

Shadow’s Normal Offset Bias

We’ve had a couple complaints, but it wasn’t until user SolarPortal made a more exhaustive research where we realized we were not using state of the art shadow mapping techniques.

We were relying on hlmsManager->setShadowMappingUseBackFaces( true ) to hide most self-occlussion errors, but this caused other visual errors.

Normal Offset Bias is a technique from 2011 (yes, it’s old!) which drastically improves self occlussion and shadow acne while improving overall shadow quality; and is much more robust than using inverted-culling during the caster pass.

Therefore this technique replaced the old one and the function HlmsManager::setShadowMappingUseBackFaces() has been removed.

Users can globally control normal-offset and constant biases per cascade by tweaking ShadowTextureDefinition::normalOffsetBias and ShadowTextureDefinition::constantBiasScale respectively.

You can also control them via compositors scripts in the shadow node declaration, using the new keywords constant_bias_scale and normal_offset_bias

Users porting from 2.2.x may notice their shadows are a bit different (for the better!), but may encounter some self shadowing artifacts. Thus they may have to adjust these two biases if they need to.

Unlit vertex and pixel shaders unified

Unlit shaders were still duplicating its code 3 times (one for each RenderSystem) and all of its vertex & pixel shader code has been unified into a single .any file.

Although this shouldn’t impact you at all, users porting from 2.2.x need to make sure old Hlms shader templates from Unlit don’t linger and get mixed with the new files.

Pay special attention the files from Samples/Media/Hlms/Unlit match 1:1 the ones in your project and there aren’t stray .glsl/.hlsl/.metal files from an older version.

If you have customized the Unlit implementation, you may find your customizations to be broken. But they’re easy to fix. For reference look at Colibri’s two commits which ported its Unlit customizations from 2.2.x to 2.3.0

Added HlmsMacroblock::mDepthClamp

It is now possible to toggle Depth Clamp on/off. Check if it’s supported via RSC_DEPTH_CLAMP. All desktop GPU should support it unless you’re using extremely old OpenGL drivers.
iOS supports it since A11 chip (iPhone 8 or newer)

Users upgrading from older Ogre versions should be careful their libraries and headers don’t get out of sync. A full rebuild is recommended.

The reason being is that HlmsMacroblock (which is used almost anywhere in Ogre) added a new member variable. And if a DLL or header gets out of sync, it likely won’t crash but the artifacts will be very funny (most likely depth buffer will be disabled).

Added shadow pancaking

With the addition of depth clamp, we are now able to push the near plane of directional shadow maps in PSSM (non-stable variant). This greatly enhances depth buffer precision and reduces self-occlusion and acne bugs.

This improvement may make it possible for users to try using PFG_D16_UNORM instead of PFG_D32_FLOAT for shadow mapping, halving memory consumption.

Shadow pancaking is automatically disabled when depth clamp is not supported.

Vulkan is ready!

In Ogre-Next 2.3, Vulkan is considered stable. If you find a bug, please report it.

Most notable known issue is that it appears there are some issues when integrating with Qt we haven’t looked into yet.

PluginOptional

Old timers may remember that Ogre could crash if the latest DirectX runtimes were not installed, despite having an OpenGL backend as a fallback.

This was specially true during the Win 9x and Win XP eras which may not have DirectX 9.0c support. And stopped being an issue in the last decade sinceā€¦ well everyone has it now.

This problem came back with the Vulkan plugin, as laptops having very old drivers (e.g. from 2014) with GPUs that were perfectly capable of running Vulkan would crash due to missing system DLLs.

Furthermore, if the GPU cannot do Vulkan, Ogre would also crash.

We added the keyword PluginOptional to the Plugins.cfg file. With this, Ogre will try to load OpenGL, D3D11, Metal and/or Vulkan; and if these plugins fail to load, they will be ignored.

Make sure to update your Plugins.cfg to use this feature to provide a good experience to all of your users, even if they’ve got old HW or SW.

Other relevant information when porting

See What’s New in Ogre 2.3 from the manual for detailed info.

Also see Root Layouts section if you are customizing Hlms implementations and want to support Vulkan.

The future: Ogre 2.4

We already have a ticket tracking 2.4 roadmap.

Rather than rendering features, Ogre 2.4 will be focusing on robusting its source code base. There is a lot of code debt which needs to be addressed.

Most notably:

  • We will change the project from “Ogre” to “Ogre-Next”. The PR is already on its way and has been sitting in the backburner because we didn’t want to risk such a potentially breaking change so close to 2.3’s release. This change will allow installing Ogre 1.x and Ogre-Next side by side at the same time
  • Move to C++11 and up
    • Users may remembers my stance on C++11 adoption. Since then, while sadly the bloat is still there (literally compiling with C++98 is just faster because std headers bring in a lot of unnecessary baggage) HW has become faster, compilers did make some marginal improvements on build speeds, and most importantly we’re seeing more trouble maintaining C++98/03 support than just moving to C++11.
    • Additionally, we’ve long been wanting to use some of the C++11 (and up) built in features such as override keyword which help improve code quality.
  • Remove dead and deprecated code
  • Remove Boost (all Boost functionality we depended on can be found on the STL in C++11)

As for features, we will work on those needed by CIVCT:

  • Metal will start using Root Layouts, just like Vulkan. This will allow us to support a lot more textures and UAVs per shader.
  • Hlms implementations have a lot of duplicate Samplers for per-pass resources. We must merge them because on D3D11 CIVCT runs out of the limit of 16 samplers.

About the 2.3.0 release

For a full list of changes see the Github release

Source and SDK is in the download page.

Discussion in forum thread.

Thanks to Open Source Robotics Corporation for sponsoring CIVCT feature for their Ignition Project

Ogre 13.2 released

Ogre 13.2.0 was just released. This “holiday release” contains mostly bugfixes, however there are also some notable additions.

Vulkan RenderSystem

The elephant in the Room is likely the addition of the Vulkan Rendersystem – as was announced earlier. Contrary to my expectation, progress was quite smooth though. This means that all basic features are already in place and the RTSS and Terrain Components support Vulkan too. Therefore, the Vulkan RenderSystem is now tagged [BETA] instead of [EXPERIMENTAL]. Still, some more advanced features are currently missing.

Fresnel Sample Running on Vulkan

Depending on your usage, you might be able to already port your application – at least you can already start familiarizing with it. There are two caveats though..

Buffer updates

Currently Ogre does not try to hide the asynchronicity of Vulkan from the user and rather lets you run into rendering glitches.
The general idea of Vulkan is that you have multiple images in flight to keep the GPU busy. This means that we submit the work for the next frame without waiting for the current frame to finish.
This part hits you as soon as try to update vertex data. If the GPU is not yet done processing it, you will get rendering glitches. Particularly, your rendering will be broken if you update the data each frame.
The solution here is to either implement triple-buffering yourself or discard the buffer contents on update, which will give you new memory on Vulkan. The Ogre internals have been updated accordingly and ideally also improve performance on all other rendersystems.

Rendering interruption

Closely related to the above is rendering interruption. This means that after the first Renderable was submitted for the current frame (i.e. RenderSystem::_render has been called), you decide to load another Texture or update a buffer.

As we dont know whether the update affects the current Frame, we would need to interrupt the rendering, do the upload and continue where we left off. While certainly possible, we just throw an exception right now. Typically, it is much easier to just schedule your buffer updates before rendering kicks off, than ordering things mid-flight. And this is faster too.

Using GLSLang with GL3+

As the RTSS was extended to generate SPIRV compatible GLSL for Vulkan, it was natural to enable this path for GL3+ as well. If the gl_spirv profile is supported, you can now call

mShaderGenerator->setTargetLanguage("glslang");

to use the glslang reference compiler instead of whatever your GL driver would do.

HiDPi support in Overlays

Some dangling threads in Overlays were fixed and you can now call

Ogre::OverlayManager::getSingleton().setPixelRatio(appContext.getDisplayDPI()/96);

which will scale up the UI appropriately and generate higher resolution Fonts. The magic 96 means 96 DPI which is the common setting of all Monitors up to FullHD.

Depth of Field Sample

I have updated the dormant DoF compositor code we had in Ogre to actually do something.

The sample builds upon the code of DWORD flying around the forums and implements the following technique by Thorsten Scheuermann.

The Depth of Field compositor in action

Vulkan RenderSystem in Ogre 13

The Vulkan RenderSystem backport from Ogre-next, now has landed in the master branch and will be available with Ogre 13.2. See the screenshot below for the SampleBrowser running on Vulkan

The code was simplified during backporting, which shows by the size reduction from ~33k loc in Ogre-Next to ~9k loc that are now in Ogre.

The current implementation pretends to have Fixed Function capabilities, which allows operating with one default shader – similarly to what I did for Metal. This shader only supports using a single 2D texture without lighting. E.g. vertex color is not supported. This is why the text is white instead of black in the screenshot above.
Nevertheless, it already runs on Linux, Windows and Android.

Proper lighting and texturing support, will require some adaptations to the GLSL writer in the RTSS, as Vulkan GLSL is slightly different to OpenGL GLSL. This, and the other currently missing features will hopefully come together during the 13.x development cycle. If you are particularly keen on using Vulkan, consider giving a hand.
Right now, the main goal is to get Vulkan feature-complete first, so dont expect it to outperform any of the other RenderSystems. Due to being incomplete, the Vulkan RenderSystem is tagged EXPERIMENTAL.

Vulkan and Android support added to Ogre 2.3!

Some of you who follow me on Twitter or its Ogre thread may be aware of it.

But if you don’t: We added Vulkan support! And with it, Android support came along!

The vast majority of features and samples are already working, but there are some missing pieces (see Github ticket) but overall it is much more stable and robust than I’d hoped to be at this stage.

The last time we spoke about this was in November 2019 with our Vulkan Progress Report post. We’ve come a long way since then!

Shout-out to user Hotshot5000

This work was possible because user Hotshot5000 took my branch, forked it, and advanced it further.
The Vulkan port was a daunting, overwhelming task and his contributions greatly helped me figure out the way to make it work.

It also saved me a lot of time. Even though around 40% of his code couldn’t make it into the final version, it was still very important as a proof of concept or as a reference implementation to base from, or as a way to compare new non-working code against a working reference.

Moving forward

Documentation is still being updated. Docs on how to compile for Android is already up.

Existing applications may need to perform additional work to get Vulkan running (e.g. port shaders to Vulkan). While this isn’t difficult, there is no guide written yet.

The 2.3 preparations ticket has a list of things that have changed that may require a dev’s attention when porting from 2.2 to 2.3

This list is updated at irregular intervals; and once 2.3 is out this page is probably going to be moved somewhere else (in fact it is a draft for the News post whenever we release 2.3). But for the time being that ticket is our hub for checking 2.2 -> 2.3 changes.

Users wanting to learn how Vulkan works in Ogre may be very interested in reading the new RootLayout class documentation

That’s all for now! We’re very excited in what comes out of this

Further discussion in forum thread.

Vulkan Progress Report

If you follow my twitter you may have seen I tweeted about it.

Or if you follow our Ogre repo, you may have seen some commits.

Yes, we’re working on Vulkan support.

So far we only got to a clear screen, so this is all you’re gonna get thus far:

It is working with 3 different drivers: AMDVLK, AMD RADV, and Intel Mesa, so that’s nice.
Only X11 (via xcb library) works for now, but more Windowing systems are planned for later.

A very low level library

Vulkan is very low level, and setting this up hasn’t been easy. The motto is that all commands are submitted in order, but they are not guaranteed to end in order unless they’re properly guarded.

Want to present on screen? You better setup a semaphore so the present command waits for the GPU to finish rendering to the backbuffer.

Submitted twice to the GPU? You better sync these two submissions or else they may be reordered

On the plus side, a modern rendering library could take advantage of this to start rendering the next frame while e.g. compute postprocessing is happening on a separate queue on the current frame.

A lot of misinformation

There’s a lot of samples out there. But many of them are wrong or incomplete.

For example the LunarG’s official samples are wrong because they acquire the backbuffer from the GPU using the same semaphore instead of using one semaphore per frame.

In many of the samples this is not a problem because they perform a full stall for demo purposes, but some of the more ‘real world’ samples do not.

They also do not teach how to deal with GPU systems where the present queue and the graphics rendering queue are different (I don’t know which systems have this setup, but I suspect it has to do with Optimus laptops and similar setups where GPU doing rendering is not the one hooked to the monitor).

Google’s samples are much better, but they still miss some stuff, such as inserting a barrier dependency on VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT so that the graphics queue doesn’t start rendering to a backbuffer before it has been fully acquired and no longer used for presentation.

This bug is hard to catch because often the race condition will never happen due to the nature of double and triple buffer, and worst case scenario this could result in tearing or similar artifacts (even if vsync is enabled).

Though there’s the possibility that failing to insert this barrier can result in severe artifacts in AMD GPUs due to DCC compression on render targets being dirty while rendering to it. Godot’s renderer had encountered this problem.

This is covered at the end of Synchronization Examples’ Swapchain Image Acquire and Present .

Last week, Khronos released a new set of official samples. So far these seem to perform all correct practices.

A VERY good resource on Vulkan Synchronization I found is Yet another blog explaining Vulkan synchronization. It is really, really good.

If I were to summarize Vulkan, it reminds me of Javascript async/promises development: Everything is asynchronous and has to be coded with promises.

Once you get into the async mindset, Vulkan makes sense.

Where to next?

There’s a lot that needs to be done: Resizing the swapchain is not yet coded, separate Graphics and Present queues is not handled, there’s zero buffer management, no textures, no shaders.

The next task I’ll be focusing on is shaders; because they are useful to show stuff on screen and see if they’re working. Even if there are no vertex buffers yet, we can use gl_VertexID tricks to render triangles on screen.

And once shaders are working, we can then test if vertex buffers work once they’re ready, and if textures work, etc.

So that’s all for now. Until next time!