News

Ogre ecosystem roundup #4

following the last post about what is going on around Ogre, here is another update.

Ogre 1.12.6 point release

The 1.12.6 point release kept its focus on integration. Notably, it ships the Qt OgreBites implementation, that was discussed in the previous post. Also there are the following notable changes:

  • The OSX & iOS support was vastly improved. If you had any issues with using Ogre in these enviroments, please try again with Ogre 1.12.6. The improvements include scaling the input events along the window content scaling and corrected library resolution from inside an .app bundle. I also started backporting the Metal RenderSystem to the 1.x series. See the metal-wip branch. Help is welcome.
  • Also Ogre 1.12 is now available in Debian sid & unstable. Notably, this also covers the python bindings and imgui. So prototyping a 3D application is only a apt install python3-ogre-1.12 away. Also this is a major step forward from the previous Ogre 1.9 package and allows 3rd party apps like ROS relying on it. These packges are also availble in the latest Ubuntu 20.04 release, which is based off Debian unstable.
  • Next, the MSVC SDK was refined: it now properly packages and compiles the C# ibindings. You no longer have to compile any Ogre C# sources yourself. As for C++, the pdb files are finally included, which allows you using the SDK for debugging (note that the build type still is RelWithDebInfo and not Debug).
  • The refined “Fixed Pipeline Enabled” RenderSystem option allows you to bring the legacy GL & D3D9 RenderSystems into a programmable pipeline only mode. Setting it to false is the first step, if you consider porting to GL3+ & D3D11 as it will enable the RTSS shader generation, while keeping everything else the same. It also allows you toggling forth and back between shader based and fixed pipeline to compare rendering, instead of having to pull the plug once and for all.
  • For the full list of changes and bugfixes as always consult the github release

OGRECave ecosystem

Probably the biggest strength of Ogre 1.x is the legacy of content creation tools and addons. Here, I resurrected the following tools beneath the OGRECave umbrella:

  • The meshmagick CLI .mesh optimization and manipulation tool. It allows you to conveniently change the coordinate system or merge all sub-meshes into one buffer.
  • The shiny shader meta-compiler & management library. In the mid-term we want to extend the RTSS by similar features and generally recommend to prefer the RTSS. However, there are some existing projects relying on this library, and this allows them to upgrade to recent Ogre releases.
  • The ogre-gpgpu toolkit. This is a collection of many Computer Vision and Augmented Reality related tools built around Ogre. So far, I only brought back the Ogre-CUDA bridge.
    When it is ready, this probably will be moved into the core repository. Therefore, this project can be considered as a staging area.

Qt Ogre3D integration now available in master

While there have been snippets to provide Ogre integration with Qt for a long time, there is now an officially provided version in master and scheduled for Ogre 1.12.6.

This integration requires Qt5 and builds upon the ApplicationContext abstraction living in OgreBites which already handles SDL2 windowing and Activities on Android.
In contrast to previous attempts this means that it does not follow the “QtOgreWidget approach”. This might sound less convenient, but is necessary to properly handle multiple Ogre Windows or Ogre Views. Also it should be familiar for everybody who is using the QApplication API.

The implementation lives in a separate libOgreBitesQt.so library which is only created when Qt is detected when building – so if you do not use it, you do not have to care about Qt dependencies.

The API is designed to be a drop-in replacement for ApplicationContext. This means that you can just take the setup tutorial, but use the ApplicationContextQt instead and your app will be Qt5 based.
Also, because of the Input event abstraction we did for Ogre 1.11.0, the CameraMan and Trays code will continue working – just like the Event forwarding to ImGui.

Furthermore, I have ensured that the API also fits when the Qt Event loop is used and adapts to existing projects. For this, I have ported ogitor and spacescape to the new API.
Notably, with spacescape the Ogre view is now only redrawn on-demand when things change (e.g. settings, window resize).

The exposed API is QWindow based making it lightweight as only the QtGui module is required. Also this should allow extending it for QtQuick in the future, which is also QWindow based.

For details on integration see the docs.

Ogre 1.12 User Survey Results

During the period of Feb 29. – March 31. we received 47 replies. At the same time the ogre 1.12.5 Windows SDK alone was downloaded 437 times. So while the results are significant, they are probably not representative.

The most interesting result is probably this

When considering the boosted votes of the patreon supporters, the enterprise and enthusiasts parts increase. Still, the enterprise fraction remains dominant.

But, as statistics are lies better take a look at the actual numbers yourself.

Specific replies

Following the #MeanTweets idea I also wrote some short replies to the criticism, that you can read below:

read more…

Ogre 1.12 User Survey 2020

Those of you who have been around Ogre for some time might remember that back in 2018, we conducted a survey about our user base. The results of which can be found here.

For the 1.13 development cycle we would like to assess to correctly emphasize the development on the most used features.

So for the next four weeks until the 29th of March, you have the chance to participate and help us to get an impression about our user base, how Ogre is used and share some wishes for the future. Simply follow the link and make your way through the 13 questions. It should not take up much time since most of the questions are simple checkbox or radio button questions.

Link to survey

We want to thank you all upfront for helping us to develop Ogre further and getting some valuable insight information about the people using the engine!

PS: We would be glad if you could spread the word about the survey via all available channels to all potential Ogre users, because: The more participants, the more accurate are the results of course.

RTSS: Scriptable Render Pipeline the OGRE way

OGRE scripts offer a way to define materials at an abstraction level similar to D3D Effects and CgFX where you can define alternative techniques, each consisting of one or multiple passes.
Here, each pass defines a render pipeline state by defining blend modes and referencing shaders. The main difference in OGRE is that you cannot write inline shaders in the script file as we support different render systems with different shading languages.

Traditionally, OGRE allows to use fixed function pipeline (FFP) functionality where you do not have to write any shaders, as long as Phong shading and a fixed set of texture operations is enough for your use case.

However, modern Render Systems like D3D11 or GL3 no longer include FFP parts to reflect that modern hardware does not either and is rather based on unified programmable SIMT pipelines.

To abstract form this difference, OGRE therefore offers the Real Time Shader System (RTSS) component, that generates shaders that seamlessly replace the absent FFP functionality. In most cases OGRE is able to produce pixel-perfect results.

However, as the RTSS generates shaders internally, you can customize the rendering in much more detail then was possible with the FFP. Here, you do not have to write your own shaders but can keep the high abstraction that OGRE scripts offer and just use the rtshader_system section to declare the features you want. Still this, gives you a large amount of control how things are rendered.

The most simple thing to do is enabling per pixel lighting (which is default in 1.12 anyway) or make the shading respect the physical energy conservation rule as described here.

However, the RTSS also enables you to create complex custom render pipelines via OGRE scripts as it offers the following features (the emphasized parts require OGRE 1.12.5)

  • Hardware skinning
  • Instancing
  • GBuffers
  • Depth texture shadows
  • Lighting models
  • Triplanar Texturing
  • Offset mapping

Below are some examples how this might look like:

The first screenshot shows the instancing sample, where the RTSS extended the vertex shader to read from the instance buffer as well as the fragment shader to apply depth based shadows. If you switch to the PF_DEPTH format for the depth texture, it will automatically use hardware PCF as it does not incur any performance penalty.

The second screenshot shows integrated offset mapping with multiple lights. As this is handled by the RTSS as well, it can be combined with hardware skinning and instancing – all you need is to add a single line in your material. No need to touch any shader code, while being compatible to all supported render systems.

See the respective Samples on how to integrate this in your own projects.

Deprecation of the HLMS backport

If you are familiar with OGRE, you probably also know that there is the High Level Material System Component in OGRE1. Actually this Component is a backport of the respective core element of OGRE-next (2.1+), where it handles shader variations and thus has a similar goal of the RTSS.

However, it got only little love in OGRE1 after the initial backport, so even as of today there is no way to use it form OGRE scripts. Also I am not aware of any users, as there was not a single bug-report regarding the HLMS.
To reflect that the RTSS is in all cases the preferred alternative, the HLMS is therefore deprecated in OGRE1 and will be removed with the next release, if nobody steps up to object.

Modern OpenGL with OGRE1

Everybody is starting into a new year with good resolutions, so you can now take an advantage of modern OpenGL3+ concepts with OGRE .

Shader storage blocks

The first one is “Uniform Buffer Objects” (UBO) and “Shader Storage Buffer Objects” (SSBO) or simply “shared GPU Parameters” in OGRE speak.
The shared GPU parameters have gained a backing HardwareBuffer which is used for communicating with the GPU.
With OpenGL this can be either a UBO or SSBO which is detected from your shader code and automatically bound if possible.
In case of a SSBO it is possible to read-back the data from the GPU, which is triggered by the new GpuSharedParameters::download() method.
This gives you an easy way to retrieve results from the GPU without rendering to a Texture or Vertex Buffer.

Separate Shader Objects and SPIRV

Traditionally OGRE internally uses monolithic programs for GLSL that explicitly glue vertex and fragment shaders together. Notably, this means the GpuProgramParameters are only valid per combination and not per individual shader.

However, from the API perspective Ogre always exposed the DirectX model without explicit grouping – e.g. in material scripts. This approach is commonly referred to as “mix and match”.

Ogre tries hard to hide this difference for you. For instance one can only retrieve the active uniforms from GLSL once it is linked. This happens on the first render call in OpenGL – instead of at material parsing time when OGRE would need it.
Therefore OGRE goes ahead and parses the GLSL source code itself to figure out the available uniforms. Needless to say, this is quite error prone and does not support more advanced constructs e.g. struct uniforms.

Fortunately, OpenGL provides the DirectX like behaviour via “Separate Shader Objects” (SSO) that allow linking individual shaders and bundling them to pipelines later.
Now we finally take advantage of them and at this also uses ARB_program_interface_query for parsing the uniforms in a standard way and cover all corner cases.
Notably, this allows us to reference uniforms by location only – like in the good old assembly days:

vertex_program Ogre/Compositor/StdQuad_Tex2a_vp spirv
{
    source StdQuad_Tex2a_vp.vert.spv
    default_params
    {
        param_indexed_auto 0 worldviewproj_matrix
    }
}

The corresponding GLSL code for that uniform would be

layout(location = 0) uniform mat4 worldViewProj;

You might wonder why you should care; if you look closely the material snippet above is using SPIRV binary shaders, where the uniform names are stripped away and only the locations are available.

Therefore this is necessary for support of pre-compiled SPIRV shaders, which is now complete.

And while we are at it – why are SPIRV shaders cool? Well, you can compile HLSL to SPIRV and then use it with OpenGL 😉
Also, this is a pre-requisite for the Vulkan back-end and this way you can prepare you shader authoring accordingly.

If you would like to learn more about using SPIRV in OGRE, see here.

Currently (in master) you have to manually enable SSO support via the “Separate Shader Objects” RenderSystem option.
Using OpenGL for shader parsing has some side effects; notably you will now get errors when trying to set an unused (and thus optimized away) uniform. Typically this means your shader is broken – but we traditionally keep your code working during a release support period.

Ogre 1.12.4 released

I just published the Ogre 1.12.4 holiday release. Besides wishing you all a merry Christmas, there are new features that deserve an in-depth description.

OGRE_NODELESS_POSITIONING

Using Cameras and Lights without having them attached to a SceneNode was already deprecated with the 1.10 release and you got compiler warnings if you attempted to so since then.

With the OGRE_NODELESS_POSITIONING=OFF build option, we now allow actually taking advantage of having the positioning code in the Nodes.

With this option all positioning code in Cameras and Lights will be disabled, which results in faster updates and notably smaller memory footprint, which is

  • 12% less for Lights
  • 7% less for Cameras

As the node-less positioning API will be gone as well, you should make sure that you trigger no warnings in this regard.

For the most part the porting should look like

// before
mLight->setDirection(...);

// after
mSceneMgr->getRootSceneNode()->createChildSceneNode()->attachObject(mLight);
mLight->getParentSceneNode()->setDirection(..., Node::TS_WORLD);

Also refer to the notes on the deprecation page. Some additional caveats to look out are:

  • SceneNodes do not use a fixed yaw axis, while Cameras do
  • SceneNode::setDirection uses TS_LOCAL by default while Cameras and Lights behaved like TS_WORLD

Background shader compilation

The GpuProgram code got refactored and now properly respects the prepare and load states. This means that the shaders can be loaded and pre-processed in a background thread.

With D3D this additionally allows compiling the shaders in the background, which is quite handy given that HLSL compilation times range in the order of seconds.

You probably are thinking “This is great and all, but how to do background resource loading in Ogre?”. Given that was a common question for years, there is finally an according tutorial.

Other notable changes

  • Continued effort of porting Samples away from Cg
  • Documented the Matrix conversion behavior between OGRE (row-major) and GLSL (column-major)
  • compilation with OGRE_CONFIG_DOUBLE=TRUE works again
  • The built-in shadow Renderer now correctly handles multiple shadow casting lights with the RTSS
  • The RTSS now fully supports linear skinning and Dual Quaternion skinning with shearing (GLSL, GLSLES and HLSL). The manual was updated for the available options.

Ogre 1.12.3 released

Ogre 1.12.3 was just released. Typically we do not write a specific announcement for minor updates, however this one contains some major new features that warrant this one.

Of course there is the usual slew of bug-fixes as well, which are listed here.

New Features

  • Reversed-depth buffer support for D3D11 and OpenGL3+. See the accompanying tutorial for details.
  • Full Unicode Path support, including ZIP archives, on Windows (on by default)
  • The Real Time Shader System, now uses ShaderModel4 style texture sampling, which fixes multiple samples (mainly depth and 1D texture related)
  • Overlays now properly support content scaling, which is needed for HiDPI screens.
  • Native ImGui support through the Overlay component
The new ImGui Sample

Vulkan Progress Report

If you follow my twitter you may have seen I tweeted about it.

Or if you follow our Ogre repo, you may have seen some commits.

Yes, we’re working on Vulkan support.

So far we only got to a clear screen, so this is all you’re gonna get thus far:

It is working with 3 different drivers: AMDVLK, AMD RADV, and Intel Mesa, so that’s nice.
Only X11 (via xcb library) works for now, but more Windowing systems are planned for later.

A very low level library

Vulkan is very low level, and setting this up hasn’t been easy. The motto is that all commands are submitted in order, but they are not guaranteed to end in order unless they’re properly guarded.

Want to present on screen? You better setup a semaphore so the present command waits for the GPU to finish rendering to the backbuffer.

Submitted twice to the GPU? You better sync these two submissions or else they may be reordered

On the plus side, a modern rendering library could take advantage of this to start rendering the next frame while e.g. compute postprocessing is happening on a separate queue on the current frame.

A lot of misinformation

There’s a lot of samples out there. But many of them are wrong or incomplete.

For example the LunarG’s official samples are wrong because they acquire the backbuffer from the GPU using the same semaphore instead of using one semaphore per frame.

In many of the samples this is not a problem because they perform a full stall for demo purposes, but some of the more ‘real world’ samples do not.

They also do not teach how to deal with GPU systems where the present queue and the graphics rendering queue are different (I don’t know which systems have this setup, but I suspect it has to do with Optimus laptops and similar setups where GPU doing rendering is not the one hooked to the monitor).

Google’s samples are much better, but they still miss some stuff, such as inserting a barrier dependency on VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT so that the graphics queue doesn’t start rendering to a backbuffer before it has been fully acquired and no longer used for presentation.

This bug is hard to catch because often the race condition will never happen due to the nature of double and triple buffer, and worst case scenario this could result in tearing or similar artifacts (even if vsync is enabled).

Though there’s the possibility that failing to insert this barrier can result in severe artifacts in AMD GPUs due to DCC compression on render targets being dirty while rendering to it. Godot’s renderer had encountered this problem.

This is covered at the end of Synchronization Examples’ Swapchain Image Acquire and Present .

Last week, Khronos released a new set of official samples. So far these seem to perform all correct practices.

A VERY good resource on Vulkan Synchronization I found is Yet another blog explaining Vulkan synchronization. It is really, really good.

If I were to summarize Vulkan, it reminds me of Javascript async/promises development: Everything is asynchronous and has to be coded with promises.

Once you get into the async mindset, Vulkan makes sense.

Where to next?

There’s a lot that needs to be done: Resizing the swapchain is not yet coded, separate Graphics and Present queues is not handled, there’s zero buffer management, no textures, no shaders.

The next task I’ll be focusing on is shaders; because they are useful to show stuff on screen and see if they’re working. Even if there are no vertex buffers yet, we can use gl_VertexID tricks to render triangles on screen.

And once shaders are working, we can then test if vertex buffers work once they’re ready, and if textures work, etc.

So that’s all for now. Until next time!