Whoa! Last time I (Matias) posted, we had a different website 🙂
What I’ve been working on:
I’ve started working on “Compute Shaders” (CS). They have been blocking future progress for far too long. They are required for modern techniques such as Forward+ and come in handy for things like tiled deferred rendering. Originally I started on compute shaders because I wanted them for a new terrain system that could generate the shadow maps in real time (yeah, there’s a new terrain system coming). CS are the only way to do it efficiently.
This has been a lot of work, and is currently under the unstable branch “2.1-pso-compute”. Actually, Compute Shaders are already working. However what we need to support is UAV buffers, which is sort of speak like read & write malloc’ed arrays of GPUs. We’ve already added support for UAV textures, but UAV buffers were missing and they are more flexible.
UAV buffers need to be treated with care though, because you’ll likely want to to create them from C++ (since they’re almost like a GPU malloc), but the compositor needs to be aware of them.
Why, you ask? Because the compositor is in charge of placing memory barriers and resource transitions. In other words: It needs to prevent race conditions. You’ll see … if you run Compute Shader A, and then Compute Shader B, and then render geometry, you have no guarantee that A will be completed before B. In fact, rendering geometry may even finish before A does!
This is great for GPU parallelism (colloquially known as Async Compute), but sucks if there were data dependencies (e. g. B depended on A, or Rendering Geometry depended on A or B). Memory barriers/resource transitions ensure shaders that must be run in order are executed in order while, hopefully, shader executions that are independent can run in parallel without being stalled.
Note: D3D11 implicitly inserts implicit memory barriers between compute shader executions, OpenGL only offers coarse memory barriers, but only Vulkan & D3D12 offer fine memory barriers.
The Compositor is in the best spot for this kind of work because it analyzes dependencies once (during workspace initialization) and can see all input, outputs and data dependencies.
That means that while UAV buffers can be created and managed from C++, some parts must be relinquished or informed to the compositor to ensure proper behavior (whether via scripts or via code). This means I need to be extremely careful with the design to avoid a clueless programmer innocently setting an UAV as input/output from a compute shader directly without the compositor noticing.
To make things worse, D3D11 & OpenGL differ quite greatly in how UAV buffers should be handled.
All in all, progress is steady. Compute Shaders are coming.
Other stuff that has increased in importance has been multiple RenderTarget inputs to compositor workspaces. Right now we only allow defining one “final target” which is treated as the final output (i. e. the RenderWIndow), although it doesn’t necessarily have to. The intention is to support more than just one external RenderTargets being available to a Compositor Workspace, which helps a lot in chaining multiple workspaces together (and are also very relevant for calculating memory barriers correctly).
Compute Shaders defined via JSON and have access to the Hlms
This is something I’ve been wanting to do for a while. Instead of using low level material’s syntax or interface (which was half ill-suited, half well-suited for the job), a new special Hlms is in charge of Compute shaders. We already have a working example (see all possible settings), although beware it’s subject to change. Auto params work, but not all of them since some don’t make sense because they were meant for rendering (and for the moment, attempting to use the unsuitable ones will likely result in a crash).
Why am I excited about the Hlms access? Because of the preprocessor of course!
For example, you can use the Hlms to unroll a loop based on the width of texture, thus reducing loop overhead during execution. Unrolling a loop can be critical in fully utilizing all bandwidth when performing certain tasks (such as parallel reduction), though it can hurt performance in other cases (particularly if the instruction length is too high, or it results in high register pressure).
You can also adapt your code based on the number of threads per group or automatically modify the shader depending on whether the bound texture is MSAA or not (which would normally require defining multiple shader programs and manually selecting the correct one).
Once Compute Shaders are over, I can finish what little remains of this terrain system and release it. Afterwards I may end up resuming on DERGO (a in-Blender live Ogre material editor) or continuing support for GLES3.
I really want to work more on DERGO, but GLES3 working again would mean three platforms being compatible again (OS X, Android, iOS) and once we have that in the bag, we might start talking about an official 2.1 SDK release date. Can’t say that isn’t tempting…
What else was accomplished in the past 2 months:
- User al2950 contributed PlaneBoundedVolumeSceneQuery to Ogre 2.x. This feature has been requested by many. Thanks!
- spookyboo deserves a special mention for reporting a lot of JSON bugs for our PBS materials. He has been severely stress testing our system as he’s been working on a Hlms material editor.
- A major bug involving reading of normal maps was fixed. Thanks to user GlowingPotato for noticing!
- Several fixes affecting the accuracy of our PBS implementation.
- Other minor bug fixes.
Thanks to all community users who have been reporting your issues and helping make Ogre 2.1 more robust every day! Our PBS implementation has been under a lot of scrutiny lately and I like it. It hurts my ego of course, but it results in improved quality. This of course means our users can focus on the important parts and not on the technical details.
As many have already noticed, we have a new Ogre team member for already a few weeks now, so it is finally time to official announce this great news: Eugene Golushkov (forum account: Eugene) has joined the ranks of the core team and is currently focusing on the DX11 implementation efforts.
Eugene is actively working on a commercial application using Ogre and therefore is able to provide a lot of crucial insight into the implementation’s state and is able to directly tackle issues as they get uncovered by his day-to-day work. He already contributed a lot of fixes and additions to the DX11 render system and is overall doing great work.
More details on the updated team page.
In a few hours the deadline for organizations to apply for the Google Summer of Code 2016 will end. And of course we have submitted our application to participate again in this great project.
One part of the application is an ideas list that proposes some interesting topics to potential students. The development core team compiled such a list of project ideas that are deemed very relevant. But of course this list can be extended by ideas from the community. In order to gather and discuss them, we created a thread in the forums and would encourage everyone to chime in and provide feedback either for already listed ideas or new ones.
Looking forward to your ideas!
Getting actual sales numbers for game titles is often difficult and unfortunately, we haven’t found a magic divining rod to get those numbers (yet), but with services such as SteamSpy it is at least possible to get a rough estimate for the leading game sales platform.
Our community member “bronzebeard” took it upon himself to compile a list of known Ogre-based Steam titles and their estimated sales numbers. He also promised to update the list approx. every month.
Check it out: Ogre3D Steam Game Sales List
PS: If you are aware of any missing Ogre-based application and their sales numbers (either from Steam or some other source), let us know in this forum thread. Thanks!
Merry Christmas and Happy New Year!
If you don’t celebrate any of those two, then don’t worry. Best wishes to you too!
We’re not dead. Just been busy, and very busy.
First of all, I need to clarify that Ogre 2.1 is very stable. Several users in our forums have been under the impression that 2.1 is unstable (both in terms of crashes or codebase constantly changing) and that is not true. Several teams are actually using 2.1 in production. We’re still away from an official release because we don’t run on Android, iOS and OS X yet; which for some, it can be a deal breaker. But if you work on Windows or Linux (or support for these other platforms can come later), then you can clone the repo and start working on 2.1
Beware most of the CMake option configurations haven’t been checked. Stick to defaults at first, and once you get the samples compiling and running, start experimenting with the other CMake options.
Also bare in mind the wiki and most plugins/addons are for 1.x; your starting point would be the samples (select OGRE_BUILD_SAMPLES2 in CMake) and the porting manual. (Recommended to view in OpenOffice or LibreOffice, then export to PDF. MS Word can open it, but it tends to screw the formatting).
Second, a community user, miki3d, has suggested a new logo/rebranding. What do you think? Don’t forget to stop by.
So… what’s new?
1. Added TagPoints to the new Skeleton system! This has been a sort of unfinished business for me. I’m glad it’s finally done!