This is dark_sylinc writing again…and: Oh boy! We’ve been busy!
1. Light list generation for forward lighting was threaded. Turns out, we were spending a lot of time building the light list when there are tons of objects on screen. Frame time was reduced from 10 ms to 4.5 ms on my Intel i5-4460 for 50k draws (AMD Radeon HD 7770).
2. Ported DX11! It’s not as thoroughly tested as the GL3+ RenderSystem, so I’d stick with GL3+ if you want stability. But it’s booting up, it can take advantage of most of the AZDO enhancements, and all the samples are running. Performance benchmarks against GL3+ are inconsistent: It highly depends on the driver (different cards, different bench results) and some samples run better on GL3+ others on D3D11, but often only by a slight margin.
Since the samples are GPU bottlenecked, my theory is that it depends on how well the driver compiles and optimizes the GLSL shader versus how well the driver optimizes and reinterprets the HLSL IL that the D3D runtime throws at the driver.
Now cards that are supposed to be supported but were not due to driver issues (i. e. Radeon HD 2000 through Radeon HD 4000) are now being supported! Intel cards weren’t tested, but in theory they should be supported too. Feedback is appreciated in this area (both Windows, Linux, and D3D vs. OpenGL).
3. We’ve been working on an experimental branch with a new technique called “Forward3D“. Sounds exciting but it’s not really ground breaking.
I don’t want to use deferred shading as default because it causes a lot of problems (transparency, antialiasing, multiple BDRF). Besides, it uses a lot of bandwidth. Forward+/Forward2.5 is great, but requires DX11 HW (needs UAV) and a Z-Prepass. This Z-prepass is often a turn off for many (even though in some cases it may improve performance, i.e. if you’re heavily pixel shader or ROP [= raster operation] bound).
I came up with an original idea for a new algorithm I call “Forward3D“. It’s not superior on all accounts, but it can work on DX10 hardware and doesn’t require a Z-prepass. The light list generation algorithm is now being generated in the CPU, however I think it should be able to run on Compute Shaders on DX10 hardware just fine (though, I don’t know yet if generating the light list is expensive enough; it may not even be worth doing on CS or perhaps it will).
The result is a nice generic algorithm that can run on a lot of hardware and can handle a lot of lights. Whether it performs better or worse than Deferred or Forward+ depends on the scene.
These are early screenshots. The algorithm has actually improved since then (particularly for bigger lights, it can handle a lot more lights now):
4. The community seems to be eager to compare how Ogre 2.1 fares against commercial engines. Remember that Ogre is a rendering engine while most of these engines are game engines (which means they provide much more than graphics, like physics, sounds, logic, scripting and level editors). Nonetheless Ogre seems to be doing very well!
We highly appreciate the faith you put in us!
5. Reported two Linux driver bugs to AMD. AMD has already confirmed that they will be including a fix for one of their bugs in the next Linux release. Their engineers are still working on the second bug, which has been much harder to isolate.
6. Merged all changes from:
- 1.9 → 1.10
- 1.9 & 1.10 → 2.0
- 1.9 & 1.10 & 2.0 → 2.1
Now, all enhancements that were made to 1.10 (particular to RenderSystems) are available in 2.0 as well. We still recommend that on 2.0 you stick to D3D9 though, since it’s the fastest and most stable one. On 2.1 we recommend GL3+, but you’re now encouraged to also try out the D3D11 RS as well.
7. Fixed tons of bugs as they’ve been reported or been found.
Well. There’s a lot of work that remains to be done. Ogre3D is well and alive! I’m /signing off for now.