[2.1] How to replace StaticGeometry?

Problems building or running the engine, queries about how to use features etc.
Post Reply
JuBe
Gnoblar
Posts: 24
Joined: Wed Jan 10, 2007 2:45 pm
x 1

[2.1] How to replace StaticGeometry?

Post by JuBe »

Hi,

Our application has multiple objects that use the same material, but their meshes are unique.
With Ogre 1.9 we used StaticGeometry to reduce API draw calls, but now that we have ported our application to 2.1 we can't use that anymore.

Is there currently some way to reduce API draw calls in Ogre 2.1?

I read that the old StaticGeometry code will be replaced/ported, but is there a timetable for that?
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5298
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1279
Contact:

Re: [2.1] How to replace StaticGeometry?

Post by dark_sylinc »

There is no StaticGeometry replacement in Ogre 2.1; and likely won't be. Use with SCENE_STATIC for the nodes and objects though. There are plans for specialized versions (i.e. grass rendering is a particular interesting subject, sometimes tree rendering as well) though.
In certain cases we'd still want to batch objects into the same mesh, as the goal is to keep a reasonable number of Items / vertex_count ratio (more Items means more CPU work and usually less GPU work due to better culling, while less Items with merged meshes mean the opposite: less CPU and usually more GPU), for such cases we're planning (no ETA) on providing a simple mesh merger (though technically you could do that yourself, we would only provide it for convenience out of the box).

Draw call overhead in Ogre 2.1 is very low, and performance is affected in very different ways compared to 1.x and 2.0.

Is there something in particular you were rendering with StaticGeometry? How large were your scenes? (object count, vertex count, number of materials, etc)
JuBe
Gnoblar
Posts: 24
Joined: Wed Jan 10, 2007 2:45 pm
x 1

Re: [2.1] How to replace StaticGeometry?

Post by JuBe »

We were rendering logs with StaticGeometry. It was perfect for them, because they shared the same material but each of them had their own geometry.
The geometry of one log is pretty basic ~50 vertices per log.

I did some profiling using Nsight and if I understand the results correctly the main bottleneck is the amount of draw calls ( over 5000 of them ).
I also did the 2x2 textures, Null scissor rectangle and Minimum geometry tests and they had no effect of the drawing performance.
http://http.developer.nvidia.com/Parall ... ntrols.htm

Here is Nsight API Call Summary from single frame.

Code: Select all

API Call Summary
Call Type, Call Count
Draws:,5414
Dispatches:,0
Clears:,6
Blits:,0
Presents:,1
Command List Executes,0
Non-API:,1
Other:,11061
Total:,16483

API Call Details
API Call,Count,Avg CPU µs,Σ CPU µs,Avg GPU µs,Σ GPU µs
ID3D11DeviceContext1::DrawIndexedInstanced(),5412,<1,468,21,114968
ID3D11DeviceContext1::VSSetShaderResources(),2697,<1,524,0,0
ID3D11DeviceContext1::PSSetShaderResources(),2686,<1,426,0,0
ID3D11DeviceContext1::VSSetSamplers(),2664,<1,277,0,0
ID3D11DeviceContext1::PSSetSamplers(),2664,<1,238,0,0
ID3D11DeviceContext1::PSSetConstantBuffers(),32,<1,9,0,0
ID3D11DeviceContext1::Map(),28,3,72,0,0
ID3D11DeviceContext1::Unmap(),28,1,41,0,0
ID3D11DeviceContext1::VSSetConstantBuffers(),25,<1,11,0,0
ID3D11DeviceContext1::IASetVertexBuffers(),22,<1,7,0,0
ID3D11DeviceContext1::VSSetShader(),17,<1,8,0,0
ID3D11DeviceContext1::GSSetShader(),17,<1,5,0,0
ID3D11DeviceContext1::HSSetShader(),17,<1,4,0,0
ID3D11DeviceContext1::DSSetShader(),17,<1,2,0,0
ID3D11DeviceContext1::PSSetShader(),17,<1,6,0,0
ID3D11DeviceContext1::IASetInputLayout(),17,<1,8,0,0
ID3D11DeviceContext1::IASetPrimitiveTopology(),17,<1,3,0,0
ID3D11DeviceContext1::IASetIndexBuffer(),16,<1,3,0,0
ID3D11DeviceContext1::HSSetShaderResources(),9,1,6,0,0
ID3D11DeviceContext1::DSSetShaderResources(),9,<1,4,0,0
ID3D11DeviceContext1::GSSetShaderResources(),9,<1,3,0,0
ID3D11DeviceContext1::CSSetShaderResources(),9,<1,4,0,0
ID3D11DeviceContext1::OMSetRenderTargets(),9,1,12,0,0
ID3D11DeviceContext1::RSSetViewports(),5,1,3,0,0
ID3D11DeviceContext1::RSSetScissorRects(),5,<1,2,0,0
ID3D11DeviceContext1::RSSetState(),5,1,4,0,0
ID3D11DeviceContext1::OMSetBlendState(),5,<1,2,0,0
IDXGISwapChain3::GetDesc(),4,2,8,0,0
ID3D11DeviceContext1::ClearRenderTargetView(),3,<1,1,0,0
ID3D11DeviceContext1::ClearDepthStencilView(),3,<1,1,0,0
ID3D11Device1::CreateDepthStencilState(),2,9,17,0,0
ID3D11DepthStencilState::Release(),2,<1,1,0,0
ID3D11DeviceContext1::OMSetDepthStencilState(),2,<1,1,0,0
ID3D11DeviceContext1::GetData(),1,8,8,0,0
ID3D11Query::Release(),1,<1,<1,0,0
ID3D11DeviceContext1::SOGetTargets(),1,<1,<1,0,0
ID3D11DeviceContext1::Draw(),1,<1,<1,39,39
ID3D11DeviceContext1::DrawInstanced(),1,1,1,28,28
ID3D11Device1::CreateQuery(),1,17,17,0,0
ID3D11DeviceContext1::End(),1,5,5,0,0
IDXGISwapChain3::Present(),1,426,426,0,0
I did found one problem that we can solve easily (each log caused 3 draw calls because the geometry was created poorly in code).
But even after that fix the amount of draw calls can be really high because we can have thousands of logs on screen.

So are the only solutions for this problem manually merge objects together or limit the object count on screen?
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5298
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1279
Contact:

Re: [2.1] How to replace StaticGeometry?

Post by dark_sylinc »

Hi!

5000 draws is a walk in the park for Ogre 2.1.
CPU side, you can expect around 50.000-200.000 draws before you begin to get below 60 fps (depends on CPU, number of cores, other misc factors like material settings, number of materials, presence of skeletally animated objects, and assuming you don't hit a GPU bottleneck first). To compare against Ogre 1.x; Ogre 1.x would start getting really bad after >2000 draws.

Your own NSight stats tell the combined CPU time is of 2639.5µs (2.64ms = 379 fps) and GPU's is 115035µs (115ms = 8.69 fps), clearly indicating a serious GPU bottleneck.

However you said something key:
The geometry of one log is pretty basic ~50 vertices per log.
That's the problem. The GPU will have bubbles as you should be processing at least 196 vertices per draw (196 is not just any magic number :) : 64 triangles * 3 vertices per triangle = 192). Likely get >1000 vertices per draw if you can afford it.

Your application was GPU bound, but the solution is still indeed batching things together.

So yes... the solution would be to merge several logs into one mesh (basically StaticGeometry was doing that for you) and this is one of the use cases I was thinking when I said "there are plans for specialized versions". I'm afraid we're not providing an utility just yet to help you with that. :?

Cheers

PS: Just for the sake to explain things, 2x2 textures & Null scissor rectangle didn't improve anything because they affect pixel shader bandwidth and rasterizer bottlenecks; whereas your bottleneck was in either the vertex fetcher (input assembly stage) or the vertex shader.
"Minimum geometry" didn't reveal the problem either because for the vertex fetcher & vertex shader, processing 50 vertices per draw or processing 3 is essentially the same. Minimum geometry only reveals vertex bottlenecks when you have one draw rendering millions of vertices, and suddenly you only have 3; or you have thousands of tiny sub-pixel triangles (which eat pixel shader resources a lot) and suddenly there's just one triangle.
JuBe
Gnoblar
Posts: 24
Joined: Wed Jan 10, 2007 2:45 pm
x 1

Re: [2.1] How to replace StaticGeometry?

Post by JuBe »

Hi

Thanks for the reply it was really helpful.

One thing that I didn't quite get was the 196 vertice limit per draw. I created 4913 unique boxes with 36 vertices per box and then raised the vertice count per box gradually until I was over the 196 vertice count and the performance was basically the same. So what does the limit mean?

What would be the best way to merge the separate objects together?
Just create new mesh with the merged objects data ( with the necessary transformation )?
Post Reply