Google

BSP on modern GPU...why it isn't efficient?

A place for users of OGRE to discuss ideas and experiences of utilitising OGRE in their games / demos / applications.

Moderators: OGRE Team, Moderators

BSP on modern GPU...why it isn't efficient?

Postby EpS1L0n » Sun Jan 08, 2006 1:04 pm

Hi,
on this link (http://www.ogre3d.org/wiki/index.php/SceneManagersFAQ) is reported that "Quake 3-style .bsp format may not be that efficient with modern GPU's in any case"

Can anyone give me a reason?
EpS1L0n
Newcomer
 
Posts: 3
Joined: Sun Jan 08, 2006 1:00 pm

Postby Chris Jones » Sun Jan 08, 2006 1:11 pm

i dont know too much about it but its something todo with the fact that old graphics cards, liked small amounts of triangles in lots of batches, but modern cards like large amounts of triangles in as little batches as possible, so bsp cuts the scene up too much, or something to that effect
User avatar
Chris Jones
Veteran
 
Posts: 1707
Joined: Tue Apr 05, 2005 1:11 pm
Location: Gosport, South England

Postby sinbad » Sun Jan 08, 2006 1:22 pm

Old cards were transform limited, or possibly didn't even have hardware transform at all (the Voodoo1/2 cards didn't for example). Therefore it was important to send as few triangles to the pipeline as you could. This also applied to software renderers which often performed per-triangle culling.

The Q3A BSP structure subscribes to this approach, culling in very small patches of triangles and dynamically determining every frame which small groups are visible so as to only pass the smallest set it needs to the renderer.

There are basically 2 ways to handle this - either keep all the data on the card and make lots of small calls to pull in all the fragments you need to render for this frame, or build up a combined buffer of fragments to render every frame (adjusting offsets etc on the fly in the CPU) and uploading that to the card as often as you need to for rendering. Q3A uses the latter approach, our BSP renderer uses the former (it used to use the same as Q3A but as an experiment we tried it the other way, and for hardware vertex buffer enabled cards it is faster).

Modern GPUs hate both approaches; the first spends far too much time in overheads for each rendering call, and the latter spends too much time transferring data over the bus. Which one is faster depends on the card, bus and CPU you have, but neither are optimal.

Modern level structures are designed to use much bigger chunks of data in one go,and are optimised for submission of large chunks of geometry at once not picking small fragments. So, even though the Q3A format is very popular, it's very outdated and generally unsuitable for modern projects.
User avatar
sinbad
OGRE Project Lead
OGRE Project Lead
 
Posts: 24892
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands

Postby EpS1L0n » Sun Jan 08, 2006 1:59 pm

Thank you, very clear answers.
Can you give me references about a modern way to design a BSP on the GPU?
EpS1L0n
Newcomer
 
Posts: 3
Joined: Sun Jan 08, 2006 1:00 pm


Return to Using OGRE in practice

Who is online

Users browsing this forum: emarcotte and 4 guests