Inconsistancy in floating-point modes can cause rare crashes

Discussion area about developing or extending OGRE, adding plugins for it or building applications on it. No newbie questions please, use the Help forum for that.
tommie
Gnoblar
Posts: 10
Joined: Fri Dec 09, 2005 12:07 pm
Location: Amsterdam, The Netherlands

Inconsistancy in floating-point modes can cause rare crashes

Post by tommie »

This is an issue I started having after I would set the 'floating-point mode' to 'consistent' (as part of the Direct3D config options).

I would get occasional (yet consistent and reproducable) crashes, that typically occured when two identical objects (in my case planes) where at the same position, and the camera was at a specific position as well (at some position perpendicular to the planes in my case). The crash would always show up in the comparison operator from 'DepthSortDescendingLess', that was accessed by the 'std::stable_sort()' call from 'QueuedRenderableCollection::sort()'.

In the end, the crash disappeared after I recompiled the Ogre library using compile option /fp:precise (Visual Studio 2005).

On a side note: I found out that initializing the Direct3D renderer with the floating point mode on 'fast' will affect ALL of the floating point operations in your application, even if it was compiled with /fp:precise!
User avatar
sinbad
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 19269
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands
x 66
Contact:

Post by sinbad »

Interesting, we have had a few people reporting random crashes in this area and I've never been able to recreate it. http://www.ogre3d.org/phpBB2/viewtopic. ... 691#245691

This suggests that this imprecision is returning opposite results when comparing 2 objects against each other. I didn't even know that was possible - even if floating point is imprecise it should still be deterministic, surely?
ryming
Gnoblar
Posts: 9
Joined: Mon Aug 06, 2007 3:50 am

Post by ryming »

sinbad wrote: This suggests that this imprecision is returning opposite results when comparing 2 objects against each other. I didn't even know that was possible - even if floating point is imprecise it should still be deterministic, surely?
Aha, I will try this and check my program again.
tommie
Gnoblar
Posts: 10
Joined: Fri Dec 09, 2005 12:07 pm
Location: Amsterdam, The Netherlands

Post by tommie »

sinbad wrote:This suggests that this imprecision is returning opposite results when comparing 2 objects against each other. I didn't even know that was possible - even if floating point is imprecise it should still be deterministic, surely?
Yes, very strange indeed... All I can add to this at this moment is that I did NOT get the crash when I would let Direct3D set the floating-point mode to 'fast' (both when Ogre was compiled using '/fp:precise' and '/fp:fast'). But I still would have the crash if I compiled both Ogre as well as my main application using '/fp:fast'.

So, is it an option to use '/fp:precise' as a default floating-point mode for Ogre? If this would cause a significant decrease in performance, one should be able to override this using the Direct3D setting. But that would not work with OpenGL.
User avatar
sinbad
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 19269
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands
x 66
Contact:

Post by sinbad »

Ideally I'd like to find out what's going on here first before just avoiding it with /fp:precise. I could really do with a small repeatable example if you can provide one.
tommie
Gnoblar
Posts: 10
Joined: Fri Dec 09, 2005 12:07 pm
Location: Amsterdam, The Netherlands

Post by tommie »

Yes, I understand. My app is too big (and proprietary), but I can try to create an example app from scratch after my next deadline. Or perhaps someone from the aforementioned thread has something available already?
User avatar
sinbad
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 19269
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands
x 66
Contact:

Post by sinbad »

Someone sent me a repro, although even that happened in different cases to what he had. Please try this:

Code: Select all

Index: OgreMain/include/OgreRenderQueueSortingGrouping.h
===================================================================
RCS file: /cvsroot/ogre/ogrenew/OgreMain/include/OgreRenderQueueSortingGrouping.h,v
retrieving revision 1.41
diff -u -r1.41 OgreRenderQueueSortingGrouping.h
--- OgreMain/include/OgreRenderQueueSortingGrouping.h	23 Aug 2006 08:18:35 -0000	1.41
+++ OgreMain/include/OgreRenderQueueSortingGrouping.h	30 Aug 2007 15:48:22 -0000
@@ -170,7 +170,7 @@
                     // Different renderables, sort by depth
                     Real adepth = a.renderable->getSquaredViewDepth(camera);
                     Real bdepth = b.renderable->getSquaredViewDepth(camera);
-				    if (adepth == bdepth)
+					if (Math::RealEqual(adepth, bdepth))
 				    {
                         // Must return deterministic result, doesn't matter what
                         return a.pass < b.pass;
I think it's because the == is not succeeding / failing symmetrically in non-precise mode, this gives it a little more elbow room at the expense of less precision at very very low values. It fixes it for my case, but I'd like to know it works in others too.
User avatar
pricorde
Greenskin
Posts: 114
Joined: Thu Aug 11, 2005 9:28 pm
Location: France
Contact:

Post by pricorde »

I applied this patch and tried to reproduce the specific situation in which the game crashed, and it does not crashes anymore.
So it seems that this patch solved this difficult issue.
Thanks a lot Sinbad.

As a developer, you often learn the hard way that using == on floats is rarely a good idea, but this asymmetry of == in non-precise mode is new to me!
User avatar
xavier
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 9481
Joined: Fri Feb 18, 2005 2:03 am
Location: Dublin, CA, US
x 22

Post by xavier »

Defintely -- '==' with floating-point values is nearly useless, hence the high incidence of usage of inequality operators and epsilons instead. Even values such as "0.0" and "1.0" do not always turn out to be what you'd expect.
Do you need help? What have you tried?

Image

Angels can fly because they take themselves lightly.
User avatar
sinbad
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 19269
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands
x 66
Contact:

Post by sinbad »

xavier wrote:Defintely -- '==' with floating-point values is nearly useless, hence the high incidence of usage of inequality operators and epsilons instead. Even values such as "0.0" and "1.0" do not always turn out to be what you'd expect.
Yeah I know, but in this case equality doesn't matter in fact, only consistency. I wouldn't care if the '==' was never true, as is likely, so long as the '<' returns consistent results - that's all that's needed, the equality check was supposed to be a rare boundary condition check to maintain deterministic ordering if the floating point values ever did come out exactly equal.

It appears though that under /fp:fast even '<' is not reliable when values are close - it's possible for the comparison to return the same value when the arguments are reversed if they are close. That's news to me, and that's what this patch addresses.
User avatar
Jabberwocky
OGRE Moderator
OGRE Moderator
Posts: 2819
Joined: Mon Mar 05, 2007 11:17 pm
Location: Canada
x 218
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by Jabberwocky »

I just stumbled across a crash related to this thread.
The problem is similar - the DepthSortDescendingLess comparator is not deterministic, which causes stable_sort to crash.
I'll copy the comparator operation here, so you can see what I mean:

from file: ogre_src_v1-7-1\ogremain\include\ogrerenderqueuesortinggrouping.h

Code: Select all

		struct DepthSortDescendingLess
        {
            const Camera* camera;

            DepthSortDescendingLess(const Camera* cam)
                : camera(cam)
            {
            }

            bool _OgreExport operator()(const RenderablePass& a, const RenderablePass& b) const
            {
                if (a.renderable == b.renderable)
                {
                    // Same renderable, sort by pass hash
                    return a.pass->getHash() < b.pass->getHash();
                }
                else
                {
                    // Different renderables, sort by depth
                    Real adepth = a.renderable->getSquaredViewDepth(camera);
                    Real bdepth = b.renderable->getSquaredViewDepth(camera);
					if (Math::RealEqual(adepth, bdepth))
				    {
                        // Must return deterministic result, doesn't matter what
                        return a.pass < b.pass;
				    }
				    else
				    {
				        // Sort DESCENDING by depth (i.e. far objects first)
					    return (adepth > bdepth);
				    }
                }

            }
        };
The important part is this:

Code: Select all

					if (Math::RealEqual(adepth, bdepth))
				    {
                        // Must return deterministic result, doesn't matter what
                        return a.pass < b.pass;
				    }
In my case, Math::RealEqual is returning true, so we fall into the a.pass < b.pass check. Unfortunately, both my renderables are using the same material, and so a.pass equals b.pass. So, comparing pass pointers via less than is not consistent. And this leads to the stable_sort crash.

A better check would be to compare the RenderablePass address pointers, since it is guaranteed these will never be equal.

Code: Select all

					if (Math::RealEqual(adepth, bdepth))
				    {
                        // Must return deterministic result, doesn't matter what
                        return &a < &b;
				    }
Summary of the fix:
file: ogrerenderqueuesortinggrouping.h
function: DepthSortDescendingLess::operator()
line number: 181 (in Ogre 1.7.1)

Code: Select all

-                         return a.pass < b.pass;
+                        return &a < &b;
I have submitted a bug referencing this post: http://www.ogre3d.org/mantis/view.php?id=457
Image
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by dark_sylinc »

Unfortunately, strictly speaking your solution is not optimum either. Comparing pointers isn't guaranteed to be deterministic due to how virtual memory addressing works. This is very rare voodoo which usually happens when you start managing virtual memory yourself (aka. not calling malloc), but I must point out it may happen.

And some architectures other than x86 may get whinier about it (I'm unaware of any though).
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by sparkprime »

You can use std::less to compare pointers. Otherwise std::map<Fish*> would not work.

edit: I may mean std::compare, can't remember
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by dark_sylinc »

I've been thinking about it, and there shouldn't be a problem because if in the rare case two different memory regions are assigned to the same virtual address (which means "&a < &b" will fail) then pretty much everything else will also fail, and there's a bigger bug to worry about in the developer's code.

On a different angle, comparing pointers breaks determinism. Meaning the same render may be rendered different on two different runs, because the pointer addresses assigned to 'a' & 'b' were different.
This may affect automated tests, and potentially games relying on determinism (though, if the results of the sort are only used internally, then games using Ogre should be unaffected; as they never see or depend in any way on the sort results).

In other words, it's deterministic within a run, but not multiple re-runs of the application.
User avatar
Jabberwocky
OGRE Moderator
OGRE Moderator
Posts: 2819
Joined: Mon Mar 05, 2007 11:17 pm
Location: Canada
x 218
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by Jabberwocky »

dark_sylinc wrote:On a different angle, comparing pointers breaks determinism.
That's a good point. My proposed change does preserve determinism within a particular sort, so it is still a valid fix for the crash. But you're right, it potentially breaks determinism between separate runs (and maybe even separate frames?).

This fix is still far preferable than a crash of course. We are talking about an edge-case to an edge case (equal depth, same material) that quite likely won't affect the vast majority of users or test cases - if it did, we'd be hearing about this crash far more often. Still, if we can come up with a better solution that preserves determinism across different runs, fantastic. If not, I would still advocate the change going in, again because the crash is obviously the far greater issue.

Maybe having some kind of auto-incrementing ID integer built into the RenderablePass which could be used in the comparison would fix the determinism problem. I don't know if that's overkill or not for such an unlikely problem.
Image
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by CABAListic »

What if you just return false? It doesn't get any more deterministic than that, and in terms of ordering it means that the objects are considered equal (a is not smaller than b and b is not smaller than a). If stable_sort is the sorting algorithm used, then that should ensure deterministic ordering.
User avatar
Jabberwocky
OGRE Moderator
OGRE Moderator
Posts: 2819
Joined: Mon Mar 05, 2007 11:17 pm
Location: Canada
x 218
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by Jabberwocky »

That's the exact circumstance which leads to the stable_sort crash. It confuses the hell out of stable_sort if a<b and b<a.
Image
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by CABAListic »

Yes, but returning false means that a !< b and b !< a. stable_sort can definitely deal with that.
User avatar
Jabberwocky
OGRE Moderator
OGRE Moderator
Posts: 2819
Joined: Mon Mar 05, 2007 11:17 pm
Location: Canada
x 218
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by Jabberwocky »

Ok yeah, you're right.
Taking that into consideration, I now believe I misdiagnosed the problem earlier.
The circumstance leading up to my crash also should have always returned false:

Code: Select all

                        return a.pass < b.pass;
since both pointers were equal (same material).
But that shouldn't actually have caused a crash.
Damn.

Here's what I know for sure.
1. I had a reproducable crash in stable_sort
2. The stack was similar as the one linked to above.
3. I could only reproduce it in release, not debug, so the stack was a bit mangled due to release optimization.
4. The crash went away when I made the one line "fix" that I explained in my previous post.
5. I am using DirectX and Floating-point mode=Consistent

Right now I'm thinking my "fix" just covered up the real crash somehow, and it's not actually fixed. I'll have to look into it. Thanks everyone for your input.
Image
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by sparkprime »

dark_sylinc wrote:I've been thinking about it, and there shouldn't be a problem because if in the rare case two different memory regions are assigned to the same virtual address (which means "&a < &b" will fail) then pretty much everything else will also fail, and there's a bigger bug to worry about in the developer's code.

On a different angle, comparing pointers breaks determinism. Meaning the same render may be rendered different on two different runs, because the pointer addresses assigned to 'a' & 'b' were different.
This may affect automated tests, and potentially games relying on determinism (though, if the results of the sort are only used internally, then games using Ogre should be unaffected; as they never see or depend in any way on the sort results).

In other words, it's deterministic within a run, but not multiple re-runs of the application.
I realise hte thread has moved on, but for the record

ptr < ptr2 is only allowed (according to the standard) if they are pointers into the same allocated chunk, e.g. &arr[0] < &arr[10]

This is because memory is not required by the standard to be totally ordered.

The correct way is to use the std::less thing. std::map uses std::less.

However all of the systems we care about will do 'what you want' with ptr < ptr2 regardless of whether they are part of the same chunk.

For such systems, std::less is no slower than a raw < so even with that in mind, there is no reason not to use std::less.
User avatar
Jabberwocky
OGRE Moderator
OGRE Moderator
Posts: 2819
Joined: Mon Mar 05, 2007 11:17 pm
Location: Canada
x 218
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by Jabberwocky »

(btw- thanks sparkprime. Interesting).

I have a theory about this crash.
It is because of the epsilon in Math::RealEqual

Again, the relevant sort code, from ogrerenderqueuesortinggrouping.h

Code: Select all

if (Math::RealEqual(adepth, bdepth))
{
   // Must return deterministic result, doesn't matter what
   return a.pass < b.pass;
}
else
{
   // Sort DESCENDING by depth (i.e. far objects first)
   return (adepth > bdepth);
}
So, we compare in this order:
1. compare depth
2. conpare pass pointer (if depth is equal)
For the sake of argument, let's say the epsilon in Math::RealEqual is 0.05.

Now, imagine we're sorting 3 RenderablePass, a, b, and c:
RenderablePass a: depth=1.01 pass=0x3
RenderablePass b: depth=1.04 pass=0x2
RenderablePass c: depth=1.07 pass=0x1

I will show you that these values lead to two incompatible conclusions, namely:
b < a < c
and
b > c

Details:
Step 1: compare a to b
a.depth == b.depth (within epsilon of 0.05)
-> so compare passes
a.pass > b.pass
conclusion: a > b

Step 2: compare a to c
a.depth < c.depth
conclusion: a < c

*** Based on step 1 and step 2, we can conclude: b < a < c
Everything is good so far.

Step 3: compare b to c
b.depth == c.depth (within espilon of 0.05)
-> so compare passes
b.pass > c.pass
*** conclusion: b > c

The two lines marked above with *** are inconsistent. It cannot be true that:
b > c
and
b < a < c

This seems like it could possibly crash the sorting routine.
As you can see, the epsilon of Math::RealEqual can cause a series of RenderablePasses which have a very close, but non-equal depth to be sorted by depth or by pass in an inconsistent manner. Unfortunately, this Math::RealEqual check was explicitly added as a fix to an earlier crash bug (see earlier in this thread), so simply removing it is not an option.

I don't have a solution yet, but just wanted to post what I'd found so far.
Thanks for the support on this issue.
Image
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by CABAListic »

Sounds reasonable. Unfortunately, I don't see an easy way out. If the < operator really isn't reliable, then what else is there to do?

I would, however, like to reexamine if the behaviour of < really isn't predictable (with fp:fast). That claim seems really odd to me, as it would break *any* sorting of floating point values, even a std::set<float> could potentially break. That just doesn't make sense to me.

Edit: Actually, even the original version before the epsilon comparison had a check on equality. Perhaps it was that check on equality that was inconsistent rather than the comparison operator? Either way, as long as the < operator is reliable, no equality check is needed at all, so maybe just try to remove that entirely?
User avatar
Jabberwocky
OGRE Moderator
OGRE Moderator
Posts: 2819
Joined: Mon Mar 05, 2007 11:17 pm
Location: Canada
x 218
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by Jabberwocky »

I tracked down the bug.
It is not the issue I identified in my last post - although I believe that could still potentially be a problem.
I have a fix for both problems - included at the bottom of this post.

To understand my fix, make sure you read this thread from the top, and remind yourself why the Math::RealEqual check was initially added.
So it's still the same issue with float imprecision, except that the Math::RealEqual check doesn't solve the problem. Here's why.

From OgreMath.h:

Code: Select all

        static bool RealEqual(Real a, Real b,
            Real tolerance = std::numeric_limits<Real>::epsilon());
From OgreMath.cpp

Code: Select all

    bool Math::RealEqual( Real a, Real b, Real tolerance )
    {
        if (fabs(b-a) <= tolerance)
            return true;
        else
            return false;
    }
As you can see, the default value for the tolerance variable is std::numeric_limits<Real>::epsilon().
The shows up in my debugger as: 9.9999997e-005

However, we're using this to see if some pretty gigantic floats are equal. The values of adepth and bdepth in my crash case are:
adepth=2555065.25
bdepth=2555065.25

Remember, these are squared distances. The actual (non-squared) distances are about 1600, so nothing abnormally large.

So the tolerance value is like 10^12 times smaller than the depth values being passed in to Math::RealEqual. But we only get about 6 or 7 digits of precision on a float. So basically that tolerance value is useless, and we're still affected by the strange behaviour of approximately equal floats to sometimes compare one way, and sometimes another way, as quoted by Sinbad above:
This suggests that this imprecision is returning opposite results when comparing 2 objects against each other. I didn't even know that was possible - even if floating point is imprecise it should still be deterministic, surely?
In fact, I verified that this exact problem was happening.
I added logging to the sort function, as follows:
(This is an unaltered version of the existing sort function, except for the logging)

Code: Select all

        /// Comparator to order objects by descending camera distance
		struct DepthSortDescendingLess
        {
            const Camera* camera;

            DepthSortDescendingLess(const Camera* cam)
                : camera(cam)
            {
            }

            bool _OgreExport operator()(const RenderablePass& a, const RenderablePass& b) const
            {
               std::stringstream ssLog;
               ssLog << std::endl
                  << "sorting a (0x" << &a << ") vs b (0x" << &b << ")" << std::endl
                  << "  a.renderable=" << a.renderable << " a.pass=" << a.pass << std::endl
                  << "  b.renderable=" << b.renderable << " b.pass=" << b.pass;

               LogManager::getSingleton().logMessage( ssLog.str() );

               if (a.renderable == b.renderable)
               {
                  // Same renderable, sort by pass hash
                  return a.pass->getHash() < b.pass->getHash();
               }
               else
               {
                  // Different renderables, sort by depth
                  Real adepth = a.renderable->getSquaredViewDepth(camera);
                  Real bdepth = b.renderable->getSquaredViewDepth(camera);

                  if (Math::RealEqual(adepth, bdepth))
                  {
                     std::stringstream ssLog;
                     ssLog 
                        << "adepth=" << std::setprecision(20) << adepth << " bdepth=" << bdepth << std::endl
                        << "Depth equal, sort by pass.  a.pass=" << a.pass << " b.pass=" << b.pass;
                     LogManager::getSingleton().logMessage( ssLog.str() );
                     // Must return deterministic result, doesn't matter what
                     return a.pass < b.pass;
                  }
                  else
                  {
                     std::stringstream ssLog;
                     ssLog 
                        << "adepth=" << std::setprecision(20) << adepth << " bdepth=" << bdepth << std::endl
                        << "Depth not equal, sorting by depth.";
                     LogManager::getSingleton().logMessage( ssLog.str() );

                     // Sort DESCENDING by depth (i.e. far objects first)
                     return (adepth > bdepth);
                  }
               }
            }
      };
And also, I frame the call to std::stable_sort with a "begin" and "end" log message, in QueuedRenderableCollection::sort:

Code: Select all

            LogManager::getSingleton().logMessage( "begin std::stable_sort" );
				std::stable_sort(
					mSortedDescending.begin(), mSortedDescending.end(), 
					DepthSortDescendingLess(cam));
            LogManager::getSingleton().logMessage( "end std::stable_sort" );
Here's the log I'm getting, leading up to the crash:
03:11:39: begin std::stable_sort
03:11:39: end std::stable_sort
03:11:39: begin std::stable_sort
03:11:39: end std::stable_sort
03:11:39: begin std::stable_sort
03:11:39: end std::stable_sort
03:11:39: begin std::stable_sort
03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B10)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=0D1BFC80 b.pass=02BFE6F0
03:11:39: adepth=2555065.25 bdepth=2555065.25
Depth not equal, sorting by depth.
03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B10)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=0D1BFC80 b.pass=02BFE6F0
03:11:39: adepth=2555065.25 bdepth=2555065.25
Depth equal, sort by pass. a.pass=02BEDE70 b.pass=02BFE6F0
03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B08)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=5368E8E0 b.pass=0000001B
Look, it's comparing the same 2 RenderablePasses twice, once finding the depth not equal:

(this is a chunk from the log above):
03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B10)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=0D1BFC80 b.pass=02BFE6F0
03:11:39: adepth=2555065.25 bdepth=2555065.25
Depth not equal, sorting by depth.
And the 2nd time finding the depth equal

(this is a chunk from the log above):
03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B10)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=0D1BFC80 b.pass=02BFE6F0
03:11:39: adepth=2555065.25 bdepth=2555065.25
Depth equal, sort by pass. a.pass=02BEDE70 b.pass=02BFE6F0
I'm not quite sure why stable_sort is comparing the same 2 RenderablePasses twice, but for now I'm just assuming that's normal behaviour. Of course, the real issue is the inconsistent result of Math::RealEqual for the large depth values.

Fix:

Code: Select all

            bool _OgreExport operator()(const RenderablePass& a, const RenderablePass& b) const
            {
               if (a.renderable == b.renderable)
               {
                  // Same renderable, sort by pass hash
                  return a.pass->getHash() < b.pass->getHash();
               }
               else
               {
                  // Different renderables, sort by depth
                  Real adepth = a.renderable->getSquaredViewDepth(camera);
                  Real bdepth = b.renderable->getSquaredViewDepth(camera);

                  // Floats only have about 6 or 7 digits of precision, so provide a tolerance within this limit.   
                  float tolerance = std::max(adepth, bdepth) / 1000000.f; 
                  if (Math::RealEqual(adepth, bdepth, tolerance))
                  {
                     // Must return deterministic result, doesn't matter what
                     return false;
                  }
                  else
                  {
                     // Sort DESCENDING by depth (i.e. far objects first)
                     return (adepth > bdepth);
                  }
               }
            }
      };
To summarize, the 2 changes are:
1. providing a useful tolerance to Math::RealEqual.
2. returning false in the case the depths are found to be equal. (I haven't gone into the details about why this fixes the potential problem I identified last post, but I'm pretty sure it does. We can discuss that further if needed.)

With these changes, the crash no longer occurred.
Image
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by CABAListic »

Have you tried what happens if you remove the equality test altogether?
User avatar
Jabberwocky
OGRE Moderator
OGRE Moderator
Posts: 2819
Joined: Mon Mar 05, 2007 11:17 pm
Location: Canada
x 218
Contact:

Re: Inconsistancy in floating-point modes can cause rare cra

Post by Jabberwocky »

Checking that now.
Image
Post Reply