Inconsistancy in floating-point modes can cause rare crashes
-
- Gnoblar
- Posts: 10
- Joined: Fri Dec 09, 2005 12:07 pm
- Location: Amsterdam, The Netherlands
Inconsistancy in floating-point modes can cause rare crashes
This is an issue I started having after I would set the 'floating-point mode' to 'consistent' (as part of the Direct3D config options).
I would get occasional (yet consistent and reproducable) crashes, that typically occured when two identical objects (in my case planes) where at the same position, and the camera was at a specific position as well (at some position perpendicular to the planes in my case). The crash would always show up in the comparison operator from 'DepthSortDescendingLess', that was accessed by the 'std::stable_sort()' call from 'QueuedRenderableCollection::sort()'.
In the end, the crash disappeared after I recompiled the Ogre library using compile option /fp:precise (Visual Studio 2005).
On a side note: I found out that initializing the Direct3D renderer with the floating point mode on 'fast' will affect ALL of the floating point operations in your application, even if it was compiled with /fp:precise!
I would get occasional (yet consistent and reproducable) crashes, that typically occured when two identical objects (in my case planes) where at the same position, and the camera was at a specific position as well (at some position perpendicular to the planes in my case). The crash would always show up in the comparison operator from 'DepthSortDescendingLess', that was accessed by the 'std::stable_sort()' call from 'QueuedRenderableCollection::sort()'.
In the end, the crash disappeared after I recompiled the Ogre library using compile option /fp:precise (Visual Studio 2005).
On a side note: I found out that initializing the Direct3D renderer with the floating point mode on 'fast' will affect ALL of the floating point operations in your application, even if it was compiled with /fp:precise!
- sinbad
- OGRE Retired Team Member
- Posts: 19269
- Joined: Sun Oct 06, 2002 11:19 pm
- Location: Guernsey, Channel Islands
- x 66
- Contact:
Interesting, we have had a few people reporting random crashes in this area and I've never been able to recreate it. http://www.ogre3d.org/phpBB2/viewtopic. ... 691#245691
This suggests that this imprecision is returning opposite results when comparing 2 objects against each other. I didn't even know that was possible - even if floating point is imprecise it should still be deterministic, surely?
This suggests that this imprecision is returning opposite results when comparing 2 objects against each other. I didn't even know that was possible - even if floating point is imprecise it should still be deterministic, surely?
-
- Gnoblar
- Posts: 9
- Joined: Mon Aug 06, 2007 3:50 am
-
- Gnoblar
- Posts: 10
- Joined: Fri Dec 09, 2005 12:07 pm
- Location: Amsterdam, The Netherlands
Yes, very strange indeed... All I can add to this at this moment is that I did NOT get the crash when I would let Direct3D set the floating-point mode to 'fast' (both when Ogre was compiled using '/fp:precise' and '/fp:fast'). But I still would have the crash if I compiled both Ogre as well as my main application using '/fp:fast'.sinbad wrote:This suggests that this imprecision is returning opposite results when comparing 2 objects against each other. I didn't even know that was possible - even if floating point is imprecise it should still be deterministic, surely?
So, is it an option to use '/fp:precise' as a default floating-point mode for Ogre? If this would cause a significant decrease in performance, one should be able to override this using the Direct3D setting. But that would not work with OpenGL.
- sinbad
- OGRE Retired Team Member
- Posts: 19269
- Joined: Sun Oct 06, 2002 11:19 pm
- Location: Guernsey, Channel Islands
- x 66
- Contact:
-
- Gnoblar
- Posts: 10
- Joined: Fri Dec 09, 2005 12:07 pm
- Location: Amsterdam, The Netherlands
- sinbad
- OGRE Retired Team Member
- Posts: 19269
- Joined: Sun Oct 06, 2002 11:19 pm
- Location: Guernsey, Channel Islands
- x 66
- Contact:
Someone sent me a repro, although even that happened in different cases to what he had. Please try this:
I think it's because the == is not succeeding / failing symmetrically in non-precise mode, this gives it a little more elbow room at the expense of less precision at very very low values. It fixes it for my case, but I'd like to know it works in others too.
Code: Select all
Index: OgreMain/include/OgreRenderQueueSortingGrouping.h
===================================================================
RCS file: /cvsroot/ogre/ogrenew/OgreMain/include/OgreRenderQueueSortingGrouping.h,v
retrieving revision 1.41
diff -u -r1.41 OgreRenderQueueSortingGrouping.h
--- OgreMain/include/OgreRenderQueueSortingGrouping.h 23 Aug 2006 08:18:35 -0000 1.41
+++ OgreMain/include/OgreRenderQueueSortingGrouping.h 30 Aug 2007 15:48:22 -0000
@@ -170,7 +170,7 @@
// Different renderables, sort by depth
Real adepth = a.renderable->getSquaredViewDepth(camera);
Real bdepth = b.renderable->getSquaredViewDepth(camera);
- if (adepth == bdepth)
+ if (Math::RealEqual(adepth, bdepth))
{
// Must return deterministic result, doesn't matter what
return a.pass < b.pass;
- pricorde
- Greenskin
- Posts: 114
- Joined: Thu Aug 11, 2005 9:28 pm
- Location: France
- Contact:
I applied this patch and tried to reproduce the specific situation in which the game crashed, and it does not crashes anymore.
So it seems that this patch solved this difficult issue.
Thanks a lot Sinbad.
As a developer, you often learn the hard way that using == on floats is rarely a good idea, but this asymmetry of == in non-precise mode is new to me!
So it seems that this patch solved this difficult issue.
Thanks a lot Sinbad.
As a developer, you often learn the hard way that using == on floats is rarely a good idea, but this asymmetry of == in non-precise mode is new to me!
Rigs of Rods: http://rigsofrods.blogspot.com/
- xavier
- OGRE Retired Moderator
- Posts: 9481
- Joined: Fri Feb 18, 2005 2:03 am
- Location: Dublin, CA, US
- x 22
- sinbad
- OGRE Retired Team Member
- Posts: 19269
- Joined: Sun Oct 06, 2002 11:19 pm
- Location: Guernsey, Channel Islands
- x 66
- Contact:
Yeah I know, but in this case equality doesn't matter in fact, only consistency. I wouldn't care if the '==' was never true, as is likely, so long as the '<' returns consistent results - that's all that's needed, the equality check was supposed to be a rare boundary condition check to maintain deterministic ordering if the floating point values ever did come out exactly equal.xavier wrote:Defintely -- '==' with floating-point values is nearly useless, hence the high incidence of usage of inequality operators and epsilons instead. Even values such as "0.0" and "1.0" do not always turn out to be what you'd expect.
It appears though that under /fp:fast even '<' is not reliable when values are close - it's possible for the comparison to return the same value when the arguments are reversed if they are close. That's news to me, and that's what this patch addresses.
- Jabberwocky
- OGRE Moderator
- Posts: 2819
- Joined: Mon Mar 05, 2007 11:17 pm
- Location: Canada
- x 218
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
I just stumbled across a crash related to this thread.
The problem is similar - the DepthSortDescendingLess comparator is not deterministic, which causes stable_sort to crash.
I'll copy the comparator operation here, so you can see what I mean:
from file: ogre_src_v1-7-1\ogremain\include\ogrerenderqueuesortinggrouping.h
The important part is this:
In my case, Math::RealEqual is returning true, so we fall into the a.pass < b.pass check. Unfortunately, both my renderables are using the same material, and so a.pass equals b.pass. So, comparing pass pointers via less than is not consistent. And this leads to the stable_sort crash.
A better check would be to compare the RenderablePass address pointers, since it is guaranteed these will never be equal.
Summary of the fix:
file: ogrerenderqueuesortinggrouping.h
function: DepthSortDescendingLess::operator()
line number: 181 (in Ogre 1.7.1)
I have submitted a bug referencing this post: http://www.ogre3d.org/mantis/view.php?id=457
The problem is similar - the DepthSortDescendingLess comparator is not deterministic, which causes stable_sort to crash.
I'll copy the comparator operation here, so you can see what I mean:
from file: ogre_src_v1-7-1\ogremain\include\ogrerenderqueuesortinggrouping.h
Code: Select all
struct DepthSortDescendingLess
{
const Camera* camera;
DepthSortDescendingLess(const Camera* cam)
: camera(cam)
{
}
bool _OgreExport operator()(const RenderablePass& a, const RenderablePass& b) const
{
if (a.renderable == b.renderable)
{
// Same renderable, sort by pass hash
return a.pass->getHash() < b.pass->getHash();
}
else
{
// Different renderables, sort by depth
Real adepth = a.renderable->getSquaredViewDepth(camera);
Real bdepth = b.renderable->getSquaredViewDepth(camera);
if (Math::RealEqual(adepth, bdepth))
{
// Must return deterministic result, doesn't matter what
return a.pass < b.pass;
}
else
{
// Sort DESCENDING by depth (i.e. far objects first)
return (adepth > bdepth);
}
}
}
};
Code: Select all
if (Math::RealEqual(adepth, bdepth))
{
// Must return deterministic result, doesn't matter what
return a.pass < b.pass;
}
A better check would be to compare the RenderablePass address pointers, since it is guaranteed these will never be equal.
Code: Select all
if (Math::RealEqual(adepth, bdepth))
{
// Must return deterministic result, doesn't matter what
return &a < &b;
}
file: ogrerenderqueuesortinggrouping.h
function: DepthSortDescendingLess::operator()
line number: 181 (in Ogre 1.7.1)
Code: Select all
- return a.pass < b.pass;
+ return &a < &b;
- dark_sylinc
- OGRE Team Member
- Posts: 5296
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1278
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
Unfortunately, strictly speaking your solution is not optimum either. Comparing pointers isn't guaranteed to be deterministic due to how virtual memory addressing works. This is very rare voodoo which usually happens when you start managing virtual memory yourself (aka. not calling malloc), but I must point out it may happen.
And some architectures other than x86 may get whinier about it (I'm unaware of any though).
And some architectures other than x86 may get whinier about it (I'm unaware of any though).
- sparkprime
- Ogre Magi
- Posts: 1137
- Joined: Mon May 07, 2007 3:43 am
- Location: Ossining, New York
- x 13
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
You can use std::less to compare pointers. Otherwise std::map<Fish*> would not work.
edit: I may mean std::compare, can't remember
edit: I may mean std::compare, can't remember
- dark_sylinc
- OGRE Team Member
- Posts: 5296
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1278
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
I've been thinking about it, and there shouldn't be a problem because if in the rare case two different memory regions are assigned to the same virtual address (which means "&a < &b" will fail) then pretty much everything else will also fail, and there's a bigger bug to worry about in the developer's code.
On a different angle, comparing pointers breaks determinism. Meaning the same render may be rendered different on two different runs, because the pointer addresses assigned to 'a' & 'b' were different.
This may affect automated tests, and potentially games relying on determinism (though, if the results of the sort are only used internally, then games using Ogre should be unaffected; as they never see or depend in any way on the sort results).
In other words, it's deterministic within a run, but not multiple re-runs of the application.
On a different angle, comparing pointers breaks determinism. Meaning the same render may be rendered different on two different runs, because the pointer addresses assigned to 'a' & 'b' were different.
This may affect automated tests, and potentially games relying on determinism (though, if the results of the sort are only used internally, then games using Ogre should be unaffected; as they never see or depend in any way on the sort results).
In other words, it's deterministic within a run, but not multiple re-runs of the application.
- Jabberwocky
- OGRE Moderator
- Posts: 2819
- Joined: Mon Mar 05, 2007 11:17 pm
- Location: Canada
- x 218
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
That's a good point. My proposed change does preserve determinism within a particular sort, so it is still a valid fix for the crash. But you're right, it potentially breaks determinism between separate runs (and maybe even separate frames?).dark_sylinc wrote:On a different angle, comparing pointers breaks determinism.
This fix is still far preferable than a crash of course. We are talking about an edge-case to an edge case (equal depth, same material) that quite likely won't affect the vast majority of users or test cases - if it did, we'd be hearing about this crash far more often. Still, if we can come up with a better solution that preserves determinism across different runs, fantastic. If not, I would still advocate the change going in, again because the crash is obviously the far greater issue.
Maybe having some kind of auto-incrementing ID integer built into the RenderablePass which could be used in the comparison would fix the determinism problem. I don't know if that's overkill or not for such an unlikely problem.
-
- OGRE Retired Team Member
- Posts: 2903
- Joined: Thu Jan 18, 2007 2:48 pm
- x 58
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
What if you just return false? It doesn't get any more deterministic than that, and in terms of ordering it means that the objects are considered equal (a is not smaller than b and b is not smaller than a). If stable_sort is the sorting algorithm used, then that should ensure deterministic ordering.
- Jabberwocky
- OGRE Moderator
- Posts: 2819
- Joined: Mon Mar 05, 2007 11:17 pm
- Location: Canada
- x 218
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
That's the exact circumstance which leads to the stable_sort crash. It confuses the hell out of stable_sort if a<b and b<a.
-
- OGRE Retired Team Member
- Posts: 2903
- Joined: Thu Jan 18, 2007 2:48 pm
- x 58
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
Yes, but returning false means that a !< b and b !< a. stable_sort can definitely deal with that.
- Jabberwocky
- OGRE Moderator
- Posts: 2819
- Joined: Mon Mar 05, 2007 11:17 pm
- Location: Canada
- x 218
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
Ok yeah, you're right.
Taking that into consideration, I now believe I misdiagnosed the problem earlier.
The circumstance leading up to my crash also should have always returned false:
since both pointers were equal (same material).
But that shouldn't actually have caused a crash.
Damn.
Here's what I know for sure.
1. I had a reproducable crash in stable_sort
2. The stack was similar as the one linked to above.
3. I could only reproduce it in release, not debug, so the stack was a bit mangled due to release optimization.
4. The crash went away when I made the one line "fix" that I explained in my previous post.
5. I am using DirectX and Floating-point mode=Consistent
Right now I'm thinking my "fix" just covered up the real crash somehow, and it's not actually fixed. I'll have to look into it. Thanks everyone for your input.
Taking that into consideration, I now believe I misdiagnosed the problem earlier.
The circumstance leading up to my crash also should have always returned false:
Code: Select all
return a.pass < b.pass;
But that shouldn't actually have caused a crash.
Damn.
Here's what I know for sure.
1. I had a reproducable crash in stable_sort
2. The stack was similar as the one linked to above.
3. I could only reproduce it in release, not debug, so the stack was a bit mangled due to release optimization.
4. The crash went away when I made the one line "fix" that I explained in my previous post.
5. I am using DirectX and Floating-point mode=Consistent
Right now I'm thinking my "fix" just covered up the real crash somehow, and it's not actually fixed. I'll have to look into it. Thanks everyone for your input.
- sparkprime
- Ogre Magi
- Posts: 1137
- Joined: Mon May 07, 2007 3:43 am
- Location: Ossining, New York
- x 13
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
I realise hte thread has moved on, but for the recorddark_sylinc wrote:I've been thinking about it, and there shouldn't be a problem because if in the rare case two different memory regions are assigned to the same virtual address (which means "&a < &b" will fail) then pretty much everything else will also fail, and there's a bigger bug to worry about in the developer's code.
On a different angle, comparing pointers breaks determinism. Meaning the same render may be rendered different on two different runs, because the pointer addresses assigned to 'a' & 'b' were different.
This may affect automated tests, and potentially games relying on determinism (though, if the results of the sort are only used internally, then games using Ogre should be unaffected; as they never see or depend in any way on the sort results).
In other words, it's deterministic within a run, but not multiple re-runs of the application.
ptr < ptr2 is only allowed (according to the standard) if they are pointers into the same allocated chunk, e.g. &arr[0] < &arr[10]
This is because memory is not required by the standard to be totally ordered.
The correct way is to use the std::less thing. std::map uses std::less.
However all of the systems we care about will do 'what you want' with ptr < ptr2 regardless of whether they are part of the same chunk.
For such systems, std::less is no slower than a raw < so even with that in mind, there is no reason not to use std::less.
- Jabberwocky
- OGRE Moderator
- Posts: 2819
- Joined: Mon Mar 05, 2007 11:17 pm
- Location: Canada
- x 218
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
(btw- thanks sparkprime. Interesting).
I have a theory about this crash.
It is because of the epsilon in Math::RealEqual
Again, the relevant sort code, from ogrerenderqueuesortinggrouping.h
So, we compare in this order:
1. compare depth
2. conpare pass pointer (if depth is equal)
For the sake of argument, let's say the epsilon in Math::RealEqual is 0.05.
Now, imagine we're sorting 3 RenderablePass, a, b, and c:
RenderablePass a: depth=1.01 pass=0x3
RenderablePass b: depth=1.04 pass=0x2
RenderablePass c: depth=1.07 pass=0x1
I will show you that these values lead to two incompatible conclusions, namely:
b < a < c
and
b > c
Details:
Step 1: compare a to b
a.depth == b.depth (within epsilon of 0.05)
-> so compare passes
a.pass > b.pass
conclusion: a > b
Step 2: compare a to c
a.depth < c.depth
conclusion: a < c
*** Based on step 1 and step 2, we can conclude: b < a < c
Everything is good so far.
Step 3: compare b to c
b.depth == c.depth (within espilon of 0.05)
-> so compare passes
b.pass > c.pass
*** conclusion: b > c
The two lines marked above with *** are inconsistent. It cannot be true that:
b > c
and
b < a < c
This seems like it could possibly crash the sorting routine.
As you can see, the epsilon of Math::RealEqual can cause a series of RenderablePasses which have a very close, but non-equal depth to be sorted by depth or by pass in an inconsistent manner. Unfortunately, this Math::RealEqual check was explicitly added as a fix to an earlier crash bug (see earlier in this thread), so simply removing it is not an option.
I don't have a solution yet, but just wanted to post what I'd found so far.
Thanks for the support on this issue.
I have a theory about this crash.
It is because of the epsilon in Math::RealEqual
Again, the relevant sort code, from ogrerenderqueuesortinggrouping.h
Code: Select all
if (Math::RealEqual(adepth, bdepth))
{
// Must return deterministic result, doesn't matter what
return a.pass < b.pass;
}
else
{
// Sort DESCENDING by depth (i.e. far objects first)
return (adepth > bdepth);
}
1. compare depth
2. conpare pass pointer (if depth is equal)
For the sake of argument, let's say the epsilon in Math::RealEqual is 0.05.
Now, imagine we're sorting 3 RenderablePass, a, b, and c:
RenderablePass a: depth=1.01 pass=0x3
RenderablePass b: depth=1.04 pass=0x2
RenderablePass c: depth=1.07 pass=0x1
I will show you that these values lead to two incompatible conclusions, namely:
b < a < c
and
b > c
Details:
Step 1: compare a to b
a.depth == b.depth (within epsilon of 0.05)
-> so compare passes
a.pass > b.pass
conclusion: a > b
Step 2: compare a to c
a.depth < c.depth
conclusion: a < c
*** Based on step 1 and step 2, we can conclude: b < a < c
Everything is good so far.
Step 3: compare b to c
b.depth == c.depth (within espilon of 0.05)
-> so compare passes
b.pass > c.pass
*** conclusion: b > c
The two lines marked above with *** are inconsistent. It cannot be true that:
b > c
and
b < a < c
This seems like it could possibly crash the sorting routine.
As you can see, the epsilon of Math::RealEqual can cause a series of RenderablePasses which have a very close, but non-equal depth to be sorted by depth or by pass in an inconsistent manner. Unfortunately, this Math::RealEqual check was explicitly added as a fix to an earlier crash bug (see earlier in this thread), so simply removing it is not an option.
I don't have a solution yet, but just wanted to post what I'd found so far.
Thanks for the support on this issue.
-
- OGRE Retired Team Member
- Posts: 2903
- Joined: Thu Jan 18, 2007 2:48 pm
- x 58
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
Sounds reasonable. Unfortunately, I don't see an easy way out. If the < operator really isn't reliable, then what else is there to do?
I would, however, like to reexamine if the behaviour of < really isn't predictable (with fp:fast). That claim seems really odd to me, as it would break *any* sorting of floating point values, even a std::set<float> could potentially break. That just doesn't make sense to me.
Edit: Actually, even the original version before the epsilon comparison had a check on equality. Perhaps it was that check on equality that was inconsistent rather than the comparison operator? Either way, as long as the < operator is reliable, no equality check is needed at all, so maybe just try to remove that entirely?
I would, however, like to reexamine if the behaviour of < really isn't predictable (with fp:fast). That claim seems really odd to me, as it would break *any* sorting of floating point values, even a std::set<float> could potentially break. That just doesn't make sense to me.
Edit: Actually, even the original version before the epsilon comparison had a check on equality. Perhaps it was that check on equality that was inconsistent rather than the comparison operator? Either way, as long as the < operator is reliable, no equality check is needed at all, so maybe just try to remove that entirely?
- Jabberwocky
- OGRE Moderator
- Posts: 2819
- Joined: Mon Mar 05, 2007 11:17 pm
- Location: Canada
- x 218
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
I tracked down the bug.
It is not the issue I identified in my last post - although I believe that could still potentially be a problem.
I have a fix for both problems - included at the bottom of this post.
To understand my fix, make sure you read this thread from the top, and remind yourself why the Math::RealEqual check was initially added.
So it's still the same issue with float imprecision, except that the Math::RealEqual check doesn't solve the problem. Here's why.
From OgreMath.h:
From OgreMath.cpp
As you can see, the default value for the tolerance variable is std::numeric_limits<Real>::epsilon().
The shows up in my debugger as: 9.9999997e-005
However, we're using this to see if some pretty gigantic floats are equal. The values of adepth and bdepth in my crash case are:
adepth=2555065.25
bdepth=2555065.25
Remember, these are squared distances. The actual (non-squared) distances are about 1600, so nothing abnormally large.
So the tolerance value is like 10^12 times smaller than the depth values being passed in to Math::RealEqual. But we only get about 6 or 7 digits of precision on a float. So basically that tolerance value is useless, and we're still affected by the strange behaviour of approximately equal floats to sometimes compare one way, and sometimes another way, as quoted by Sinbad above:
I added logging to the sort function, as follows:
(This is an unaltered version of the existing sort function, except for the logging)
And also, I frame the call to std::stable_sort with a "begin" and "end" log message, in QueuedRenderableCollection::sort:
Here's the log I'm getting, leading up to the crash:
(this is a chunk from the log above):
(this is a chunk from the log above):
Fix:
To summarize, the 2 changes are:
1. providing a useful tolerance to Math::RealEqual.
2. returning false in the case the depths are found to be equal. (I haven't gone into the details about why this fixes the potential problem I identified last post, but I'm pretty sure it does. We can discuss that further if needed.)
With these changes, the crash no longer occurred.
It is not the issue I identified in my last post - although I believe that could still potentially be a problem.
I have a fix for both problems - included at the bottom of this post.
To understand my fix, make sure you read this thread from the top, and remind yourself why the Math::RealEqual check was initially added.
So it's still the same issue with float imprecision, except that the Math::RealEqual check doesn't solve the problem. Here's why.
From OgreMath.h:
Code: Select all
static bool RealEqual(Real a, Real b,
Real tolerance = std::numeric_limits<Real>::epsilon());
Code: Select all
bool Math::RealEqual( Real a, Real b, Real tolerance )
{
if (fabs(b-a) <= tolerance)
return true;
else
return false;
}
The shows up in my debugger as: 9.9999997e-005
However, we're using this to see if some pretty gigantic floats are equal. The values of adepth and bdepth in my crash case are:
adepth=2555065.25
bdepth=2555065.25
Remember, these are squared distances. The actual (non-squared) distances are about 1600, so nothing abnormally large.
So the tolerance value is like 10^12 times smaller than the depth values being passed in to Math::RealEqual. But we only get about 6 or 7 digits of precision on a float. So basically that tolerance value is useless, and we're still affected by the strange behaviour of approximately equal floats to sometimes compare one way, and sometimes another way, as quoted by Sinbad above:
In fact, I verified that this exact problem was happening.This suggests that this imprecision is returning opposite results when comparing 2 objects against each other. I didn't even know that was possible - even if floating point is imprecise it should still be deterministic, surely?
I added logging to the sort function, as follows:
(This is an unaltered version of the existing sort function, except for the logging)
Code: Select all
/// Comparator to order objects by descending camera distance
struct DepthSortDescendingLess
{
const Camera* camera;
DepthSortDescendingLess(const Camera* cam)
: camera(cam)
{
}
bool _OgreExport operator()(const RenderablePass& a, const RenderablePass& b) const
{
std::stringstream ssLog;
ssLog << std::endl
<< "sorting a (0x" << &a << ") vs b (0x" << &b << ")" << std::endl
<< " a.renderable=" << a.renderable << " a.pass=" << a.pass << std::endl
<< " b.renderable=" << b.renderable << " b.pass=" << b.pass;
LogManager::getSingleton().logMessage( ssLog.str() );
if (a.renderable == b.renderable)
{
// Same renderable, sort by pass hash
return a.pass->getHash() < b.pass->getHash();
}
else
{
// Different renderables, sort by depth
Real adepth = a.renderable->getSquaredViewDepth(camera);
Real bdepth = b.renderable->getSquaredViewDepth(camera);
if (Math::RealEqual(adepth, bdepth))
{
std::stringstream ssLog;
ssLog
<< "adepth=" << std::setprecision(20) << adepth << " bdepth=" << bdepth << std::endl
<< "Depth equal, sort by pass. a.pass=" << a.pass << " b.pass=" << b.pass;
LogManager::getSingleton().logMessage( ssLog.str() );
// Must return deterministic result, doesn't matter what
return a.pass < b.pass;
}
else
{
std::stringstream ssLog;
ssLog
<< "adepth=" << std::setprecision(20) << adepth << " bdepth=" << bdepth << std::endl
<< "Depth not equal, sorting by depth.";
LogManager::getSingleton().logMessage( ssLog.str() );
// Sort DESCENDING by depth (i.e. far objects first)
return (adepth > bdepth);
}
}
}
};
Code: Select all
LogManager::getSingleton().logMessage( "begin std::stable_sort" );
std::stable_sort(
mSortedDescending.begin(), mSortedDescending.end(),
DepthSortDescendingLess(cam));
LogManager::getSingleton().logMessage( "end std::stable_sort" );
Look, it's comparing the same 2 RenderablePasses twice, once finding the depth not equal:03:11:39: begin std::stable_sort
03:11:39: end std::stable_sort
03:11:39: begin std::stable_sort
03:11:39: end std::stable_sort
03:11:39: begin std::stable_sort
03:11:39: end std::stable_sort
03:11:39: begin std::stable_sort
03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B10)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=0D1BFC80 b.pass=02BFE6F0
03:11:39: adepth=2555065.25 bdepth=2555065.25
Depth not equal, sorting by depth.
03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B10)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=0D1BFC80 b.pass=02BFE6F0
03:11:39: adepth=2555065.25 bdepth=2555065.25
Depth equal, sort by pass. a.pass=02BEDE70 b.pass=02BFE6F0
03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B08)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=5368E8E0 b.pass=0000001B
(this is a chunk from the log above):
And the 2nd time finding the depth equal03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B10)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=0D1BFC80 b.pass=02BFE6F0
03:11:39: adepth=2555065.25 bdepth=2555065.25
Depth not equal, sorting by depth.
(this is a chunk from the log above):
I'm not quite sure why stable_sort is comparing the same 2 RenderablePasses twice, but for now I'm just assuming that's normal behaviour. Of course, the real issue is the inconsistent result of Math::RealEqual for the large depth values.03:11:39:
sorting a (0x04F45B18) vs b (0x04F45B10)
a.renderable=0D1BCDD8 a.pass=02BEDE70
b.renderable=0D1BFC80 b.pass=02BFE6F0
03:11:39: adepth=2555065.25 bdepth=2555065.25
Depth equal, sort by pass. a.pass=02BEDE70 b.pass=02BFE6F0
Fix:
Code: Select all
bool _OgreExport operator()(const RenderablePass& a, const RenderablePass& b) const
{
if (a.renderable == b.renderable)
{
// Same renderable, sort by pass hash
return a.pass->getHash() < b.pass->getHash();
}
else
{
// Different renderables, sort by depth
Real adepth = a.renderable->getSquaredViewDepth(camera);
Real bdepth = b.renderable->getSquaredViewDepth(camera);
// Floats only have about 6 or 7 digits of precision, so provide a tolerance within this limit.
float tolerance = std::max(adepth, bdepth) / 1000000.f;
if (Math::RealEqual(adepth, bdepth, tolerance))
{
// Must return deterministic result, doesn't matter what
return false;
}
else
{
// Sort DESCENDING by depth (i.e. far objects first)
return (adepth > bdepth);
}
}
}
};
1. providing a useful tolerance to Math::RealEqual.
2. returning false in the case the depths are found to be equal. (I haven't gone into the details about why this fixes the potential problem I identified last post, but I'm pretty sure it does. We can discuss that further if needed.)
With these changes, the crash no longer occurred.
-
- OGRE Retired Team Member
- Posts: 2903
- Joined: Thu Jan 18, 2007 2:48 pm
- x 58
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
Have you tried what happens if you remove the equality test altogether?
- Jabberwocky
- OGRE Moderator
- Posts: 2819
- Joined: Mon Mar 05, 2007 11:17 pm
- Location: Canada
- x 218
- Contact:
Re: Inconsistancy in floating-point modes can cause rare cra
Checking that now.