So you were ahead of us all with this Lockless Queue stuff!xavier wrote:In this case, the element count *is* your gatekeeper,
Threaded Game Engine
- CaseyB
- OGRE Contributor
- Posts: 1335
- Joined: Sun Nov 20, 2005 2:42 pm
- Location: Columbus, Ohio
- x 3
- Contact:
- xavier
- OGRE Retired Moderator
- Posts: 9481
- Joined: Fri Feb 18, 2005 2:03 am
- Location: Dublin, CA, US
- x 22
Looks like PortMidi was ahead of all of us. The only difference between the algorithm I was describing and what LDK does, is they check head against tail for the empty case, which works just as well since the queue will have an element to read once the pointers are not the same. It also has the same restriction on one-reader-one-writer, which is fine in this case because the only thread that should be writing to the rendering queue is the main logic thread (which should get its updates from the other threads or subprocessing, such as physics and input). In fact, I would design the engine in such a way that you only ever have one reader and one writer for any given message queue.
The main logic thread would also write updates to an audio rendering thread (not so sure that an audio manager needs to be in its own thread since libraries such as OpenAL and fmod will already have threads internally, much like PhysX runs its world in a separate thread for you). So typically you would have at least the following threads in a game engine:
- rendering
- main loop (can contain physics and audio processing for the reasons stated above)
- queue management thread
- input
I put input in a separate thread because I would rather input events keep being processed regardless of the speed of the main loop -- YMMV on this, the separate input is my preference (and it doesn't hurt anything, it just passes messages to the main loop using the same queue mechanism).
That queue implementation looks fine to me for a message queue, Zeal. Good find Frenetic!
[edit: forgot the queue management thread, which removes the "what if the OS puts a thread to sleep" issue raised in Frenetic's lockfree queue link below]
The main logic thread would also write updates to an audio rendering thread (not so sure that an audio manager needs to be in its own thread since libraries such as OpenAL and fmod will already have threads internally, much like PhysX runs its world in a separate thread for you). So typically you would have at least the following threads in a game engine:
- rendering
- main loop (can contain physics and audio processing for the reasons stated above)
- queue management thread
- input
I put input in a separate thread because I would rather input events keep being processed regardless of the speed of the main loop -- YMMV on this, the separate input is my preference (and it doesn't hurt anything, it just passes messages to the main loop using the same queue mechanism).
That queue implementation looks fine to me for a message queue, Zeal. Good find Frenetic!
[edit: forgot the queue management thread, which removes the "what if the OS puts a thread to sleep" issue raised in Frenetic's lockfree queue link below]
Last edited by xavier on Fri Dec 01, 2006 7:20 am, edited 2 times in total.
- Frenetic
- Bugbear
- Posts: 806
- Joined: Fri Feb 03, 2006 7:08 am
That would probably be optimal. Shouldn't be hard to change.CaseyB wrote:Couldn't you just return false instead to let the user know that the method failed.
More interesting links on this subject:
http://www.boyet.com/Articles/LockfreeQueue.html
http://www.boyet.com/Articles/LockfreeFreeList.html
http://www.audiomulch.com/~rossb/code/lockfree/
My Google-fu is strong.
- CaseyB
- OGRE Contributor
- Posts: 1335
- Joined: Sun Nov 20, 2005 2:42 pm
- Location: Columbus, Ohio
- x 3
- Contact:
In the first example of a Lockless Queue, there is this codeand overflow was set hereMy question is why does it return after simply clearing the overflow? shouldn't it actually remove something from the queue?
Code: Select all
inline void dequeue(T& msg)
{
if(mQueue->mOverflow)
{
mQueue->mOverflow = false;
return;
}
...
}
Code: Select all
inline void enqueue(const T& msg)
{
size_t tail = mQueue->mTail;
memcpy((*mQueue)[tail], &msg, sizeof(T));
tail++;
if(tail == mSize)
tail = 0;
if(tail == mQueue->mHead)
{
mQueue->mOverflow = true;
throw LDK_LENGTH_ERROR("LocklessQueue overflow");
}
mQueue->mTail = tail;
}
- Frenetic
- Bugbear
- Posts: 806
- Joined: Fri Feb 03, 2006 7:08 am
- Frenetic
- Bugbear
- Posts: 806
- Joined: Fri Feb 03, 2006 7:08 am
Note that when the class overflows, it throws an "error" exception, which should indicate to the user that the state of the class is now "undefined". Still, the code does appear rather inconsistent.
It also has this gem in it:
I hate it when people do that. Everyone should know that 'delete NULL' is harmless. And testing with 'if' won't save you if the pointer's value is garbage.
I'm thinking this is a rather inconsistent class wrapping a higher-quality lockless queue implementation.
Still, it shouldn't be that hard to clean up.
It also has this gem in it:
Code: Select all
if(mQueue)
delete mQueue;
I'm thinking this is a rather inconsistent class wrapping a higher-quality lockless queue implementation.
Still, it shouldn't be that hard to clean up.
- CaseyB
- OGRE Contributor
- Posts: 1335
- Joined: Sun Nov 20, 2005 2:42 pm
- Location: Columbus, Ohio
- x 3
- Contact:
No, not at all! It's a really great starting point! I also don't understand theFrenetic wrote:Still, it shouldn't be that hard to clean up.
Code: Select all
inline ~QType()
{
if(mBuffer)
::free(mBuffer);
}
- Frenetic
- Bugbear
- Posts: 806
- Joined: Fri Feb 03, 2006 7:08 am
The :: operator with nothing in front of it means you are explicitly using the global namespace. In this case, the free() function is being used.CaseyB wrote:What is ::free all about?Code: Select all
inline ~QType() { if(mBuffer) ::free(mBuffer); }
You know, free(), as in that heathenous C function that should never, ever be used in C++ code?
[edit] Actually, it appears that code is just a hacky way of preventing the object's destructor from being called...
- CaseyB
- OGRE Contributor
- Posts: 1335
- Joined: Sun Nov 20, 2005 2:42 pm
- Location: Columbus, Ohio
- x 3
- Contact:
Ok, there is something that I am missing. I have taken the code that Frenetic linked ot and changed it up a bit and written a test for it. The message goes into the queue fine, but comes out trashed. Here's the code:
LocklessQueue.h
main.cpp
LocklessQueue.h
Code: Select all
#ifndef _LOCKLESSQUEUE_H_
#define _LOCKLESSQUEUE_H_
#include "OgreException.h"
namespace Blah {
template <class T>
class LocklessQueue
{
struct QType
{
size_t mHead;
size_t mTail;
T** mBuffer;
bool mOverflow;
inline QType(size_t size) : mHead(0), mTail(0),
mBuffer(NULL), mOverflow(false)
{
mBuffer = new T*[size];
if(!mBuffer)
throw Ogre::Exception(Ogre::Exception(1, "Could not allocate lockless queue.",
Ogre::String(__FUNCTION__)));
}
inline ~QType()
{
delete mBuffer[];
}
/*inline const T* operator[](size_t idx) const
{
return mBuffer[idx];
}*/
inline T* operator[](size_t idx)
{
return mBuffer[idx];
}
};
QType* mQueue;
const size_t mSize;
public:
inline LocklessQueue(size_t size) : mQueue(NULL),
mSize(size+1)
{
mQueue = new QType(mSize);
}
inline ~LocklessQueue()
{
delete mQueue;
}
inline bool push(T msg)
{
size_t tail = mQueue->mTail;
mQueue->mBuffer[tail] = &msg;
tail++;
if(tail == mSize)
tail = 0;
if(tail == mQueue->mHead)
{
mQueue->mOverflow = true;
return false;
}
mQueue->mTail = tail;
return true;
}
inline bool pop(T &msg)
{
if(mQueue->mOverflow)
mQueue->mOverflow = false;
size_t head = mQueue->mHead; //make sure this is written after access
if(head == mQueue->mTail)
return false;
&msg = mQueue->mBuffer[head];
head++;
if(head == mSize)
head = 0;
mQueue->mHead = head;
return true;
}
inline bool pop()
{
if(mQueue->mOverflow)
mQueue->mOverflow = false;
size_t head = mQueue->mHead; //make sure this is written after access
if(head == mQueue->mTail)
return false;
head++;
if(head == mSize)
head = 0;
mQueue->mHead = head;
return true;
}
inline bool full() const
{
size_t tail = mQueue->mTail + 1;
if(tail == mSize)
tail = 0;
return (tail == mQueue->mHead);
}
inline bool empty() const
{
return (mQueue->mHead == mQueue->mTail);
}
inline T* peek() const
{
size_t head = mQueue->mHead;
if(head == mQueue->mTail)
return NULL;
return (*mQueue)[head];
}
inline void clear()
{
mQueue->mTail = 0;
mQueue->mHead = 0;
mQueue->mOverflow = 0;
}
};
}//namespace Blah
#endif
Code: Select all
#include <iostream>
#include "boost/thread/thread.hpp"
#include "boost/thread/xtime.hpp"
#include "LocklessQueue.h"
struct message
{
std::string text;
int number;
};
Blah::LocklessQueue<message> *messageQueue;
void putter()
{
for(int i = 0; i < 1; i++)
{
message a;
a.text = "Hello";
a.number = i;
std::cout << "Putting " << a.text << " " << a.number << std::endl;
messageQueue->push(a);
boost::xtime sleepTime;
boost::xtime_get(&sleepTime, boost::TIME_UTC);
sleepTime.sec += 60;
boost::thread::sleep(sleepTime);
}
}
void getter()
{
int j = 0;
while(j < 1)
{
if(!messageQueue->empty())
{
message *b = messageQueue->peek();
messageQueue->pop();
std::cout << "Getting " << b->text << " " << b->number << std::endl;
j = b->number;
}
}
}
void main()
{
messageQueue = new Blah::LocklessQueue<message>(100);
boost::thread_group threads;
threads.create_thread(putter);
threads.create_thread(getter);
threads.join_all();
}
-
- Gnoblar
- Posts: 20
- Joined: Thu Feb 09, 2006 12:06 am
- Game_Ender
- Ogre Magi
- Posts: 1269
- Joined: Wed May 25, 2005 2:31 am
- Location: Rockville, MD, USA
Well since no one seems to of followed my Orocos link... there implementation of a lock free queue allows multiple readers and multi writers, and only depends on the CAS instruction. From everything I have read most lock free algorithms that don't use something like a CAS instruction have plenty of corner cases where they fail. Orocos is currently Unix only and is GPL, but hey at least its proof it can be done.
They have data objects that are lock free, with a one writer and multiple readers. This is at the cost of having to store something like (1 + 2 * X) copies of the object where X = number of readers.
They have data objects that are lock free, with a one writer and multiple readers. This is at the cost of having to store something like (1 + 2 * X) copies of the object where X = number of readers.
Last edited by Game_Ender on Fri Dec 01, 2006 4:46 pm, edited 1 time in total.
- Game_Ender
- Ogre Magi
- Posts: 1269
- Joined: Wed May 25, 2005 2:31 am
- Location: Rockville, MD, USA
-
- Gnoblar
- Posts: 20
- Joined: Thu Feb 09, 2006 12:06 am
I just checked it, but I can't see it is any different from slist - I believe there are many implementations out there (no need to use a GPL one).Well since no one seems to of followed my Orocos link...
Check this out:
http://www.codeproject.com/threads/fast_ipc.asp
It is for interprocess communication, but I found it explains everything very nicely.
EDIT:
Why wouldn't CAS work on multi-core processors?
-
- Gnoblar
- Posts: 20
- Joined: Thu Feb 09, 2006 12:06 am
- Game_Ender
- Ogre Magi
- Posts: 1269
- Joined: Wed May 25, 2005 2:31 am
- Location: Rockville, MD, USA
The copies are only needed for shared data objects (a seperate thing from the queue). I guess it comes down to which is slower, copying, or locking? Since locking is a syscall, I am inclineded to think in cases where the data isn't super large the copying will be faster. Of course the only way to do this is test.martin.enge wrote:Wouldn't it mean a lot of unnecessary copying (which is very expensive) to keep all this x*2 copies of the data?
- Zeal
- Ogre Magi
- Posts: 1260
- Joined: Mon Aug 07, 2006 6:16 am
- Location: Colorado Springs, CO USA
Wait a minute.. who cares about multiple readers/writers? We can get by with one reader/writer per thread. Thus, that LDK lockless queue SHOULD work, but casey's example doesnt... why?
Perhaps its something to do with cout? In that example I posted, a mutex was used (I thought), since two threads couldnt cout at the same time? Or is it something else....
*heres the quote from that tut
I have the entire 1.33.1 boost library installed. I use boost array all the time...
*crap I need to build the boost libraries dont I. Didnt have to do that with boost array..
Perhaps its something to do with cout? In that example I posted, a mutex was used (I thought), since two threads couldnt cout at the same time? Or is it something else....
*heres the quote from that tut
*im trying to test now, but everytime I include boost/thread I get the following compiler errorTwo new threads are created, which loop 10 times, writing out an id and the current loop count to std::cout, while the main thread waits for both to complete. The std::cout object is a shared resource, so each thread uses a global mutex to ensure that only one thread at a time attempts to write to it.
Code: Select all
LINK : fatal error LNK1104: cannot open file 'libboost_thread-vc71-mt-sgd-1_33_1.lib'
*crap I need to build the boost libraries dont I. Didnt have to do that with boost array..
- Praetor
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
- Contact:
Much of boost is templates, so it is header files only. There are a few exceptions that I try to avoid (and sometimes fail avoid). Boost's filesystem utilities and boost threading are two examples of projects that must be built. Boost.python too I believe. It's only marginally painful, because I found bjam just wouldn't work. If I wanted to use boost with a VC8 project, the library had to be built with vc8, so I needed to hand-make projects for those libraries. Not quite as laborious as I make it sound.
- Game_Ender
- Ogre Magi
- Posts: 1269
- Joined: Wed May 25, 2005 2:31 am
- Location: Rockville, MD, USA
You can think of any shared resource which multiple objects don't want to access? I think physics is the biggest one, the AI, sound, and graphics systems all want to know the state of physics objects. So they would all be requesting data from it.zeal wrote:...who cares about multiple readers/writers...
Offtopic:
Windows users have very little excuse not to at least try boost. They have an installer now.
I am going to say you were doing something wrong though Praetor, because bjam is used to automatically build and test boost for regression testing and I am sure they test VS compiler so it must be possible. Did you try there mailing list?
- xavier
- OGRE Retired Moderator
- Posts: 9481
- Joined: Fri Feb 18, 2005 2:03 am
- Location: Dublin, CA, US
- x 22
Threads, not objects. And not directly from the queue, which is what we are talking about in this case.Game_Ender wrote:You can think of any shared resource which multiple objects don't want to access? I think physics is the biggest one, the AI, sound, and graphics systems all want to know the state of physics objects. So they would all be requesting data from it.zeal wrote:...who cares about multiple readers/writers...
- Zeal
- Ogre Magi
- Posts: 1260
- Joined: Mon Aug 07, 2006 6:16 am
- Location: Colorado Springs, CO USA
Did you ever get your code to work Casey? I just got boost working (stupid libs), and im not sure its working for me either. I removed the cout from the putter, so hopefully with only one thread calling it there should be no problem.
However, all I see is "Getting _"
As if the putter thread isnt writing anything to the queue...
*btw the fact that im using an AMD Athlon shouldnt be a problem right? I mean even with a single core and no 'hyper threading' support, multiple boost threads should 'work', they just wont work in parallel, correct?
*well the tuts at...
http://www.ddj.com/dept/cpp/184401518
...seem to be working, so I guess that answers my question. Hooray for threads!
However, all I see is "Getting _"
As if the putter thread isnt writing anything to the queue...
*btw the fact that im using an AMD Athlon shouldnt be a problem right? I mean even with a single core and no 'hyper threading' support, multiple boost threads should 'work', they just wont work in parallel, correct?
*well the tuts at...
http://www.ddj.com/dept/cpp/184401518
...seem to be working, so I guess that answers my question. Hooray for threads!
- CaseyB
- OGRE Contributor
- Posts: 1335
- Joined: Sun Nov 20, 2005 2:42 pm
- Location: Columbus, Ohio
- x 3
- Contact:
- Zeal
- Ogre Magi
- Posts: 1260
- Joined: Mon Aug 07, 2006 6:16 am
- Location: Colorado Springs, CO USA
Heres another lockless queue... This one makes the other seem quite bloated. It also contains a pretty good explanation incase anyone is still a little fuzzy.
Midway down...
http://www.ddj.com/dept/cpp/193004231
Midway down...
http://www.ddj.com/dept/cpp/193004231
- Frenetic
- Bugbear
- Posts: 806
- Joined: Fri Feb 03, 2006 7:08 am
I notice this explicitly says that the queue should contain pointers, not actual data. This means a thread creates the object it wants to queue on the heap, and then just makes sure to not use the pointer once it is pushed into the lockless FIFO.Heres another lockless queue... http://www.ddj.com/dept/cpp/193004231
Basically, instead of transferring the data itself to the thread reading the queue, you are just transferfing ownership.
If someone tries to make a working LocklessQueue class, it might be better to use a pointer of the template type instead of dicking around with allocating shallow copies.
This also means arbitrarily large objects can be "sent" from thread to thread without worrying about performance.
- Zeal
- Ogre Magi
- Posts: 1260
- Joined: Mon Aug 07, 2006 6:16 am
- Location: Colorado Springs, CO USA
- CaseyB
- OGRE Contributor
- Posts: 1335
- Joined: Sun Nov 20, 2005 2:42 pm
- Location: Columbus, Ohio
- x 3
- Contact:
That's how mine works... well is almost working! By the way there's a logic error on that page. He increments _head BEFORE putting anything in the queue and then increments _tail AFTER taking something out of the queue. That means that the first read is on undefined memory. Anyway, I almost have it working, should be able to put it up in just a few minutes.