Threaded Game Engine

Discussion area about developing or extending OGRE, adding plugins for it or building applications on it. No newbie questions please, use the Help forum for that.
Post Reply
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

xavier wrote:In this case, the element count *is* your gatekeeper,
So you were ahead of us all with this Lockless Queue stuff! :oops:
Image
Image
User avatar
xavier
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 9481
Joined: Fri Feb 18, 2005 2:03 am
Location: Dublin, CA, US
x 22

Post by xavier »

Looks like PortMidi was ahead of all of us. ;) The only difference between the algorithm I was describing and what LDK does, is they check head against tail for the empty case, which works just as well since the queue will have an element to read once the pointers are not the same. It also has the same restriction on one-reader-one-writer, which is fine in this case because the only thread that should be writing to the rendering queue is the main logic thread (which should get its updates from the other threads or subprocessing, such as physics and input). In fact, I would design the engine in such a way that you only ever have one reader and one writer for any given message queue.

The main logic thread would also write updates to an audio rendering thread (not so sure that an audio manager needs to be in its own thread since libraries such as OpenAL and fmod will already have threads internally, much like PhysX runs its world in a separate thread for you). So typically you would have at least the following threads in a game engine:

- rendering
- main loop (can contain physics and audio processing for the reasons stated above)
- queue management thread
- input

I put input in a separate thread because I would rather input events keep being processed regardless of the speed of the main loop -- YMMV on this, the separate input is my preference (and it doesn't hurt anything, it just passes messages to the main loop using the same queue mechanism).

That queue implementation looks fine to me for a message queue, Zeal. Good find Frenetic!

[edit: forgot the queue management thread, which removes the "what if the OS puts a thread to sleep" issue raised in Frenetic's lockfree queue link below]
Last edited by xavier on Fri Dec 01, 2006 7:20 am, edited 2 times in total.
Do you need help? What have you tried?

Image

Angels can fly because they take themselves lightly.
User avatar
Frenetic
Bugbear
Posts: 806
Joined: Fri Feb 03, 2006 7:08 am

Post by Frenetic »

CaseyB wrote:Couldn't you just return false instead to let the user know that the method failed.
That would probably be optimal. Shouldn't be hard to change.


More interesting links on this subject:

http://www.boyet.com/Articles/LockfreeQueue.html
http://www.boyet.com/Articles/LockfreeFreeList.html
http://www.audiomulch.com/~rossb/code/lockfree/

My Google-fu is strong. Image
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

In the first example of a Lockless Queue, there is this code

Code: Select all

inline void dequeue(T& msg)
{
     if(mQueue->mOverflow)
     {
          mQueue->mOverflow = false;
          return;
     }

     ...
}
and overflow was set here

Code: Select all

inline void enqueue(const T& msg)
{
     size_t tail = mQueue->mTail;
     memcpy((*mQueue)[tail], &msg, sizeof(T));
     tail++;
     if(tail == mSize) 
          tail = 0;
     if(tail == mQueue->mHead)
     {
          mQueue->mOverflow = true;
          throw LDK_LENGTH_ERROR("LocklessQueue overflow");
     }
     mQueue->mTail = tail;
}
My question is why does it return after simply clearing the overflow? shouldn't it actually remove something from the queue?
Image
Image
User avatar
Frenetic
Bugbear
Posts: 806
Joined: Fri Feb 03, 2006 7:08 am

Post by Frenetic »

CaseyB wrote:In the first example of a Lockless Queue, there is this code
*snip*
My question is why does it return after simply clearing the overflow? shouldn't it actually remove something from the queue?
That, my good sir, is an excellent question. It certainly looks like an error, doesn't it?
User avatar
Frenetic
Bugbear
Posts: 806
Joined: Fri Feb 03, 2006 7:08 am

Post by Frenetic »

Note that when the class overflows, it throws an "error" exception, which should indicate to the user that the state of the class is now "undefined". Still, the code does appear rather inconsistent.

It also has this gem in it:

Code: Select all

    if(mQueue)
        delete mQueue;
I hate it when people do that. Everyone should know that 'delete NULL' is harmless. And testing with 'if' won't save you if the pointer's value is garbage.

I'm thinking this is a rather inconsistent class wrapping a higher-quality lockless queue implementation.

Still, it shouldn't be that hard to clean up.
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

Frenetic wrote:Still, it shouldn't be that hard to clean up.
No, not at all! It's a really great starting point! I also don't understand the

Code: Select all

inline ~QType()
{
     if(mBuffer)
          ::free(mBuffer);
}
What is ::free all about?
Image
Image
User avatar
Frenetic
Bugbear
Posts: 806
Joined: Fri Feb 03, 2006 7:08 am

Post by Frenetic »

CaseyB wrote:

Code: Select all

inline ~QType()
{
     if(mBuffer)
          ::free(mBuffer);
}
What is ::free all about?
The :: operator with nothing in front of it means you are explicitly using the global namespace. In this case, the free() function is being used.

You know, free(), as in that heathenous C function that should never, ever be used in C++ code? :evil:

[edit] Actually, it appears that code is just a hacky way of preventing the object's destructor from being called...
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

Ok, there is something that I am missing. I have taken the code that Frenetic linked ot and changed it up a bit and written a test for it. The message goes into the queue fine, but comes out trashed. Here's the code:

LocklessQueue.h

Code: Select all

#ifndef _LOCKLESSQUEUE_H_
#define _LOCKLESSQUEUE_H_

#include "OgreException.h"

namespace Blah {
	template <class T>
	class LocklessQueue
	{
		struct QType
		{
			size_t mHead;
			size_t mTail;
			T** mBuffer;
			bool mOverflow;

			inline QType(size_t size) : mHead(0), mTail(0),
				mBuffer(NULL), mOverflow(false)
			{
				mBuffer = new T*[size];
				if(!mBuffer)
					throw Ogre::Exception(Ogre::Exception(1, "Could not allocate lockless queue.",
						Ogre::String(__FUNCTION__)));
			}

			inline ~QType()
			{
				delete mBuffer[];
			}
			
			/*inline const T* operator[](size_t idx) const
			{
				return mBuffer[idx];
			}*/
			
			inline T* operator[](size_t idx)
			{
				return mBuffer[idx];
			}
		};

		QType* mQueue;
		const size_t mSize;

	public:
		inline LocklessQueue(size_t size) : mQueue(NULL),
			mSize(size+1)
		{
			mQueue = new QType(mSize);
		}
		
		inline ~LocklessQueue()
		{
			delete mQueue;
		}
				
		inline bool push(T msg)
		{
			size_t tail = mQueue->mTail;
			mQueue->mBuffer[tail] = &msg;
			tail++;
			if(tail == mSize) 
				tail = 0;
			if(tail == mQueue->mHead)
			{
				mQueue->mOverflow = true;
				return false;
			}
			mQueue->mTail = tail;
			return true;
		}
		
		inline bool pop(T &msg)
		{
			if(mQueue->mOverflow)
				mQueue->mOverflow = false;

			size_t head = mQueue->mHead; //make sure this is written after access
			
			if(head == mQueue->mTail)
				return false;

			&msg = mQueue->mBuffer[head];
			head++;
			if(head == mSize)
				head = 0;

			mQueue->mHead = head;
			return true;
		}

		inline bool pop()
		{
			if(mQueue->mOverflow)
				mQueue->mOverflow = false;

			size_t head = mQueue->mHead; //make sure this is written after access
			
			if(head == mQueue->mTail)
				return false;

			head++;
			if(head == mSize)
				head = 0;

			mQueue->mHead = head;
			return true;
		}
		
		inline bool full() const
		{
			size_t tail = mQueue->mTail + 1;
			if(tail == mSize)
				tail = 0;

			return (tail == mQueue->mHead);
		}
		
		inline bool empty() const
		{
			return (mQueue->mHead == mQueue->mTail);
		}
		
		inline T* peek() const
		{
			size_t head = mQueue->mHead;
			if(head == mQueue->mTail)
				return NULL;

			return (*mQueue)[head];
		}
		
		inline void clear()
		{
			mQueue->mTail = 0;
			mQueue->mHead = 0;
			mQueue->mOverflow = 0;
		}
	};
}//namespace Blah
#endif
main.cpp

Code: Select all

#include <iostream>

#include "boost/thread/thread.hpp"
#include "boost/thread/xtime.hpp"
#include "LocklessQueue.h"

struct message
{
	std::string text;
	int number;
};

Blah::LocklessQueue<message> *messageQueue;

void putter()
{
	for(int i = 0; i < 1; i++)
	{
		message a;
		a.text = "Hello";
		a.number = i;
		std::cout << "Putting " << a.text << " " << a.number << std::endl;
		messageQueue->push(a);

		boost::xtime sleepTime;
		boost::xtime_get(&sleepTime, boost::TIME_UTC);
		sleepTime.sec += 60;
		boost::thread::sleep(sleepTime);
	}
}

void getter()
{
	int j = 0;
	while(j < 1)
	{
		if(!messageQueue->empty())
		{
			message *b = messageQueue->peek();
			messageQueue->pop();
			std::cout << "Getting " << b->text << " " << b->number << std::endl;
			j = b->number;
		}
	}
}

void main()
{
	messageQueue = new Blah::LocklessQueue<message>(100);

	boost::thread_group threads;
	threads.create_thread(putter);
	threads.create_thread(getter);
	threads.join_all();
}
Image
Image
martin.enge
Gnoblar
Posts: 20
Joined: Thu Feb 09, 2006 12:06 am

Post by martin.enge »

About this lockless queue... would >1 readers/writers break thread-safety? Naively I would think it would...
User avatar
Game_Ender
Ogre Magi
Posts: 1269
Joined: Wed May 25, 2005 2:31 am
Location: Rockville, MD, USA

Post by Game_Ender »

Well since no one seems to of followed my Orocos link... there implementation of a lock free queue allows multiple readers and multi writers, and only depends on the CAS instruction. From everything I have read most lock free algorithms that don't use something like a CAS instruction have plenty of corner cases where they fail. Orocos is currently Unix only and is GPL, but hey at least its proof it can be done.

They have data objects that are lock free, with a one writer and multiple readers. This is at the cost of having to store something like (1 + 2 * X) copies of the object where X = number of readers.
Last edited by Game_Ender on Fri Dec 01, 2006 4:46 pm, edited 1 time in total.
User avatar
Game_Ender
Ogre Magi
Posts: 1269
Joined: Wed May 25, 2005 2:31 am
Location: Rockville, MD, USA

Post by Game_Ender »

(double post)
martin.enge
Gnoblar
Posts: 20
Joined: Thu Feb 09, 2006 12:06 am

Post by martin.enge »

Well since no one seems to of followed my Orocos link...
I just checked it, but I can't see it is any different from slist - I believe there are many implementations out there (no need to use a GPL one).

Check this out:

http://www.codeproject.com/threads/fast_ipc.asp

It is for interprocess communication, but I found it explains everything very nicely.

EDIT:
Why wouldn't CAS work on multi-core processors?
martin.enge
Gnoblar
Posts: 20
Joined: Thu Feb 09, 2006 12:06 am

Post by martin.enge »

Seems I didn't look hard enough in the AtomicQueue code before posting... there is definitely something else there. But I don't understand what, really.

Wouldn't it mean a lot of unnecessary copying (which is very expensive) to keep all this x*2 copies of the data?
User avatar
Game_Ender
Ogre Magi
Posts: 1269
Joined: Wed May 25, 2005 2:31 am
Location: Rockville, MD, USA

Post by Game_Ender »

martin.enge wrote:Wouldn't it mean a lot of unnecessary copying (which is very expensive) to keep all this x*2 copies of the data?
The copies are only needed for shared data objects (a seperate thing from the queue). I guess it comes down to which is slower, copying, or locking? Since locking is a syscall, I am inclineded to think in cases where the data isn't super large the copying will be faster. Of course the only way to do this is test.
User avatar
Zeal
Ogre Magi
Posts: 1260
Joined: Mon Aug 07, 2006 6:16 am
Location: Colorado Springs, CO USA

Post by Zeal »

Wait a minute.. who cares about multiple readers/writers? We can get by with one reader/writer per thread. Thus, that LDK lockless queue SHOULD work, but casey's example doesnt... why?

Perhaps its something to do with cout? In that example I posted, a mutex was used (I thought), since two threads couldnt cout at the same time? Or is it something else....

*heres the quote from that tut
Two new threads are created, which loop 10 times, writing out an id and the current loop count to std::cout, while the main thread waits for both to complete. The std::cout object is a shared resource, so each thread uses a global mutex to ensure that only one thread at a time attempts to write to it.
*im trying to test now, but everytime I include boost/thread I get the following compiler error

Code: Select all

LINK : fatal error LNK1104: cannot open file 'libboost_thread-vc71-mt-sgd-1_33_1.lib'
I have the entire 1.33.1 boost library installed. I use boost array all the time...

*crap I need to build the boost libraries dont I. Didnt have to do that with boost array..
User avatar
Praetor
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 3335
Joined: Tue Jun 21, 2005 8:26 pm
Location: Rochester, New York, US
x 3
Contact:

Post by Praetor »

Much of boost is templates, so it is header files only. There are a few exceptions that I try to avoid (and sometimes fail avoid). Boost's filesystem utilities and boost threading are two examples of projects that must be built. Boost.python too I believe. It's only marginally painful, because I found bjam just wouldn't work. If I wanted to use boost with a VC8 project, the library had to be built with vc8, so I needed to hand-make projects for those libraries. Not quite as laborious as I make it sound.
User avatar
Game_Ender
Ogre Magi
Posts: 1269
Joined: Wed May 25, 2005 2:31 am
Location: Rockville, MD, USA

Post by Game_Ender »

zeal wrote:...who cares about multiple readers/writers...
You can think of any shared resource which multiple objects don't want to access? I think physics is the biggest one, the AI, sound, and graphics systems all want to know the state of physics objects. So they would all be requesting data from it.

Offtopic:
Windows users have very little excuse not to at least try boost. They have an installer now.

I am going to say you were doing something wrong though Praetor, because bjam is used to automatically build and test boost for regression testing and I am sure they test VS compiler so it must be possible. Did you try there mailing list?
User avatar
xavier
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 9481
Joined: Fri Feb 18, 2005 2:03 am
Location: Dublin, CA, US
x 22

Post by xavier »

Game_Ender wrote:
zeal wrote:...who cares about multiple readers/writers...
You can think of any shared resource which multiple objects don't want to access? I think physics is the biggest one, the AI, sound, and graphics systems all want to know the state of physics objects. So they would all be requesting data from it.
Threads, not objects. And not directly from the queue, which is what we are talking about in this case.
Do you need help? What have you tried?

Image

Angels can fly because they take themselves lightly.
User avatar
Zeal
Ogre Magi
Posts: 1260
Joined: Mon Aug 07, 2006 6:16 am
Location: Colorado Springs, CO USA

Post by Zeal »

Did you ever get your code to work Casey? I just got boost working (stupid libs), and im not sure its working for me either. I removed the cout from the putter, so hopefully with only one thread calling it there should be no problem.

However, all I see is "Getting _"

As if the putter thread isnt writing anything to the queue...

*btw the fact that im using an AMD Athlon shouldnt be a problem right? I mean even with a single core and no 'hyper threading' support, multiple boost threads should 'work', they just wont work in parallel, correct?

*well the tuts at...

http://www.ddj.com/dept/cpp/184401518

...seem to be working, so I guess that answers my question. Hooray for threads!
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

Nope, I never was able to get it working, so I am thinking of trying to write my own based on what I've been reading. If I get it working I'll post it here.
Image
Image
User avatar
Zeal
Ogre Magi
Posts: 1260
Joined: Mon Aug 07, 2006 6:16 am
Location: Colorado Springs, CO USA

Post by Zeal »

Heres another lockless queue... This one makes the other seem quite bloated. It also contains a pretty good explanation incase anyone is still a little fuzzy.

Midway down...

http://www.ddj.com/dept/cpp/193004231
User avatar
Frenetic
Bugbear
Posts: 806
Joined: Fri Feb 03, 2006 7:08 am

Post by Frenetic »

Heres another lockless queue... http://www.ddj.com/dept/cpp/193004231
I notice this explicitly says that the queue should contain pointers, not actual data. This means a thread creates the object it wants to queue on the heap, and then just makes sure to not use the pointer once it is pushed into the lockless FIFO.

Basically, instead of transferring the data itself to the thread reading the queue, you are just transferfing ownership.

If someone tries to make a working LocklessQueue class, it might be better to use a pointer of the template type instead of dicking around with allocating shallow copies.

This also means arbitrarily large objects can be "sent" from thread to thread without worrying about performance.
User avatar
Zeal
Ogre Magi
Posts: 1260
Joined: Mon Aug 07, 2006 6:16 am
Location: Colorado Springs, CO USA

Post by Zeal »

This also means arbitrarily large objects can be "sent" from thread to thread without worrying about performance.
I would agree, speed is key here (we are making a game afterall).

However, didnt xavier warn about passing pointers between threads? Or did I miss his point..
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

That's how mine works... well is almost working! By the way there's a logic error on that page. He increments _head BEFORE putting anything in the queue and then increments _tail AFTER taking something out of the queue. That means that the first read is on undefined memory. Anyway, I almost have it working, should be able to put it up in just a few minutes.
Image
Image
Post Reply