Threaded Game Engine

Discussion area about developing or extending OGRE, adding plugins for it or building applications on it. No newbie questions please, use the Help forum for that.
Post Reply
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Threaded Game Engine

Post by CaseyB »

I am going to start work on a threaded game engine and I am looking for advice. I have read a lot of articles and I want to make sure I am understanding what I've read. If you guys wouldn't mind having a look at this blog entry and commenting I would be very grateful!
Image
Image
klauss
Hobgoblin
Posts: 559
Joined: Wed Oct 19, 2005 4:57 pm
Location: LS87, Buenos Aires, República Argentina.

Post by klauss »

Perhaps you'd like to take a look at the deeplayer engine (also at planning-only stage).

The idea there is also to make full use of multicore, and though I find some of the discussion there a bit naive (in that it shows a bit of inexperience with multithreaded coding), there are some neat ideas and, mostly, we've been talking about those naive solutions, chuck and I (chuck being deeplayer's lead) and came (privately over mail) to some interesting conclusions.

First, for instance, there's a huge difference whether you want to do your own physics or not. Because if you will, then you can do interesting stuff which otherwise would be prohibitive, like intermixing a bit graphics with physics. Our idea is that things should be as modular as possible, but at some point graphics and physics have a serious problem with resource dependency and interlocks, and mixing them a bit allows elegant solutions to that problem.

An important design parameter that may make the work on deeplayer not that suitable for you is, not only the desire to make a physics engine almost from scratch (maybe only reusing the core algorithms from other physics packages, like trimesh collisions and stuff like that), is that the engine is space-oriented. Space has specific physics needs - ie, you really don't need stable stacking, and that's one hard problem less to handle ;)

Still... you might want to check it out. Or share ideas. Or whatever.
Oíd mortales, el grito sagrado...
Hey! What is it with that that?
Wing Commander Universe
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

Yeah, I'm not hard-core enough to make a physics engine from scratch! Mostly becuase I want it to be as versatile as possible, so I am going to integrate PhysX + NxOgre. But I'll have a look at your ideas and make comments. I am pretty new to Threaded applications as well so I probably own't be of much help! :lol: I do know that when Valve rewrote their Source Engine they used a threading standard called OpenMP It's cross platform and seems pretty easy to use.
Image
Image
User avatar
Game_Ender
Ogre Magi
Posts: 1269
Joined: Wed May 25, 2005 2:31 am
Location: Rockville, MD, USA

Post by Game_Ender »

Where is the article stating that they used OpenMP? The Anandtech article I read said they used lock-free algorithms as the heart of the there threading primitives. To implement lock-free message queues, shared data, etc. It did mention OpenMP but I thought they listed it as an options no as something they used. OpenMP does not support task level parallelism, so it is limited in what it can do for you. It does sound like it would be very useful for doing data level parallelism within a physics engine automagically.
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

:oops: You're right! Sorry, I've been reading so much that it all kind of runs together! I read about someone using though, I'll see if I can find that article.

-=EDIT=-
Ok, it was Tim Sweeny talking about the Unreal 3 Engine.
Image
Image
User avatar
Chris Jones
Lich
Posts: 1742
Joined: Tue Apr 05, 2005 1:11 pm
Location: Gosport, South England
x 1

Post by Chris Jones »

you can always check out OGE.

it seems to be working ok for us currently, but it was the biggest thing that slowed us down and took many months to get right

good luck
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

I've just checked OGE out of CVS, thanks!
Image
Image
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

@Chris Jones:
You are using Boost to do the threading. I was just wondering what led to that decision?

It seems that Boost Threads has implemented only the basic needs of multithreaded program. And OpenMP doens't support functional multi-tasking, is there another library that anyone would suggest looking into?
Image
Image
User avatar
Chris Jones
Lich
Posts: 1742
Joined: Tue Apr 05, 2005 1:11 pm
Location: Gosport, South England
x 1

Post by Chris Jones »

we chose boost because IIRC (steven knows more on this subject than i do) its cross platform. i think alot/all of those others are for specific platforms. im not 100% sure on the details, but boost has been fine for us, i dont think theres any current issues with boost::Thread.

also, i dont think its really added a huge dependancy to OGE
User avatar
steven
Gnoll
Posts: 657
Joined: Mon Feb 28, 2005 1:53 pm
Location: Australia - Canberra (ex - Switzerland - Geneva)
Contact:

Post by steven »

CaseyB wrote:@Chris Jones:
You are using Boost to do the threading. I was just wondering what led to that decision?
There are several others thread lib (pthreads, zthreads, etc).
IMO they each have issues:
* not fully cross-platform (works on one but not completly on another, works with VC but not mingw, etc).
* issues: simple test creates deadlocks.
* features that have limitations
* stalled developement
* and so on.

This is my opinion. Some of those libraries exist since years and are used prefectly.
You just need to see what you need.
It seems that Boost Threads has implemented only the basic needs of multithreaded program.
Yes.
But it works on all platform and compilers (except exotic ones but the previous lib would not work anyway) and this is our main criterion.
And OpenMP doens't support functional multi-tasking, is there another library that anyone would suggest looking into?
OpenMP is completely different. It "parallelise" a sequential code.
You could use boost::thread AND OpenMP to parallelise different aspect of your engine.
For example, in oge we use boost to put each main manager in a different thread. But we could later use OpenMP to subdivide the task of a manager to profit of cores that a not used - that is > 6 cores/cpus.
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

steven wrote:OpenMP is completely different. It "parallelise" a sequential code.
You could use boost::thread AND OpenMP to parallelise different aspect of your engine.
For example, in oge we use boost to put each main manager in a different thread. But we could later use OpenMP to subdivide the task of a manager to profit of cores that a not used - that is > 6 cores/cpus.
That is exactly what I want to do! For some reason using the two together never came to me! I'll give that a shot! Thanks!
Image
Image
klauss
Hobgoblin
Posts: 559
Joined: Wed Oct 19, 2005 4:57 pm
Location: LS87, Buenos Aires, República Argentina.

Post by klauss »

I don't understand all this talk about multiple libraries and what they support and what they don't.

AFAIK, as long as there's a way to start a new thread (a process sharing the same virtual addressing as the parent process), have a few kind of mutexes (simple mutexes, semaphores, critical sections (faster mutexes with spinlocks), and, perhaps (because it's hard to implement) multiple-read-single-write locks), you have all you need. And most, if not all libraries, support those. The rest is just clever usage of the tools you're given.
Oíd mortales, el grito sagrado...
Hey! What is it with that that?
Wing Commander Universe
User avatar
Praetor
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 3335
Joined: Tue Jun 21, 2005 8:26 pm
Location: Rochester, New York, US
x 3
Contact:

Post by Praetor »

Well there is a little more to it. Threads are just so platform-specific. So, if you were looking to stay cross-platform you have all sorts of problems. I've only dabbled with boost threads, but so far I'm very pleased with them. OpenMP is useful because for some types of code you can add a few directives and generate multi-threaded code automagically.
User avatar
Falagard
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 2060
Joined: Thu Feb 26, 2004 12:11 am
Location: Toronto, Canada
x 3
Contact:

Post by Falagard »

klauss, if I understand correctly, OpenMP makes it simple to parallelize discreet pieces of functionality that might take a lot of CPU usage, and split it across multiple cores.

An example might be some algorithm that takes a few seconds (or even minutes) to perform. Using special directives in your code, you can take nested loops within a single function and parallelize it using OpenMP and it'll spread the code across multiple threads without any extra programming work that you need to do to worry about things like mutexes, etc. Good for things like working on large pieces of data, decompression algorithms, processing images or sound files, things of that nature.

OpenMP is for finely granular multi-threading at the micro level. Boost::Threads would be for multi-threading at the macro level.
Last edited by Falagard on Wed Nov 29, 2006 3:01 am, edited 1 time in total.
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

I am looking through the OGE code and the mutexes and scope_locks are pretty straight forward, and I see the run methods, but I am having trouble finding where you fork off the new threads.
Image
Image
User avatar
Zeal
Ogre Magi
Posts: 1260
Joined: Mon Aug 07, 2006 6:16 am
Location: Colorado Springs, CO USA

Post by Zeal »

Im gonna try and get a core 2 duo here soon, so this topic is very interesting to me. I have also been looking at boost::thread, but it seems a tad over my head. I read the Valve article, and the way they presented it seemed very clear (almost too obvious).

Does anyone have any super basic tuts that would help explain...

1.) How to create two threads (say a simple while loop that just goes round and round), and run them each on a separate cpu core?

2.) It seems the next step would be resource 'sharing'. Im curious as to your options when it comes to reading/writing a piece of data that may exist 'between' threads. Obviously this type of 'shared' data is to be used sparingly (due to the overhead with locking/unlocking), but you have to 'connect' the threads somehow, right?

3.) So if I could only comprehend how to do 1 & 2, that could take care of your 'coarse grain multithreading' (physics on one core, rendering on another ect...). Then the last thing would be to have each thread check the other cores to see if they go idle, and if they do, offload some of your 'fine grained' calculations. Any info on this?

Again, im thinking about all this in very basic noob calibur terms. It seems like you could get some good results with the above steps, can anyone educate me on what im missing?
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

Zeal wrote:Again, im thinking about all this in very basic noob calibur terms.
Multi-Core technology is new enough that everyone is pretty new to it all. :wink: Your best bet is to read as much as you can about the theory so that you get that down pretty well, because everything gets messy in practice! As for simple examples, I just found out that boost comes with some! But here's one that I found online

Code: Select all

#include <boost/thread/thread.hpp>
#include <iostream>

int count = 0;
boost::mutex mutex;

void increment_count()
{
   boost::mutex::scoped_lock lock(mutex);
   std::cout << "count = " << ++count << std::endl;
}

int main(int argc, char* argv[])
{
   boost::thread_group threads;
   for (int i = 0; i < 10; ++i)
      threads.create_thread(&increment_count);
   threads.join_all();
}
It forks off 10 threads that run the increment_count() method. The line in this method that actually touches the shared memory is protected by a mutex, that means that only 1 thread can touch it at a time. Using OpenMP it would look like this

Code: Select all

#include <omp.h>

main ()  
{
	int i, count;

	#pragma omp parallel shared(count) private(i)
	{
		#pragma omp for schedule(dynamic)
		for (i=0; i < 10; i++)
		{
			std::cout << "count = " << ++count << std::endl;
		}
	}  /* end of parallel section */
}
As you can see it looks a lot more like what you would do now and it protects count because you tell it that it's a shared resource. As for anything much more complex than this right now, I have no idea! :D
Image
Image
User avatar
Zeal
Ogre Magi
Posts: 1260
Joined: Mon Aug 07, 2006 6:16 am
Location: Colorado Springs, CO USA

Post by Zeal »

ooooohh.. Thats cool. I like boost a lot, so Id be happy to use it for this too.. ill have to do more research on boost::thread and mutex when I get my new cpu..

Although I guess there is nothing stopping me from running tests on a single core.. it would just be slower than running a single thread ill bet :p
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

Zeal wrote:Although I guess there is nothing stopping me from running tests on a single core.. it would just be slower than running a single thread ill bet :p
With boost, yes, with OpenMP it's smart enough to only allocate threads that you have the cores to support.
Image
Image
klauss
Hobgoblin
Posts: 559
Joined: Wed Oct 19, 2005 4:57 pm
Location: LS87, Buenos Aires, República Argentina.

Post by klauss »

CaseyB wrote:
Zeal wrote:Again, im thinking about all this in very basic noob calibur terms.
Multi-Core technology is new enough that everyone is pretty new to it all.
Not at all - servers have been running on multiple cores (ie: SMP) for a long while. Not to mention supercomputers.
There's a lot of theory and practice about it - it's just not common knowledge for the bulk of the programming community, but a search should turn out pretty much a lot of info about best practices.

Cool stuff about that OpenMP - I had read a bit about it. I don't like it, it hides too much for my taste, but I see how people could fall in love with it ;)

Threading is a complex thing to master, I don't think there's any kind of tut for noobs. If you don't go the distance(*), you end up with problems. You have to study it all, and pray since you'll only learn by practicing. Most important of all to get at least safe code, is learn very well the locking patterns that avoid deadlocks - you don't want deadlocks on your code, you'll never be able to debug them.

Then, you'll want to know the costs of creating threads - you don't want to create lots of thread just for fun, each thread incurrs in a certain overhead, even on multicore. Just to show, there's a tiny piece of code that will crash any and all linux distributions (and probably any other OS as well):

Code: Select all

while (1) fork();
It creates processes (kinda like threads) exponentially, stealing priority until no other process can even add two numbers. In the end, the computer will be lost in task switching and no progress will be made on any thread. That's the empirical way of confirming that threads and processes do have an overhead.

Then, you'll have to learn about processor affinity. Complex issue that one, and I don't know a portable way of handling that - but thing is, you sometimes need to know about such things. Take Windows, and its performance counters: if your thread switches processors, the performance counter will change (it's dependant on the processor the thread is running on), and if your game's timing uses the performance counters, you'll have to make sure you query it on the same processor every time. Tricky stuff.

(*) I hope that's good usage ;)
Oíd mortales, el grito sagrado...
Hey! What is it with that that?
Wing Commander Universe
User avatar
Chris Jones
Lich
Posts: 1742
Joined: Tue Apr 05, 2005 1:11 pm
Location: Gosport, South England
x 1

Post by Chris Jones »

I am looking through the OGE code and the mutexes and scope_locks are pretty straight forward, and I see the run methods, but I am having trouble finding where you fork off the new threads
take a look inside ObjectManager::_run()

that function starts all the other managers threads (although in 0.2 we are changing this slightly to allow you to choose what managers to start up)

for each manager, it starts the startThread function on each of the managers, which in turn calls _run(), thats the main loop for each manager
User avatar
steven
Gnoll
Posts: 657
Joined: Mon Feb 28, 2005 1:53 pm
Location: Australia - Canberra (ex - Switzerland - Geneva)
Contact:

Post by steven »

CaseyB wrote:I am looking through the OGE code and the mutexes and scope_locks are pretty straight forward, and I see the run methods, but I am having trouble finding where you fork off the new threads.
Chris is right to point to ObjectManager::_run()

If you want to have a better understanding of our design take a look at:
http://oge.dayark.com/wiki/index.php?ti ... on_Summary

But this isn't OGRE-related you should ask us on our OGE forum
Bloodypriest
Goblin
Posts: 223
Joined: Thu Aug 18, 2005 2:54 pm

Post by Bloodypriest »

Game_Ender wrote:OpenMP does not support task level parallelism, so it is limited in what it can do for you.
Not true. OpenMP can do task level parallelism but it is damn hell clumsy.

Here's an example :

Code: Select all

#pragma omp parallel sections num_threads(3)
{
    #pragma omp section
    {
        // task 1 code here
    }
    #pragma omp section
    {
        // task 2 code here
    }
    #pragma omp section
    {
        // task 3 code here
    }
}
And there is always the chance that if OpenMP will not fully parallelize the code (for instance if the maximum number of threads isn't high enough) so if you have tasks waiting for each other to complete, you might end up with infinite loops when the code ends up running sequentially.

All in all, you should avoid "#pragma omp parallel sections" at all costs.
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

Bloodypriest wrote:Not true. OpenMP can do task level parallelism but it is damn hell clumsy.
And also yo can't branch into or out of those sections so it's not fit to run an entire subsystem.
Image
Image
User avatar
CaseyB
OGRE Contributor
OGRE Contributor
Posts: 1335
Joined: Sun Nov 20, 2005 2:42 pm
Location: Columbus, Ohio
x 3
Contact:

Post by CaseyB »

Interestingly Microsoft mentions OpenMP as a supported option for Xbox360 Development on this MSDN page that discusses the DirectX SDK.
Image
Image
Post Reply