After another round of threading mishaps, I've been forced to once again reconsider our whole approach to threads. While I still like the concept of the elaborate messaging system in AppCore, it belongs in a far higher level language than C++. It simply adds to much overhead to be useful in performance-intensive applications (and let's face it, why else does anything use C++ in a post-Python world?) After reading a number of articles on the matter, (the important ones are listed at the bottom) I have a few thoughts I'd love to get some feedback on.
Let's start with the original goals of threading. People often embrace multithreading only grudglingly as a method of improving their application's performance. But I don't think that needs to be the only excuse. Performance is certainly one objective, but threading also has other benefits under certain circumstances. At a conceptual level, it makes a lot of sense to represent logic, graphics, networking, etc. systems independently. While they occasionally query each other for information, a game's logic loop shouldn't be forced to proceed in a fixed one-to-one correspondence with frame rendering. In fact, this approach introduces all sorts of problems when networking between computers of different calibers.
In his article The Free Lunch is Over (1), Herb Sutter argues that the concurrency revolution (a term he is also credited with originating) is the biggest shift in software development since the acceptance of Object-Oriented Programming. The difference is that object orientation, once one understands the concepts, is exceedingly intuitive. Multithreading, though conceptually so, is far from intuitive in practice. Unlike objects, threads do not intuitively suggest relationships with one another, which quickly becomes the primary complication of multithreaded development.
I forgot where I saw the analogy of the two bakers, but it illustrates the intuitive advantages of multithreading techniques. Lets say you need to make two cakes. Your first alternative would be to repeat the procedure to make one cake twice. This is the original single-threaded approach. The second would be to make a double batch of cake, and then cook them separately. This approach is more akin to data parallism (4), because you are performing the labor-intensive work (making the batter) in parallel. The third method would be to hire two bakers and tell them each to make a cake at the same time. While this represents a task-parallel (5) approach, the problem itself does not illustrate the full utility of a task parallel solution because the actions to be performed are identical.
A situation where one must prepare both a roast and a cake would be a far more appropriate candidate for task parallelism. In our new example, the programmer would describe both tasks, that of making a roast and that of making a cake, and then assign one chef (or thread) to each. This is quite intuitive, until you begin worrying about sharing the oven, staying out of each others way, and dealing with unpredictable alterations to the work-space. For instance, in a single-threaded situation you could reasonably expect that having shut the refrigerator door, you could safely open it again the next time you needed eggs. This is not the case in a multithreaded (read multichef) environment. You might even crash your program by opening the door and attempting to retrieve a box of eggs, only to find the door had been shut in your face by another thread/chef.
There's plenty of room for both data-parallelism and task-parallelism in many resource-intensive applications. Libraries like TBB make most expensive loops fairly simple to (data) parallelize, which is very useful in optimizing for modern multicore systems. I don't think, however, that any of these innovations reduce the allure of task-parallelism. Though task parallelism does offer performance benefits as well, I (naively?) still see a huge potential for it to simplify matters rather than complicating them.
Object Orientation is a brilliant way of describing code in natively intuitive form. That's what has made it so wildly popular, one might even say indespensible, for large and complicated software projects. Intuitively, we would like to be able to think of threads as objects too, tangible "things" that we can manipulate and interact with, and describe relationships between. Instead, we can really only control the paths they follow. Multithreaded applications are like a spider-web of objects, each thread a spider scurrying along the web oblivious to its surroundings. That's fine, until the spiders collide. So we "thread-safe" our objects, by imposing virtual traffic lights on our code-webs, preventing the most blatant types of collisions. Still, it's difficult to predict the more insidious possibilities, weird traffic jams (deadlocks), and it requires additional overhead by both programmer and processor to maintain all these virtual stop-lights (semaphores, mutexs, conditions).
These are my thoughts on what it would take to "objectify" a thread.
There are essentially three cases in which threads will overlap. The first is when they are querying an object under the jurisdiction of another thread/system, or an object that "belongs" to multiple systems. The second is when they are directly manipulating some aspect of said object. The third, on which I will focus, is when one thread must "trigger" a reaction of another thread/system.
The traditional approach is for the triggering thread to wade deep into the territory of the triggeree's system. But there are a few reasons you would rather "queue" such a call to be run later by the native thread.
First, from an intuitive standpoint, queueing is much more akin to the interaction of our chefs from the first example. Rather than one momentarily taking control of the others' limbs to perform some action, (the traditional approach), queueing is the moral equivalent of communication; where the first chef would instead tell the second chef to perform a specific action. In the second example, each spider (thread) traverses its own subnet (system) and resists any large intrusion into its neighbor's subnet.
Second, queuing avoids costly blocks where the intruding thread must acquire heavily used locks before proceeding. Depending on the circumstances, my uneducated guess is that this will easily make up for the overhead of maintaining the queue in most ordinary situations. This discussion of crazy mutex webs leads into the next point, which I feels is the most important benefit yet.
Third, this technique avoids the debugging hell one enters when such intrusions cause unpredictable and deep-rooted deadlocks. While superficial calls can and should be handled in the originating thread, we can eliminate much of the complexity of multithreaded design by expecting that the inner workings of a system are only manipulated by one thread at a time. This basically comes back to the idea that no matter what you do, there is no fool-proof method of thread-safing complex objects like an STL map or vector when used in a complicated multithreaded context. Queueing eleminates the type of bi-directional intrusions that cause deadlocks in the first place.
Ideally, different systems depend on each other in only one direction. From a design perspective, we often use the listener pattern to achieve this desired agnosticism. For example, the graphics system must query game objects to update the corresponding nodes' positions, but the game logic must also inform the graphics system should a new object be created. This is usually handled by the graphics system registering itself as a listener with the game logic, so that the game code need not explicitely know about the graphics code. With the traditional approach, however, the game thread will in fact stray far into the graphics system to make this callback, even while the game code remains blissfully ignorant of its inner workings.
The first step of applying this approach to a project is to determine the dependency directions between systems. To do so, it is helpful to look at the shared objects on the boundary layer between multiple systems. The "consumer" system or systems will access these objects only to inquire about their states, never to modify the object. The single "producer" system has the lone ability to modify the object. The producer system also may also need to notify all consumers should a particular event occur. In this case, it would still directly call the particular function for each listener (to keep it agnostic of inner workings, including the name of the thread running the system). The difference, however, is that this callback, should it need to perform any action demanding a lock, would serve as a wrapper which simply queues a call to another function. Because we don't need any locks for this, it is the only type of interaction that can go both directions.
Here's how I would do this for a small subsection of our own game. To simplify matters, we will consider only three systems: the game logic, graphics, and networking. The game logic must notify both the graphics system and the networking system when certain events occur, including unit creation, movement, and shooting. Because we don't want the game logic to know or care about either of these systems, we register both as listeners for the relevant events. The game code has a number of 'unit' and 'projectile' objects which it must periodically update, making it the producer for those objects. Likewise, the network and graphics code must periodically check the state of those objects in order to update themselves, making them consumers of those objects. Finally, the network system must occasionally inform the game logic of an incoming transmission. To make all this work, we would first provide both of the shared objects, the units and projectiles, with locking mechanisms in their getter/setter functions. Then, we would make the game callbacks in the graphics system and the network callbacks in the game system non-locking, and instead queue calls to perform any involved work.
/// Provides methods to manipulate the flow of a hardware thread
/// @note The static functions will probably be preferable in most cases.
/// Return the name of the calling thread
static std::string getName ();
/// Return the calling thread's Thread object
static Thread* getThread ();
/// Check for stop signals and execute call queue
static void breakpoint ();
/// Queue a function to be called at the given thread's next breakpoint
/// @note Thread will take ownership of the Runnable pointer.
static void queueInThread (std::string thread, Runnable* call);
/// Set the name of the calling thread
static void setName (std::string name);
/// Standard constructor
Thread (std::string name, Runnable* entryPoint);
/// Standard destructor
virtual ~Thread ();
/// See static breakpoint()
void breakpoint ();
/// See static getName()
std::string getName ();
/// See static queueInThread()
void queue (Runnable* call);
/// Start the thread if not already running
virtual void run ();
/// See static setName()
void setName (std::string name);
/// Cleanly exit this thread at the next breakpoint
void stop ();
/// A simple interface for thread entry point or queued calls
/// Override this function with whatever action is to be performed
virtual void run () = 0;
I found the following articles quite interesting and provocative:
2. http://www.theserverside.com/news/13651 ... t-Programs