[SoC 2007] Custom Memory Heaps and Object Allocators

Threads related to Google Summer of Code
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

[SoC 2007] Custom Memory Heaps and Object Allocators

Post by c4llidus »

Hello All,

This post is a bit overdue ;) It serves as my introduction to the Ogre community and outlines the plans I have for implementing custom object allocators and heaps for Ogre.

Here is what ill be working on, copied from my GSoC application.....

Thread safe object allocators and memory pool management.

The need for thread safe custom allocators and memory management has been identified by the Ogre 3D project. I intend to respond to and extend the identified requirements for a custom memory management model and propose the following project.

The memory system would be created from two key sections, first a global heap with a pluggable memory allocator that can support memory auditing and profiling. And secondly a custom base class that can be extended via inheritance to allow specific families of objects to utilise a unique heap and allocation model. Both the global heap and specific family heaps should be optional at compile time, permitting the disabling of custom memory systems and a reversion to stock C++ allocation model.

The most immediate and obvious advantage to implementing a custom heap framework is debugging and profiling. By utilising a "Debug Heap" an application developer would be able to access and log detailed information about memory usage, including size a frequency of allocations, most active periods of memory usage and possible corruption or leaking of memory.

Building this functionality directly into the engine has a number of advantages over using external memory checking tools. Firstly the user is alerted to a problem the moment is arises, when recent code, the likely cause of the issue, is fresh in their mind. Secondly a custom debug heap can modify the internal specifics of the memory allocation model to integrate with existing debug information paths, resulting in a unified debug information system.

As well as an invaluable debugging and profiling tool, custom memory heaps allow for tuning of memory footprint boundaries. Thus an end user would be able to restrict an applications memory footprint to a maximum threshold. Usefull for situations where the installed memory of the development system is larger than the intended system specifications.

Additionally, per object family heaps can be introduced along side global heaps or native C++ memory schemes, permitting optimisations and improvements in responding to specific allocation patterns. An example of this could be a "String Heap" for handling small highly dynamic object allocations, as typically seen in list, tree, graph and string data structures.

Specific heaps are easily hooked into object family hierarchies by the use of inheritable per class operator new and delete overrides. Permitting participating class instances to be created using custom, pluggable memory schemes. This should have the effect of reducing load times and stabaliseing or lowing the engines memory footprint.

Once the debugging faze is passed the developer will of course be intent on getting the best performance possible from the engine. The implementation of a fast and flexible heap to meet this demand is readily attainable, with excellent examples of fast memory allocators such as Doug Lee's dlmalloc[1] code available. Coupled with a flexible C++ class system the "Release Heap" can easily be plugged in to replace the debugging heap.

Integration of the new memory scheme should be almost transparent to the existing code. Implemented simply as base classes that can bolster existing code constructions, augmented with a control API.

................

Well thats an overview, I hope people will comment on this, feedback and ideas are welcomed :) Work has already started on some simple proof of concept code for allocators and how to plug them into the basic framework, the results of this and more specific details will be posted here and on the wiki.
User avatar
tuan kuranes
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 2653
Joined: Wed Sep 24, 2003 8:07 am
Location: Haute Garonne, France
x 4
Contact:

Post by tuan kuranes »

Nice. Can't wait for detailed specifications.

What memory statistics is planned in details ? (per frame new/delete ? per second ? min/max/average new/delete ? min/max/average new/delete at once ? Allocation time per object ? )

If I want to use your allocators for my GameEntities, what specification my class should follow ? just inherit from your base allocator class ?
User avatar
Project5
Goblin
Posts: 245
Joined: Mon Nov 22, 2004 11:56 pm
Location: New York, NY, USA

Post by Project5 »

Are you planning on using a mutex for thread safe access to these allocators or having a separate heap per thread?

--Ben
User avatar
Game_Ender
Ogre Magi
Posts: 1269
Joined: Wed May 25, 2005 2:31 am
Location: Rockville, MD, USA

Post by Game_Ender »

What about using this method to easily hook in more advanced allocators like Hoard?
User avatar
Wolfmanfx
OGRE Team Member
OGRE Team Member
Posts: 1525
Joined: Fri Feb 03, 2006 10:37 pm
Location: Austria - Leoben
x 99
Contact:

Post by Wolfmanfx »

HOARD is GPL or you can obtain a Commercial license i think the GPL is a no go.
User avatar
sinbad
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 19269
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands
x 66
Contact:

Post by sinbad »

Well, the whole idea is that the allocators be pluggable (at compile time), so if you chose to plug Hoarde in you should be able to with the appropriate bridge / adaptor code.
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

tuan kuranes : You will just need to inherit from the base class, unless you already have an overloaded version of new/delete.

Project5 : The thready safety mechanism will be built into the specific heap/allocator and not the main framework, a per-thread heap is a possibility and would have many advantages (particularly on SMP systems). A mutex-esk protected heap could optionally be plugged into it as well.

As sinbad has already stated, HOARD could be integrated into the system with a suitable wrapper (adapter) that would allow it to plug in as with any other allocator scheme, this is probably beyond the scope of this project but once the specifics are available it wont be too tricky to do.
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

Image

Hi All,
Well here is some more information about the framework that will hold specific allocators for user objects. I have chosen to use policy based programming methods to build a flexible system as described below. For more information on policy programming see the book “Modern C++ Designâ€
beaugard
OGRE Contributor
OGRE Contributor
Posts: 265
Joined: Sun Mar 25, 2007 1:48 pm
x 2

Post by beaugard »

This is almost too sweet! This was already the most important gsoc project, and now we get policies! I had almost lost hope after Sinbad said he preferred controlling compile-time behaviour with #defines (ok, maybe it was more a question of what is worth refactoring than preference).

Yay!
User avatar
sinbad
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 19269
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands
x 66
Contact:

Post by sinbad »

(some discussion duplicated from private mails for completeness)

A very good start. I totally agree with the policy-based approach, as used in Loki (written by Andrei Alexandrescu of Modern C++ Design). Yes beaugard, using policies or not depends on the granularity of choice a needed over a particular aspect, and this is clearly one area where choice is needed. It's worth noting that there will be typedefs to encapsulate common policies, which themselves may be influenced by #defines, in order to simplify general / default use - although clearly the underlying policy system will be capable of more than that. For example, our objects will have default assigned allocators, which will have to be influenced by whether threading is enabled or not. The implementation will be via policies, but top-level control of default template instantiations may well be still via #defines. They are not mutually exclusive by any means.

I know at the moment you're concentrating on compile time allocator changes, but it would be interesting to see if a runtime allocator policy might be an option, albeit less efficient. Perhaps it would be possible to do this just via another allocator policy which could delegate to a runtime-plugged derived allocator - not as fast as a compiled in version but could be useful in some contexts - in particular I'm thinking wrappers like Mogre and python-ogre.
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

Thanks for the enthusiastic post beaugard :)
Policies are a great way to configure a system at compile time and make use of static binding, but as sinbad says they do not exclude the use of #defines.

#defines will still be needed to allow a uniform way of selecting a set of default policies to fit a specific configuration, in a sense caching the choices made when deciding on how to construct the allocator.

This of course wont limit the choices available in any way, if a user wants to configure a specific allocator for their own needs then they can still wield the full range of policies (or build new ones) to do so. However, the high degree of flexibility can sometimes be more than a user wants. A selection of pre-built typedefs with #define choices will be very handy here, for example a "DefaultAllocatorType" could map back to "DebugAllocatorType" or "ReleaseAllocatorType" depending on the build configuration.

On Runtime Choices.
It should be simple enough build a policy that can hold a pointer to an allocator provided at runtime, although complications arise when considering how to enforce sanity. How to control when, how and if the user can change or specify an allocator and that the allocator is valid, can its maximum allocation size accommodate the objects for example. Is sharing of a particular heap allowed ? as the user could try to provide the same allocator pointer to several runtime policies.

Any runtime policy would also have to enforce and maintain thread safety mechanisms explicitly, its probably not a good idea to rely on the runtime supplied allocator for this.

Humm, food for thought. It should be more than possible to make it work but all the extra checking will have a performance hit. The runtime policy can extend the host class API (the Allocator) with required methods for runtime setup and these would need to be accessed via the AllocWrapper.

-Tim
User avatar
Project5
Goblin
Posts: 245
Joined: Mon Nov 22, 2004 11:56 pm
Location: New York, NY, USA

Post by Project5 »

That sounds great, I'm a huge fan of policies :-)

On the idea of profiling, will there also be hooks to catch function/file/line information to allow for a debug memory leak checker policy?

--Ben
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

Project5,
Sorry to take so long over answering your question (Connection issues over a bank holiday suck). Humm, i dont see anything to prevent that level of allocation tracking, Ill be looking into profiling specifics once the basic framework is ready to receive them.

-Tim
User avatar
Project5
Goblin
Posts: 245
Joined: Mon Nov 22, 2004 11:56 pm
Location: New York, NY, USA

Post by Project5 »

@mycatboys: That question sounds a little familiar. Scroll up a little and you'll see it answered too :-P

From how c4llidus has described the project, this is not allocator in and of itself, it's a method for assigning allocators to Ogre. Presumably it'll have a few defaults to choose from, but that's not the focus as understand it.

Think of it more as allowing Ogre to use any allocator you have around. For instance, when this project's done, I'm going to see about hooking tcmalloc to it.

--Ben
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

Hi all,
Just to keep people informed, the first sections of the allocator framework have been committed to the soc07-memory CVS branch.

So far its just the basic skeleton structure for the framework along with a testing "standard" allocator, one that just wraps new and delete with nothing fancy going on. So far so good.

Project5, the aim of this project is to create both an extensible framework to allow people to "wire in" any allocator type they would like and to build some allocators native to ogre. As far as hooking in tcmalloc, sounds good to me :) when the mainstay of the frame is in place ill be happy to answer any questions people have about trying to hook in different allocators.

As well as improvements to the code committed thus far, the next step will be building the described profiling and information parts. If any one has a burning desire for a particular feature in that area over what has been discussed so far, now would be a good time to suggest it ;)

- Tim
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

Yet another over-due post, first an apology to all for being silent over these last few weeks, a power surge zapped my PC :cry:

Anyway, committed some new code and updated the wiki so things are getting back on track.

I have just added the profiling sections to the framework and hooked them into ogre's root object so it all starts up along with everything else. Information is collected by the MemProfileManager singleton class and dumped into a log file called OgreMemoryReport.log. I have built the profile manager in such away as to allow information to be logged in logical sections. I think this is the most useful way of looking at the memory report as its easy to see a clear breakdown in the runtime of an OGRE app (initialisation, loading, running, unloading, loading again... and so on)

To dump an section memory report, access the singleton class and call the flush("some identifying string") method. A report of memory activities since the last call the flush will be logged with the provided string attached.

When the memory profiling system shuts down, (this happens when the root object goes down) a final global report of all memory activity during the entire runtime is also logged, detailing any outstanding (leaked) allocations.

The actual information logged by the profiler is sparse at the moment, but this will increase as work goes on. An example currently looks like this:

15:07:42: Section Flush: Test
---------------------------------------------------------
Memory Profile Over 300 updates
Num Allocations : 300
Num Bytes Allocated : 12000
Num Deallocations : 0
Num Bytes Deallocated : 0
Average Allocations per Update: 1
Average Bytes per Allocation : 40
Outstanding Allocations : 300 ( 12000 bytes )
---------------------------------------------------------
15:07:42: Global Stats at Shutdown:
---------------------------------------------------------
Memory Profile Over 300 updates
Num Allocations : 300
Num Bytes Allocated : 12000
Num Deallocations : 0
Num Bytes Deallocated : 0
Average Allocations per Update: 1
Average Bytes per Allocation : 40
***LEAKED MEMORY*** :
12000 bytes in 300 allocations !
---------------------------------------------------------

The profiling code has also been made thread safe, each profiling policy maintaining a mutex so a request to read the collected information can't coincide with a request to update the information resulting from an allocation.
User avatar
sinbad
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 19269
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands
x 66
Contact:

Post by sinbad »

Looks good - perfect for a real-time display of categorised memory usage, which is precisely what we want out of this element of the project :)

Tracking the source file & line number of the leaks would of course be lovely (like the old memory manager does), although this does get messier and requires use of at least 'new' macros. However, if it helped add more context I would not object to needing to use OGRE_NEW instead of 'new' to pack more information in at source.
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

Quick update,
I have added per-allocator information to the output in addition to the section over-view. So for each allocator object in existence a line is added to the section overview detailing how many (de)allocations belong to that specific allocator.

Each allocator is given a unique numeric ID and all logged info concerning that allocator will be tagged with it. This will be very handy for debugging (not least for debugging the memory heap policies them selves)

The new section info will look something like this :-

Section Flush: Test
---------------------------------------------------------
Memory Profile Over 300 updates
Num Allocations : 300
Num Bytes Allocated : 12000
Num Deallocations : 0
Num Bytes Deallocated : 0
Average Allocations per Update: 1
Average Bytes per Allocation : 40
Outstanding Allocations : 300 ( 12000 bytes )
---------------------------------------------------------
Per allocator stats :-
Allocator 0 Allocs 300 ( 12000 ) Deallocs 0 ( 0 )
---------------------------------------------------------

As for memory leak info, I do intend to add line/file/function for each leaked allocation. Some degree of macro usage will most likely be employed to this end.
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Post by jacmoe »

Really looking forward to using it! :)
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

Thanks jacmoe :)

Just an update on the info reporting mechanism, added sanity checking for when a memory profile is removed (i.e. when the allocator is destroyed) that checks for any outstanding allocations. These allocations would be lost and the memory leaked if this is the case.

The log entry looks something like this:-


14:20:50: Removing Memory Profile 0
---------------------------------------------------------
***MEMORY ERROR DETECTED***
removed allocator has outstanding allocations!
880 allocations ( 41963 bytes )
---------------------------------------------------------


This should make spotting this class of memory leak easy. Ill be adding per allocation info to this report as well, for now I'm working towards hooking the general allocation path into OGRE to replace the old scheme. Once the generic usage is catered for work can start on building a mechanism to select specific allocators for specific jobs from a global set. This would work alongside the per-family allocators as indicated in the first post.
- Tim
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

An interesting problem has emerged while working on the framework, the C++ standard says operator delete can be overloaded as

void operator delete(void*, std::size_t)

the second parameter being the size of the memory to be deallocated, this is calculated by the compiler and passed to the function. This all sounds great, but it seems not to work :| sample code built to demo the principle works fine but when this overload is added into ogre (first turning off all of the old memory system) the compiler refuses to call it and instead calls the single parameter version providing no size info. This is odd to say the least so I'm posting it here in case anyone has any input on this.

That aside, the framework is now done (well as done as any early beta can be) the profile info is still very sparse but I have decided to turn my attention to the allocators themselves for now, revisiting the profile info once they are in place. This strategy should clearly show what info gathering methods are available. The frame is hooked into ogre using operator new/delete overloads for the most general path, once more specific allocators are created they will be made available along side the general one.
- Tim
btmorex
Gremlin
Posts: 156
Joined: Thu May 17, 2007 10:56 pm

Post by btmorex »

Normal delete always calls the single parameter version of delete.

If you want to call a customized delete (with 2 parameters for example), you have to call the destructor and that special version of delete explicitly.

That said, you should be able to overload the normal version of delete (one parameter) just fine.
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

Thanks for the input btmorex,
My understanding of the mechanism of delete is that the version with size_t is implicitly called if it is defined, this is certainly the case with the exaple code I have created. It sounds like your referring to the no-throw version of delete defined as :-

void operator delete(void*, const nothrow_t&) throw();

This indeed needs to be called explicitly as you describe. The overload of single parameter delete works as expected and is employed to direct deallocations to the global allocator presently. The lack of size info is going to be a pain for some allocator types though.
- Tim
btmorex
Gremlin
Posts: 156
Joined: Thu May 17, 2007 10:56 pm

Post by btmorex »

c4llidus wrote:Thanks for the input btmorex,
My understanding of the mechanism of delete is that the version with size_t is implicitly called if it is defined, this is certainly the case with the exaple code I have created. It sounds like your referring to the no-throw version of delete defined as :-

void operator delete(void*, const nothrow_t&) throw();

This indeed needs to be called explicitly as you describe. The overload of single parameter delete works as expected and is employed to direct deallocations to the global allocator presently. The lack of size info is going to be a pain for some allocator types though.
I'm pretty sure you can overload the nothrow versions too. afaik, you can't overload placement versions. For example, this is from the <new> header on my system:

Code: Select all

/** These are replaceable signatures:
 *  - normal single new and delete (no arguments, throw @c bad_alloc on error)
 *  - normal array new and delete (same)
 *  - @c nothrow single new and delete (take a @c nothrow argument, return
 *    @c NULL on error)
 *  - @c nothrow array new and delete (same)
 *
 *  Placement new and delete signatures (take a memory address argument,
 *  does nothing) may not be replaced by a user's program.
*/
void* operator new(std::size_t) throw (std::bad_alloc);
void* operator new[](std::size_t) throw (std::bad_alloc);
void operator delete(void*) throw();
void operator delete[](void*) throw();
void* operator new(std::size_t, const std::nothrow_t&) throw();
void* operator new[](std::size_t, const std::nothrow_t&) throw();
void operator delete(void*, const std::nothrow_t&) throw();
void operator delete[](void*, const std::nothrow_t&) throw();

// Default placement versions of operator new.
inline void* operator new(std::size_t, void* __p) throw() { return __p; }
inline void* operator new[](std::size_t, void* __p) throw() { return __p; }

// Default placement versions of operator delete.
inline void  operator delete  (void*, void*) throw() { }
inline void  operator delete[](void*, void*) throw() { }
Of particular note: "Placement new and delete signatures (take a memory address argument, does nothing) may not be replaced by a user's program."
c4llidus
Gnoblar
Posts: 16
Joined: Fri Apr 13, 2007 9:02 am
Location: Aberystwyth, Wales, UK

Post by c4llidus »

[ duplicate post killed ]
Last edited by c4llidus on Thu Jul 12, 2007 9:17 pm, edited 1 time in total.
- Tim
Post Reply