Any reason to overload global new and delete?

We overload the global new and delete operators where I work for many reasons:

  • pooling all small allocations -- decreases overhead, decreases fragmentation, can increase performance for small-alloc-heavy apps
  • framing allocations with a known lifetime -- ignore all the frees until the very end of this period, then free all of them together (admittedly we do this more with local operator overloads than global)
  • alignment adjustment -- to cacheline boundaries, etc
  • alloc fill -- helping to expose usage of uninitialized variables
  • free fill -- helping to expose usage of previously deleted memory
  • delayed free -- increasing the effectiveness of free fill, occasionally increasing performance
  • sentinels or fenceposts -- helping to expose buffer overruns, underruns, and the occasional wild pointer
  • redirecting allocations -- to account for NUMA, special memory areas, or even to keep separate systems separate in memory (for e.g. embedded scripting languages or DSLs)
  • garbage collection or cleanup -- again useful for those embedded scripting languages
  • heap verification -- you can walk through the heap data structure every N allocs/frees to make sure everything looks ok
  • accounting, including leak tracking and usage snapshots/statistics (stacks, allocation ages, etc)

The idea of new/delete accounting is really flexible and powerful: you can, for example, record the entire callstack for the active thread whenever an alloc occurs, and aggregate statistics about that. You could ship the stack info over the network if you don't have space to keep it locally for whatever reason. The types of info you can gather here are only limited by your imagination (and performance, of course).

We use global overloads because it's convenient to hang lots of common debugging functionality there, as well as make sweeping improvements across the entire app, based on the statistics we gather from those same overloads.

We still do use custom allocators for individual types too; in many cases the speedup or capabilities you can get by providing custom allocators for e.g. a single point-of-use of an STL data structure far exceeds the general speedup you can get from the global overloads.

Take a look at some of the allocators and debugging systems that are out there for C/C++ and you'll rapidly come up with these and other ideas:

  • valgrind
  • electricfence
  • dmalloc
  • dlmalloc
  • Application Verifier
  • Insure++
  • BoundsChecker
  • ...and many others... (the gamedev industry is a great place to look)

(One old but seminal book is Writing Solid Code, which discusses many of the reasons you might want to provide custom allocators in C, most of which are still very relevant.)

Obviously if you can use any of these fine tools you will want to do so rather than rolling your own.

There are situations in which it is faster, easier, less of a business/legal hassle, nothing's available for your platform yet, or just more instructive: dig in and write a global overload.


The most common reason to overload new and delete are simply to check for memory leaks, and memory usage stats. Note that "memory leak" is usually generalized to memory errors. You can check for things such as double deletes and buffer overruns.

The uses after that are usually memory-allocation schemes, such as garbage collection, and pooling.

All other cases are just specific things, mentioned in other answers (logging to disk, kernel use).


In addition to the other important uses mentioned here, like memory tagging, it's also the only way to force all allocations in your app to go through fixed-block allocation, which has enormous implications for performance and fragmentation.

For example, you may have a series of memory pools with fixed block sizes. Overriding global new lets you direct all 61-byte allocations to, say, the pool with 64-byte blocks, all 768-1024 byte allocs to the the 1024b-block pool, all those above that to the 2048 byte block pool, and anything larger than 8kb to the general ragged heap.

Because fixed block allocators are much faster and less prone to fragmentation than allocating willy-nilly from the heap, this lets you force even crappy 3d party code to allocate from your pools and not poop all over the address space.

This is done often in systems which are time- and space-critical, such as games. 280Z28, Meeh, and Dan Olson have described why.


UnrealEngine3 overloads global new and delete as part of its core memory management system. There are multiple allocators that provide different features (profiling, performance, etc.) and they need all allocations to go through it.

Edit: For my own code, I would only ever do it as a last resort. And by that I mean I would almost positively never use it. But my personal projects are obviously much smaller/very different requirements.