Super high performance C/C++ hash map (table, dictionary) [closed]

I need to map primitive keys (int, maybe long) to struct values in a high-performance hash map data structure.

My program will have a few hundred of these maps, and each map will generally have at most a few thousand entries. However, the maps will be "refreshing" or "churning" constantly; imagine processing millions of add and delete messages a second.

What libraries in C or C++ have a data structure that fits this use case? Or, how would you recommend building your own? Thanks!


Solution 1:

I would recommend you to try Google SparseHash (or the C11 version Google SparseHash-c11) and see if it suits your needs. They have a memory efficient implementation as well as one optimized for speed. I did a benchmark a long time ago, it was the best hashtable implementation available in terms of speed (however with drawbacks).

Solution 2:

What libraries in C or C++ have a data structure that fits this use case? Or, how would you recommend building your own? Thanks!

Check out the LGPL'd Judy arrays. Never used myself, but was advertised to me on few occasions.

You can also try to benchmark STL containers (std::hash_map, etc). Depending on platform/implementation and source code tuning (preallocate as much as you can dynamic memory management is expensive) they could be performant enough.

Also, if performance of the final solution trumps the cost of the solution, you can try to order the system with sufficient RAM to put everything into plain arrays. Performance of access by index is unbeatable.

The add/delete operations are much (100x) more frequent than the get operation.

That hints that you might want to concentrate on improving algorithms first. If data are only written, not read, then why write them at all?

Solution 3:

Just use boost::unordered_map (or tr1 etc) by default. Then profile your code and see if that code is the bottleneck. Only then would I suggest to precisely analyze your requirements to find a faster substitute.