Nginx: Optimal map_hash_max_size and map_hash_bucket_size for 1M map?

I have 1M static rewrite rules and using this map configuration. How to determine the optimal values for map_hash_max_size and map_hash_bucket_size? I want to optimize for memory consumption. The documentation is very minimal about this.

Somebody else asked it on the Nginx forum, but without response.


I made an analysis of source code for server_names_hash_bucket_size and server_names_hash_max_size, I guess it uses the same hash as the map.

Here comes generalized copy of my answer:

  • The general recommendation would be to keep both values as small as possible.
  • If nginx complains increase max_size first as long as it complains. If the number exceeds some big number (32769 for instance), increase bucket_size to multiple of default value on your platform as long as it complains. If it does not complain anymore, decrease max_size back as long as it does not complain. Now you have the best setup for your set of keys (each set of keys may need different setup).
  • Bigger max_size means more memory consumed (once per worker or server, please comment if you know).
  • Bigger bucket_size means more CPU cycles (for every key lookup) and more transfers from main memory to cache.
  • max_size is not related to number of keys directly, if number of keys doubles, you may need to increase max_size 10 times or even more to avoid collisions. If you cannot avoid them, you have to increase bucket_size.
  • bucket_size is said to be increased to the next power of two, from the source code I would judge it should be enough to make it multiple of default value, this should keep transfers to cache optimal.
  • Size of bucket_size depends on length of your keys. Should the average key size be 32 bytes (with hash array overhead), increasing bucket_size to 512 bytes would mean, that it can accommodate 16 keys with colliding hash key. This is not something that you want, if collision happens it searches linearly. You want to have as less collisions as possible.
  • If you have max_size less than 10000 and small bucket_size, you can come across long loading time because nginx would try to find optimal hash size in a loop.
  • If you have max_size bigger than 10000, there will be "only" 1000 loops performed before it would complain.

The nginx documentation on the hash and bucket size is horribly vague. Are those numbers expressed in bytes? Entries?

I have a 128,592-byte map file with 1351 entries. The minimum values that worked for this case are:

map_hash_bucket_size 128;
map_hash_max_size 45948;

I don't know what the relationship between these numbers is. I arrived at them by increasing the bucket size to 128, then doing a binary search for the max size.