Nginx: Optimal map_hash_max_size and map_hash_bucket_size for 1M map?
I have 1M static rewrite rules and using this map configuration. How to determine the optimal values for map_hash_max_size
and map_hash_bucket_size
? I want to optimize for memory consumption. The documentation is very minimal about this.
Somebody else asked it on the Nginx forum, but without response.
I made an analysis of source code for server_names_hash_bucket_size
and server_names_hash_max_size
, I guess it uses the same hash as the map.
Here comes generalized copy of my answer:
- The general recommendation would be to keep both values as small as possible.
- If nginx complains increase
max_size
first as long as it complains. If the number exceeds some big number (32769 for instance), increasebucket_size
to multiple of default value on your platform as long as it complains. If it does not complain anymore, decreasemax_size
back as long as it does not complain. Now you have the best setup for your set of keys (each set of keys may need different setup). - Bigger
max_size
means more memory consumed (once per worker or server, please comment if you know). - Bigger
bucket_size
means more CPU cycles (for every key lookup) and more transfers from main memory to cache. -
max_size
is not related to number of keys directly, if number of keys doubles, you may need to increasemax_size
10 times or even more to avoid collisions. If you cannot avoid them, you have to increasebucket_size
. -
bucket_size
is said to be increased to the next power of two, from the source code I would judge it should be enough to make it multiple of default value, this should keep transfers to cache optimal. - Size of
bucket_size
depends on length of your keys. Should the average key size be 32 bytes (with hash array overhead), increasingbucket_size
to 512 bytes would mean, that it can accommodate 16 keys with colliding hash key. This is not something that you want, if collision happens it searches linearly. You want to have as less collisions as possible. - If you have
max_size
less than 10000 and smallbucket_size
, you can come across long loading time because nginx would try to find optimal hash size in a loop. - If you have
max_size
bigger than 10000, there will be "only" 1000 loops performed before it would complain.
The nginx documentation on the hash and bucket size is horribly vague. Are those numbers expressed in bytes? Entries?
I have a 128,592-byte map file with 1351 entries. The minimum values that worked for this case are:
map_hash_bucket_size 128;
map_hash_max_size 45948;
I don't know what the relationship between these numbers is. I arrived at them by increasing the bucket size to 128, then doing a binary search for the max size.