Higher rmem_max value leading to more packet loss

Solution 1:

More buffer doesn't necessarily imply more speed. More buffer simply implies more buffer. Below a certain value you'll see overflow as applications can't necessarily service received data quickly enough. This is bad, but at the point where there is sufficient buffer for the app to service at a reasonable rate even in the event of the occasional traffic spike then anything else is likely wasted.

If you go -too- large then you're placing a much larger burden on the kernel to find and allocate memory which, ironically, can lead to packet loss. My hunch would be that this may be what you're seeing, but that some other metrics would be required to confirm.

It's likely that the 2.5M number may come from recommendations around setting rmem and wmem values for TCP - where the relationship between window sizing and buffer settings can have significant effects under certain circumstances. That said, TCP != UDP - but some folks assume that if it helps TCP that it will also help UDP. You've got the right empirical information. If I were you, I'd stick at the 256K value and call it even.

Solution 2:

The problem is that there usually are several switches in the path in between the two endpoints (i.e., servers). While with rmem you may increase the size of the buffers in the endpoints, it does not affect the buffers in the switches, which are rather limited. So you may lose packets due to overflows in the switches buffers.