How conservative should I be when considering current cache use on a system?

Periodically, as with any large online service, we evaluate the current load on our hardware, and make attempts at "right-sizing" so that we're not paying for severely underutilized hardware.

How concerned should I be when factoring cache memory in use? It's my understanding that cache is by and large used as an optimization, but I seem to recall also reading that cache could stay held for a while beyond needing to be used -- a sort of waste due to excessively available resources. Here's an example of one of our current systems:

htop

So my question then is, how "safe" should I play it when considering how much load this host can (or should) hold? Do I consider everything, used/buffers/cache, as 100% required for optimal use? Or can I be more lenient with cache, and assume the system may swap out cache entries more frequently, but not to a point that would actually cause application performance issues?


Solution 1:

Caching is all about reducing access to the underlying storage (HDD or SSD), so it all depends on your I/O workload. Decreasing the available cache can increase your iowait and disk's %utils stats, meaning that the physical disk can be under increased stress.

I suggest you to decrease your available memory in relatively small steps (ie: 4 GB at times) and check if your server performance decreased. If not, you can reduce again your memory (never under your application minimal requirements, obviously).