We're looking at using Microsoft's CSV cache feature in our new Hyper-V 2012 R2 clusters:

  • It's enabled by default, with 512MB allocated to each CSV.
  • The maximum memory that can be assigned is 80%.
  • The recommendation is a maximum allocation of 64GB.

Our nodes will have plenty of free memory (in case of node failure, etc.) so for the majority of the time our hosts will have plenty of free memory.

What I would like to find out is what would happen if we allocated, for example, the full 64GB, and a node failure occurred such that the remaining nodes would need to reclaim memory. Would the host be able to reclaim the memory from cache? Is it possible to detect the cache by inspecting the system processes?


You asked a very specific question -- what does the CSV cache do under memory load?

The answer is that memory is allocated to the CSV cache statically, and never released. So if you experience a failover, that memory is not available for anything else, like picking up VMs that might need to run.

I suspect that, after a relatively small allocation (the default 1/2 GB) the marginal value of more memory in the CSV cache is low for you.


What kind of storage do you use as shared storage where your CSV’s are hosted?

If it’s a software solution (software-defined storage), I would recommend turning native CSV caching off and use solution’s cache instead. Software-defined storage like Starwind for example has it’s own DRAM caching which can work in write-back mode (CSV has read-only cache) and is unlimited in size (CSV is limited to 64GB) and optionally can be deduplicated (CSV can not) https://www.starwindsoftware.com/starwind-virtual-san


The changes you have highlighted were introduced in Windows 2012-R2, specifically for SoFS.

SoFS is very read-intesive and would greatly benifit from higher cache settings (max-80% / recommended-max-64GB)

For Hyper-V the recommendation is to be more conservative, in regards to memory allocation, for exactly the reason you highlight. the CSV cache will be competing for memory as a resource with the Virtual Machines on the node, especially in failure situations where Machines are moved from node to node due to node failure.

Saying that; you simple can take CSV cache as a requirement for your hardware design. You say you have "plenty". So within this "plenty" of memory you need to calculate what the maximum cache is you can use, without limiting the on-lining of Virtual Machines.

HTH.