How to setup an NFS server that caches a network share?
User data is stored on two fairly large (>1 PB) OpenStack Swift storage clusters. Let them be Cluster A and Cluster B.
In addition, there are several PoPs that need to interact with that data. Servers in these PoPs are effectively diskless, meaning no user data is stored on them or ever downloaded to them. PoPs can be grouped into general world regions (e.g. North America, South Africa, Central Europe et al.).
Some PoPs are quite a long distance away from Swift endpoints of any of the clusters, introducing an undesirable latency. In order to somewhat mitigate this, I want to setup a caching gateway server in each of the regions, which will cache r/w requests to the nearest cluster.
Currently, clients in any of the PoPs access user data by a permanently mounted swift virtual file system, which is a FUSE module that mounts Swift Object Storage as a block device (more or less). However, svfs isn't all that stable in the first place and in the future, clients should access the cache servers via NFS.
This is a diagram of one branch of the desired architecture:
+------------------+ +------------------+ NFS +------------------+
| Cluster A | SVFS | Region 1 Cache +----------> R1 PoP a Client |
| +----------------> | | |
|Persistent Storage| |Ephemeral Storage+----+ |Generates R/W Load|
+-----------------++ +------------------+ | +------------------+
| |
| +------------------+ | NFS +------------------+
| SVFS | Region 2 Cache | +-----> R1 PoP b Client |
+-----------------> | | |
|Ephemeral Storage| |Generates R/W Load|
+------------------+ +------------------+
I am familiar with the basics of setting up NFS and svfs.
The question is: How can I setup the caching server to use all available resources (a designated cache partition, RAM) to cache as aggressively and as much data as possible before writing to the svfs mount point? Basically it comes down to: How can I cache a directory in linux?
If possible, reads and writes should be consolidated and block sizes in FUSE requests should be at least 128k if possible to maximize throughput and minimize latency if the cache needs to write to the cluster.
Addendum 1: I've switched the cluster mount module from svfs to S3QL on a few of the servers. S3QL's caching has improved performance a bit. I will try to get some performance data for completeness.
If the inherent linux mechanisms (like cachefs
aka cachefilesd
) don't work AND you have budget, you may look into WAFS (wide area file services). These are devices designed for aggressive caching of NFS (and CIFS), to try and hide the latencies usually involved in WAN links.
I'm really no expert on this area (but it sure is interesting!).
What I've been looking on lately is mainly dm-cache for LVM, with SSD's for the caching part of it. Here is an example text from readhat which has a good overview, but it's not tied to RH: https://www.redhat.com/en/blog/improving-read-performance-dm-cache