Slurm cluster in Google Cloud - how do I attach a Filestore instance

If one clones the slurm-gcp project and deploys the stock cluster defined in there, things work well.

What I would like to do, is to use a GCP Filestore instance to provide (more) persistent storage to the cluster.

Part of the cluster deployment is to create a VPC network (and a subnetwork) called g1-network.

If I, after the cluster has deployed successfully, create a Filestore instance, I can select the g1-network. I can see that suitable routes are created. But I can't ping the Filestore instance from any of the cluster machines.


Solution 1:

I have found a solution.

For one, Filestore doesn't reply to ping, which threw me for a loop.

Secondly, the simplest thing is to just use the "default" VPC network and subnetwork:

  • When creating the Filestore, just select "default" for VPC.
  • In the YAML file with the cluster definition, set vpc_net and vpc_subnet to "default" for cluster and set vpc_subnet to "default" for the partition(s).