Load balancing a Windows File Share using HA-Proxy

After pulling my hair out over DFS I just had this weird and potentially dangerous idea come into my head whereby, just possibly, I might be able to use HA-Proxy to load balance a file share between servers.

I've done some remedial packet traces and it does appear that TCP port 445 is the only thing involved in using Windows file sharing. I've always thought for many years that UDP 139, 135 etc were also involved in at least establishing the connection - but apparently not!

So I setup a basic test:

listen SMBTest *:445
  mode tcp
  server Smb1 172.16.61.201:445
  server Smb2 172.16.61.202:445

And you'll never guess what... it works??? (!)

Now obviously there is the whole concern about synchronisation between the file servers (of course). That could easily be taken care of with a little bit of Robocopy script.

And considering I only need a HA read-only file share there wouldn't be any issues with regard to file locking etc.

  • Can anyone tell me if what I'm playing with here is fire? I really didn't think it would work at all and now I'm a little shocked.
  • What would be the downsides?
  • Could this be relied upon for a production environment?

Solution 1:

File replication is much more difficult problem than you might first envision.

File replication typically does not scale well. You'll start to see problems when the number of files your handling is half a million or more, either the copy takes longer than it takes to do the sync so either you'll need to sticky the session for a longer period and reduce the intervals between copies or copy fewer files.

From the little I know about your specific workload this might still be OK for you. You said the file share is read only which leads me to believe you update the data in large batch quantities. Robocopy might be slow under these circumstances yet since the interval between changes is so long this might be an acceptable risk.

Seeing as HAProxy offers comparative intelligence to a layer 4 load balancer in this setup it might be more beneficial to use a layer 4 load balancer too as they will typically handle more throughput with less latency under high loads. That might not apply to your problem but food for thought.

If you require features and performance (like r/w shares that need to be closely synced) then this wont work. If you think you'll need this with this dataset in the future consider your solution carefully as your dataset might be terabytes in size by then and you wouldn't want to be in a situation where your having to scrap it and reupload it to a new solution.