Avoiding SPOFS with GlusterFS and Windows
Solution 1:
I like GlusterFS. Actually, I adore GlusterFS. As long as you can give it some dedicated bandwidth everything's fine.
One of the best things about GlusterFS is using it with NFS. One of the surprising things I've been working with lately is NFS on Windows 7 and 2k8R2.
Here's what I'd do.
- Set up 2 GlusterFS servers that can export NFS.
- Set up a heartbeat link between them.
- Deploy something like Heartbeat/Pacemaker perhaps?
- Set up a virtual IP (VIP) between your Gluster Nodes.
- Connect the Windows boxen's mapped network drives using the IP address of the VIP.
- Test everything you can possibly imagine.
Clustering Samba sounds scary, and even if you do do that, Samba still lacks the ability to behave reliably in some windows networks (all that NT4 domain compatibility, never seem to be able to get past that).
I think that because each gluster node is in distributed,replicated mode then you should theoretically be able to connect to either and allow it to worry about moving your data around. As a result, the heartbeatd should be the thing that does the redirection and control which one you're talking to.
As for your
- File-counts can get into the 10's of millions.
I suggest that you investigate using XFS as the underlying file system, as it's pretty good with big filesystems, and supported under GlusterFS
Solution 2:
Maybe you can think in HA solution... use an LDAP for authentication (it can be replicated as many LDAP servers you want) and place an IP to listen to SMB services.
This IP will be floating on main server. When this is down Heartbeat can start services on second server.
This servers will have a mountpoint to glusterfs, and then all data will be there.
It's a possible solution and it's so easy to manage...