GlusterFS failing to mount at boot with Ubuntu 14.04
I managed to make this work through a combination of answers in this thread and this one: GlusterFS is failing to mount on boot
As per @Dan Pisarski edit /etc/init/mounting-glusterfs.conf
to read:
exec start wait-for-state WAIT_FOR=networking WAITER=mounting-glusterfs-$MOUNTPOINT
As per @dialt0ne change /etc/fstab
to read:
[serverip]:[vol] [mountpoint] glusterfs defaults,nobootwait,_netdev,backupvolfile-server=[backupserverip],direct-io-mode=disable 0 0
Works For Me(tm) on Ubuntu 14.04.2 LTS
I have run into the same problem on AWS on ubuntu 12.04. Here are some things you can do that worked for me:
- add more fetch-attempts in your fstab
This will allow you to retry the volfile server while the network is unavailable.
- add a backup volfile server in your fstab
This will allow for you to mount the filesystem from another gluster server member if the primary is down for some reason.
- add
nobootwait
in your fstab
This allows the instance to continue booting while this filesystem isn't mounted.
A sample entry from my current fstab is:
10.20.30.40:/fs1 /example glusterfs defaults,nobootwait,_netdev,backupvolfile-server=10.20.30.41,fetch-attempts=10 0 2
I have not tested this on 14.04, but it works ok for my 12.04 instances.
It's a bug
This is really a bug (the static-network-up is not a job, it's an event signal).
Moreover, using the network job as suggested in other answers is not the most correct solution.
So, I created this bug report and submitted a patch to this problem.
As a workaround, you can apply my proposed solution (at the end of this answer) and use the _netdev
option in your fstab.
A better explanation is showed above too, but you can skip this explanation if you want.
Explanation
This is a bug in the mounting-glusterfs.conf
. It can increase unnecessary 30 seconds in the boot in an Ubuntu Server, or even hang the boot process.
Because of this bug, the mountall process thinks that the mount failed (you'll see "Mount failed" errors in /var/log/boot.log
). So, when not using the nobootwait
/nofail
flags in /etc/fstab
, the bug can hang the mount process (and the boot process too). When using the nobootwait
/nofail
flags, the bug will increase the boot time in about 30 seconds.
The bug is caused by the following errors:
- There is no need to wait for the network is up. The Ubuntu itself has the
_netdev
mount flag that will retry the mount for each time that an interface brings up; - However, it's necessary to wait for the GlusterFS Server daemon (for mounts using localhost);
- This was implemented in an old commit in the GlusterFS upstream project. However, this commit was overwritten;
- It's wrong to use the
wait-for-state
upstart task to wait for a signal. It's used to wait for a job.static-network-up
is an event signal, and not a job;- This is why the "Unknown job: static-network-up" is logged;
- It's wrong, when waiting for a job to be started, not passing the
WAIT_STATE=running
env var because it's not the default inwait-for-state
.
Solution
/etc/init/mounting-glusterfs.conf
:
author "Louis Zuckerman <[email protected]>"
description "Block the mounting event for glusterfs filesystems until the glusterfs-server is running"
instance $MOUNTPOINT
start on mounting TYPE=glusterfs
task
script
if status glusterfs-server; then
start wait-for-state WAIT_FOR=glusterfs-server WAIT_STATE=running \
WAITER=mounting-glusterfs-$MOUNTPOINT
fi
end script
PS: Use also the _netdev
option in your fstab.
I ran into this as well, and want to preface this answer with the statement that I am not an expert in this area so its possible there is a better solution to this!
But the issue seems to be that static-network-up is an event, not the name of an upstart job. However, the wait-for-state script expects a job name to be passed in as WAIT_FOR value. Thus, the error of "Unknown job" as you discovered above.
To resolve the issue I changed /etc/init/mounting-glusterfs.conf, changing:
exec start wait-for-state WAIT_FOR=static-network-up WAITER=mounting-glusterfs-$MOUNTPOINT
into:
exec start wait-for-state WAIT_FOR=networking WAITER=mounting-glusterfs-$MOUNTPOINT
networking is the name of an actual job (/etc/init/networking.conf) and I believe the job that typically emits static-network-up.
This change worked for me on Ubuntu 14.04.
Thanks for the detailed explanation, I think I understand a lot more than earlier. Latest solution is almost working. The problems (actually one, since the first implies the second):
- local shares (
127.0.0.1:/share
) still not mounted -
mounted TYPE=glusterfs
never satisfied, so the services which are dependent of the mountedTYPE=glusterfs
state
/etc/fstab
:
127.0.0.1:/control-share /mnt/glu-control-share glusterfs defaults,_netdev 0 0
/etc/init/mounting-glusterfs.conf
: copied from above
/etc/init/salt-master.conf
:
description "Salt Master"
start on (mounted TYPE=glusterfs
and runlevel [2345])
stop on runlevel [!2345]
limit nofile 100000 100000
...
The local share must be mounted by hand, or by some automatism, salt-master must be started by hand after all reboots.
Noticed later: the above WAIT script in mounting-glusterfs... blocks the whole boot procedure, seems like glusterfs-server state never reaches running.