Corrupted XFS and no way to xfs_repair

One of my hosted server has some XFS problems. after the last crash, some of my RRD folders got corrupted.

example (sorry, it's in french):

# rm *
rm: impossible de supprimer « create_rrd.sh »: La structure a besoin d'un nettoyage
rm: impossible de supprimer « old »: est un dossier
rm: impossible de supprimer « tcgraph.log »: La structure a besoin d'un nettoyage
rm: impossible de supprimer « tcgraph.rrd »: La structure a besoin d'un nettoyage

The normal move would be to restart the system in single user, or using a live cd, and run a xfs_repair on /dev/sda. Unfortunately (it would be too easy), the hosting company provides an option to restart on a live cd, that doesn't work. And visiting the datacenter is not an option.

It seems that I can't touch the inodes at all, I get that "structure needs a cleanup" message everytime.

So, question is: does anyone know a way to repair/fix that XFS filesystem by hand ? Any low-level XFS manipulation tool that could help ?


The short answer is no.

The longer answer is that it would be very very fascinating to try copying everything into a tmpfs ramdisk, switching / over to that, then unmounting the harddrive filesystem and running entirely from ram while you repair the drive filesystem.

I'm not the first one to have thought of this, I found this thread but no information about whether the scheme actually worked. Since the purpose of it was to wipe the drive, they didn't bother unmounting it.

To do this, create a directory and mount -t tmpfs none /some/directory, then start filling it with a copy of the important bits of your system (sshd, mount,umount, xfs tools, a shell, and all the libraries needed to run it, probably all of /etc to be sure, and finally init... chroot jail creation scripts would help here, as would having about 4GB of RAM. Mount a copy of proc in it and if you're using devfs, mount a copy of devfs as well. Using the copy of /etc/ssh/sshd_config, set up sshd to start on a different port. chroot to your ramdisk and make sure that everything works and there's no missing libraries, then fire up sshd in the ramdisk (on the alternate port) so that it's chrooted there. Check that you can ssh into it (this might require copying your /home directory into there as well) (and opening the port on any firewall you might have).

Now, the magic begins: instead of chrooting to that tmpfs, you need to find a utility called pivot_root. Its sole purpose in life is to call pivot_root(). It's in util-linux on Debian. The purpose of pivot_root() is essentially to chroot every process at once. Originally used for moving / from an initrd ramdisk image to the actual drive, if this works, you'll be moving / from the actual drive to a ramdisk image. So, let's say you did mkdir /mnt/tmpfs; mount -t tmpfs none /mnt/tmpfs After copying everything you need in there, the next step is to mkdir /mnt/tmpfs/oldroot; pivot_root /mnt/tmpfs /oldroot (If you ran out of space copying stuff, unmount /mnt/tmpfs and mount it again, this time with -o size=... since the default is to only allow half of your memory.)

The last step is getting /oldroot unmounted. You'll need to unmount /oldroot/sysfs /oldroot/proc and so on (check /proc/mounts). If you still get "filesystem is busy" you can try forcing it, or at least tracking down everything that had files open by looking at ls -l /proc/*/cwd /proc/*/fd/ | grep /oldroot/ and killing everything still referring to it (this probably includes the ssh server that wasn't chrooted. Make sure you're logged on in that alternate sshd before you start killing things). Obviously don't kill process 1 (init). If you can't force umount with init running, you'll need to use chroot /oldroot /sbin/telinit u2 (2=runlevel you are currently on or init will probably kill everything and then you reboot and start over) to get init to "upgrade" by running the "new" init that you've got in the ramdisk. You'll need to chroot it to /oldroot in order to get it to use /oldroot/dev/initctl (the ramdisk /dev/initctl isn't the same) (note that telinit will use [/oldroot]/dev/initctl to talk to the existing init process that was pivot_rooted to the ramdisk, so the init it starts will be on the ramdisk, not /oldroot).

I'm not about to try that on a production server here. Maybe I'll try it at home this weekend.