What's the best way to merge two directories on the same filesystem in linux?
I have two directories that needs to be merged together. Files in these two directories are all large files (>= 500MB).
What I want to archive: For files in source directory: if it doesn't exist in destination directory, mv
it to the destination directory (which is fast since we are basically creating a new hard link and unlink the source file); if it exist in destination directory, copy the source file there and remove source file.
The most common way to merge directories in Linux system is to use rsync
with --remove-source-files
option. But this is slow because it will do copy operation even the destination file doesn't exist.
Any better ideas? Thank you.
Solution 1:
Basically what You described is move files an overwrite destination if exists. So Just move them.
Solution 2:
There's a case where mv
fails. Here's some example data:
mkdir -p src/d dest/d
touch src/d/f1 dest/d/f2
See how mv
fails:
$ mv src/* dest/
mv: cannot move 'src/d' to 'dest/d': Directory not empty
$ mv -f src/* dest/
mv: cannot move 'src/d' to 'dest/d': Directory not empty
$ mv -fv src/* dest/
mv: cannot move 'src/d' to 'dest/d': Directory not empty
$ mv -fvi src/* dest/
mv: overwrite 'dest/d'? y
mv: cannot move 'src/d' to 'dest/d': Directory not empty
$ mv -fvi -t dest/ src/*
mv: overwrite 'dest/d'? y
mv: cannot move 'src/d' to 'dest/d': Directory not empty
So make a script file:
vim supermove
This example does no error checking (DISCLAIMER: works for me, but please test that it works for you... maybe with echo
before mv
), and will overwrite files with same path. And it uses find with \;
which is terribly inefficient, but +
doesn't work right with "$dest"
prepended. Older versions will make some dirs without the path prepended, and newer versions of find will say:
find: In '-exec ... {} +' the '{}' must appear by itself, but you specified 'dest/{}'
You could probably find a way to fix that with xargs though. (It took a few minutes on the 64k files 8TB that I was moving). Add this content:
#!/bin/bash
src=$1
dest=$2
src=$(readlink -f "$src")
dest=$(readlink -f "$dest")
cd "$src"
# also copy hidden files
shopt -s dotglob
# make dirs (missing old permission,acl,xattr data), and then mv the files
time find * -type d -exec mkdir -p "$dest"/{} \;
time find * -type f -exec mv {} "$dest"/{} \;
# also copy permissions, acls, xattrs
rsync -aAX "$src"/ "$dest"/
And make it executable:
chmod +rx supermove
And run it
./supermove src/ dest/
And the result... before:
$ find src dest
src/
src/d
src/d/f1
dest/
dest/d
dest/d/f2
After:
$ find src dest
src
src/d
dest
dest/d
dest/d/f1
dest/d/f2
Now src/
should be just empty dirs. If so, you can rm -r src
to clean up.
Solution 3:
mv
options are all about conflict resolution:
Pick one:
-f force (always overwrite)
-i interactive (ask whether to overwrite)
-n no clobber (no overwrite)
And this is good too:
-v verbose
Otherwise, data can get lost and/or it won't be clear what exactly happened.
mv is also superior on the the same fs because it's just updating directory inodes, the files shouldn't messed with. The other thing is that the larger the operation, there is a greater chance for things to go wrong like soft-errors.