Converting UTF-8 NFD filenames to UTF-8 NFC, in either rsync or afpd
I have a home file server running FreeNAS 8. A few days ago I used rsync to upload my entire iTunes library from Mac so that I could load my library over the network instead of off a slow USB drive. This mostly worked, and iTunes runs much better now, but I'm running into issues accessing any songs that have non-ascii characters in it (I first noticed the problem when loading Queensrÿche tracks). The files would show up in the Finder, but any attempt to access them made them vanish until I reconnected to the server.
After some research I found out this is because OSX uses a different UTF character order from Linux. OSX filesystems use Unicode Normalization Form D (NFD), where linux uses Form C (NFC). Rsync doesn't convert these forms when it performs the copy from my mac to the server, now when iTunes tries to access a file with a special character over the network, the files on the server have the wrong encoding and afpd reports they don't exist.
What is the best way to address this problem? Is it possible to make rsync perform the unicode conversion while uploading the base library to the server? Can I configure afpd to transmit/receive filenames in NFD format? Is there an easy solution to change the filenames on the server? I found some stuff about a program named convmv, but I don't know if I can run that on FreeNAS.
Solution 1:
Note: If you are using version 3.0.0 or newer of rsync, the
--iconv
option as mentioned in the other answers is clearly the superior solution.
Something that should work is rsyncing between the source directory and the mounted remote file system (SMB, NFS, AFP), which rsync will just treat as local file system.
However, I do not know how well this works in practice, and you have to work around different issues, for example the delta-transfer algorithm won’t be used by default (since source and destination are “local”) (maybe --no-whole-file will work?), you have to check,e.g., that SMB effectively preserves modification times, etc.
Solution 2:
You can use rsync's --iconv
option to convert between UTF-8 NFC & NFD, at least if you're on a Mac. There is a special utf-8-mac
character set that stands for UTF-8 NFD. So to copy files from your Mac to your NAS, you'd need to run something like:
rsync -a --iconv=utf-8-mac,utf-8 localdir/ mynas:remotedir/
This will convert all the local filenames from UTF-8 NFD to UTF-8 NFC on the remote server. The files' contents won't be affected.
Solution 3:
Currently I'm using rsync --iconv
like this:
Copying files from Linux server to OS X machine
You should execute this command from OS X machine:
rsync -a --delete --iconv=UTF-8-MAC,UTF-8 '[email protected]:/home/username/path/on/server/' /Users/username/path/on/machine/
Copying files from OS X machine to Linux server
You should execute this command from OS X machine:
rsync -a --delete --iconv=UTF-8-MAC,UTF-8 /Users/username/path/on/machine/ '[email protected]:/home/username/path/on/server/'
Solution 4:
Don't use rsync to copy the files to your NAS. When you use rsync to copy the files the filenames will be stored on your NAS in UTF NFD format (i.e. the OSX format) but Samba server running on your NAS only understands UTF NFC format filenames. Use the CIFS/SMB (Samba) interface to copy the files and the everything will be fine.