Git and the Umlaut problem on Mac OS X

Enable core.precomposeunicode on the mac

git config --global core.precomposeunicode true

For this to work, you need to have at least Git 1.8.2.

Mountain Lion ships with 1.7.5. To get a newer git either use git-osx-installer or homebrew (requires Xcode).

That's it.


The cause is the different implementation of how the filesystem stores the file name.

In Unicode, Ü can be represented in two ways, one is by Ü alone, the other is by U + "combining umlaut character". A Unicode string can contain both forms, but as it's confusing to have both, the file system normalizes the unicode string by setting every umlauted-U to Ü, or U + "combining umlaut character".

Linux uses the former method, called Normal-Form-Composed (or NFC), and Mac OS X uses the latter method, called Normal-Form-Decomposed (NFD).

Apparently Git doesn't care about this point and simply uses the byte sequence of the filename, which leads to the problem you're having.

The mailing list thread Git, Mac OS X and German special characters has a patch in it so that Git compares the file names after normalization.


The following put in ~/.gitconfig works for me on 10.12.1 Sierra for UTF-8 names:

precomposeunicode = true
quotepath = false

The first option is needed so that git 'understands' UTF-8 and the second one so that it doesn't escape the characters.


To make git add file work with umlauts in file names on Mac OS X, you may convert file path strings from composed into canonically decomposed UTF-8 using iconv.

# test case

mkdir testproject
cd testproject

git --version    # git version 1.7.6.1
locale charmap   # UTF-8

git init
file=$'\303\234berschrift.txt'    # composed UTF-8 (Linux-compatible)
touch "$file"
echo 'Hello, world!' > "$file"

# convert composed into canonically decomposed UTF-8
# cf. http://codesnippets.joyent.com/posts/show/12251
# printf '%s' "$file" | iconv -f utf-8 -t utf-8-mac | LC_ALL=C vis -fotc 
#git add "$file"
git add "$(printf '%s' "$file" | iconv -f utf-8 -t utf-8-mac)"  

git commit -a -m 'This is my commit message!'
git show
git status
git ls-files '*'
git ls-files -z '*' | tr '\0' '\n'

touch $'caf\303\251 1' $'caf\303\251 2' $'caf\303\251 3'
git ls-files --other '*'
git ls-files -z --other '*' | tr '\0' '\n'

Change the repository's OSX-specific core.precomposeunicode flag to true:

git config core.precomposeunicode.true

To make sure new repositories get that flag, also run:

git config --global core.precomposeunicode true

Here is the relevant snippet from the manpage:

This option is only used by Mac OS implementation of Git. When core.precomposeunicode=true, Git reverts the unicode decomposition of filenames done by Mac OS. This is useful when sharing a repository between Mac OS and Linux or Windows. (Git for Windows 1.7.10 or higher is needed, or Git under cygwin 1.7). When false, file names are handled fully transparent by Git, which is backward compatible with older versions of Git.