Untracked files with unicode names
Using unicode filenames (e.g. like the German umlaut "Ü") in Git can cause problems - unless the correct configuration is used before working with the repository.
Certain characters can be represented in two different forms in Unicode, for example,
Ü can be represented as the single character
Ü (known as “precomposed form” or
¨ (known as “decomposed form” or
You can read more about this topic on Wikipedia – Unicode equivalence popup:true.
Depending on the operating system and file system, unicode file names might get converted to either form. Mac OS X (with HFS+) decomposes file names before storing them (thus using NFD), whereas Linux and Windows usually use NFC.
The Git config setting
core.precomposeunicode converts between NFD filenames on Mac OS X and NFC filenames in Git:
core.precomposeunicode This option is only used by Mac OS implementation of Git. When core.precomposeunicode=true, Git reverts the unicode decomposition of filenames done by Mac OS. This is useful when sharing a repository between Mac OS and Linux or Windows. (Git for Windows 1.7.10 or higher is needed, or Git under cygwin 1.7). When false, file names are handled fully transparent by Git, which is backward compatible with older versions of Git.
The default for this config setting on OS X is
true and there should be no reason to override it. Note that the setting is also important when sharing repos between Macs.
The descripton does not go into great detail, but the setting affects various Git commands, most importantly
git add – if
false when a unicode file name is added to Git on OS X, Git registers the decomposed file name. This leads to the following problems:
- Users on Mac OS X with
truewill see the file as untracked in Git status
- Users on Linux or Windows will see the file as untracked in Git status (as
core.precomposeunicodeis not used on those platforms)
This can easily be reproduced with the following test repository on Mac OS X:
git init . touch decomposed-filename-with-ü git -c core.precomposeunicode=false add decomposed-* git commit -m "Add file with decomposed filename on Mac OS X (core.precomposeunicode=false)" touch precomposed-filename-with-ü git -c core.precomposeunicode=true add precomposed-* git commit -m "Add file with precomposed filename on Mac OS X (core.precomposeunicode=true)" git -c core.precomposeunicode=false status --porcelain => ?? precomposed-filename-with-ü git -c core.precomposeunicode=true status --porcelain => ?? decomposed-filename-with-ü
Once a file name has been added in decomposed form to a Git repository, the only way of solving the problem is to remove these files from Git and re-add them with
core.precomposeunicode set to
true on Mac OS X or perform this action on Linux or Windows.
To recap, if you have problems with unicode file names showing up as untracked:
core.precomposeunicodeis globally set to
trueon OS X
$ git config --global core.precomposeunicode => true
All files still shown as untracked need to be removed from and re-added to Git.