What is the purpose of `text=auto` in `.gitattributes` file?
Mostly .gitattributes
file has * text=auto
. What is the purpose of text=auto
in that file?
Solution 1:
From the docs:
Each line in
.gitattributes
(or.git/info/attributes
) file is of form:
pattern attr1 attr2 ...
So here, the pattern is *
, which means all files, and the attribute is text=auto
.
What does text=auto
do? From the documentation:
When text is set to "auto", the path is marked for automatic end-of-line normalization. If Git decides that the content is text, its line endings are normalized to LF on checkin.
What's the default behaviour if it's not enabled?
Unspecified
If the text attribute is unspecified, Git uses the core.autocrlf configuration variable to determine if the file should be converted.
What does core.autocrlf
do? From the docs:
core.autocrlf
Setting this variable to "true" is almost the same as setting the text attribute to "auto" on all files except that text files are not guaranteed to be normalized: files that contain CRLF in the repository will not be touched. Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings. This variable can be set to input, in which case no output conversion is performed.
If you think this all as clear as mud, you're not alone.
Here's what * text=auto
does in my words: when someone commits a file, Git guesses whether that file is a text file or not, and if it is, it will commit a version of the file where all CR + LF bytes are replaced with LF bytes. It doesn't directly affect what files look like in the working tree, there are other settings that will convert LF bytes to CR + LF bytes when checking out a file.
Recommendation:
I would not recommend putting * text=auto
in the .gitattributes
file. Instead, I would recommend something like this:
*.txt text
*.html text
*.css text
*.js text
This explicitly designates which files are text files, which get CRLF converted to LF in the object database (but not necessarily in the working tree). We had a repo with * text=auto
, and Git guessed wrong for an image file that it was a text file, causing it to corrupt it as it replaced CR + LF bytes with LF bytes in the object database. That was not a fun one to debug.
If you must use * text=auto
, put it as the first line in .gitattributes
, so that the later lines can override it. This seems to be becoming an increasingly popular practise.
Solution 2:
It ensures line endings are normalized. Source: Kernel.org
When text is set to "auto", the path is marked for automatic end-of-line normalization. If git decides that the content is text, its line endings are normalized to LF on checkin.
If you want to interoperate with a source code management system that enforces end-of-line normalization, or you simply want all text files in your repository to be normalized, you should instead set the text attribute to "auto" for all files.
This ensures that all files that git considers to be text will have normalized (LF) line endings in the repository.
Solution 3:
That configuration is with regard to how line endings are handled. When enabled, all line endings are converted to LF in the repository. There are other flags to deal with how line endings are converted in your working directory. Full info on the issue us here: https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html