ChatGPT解决这个技术问题 Extra ChatGPT

What's the strategy for handling CRLF (carriage return, line feed) with Git?

I tried committing files with CRLF-ending lines, but it failed.

I spent a whole work day on my Windows computer trying different strategies and was almost drawn to stop trying to use Git and instead try Mercurial.

How to properly handle CRLF line endings?


r
randers

Almost four years after asking this question, I have finally found an answer that completely satisfies me!

See the details in github:help's guide to Dealing with line endings.

Git allows you to set the line ending properties for a repo directly using the text attribute in the .gitattributes file. This file is committed into the repo and overrides the core.autocrlf setting, allowing you to ensure consistent behaviour for all users regardless of their git settings.

And thus

The advantage of this is that your end of line configuration now travels with your repository and you don't need to worry about whether or not collaborators have the proper global settings.

Here's an example of a .gitattributes file

# Auto detect text files and perform LF normalization
*        text=auto

*.cs     text diff=csharp
*.java   text diff=java
*.html   text diff=html
*.css    text
*.js     text
*.sql    text

*.csproj text merge=union
*.sln    text merge=union eol=crlf

*.docx   diff=astextplain
*.DOCX   diff=astextplain

# absolute paths are ok, as are globs
/**/postinst* text eol=lf

# paths that don't start with / are treated relative to the .gitattributes folder
relative/path/*.txt text eol=lf

There is a convenient collection of ready to use .gitattributes files for the most popular programming languages. It's useful to get you started.

Once you've created or adjusted your .gitattributes, you should perform a once-and-for-all line endings re-normalization.

Note that the GitHub Desktop app can suggest and create a .gitattributes file after you open your project's Git repo in the app. To try that, click the gear icon (in the upper right corner) > Repository settings ... > Line endings and attributes. You will be asked to add the recommended .gitattributes and if you agree, the app will also perform a normalization of all the files in your repository.

Finally, the Mind the End of Your Line article provides more background and explains how Git has evolved on the matters at hand. I consider this required reading.

You've probably got users in your team who use EGit or JGit (tools like Eclipse and TeamCity use them) to commit their changes. Then you're out of luck, as @gatinueta explained in this answer's comments:

This setting will not satisfy you completely if you have people working with Egit or JGit in your team, since those tools will just ignore .gitattributes and happily check in CRLF files https://bugs.eclipse.org/bugs/show_bug.cgi?id=342372

One trick might be to have them commit their changes in another client, say SourceTree. Our team back then preferred that tool to Eclipse's EGit for many use cases.

Who said software is easy? :-/


Care to share the Windows .gitattributes?
This setting will not satisfy you completely if you have people working with Egit in your team, since egit will just ignore .gitattributes and happily check in CRLF files bugs.eclipse.org/bugs/show_bug.cgi?id=342372
For Windows I'm usually inclined to set the global core.autocrlf = false - I prefer LF everywhere, but some of the Windows tools like Visual Studio insist on CRLF endings in certain files (and even mix them in a few..) ; not munging line endings is the safest option. If you know what you are doing, I'd probably use core.autocrlf = input and make exceptions for projects on Windows that you know are sensitive to line endings. As others point out, every decent text editor supports LF endings now. I actually think core.autocrlf = true can probably cause more trouble than it prevents.
@gatinueta To be more specific, it's a JGit issue. Meaning TeamCity, which also uses JGit, straight up ignores .gitattributes.
I also recommend using *.sh text eol=lf
J
John Millikin

Don't convert line endings. It's not the VCS's job to interpret data -- just store and version it. Every modern text editor can read both kinds of line endings anyway.


Seconded. If you have problems with inconsistent line-endings, the best solution is shout at whoever's using the wrong editor settings until they fix it.
Disagree. Native linefeeds on all platforms is a convenience.
Visual Studio is a PITA when it comes to anything other than CRLF.
Git has an option not to convert line-endings it is autocrlf=false and unless you are doing cross-platform development, like say Mono it is best left to false when running under Windows and set to true if you will be developing open-source for Mono.
The problem with line ending is computing correct diffs. So the answer is wrong and misleading.
C
Cory

You almost always want autocrlf=input unless you really know what you are doing.

Some additional context below:

It should be either core.autocrlf=true if you like DOS ending or core.autocrlf=input if you prefer unix-newlines. In both cases, your Git repository will have only LF, which is the Right Thing. The only argument for core.autocrlf=false was that automatic heuristic may incorrectly detect some binary as text and then your tile will be corrupted. So, core.safecrlf option was introduced to warn a user if a irreversable change happens. In fact, there are two possibilities of irreversable changes -- mixed line-ending in text file, in this normalization is desirable, so this warning can be ignored, or (very unlikely) that Git incorrectly detected your binary file as text. Then you need to use attributes to tell Git that this file is binary.

The above paragraph was originally pulled from a thread on gmane.org, but it has since gone down.


Why it is a "Right Thing"?
core.autocrlf=true is a terrible idea. I've had nothing trouble with that option, plus you have to remember to set it whenever you clone the repository.
Do NOT use autocrlf=true unless you know what you are doing. If you develop in DOS/Win then autocrlf=false will keep the endings the same between remote and local repo's and is the best option in almost every situation.
@Chris - What if your developers have windows and multi-platform projects where some of the multi-platform developers work on OSX or Linux? Shouldn't the best option then be autocrlf=true?
Upvoted, with reservations. The introductory paragraph is unhelpful. core.autocrlf=input is the canonical answer. For most use cases, core.autocrlf=true and core.autocrlf=false are overly zealous (...in opposite but equally terrible ways, of course) and hence intrinsically destructive. "Git for Windows" should really have shipped with "Checkout as-is, commit Unix-style line endings" (i.e., core.autocrlf=input) as its default newline strategy. It didn't. So here we here – in frickin' 2015 – still endlessly debating this.
M
Michael

Two alternative strategies to get consistent about line-endings in mixed environments (Microsoft + Linux + Mac):

A. Global All Repositories Setup

Convert all to one format find . -type f -not -path "./.git/*" -exec dos2unix {} \; git commit -a -m 'dos2unix conversion' Set core.autocrlf to input on Linux/UNIX or true on MS Windows (repository or global) git config --global core.autocrlf input Optionally, set core.safecrlf to true (to stop) or warn (to sing:) to add extra guard comparing if the reversed newline transformation would result in the same file git config --global core.safecrlf true

B. Or per Repository Setup

Convert all to one format find . -type f -not -path "./.git/*" -exec dos2unix {} \; git commit -a -m 'dos2unix conversion' Add a .gitattributes file to your repository echo "* text=auto" > .gitattributes git add .gitattributes git commit -m 'adding .gitattributes for unified line-ending'

Don't worry about your binary files—Git should be smart enough about them.

More about safecrlf/autocrlf variables


global approach == set and forget for all repos vs. per repo == does not require others to change their global configuration.
dos2unix is an command-line-tool that depending on system you might have to install additionally
They're not exclusive, you can use both approaches at same time. Also, be very careful when using dos2unix - there is a risk of corrupting .git/index and we don't need to apply it to every file. It's better using something like find ./ -name "*.html" and specifying which files you want to apply it to.
WARNING: before running the find lines, be aware: the dos2unix that comes shiped with Git for Windows has a peculiar (IMO idiotic and dangerous) behaviour, with no arguments: instead of changing to UNIX, it toggles the newline format (DOS <-> UNIX)
And another warning: do not DOS2UNIX your .git folder. Just saying.
M
Marinos An

--- UPDATE 3 --- (does not conflict with UPDATE 2)

Considering the case that windows users prefer working on CRLF and linux/mac users prefer working on LF on text files. Providing the answer from the perspective of a repository maintainer:

For me the best strategy(less problems to solve) is: keep all text files with LF inside git repo even if you are working on a windows-only project. Then give the freedom to clients to work on the line-ending style of their preference, provided that they pick a core.autocrlf property value that will respect your strategy (LF on repo) while staging files for commit.

Staging is what many people confuse when trying to understand how newline strategies work. It is essential to undestand the following points before picking the correct value for core.autocrlf property:

Adding a text file for commit (staging it) is like copying the file to another place inside .git/ sub-directory with converted line-endings (depending on core.autocrlf value on your client config). All this is done locally.

setting core.autocrlf is like providing an answer to the question (exact same question on all OS): "Should git-client: a. convert LF-to-CRLF when checking-out (pulling) the repo changes from the remote? b. convert CRLF-to-LF when adding a file for commit?"

a. convert LF-to-CRLF when checking-out (pulling) the repo changes from the remote?

b. convert CRLF-to-LF when adding a file for commit?"

and the possible answers (values) are: false: "do none of the above", input: "do only b" true: "do a and and b" note that there is NO "do only a"

false: "do none of the above",

input: "do only b"

true: "do a and and b"

note that there is NO "do only a"

Fortunately

git client defaults (windows: core.autocrlf: true, linux/mac: core.autocrlf: false) will be compatible with LF-only-repo strategy. Meaning: windows clients will by default convert to CRLF when checking-out the repository and convert to LF when adding for commit. And linux clients will by default not do any conversions. This theoretically keeps your repo lf-only.

Unfortunately:

There might be GUI clients that do not respect the git core.autocrlf value

There might be people that don't use a value to respect your lf-repo strategy. E.g. they use core.autocrlf=false and add a file with CRLF for commit.

To detect ASAP non-lf text files committed by the above clients you can follow what is described on --- update 2 ---: (git grep -I --files-with-matches --perl-regexp '\r' HEAD, on a client compiled using: --with-libpcre flag)

And here is the catch:. I as a repo maintainer keep a git.autocrlf=input so that I can fix any wrongly committed files just by adding them again for commit. And I provide a commit text: "Fixing wrongly committed files".

As far as .gitattributes is concearned. I do not count on it, because there are more ui clients that do not understand it. I only use it to provide hints for text and binary files, and maybe flag some exceptional files that should everywhere keep the same line-endings:

*.java          text !eol # Don't do auto-detection. Treat as text (don't set any eol rule. use client's)
*.jpg           -text     # Don't do auto-detection. Treat as binary
*.sh            text eol=lf # Don't do auto-detection. Treat as text. Checkout and add with eol=lf
*.bat           text eol=crlf # Treat as text. Checkout and add with eol=crlf

Question: But why are we interested at all in newline handling strategy?

Answer: To avoid a single letter change commit, appear as a 5000-line change, just because the client that performed the change auto-converted the full file from crlf to lf (or the opposite) before adding it for commit. This can be rather painful when there is a conflict resolution involved. Or it could in some cases be the cause of unreasonable conflicts.

--- UPDATE 2 ---

The dafaults of git client will work in most cases. Even if you only have windows only clients, linux only clients or both. These are:

windows: core.autocrlf=true means convert lines to CRLF on checkout and convert lines to LF when adding files.

linux: core.autocrlf=input means don't convert lines on checkout (no need to since files are expected to be committed with LF) and convert lines to LF (if needed) when adding files. (-- update3 -- : Seems that this is false by default, but again it is fine)

The property can be set in different scopes. I would suggest explicitly setting in the --global scope, to avoid some IDE issues described at the end.

git config core.autocrlf
git config --global core.autocrlf
git config --system core.autocrlf
git config --local core.autocrlf
git config --show-origin core.autocrlf

Also I would strongly discourage using on windows git config --global core.autocrlf false (in case you have windows only clients) in contrast to what is proposed to git documentation. Setting to false will commit files with CRLF in the repo. But there is really no reason. You never know whether you will need to share the project with linux users. Plus it's one extra step for each client that joins the project instead of using defaults.

Now for some special cases of files (e.g. *.bat *.sh) which you want them to be checked-out with LF or with CRLF you can use .gitattributes

To sum-up for me the best practice is:

Make sure that every non-binary file is committed with LF on git repo (default behaviour).

Use this command to make sure that no files are committed with CRLF: git grep -I --files-with-matches --perl-regexp '\r' HEAD (Note: on windows clients works only through git-bash and on linux clients only if compiled using --with-libpcre in ./configure).

If you find any such files by executing the above command, correct them. This in involves (at least on linux): set core.autocrlf=input (--- update 3 --) change the file revert the change(file is still shown as changed) commit it

set core.autocrlf=input (--- update 3 --)

change the file

revert the change(file is still shown as changed)

commit it

Use only the bare minimum .gitattributes

Instruct the users to set the core.autocrlf described above to its default values.

Do not count 100% on the presence of .gitattributes. git-clients of IDEs may ignore them or treat them differrently.

As said some things can be added in git attributes:

# Always checkout with LF
*.sh            text eol=lf
# Always checkout with CRLF
*.bat           text eol=crlf

I think some other safe options for .gitattributes instead of using auto-detection for binary files:

-text (e.g for *.zip or *.jpg files: Will not be treated as text. Thus no line-ending conversions will be attempted. Diff might be possible through conversion programs)

text !eol (e.g. for *.java,*.html: Treated as text, but eol style preference is not set. So client setting is used.)

-text -diff -merge (e.g for *.hugefile: Not treated as text. No diff/merge possible)

--- PREVIOUS UPDATE ---

One painful example of a client that will commit files wrongly:

netbeans 8.2 (on windows), will wrongly commit all text files with CRLFs, unless you have explicitly set core.autocrlf as global. This contradicts to the standard git client behaviour, and causes lots of problems later, while updating/merging. This is what makes some files appear different (although they are not) even when you revert.
The same behaviour in netbeans happens even if you have added correct .gitattributes to your project.

Using the following command after a commit, will at least help you detect early whether your git repo has line ending issues: git grep -I --files-with-matches --perl-regexp '\r' HEAD

I have spent hours to come up with the best possible use of .gitattributes, to finally realize, that I cannot count on it.
Unfortunately, as long as JGit-based editors exist (which cannot handle .gitattributes correctly), the safe solution is to force LF everywhere even on editor-level.

Use the following anti-CRLF disinfectants.

windows/linux clients: core.autocrlf=input

committed .gitattributes: * text=auto eol=lf

committed .editorconfig (http://editorconfig.org/) which is kind of standardized format, combined with editor plugins: https://github.com/editorconfig/ https://github.com/welovecoding/editorconfig-netbeans/

https://github.com/editorconfig/

https://github.com/welovecoding/editorconfig-netbeans/


I agree with you that this is the best approach, nobody should be using editors without LF support. But be careful with your .gitattributes line, it has unintended conseques in Git < 2.10, see stackoverflow.com/a/29508751/2261442
Darn it... I have tons of answers of mine advocating for git config --global core.autocrlf false, and recommending to deal with eol in .gitattributes directives only.
P
Peter Mortensen

Using core.autocrlf=false stopped all the files from being marked updated as soon as I checked them out in my Visual Studio 2010 project. The other two members of the development team are also using Windows systems so a mixed environment didn't come into play, yet the default settings that came with the repository always marked all files as updated immediately after cloning.

I guess the bottom line is to find what CRLF setting works for your environment. Especially since in many other repositories on our Linux boxes setting autocrlf = true produces better results.

20+ years later and we're still dealing with line ending disparities between OSes... sad.


@orange80, the disparity is unfortunate, but there's no reason to call it Windows's fault. LF-only makes sense from a minimalist standpoint, perhaps; but CRLF makes more sense based on what CR and LF mean. "Carriage return" means to return to the beginning of the line; "line feed" means to move straight down to the next line, rather than to the beginning of the next line. From a semantic standpoint, Windows is more correct in having both: move back to the beginning (CR) and then down one line (LF).
@Kyralessa "more correct" in still pretending that a computer is a typewriter, which it's not, btw. Maintaining the typewriter analogy doesn't make any sense considering this is not something end-users will ever deal with, and that two characters instead of one is pointless.
Late to this party by a few years, but you ignored the fact that CR and LF are cursor positioning tools. "CR" may as well be "Cursor Return" at this point in history. If I wanted the cursor returned to the beginning of the line, I'd tell the application to do that. Otherwise, it needs to stay where I put it.
Also, if CRLF is "more correct" because a textfile newline really is both a "move one row down" and "move to beginning of line", it would follow that just a CR would cause the text editor to overwrite a line with the following line. I know of no editors which actually support this, meaning that the need to express both CRLF and CR as different things, doesn't really exist.
@avl_sweden It was very common behavior before DOS, and since Microsoft thinks compatibility is important, it has staid that way ever since. It was also the standard way in the US (as pere ASA) - ISO allowed both CR+LF and LF (so again, DOS was standards compliant); in both cases, since the sixties. Multics (Unix precursor) supported CR for bold/strike. Many applications nowadays (including .NET's "split by lines" features) look for either of the three (lone CR, lone LF, CRLF), and treat each of them as end-line. Many applications are still confused by mixed line-endings in a file, though.
G
Greg Hewgill

Try setting the core.autocrlf configuration option to true. Also have a look at the core.safecrlf option.

Actually it sounds like core.safecrlf might already be set in your repository, because (emphasis mine):

If this is not the case for the current setting of core.autocrlf, git will reject the file.

If this is the case, then you might want to check that your text editor is configured to use line endings consistently. You will likely run into problems if a text file contains a mixture of LF and CRLF line endings.

Finally, I feel that the recommendation to simply "use what you're given" and use LF terminated lines on Windows will cause more problems than it solves. Git has the above options to try to handle line endings in a sensible way, so it makes sense to use them.


Wouldn't it be better to use repository wide settings via .gitattributes file ? Was just wondering: it's inconvenient to force every user to take care of his line ending settings on his machine ... Or are there other drawbacks?
k
kiewic

These are the two options for Windows and Visual Studio users that share code with Mac or Linux users. For an extended explanation, read the gitattributes manual.

* text=auto

In your repo's .gitattributes file add:

*   text=auto

This will normalize all the files with LF line endings in the repo.

And depending on your operating system (core.eol setting), files in the working tree will be normalized to LF for Unix based systems or CRLF for Windows systems.

This is the configuration that Microsoft .NET repos use.

Example:

Hello\r\nWorld

Will be normalized in the repo always as:

Hello\nWorld

On checkout, the working tree in Windows will be converted to:

Hello\r\nWorld

On checkout, the working tree in Mac will be left as:

Hello\nWorld

Note: If your repo already contains files not normalized, git status will show these files as completely modified the next time you make any change on them, and it could be a pain for other users to merge their changes later. See refreshing a repository after changing line endings for more information.

core.autocrlf = true

If text is unspecified in the .gitattributes file, Git uses the core.autocrlf configuration variable to determine if the file should be converted.

For Windows users, git config --global core.autocrlf true is a great option because:

Files are normalized to LF line endings only when added to the repo. If there are files not normalized in the repo, this setting will not touch them.

All text files are converted to CRLF line endings in the working directory.

The problem with this approach is that:

If you are a Windows user with autocrlf = input, you will see a bunch of files with LF line endings. Not a hazard for the rest of the team, because your commits will still be normalized with LF line endings.

If you are a Windows user with core.autocrlf = false, you will see a bunch of files with LF line endings and you may introduce files with CRLF line endings into the repo.

Most Mac users use autocrlf = input and may get files with CRLF file endings, probably from Windows users with core.autocrlf = false.


Your command for windows users says git config --global core.autocrl true. You mean git config --global core.autocrlf true.
M
Michael

This is just a workaround solution:

In normal cases, use the solutions that are shipped with git. These work great in most cases. Force to LF if you share the development on Windows and Unix based systems by setting .gitattributes.

In my case there were >10 programmers developing a project in Windows. This project was checked in with CRLF and there was no option to force to LF.

Some settings were internally written on my machine without any influence on the LF format; thus some files were globally changed to LF on each small file change.

My solution:

Windows-Machines: Let everything as it is. Care about nothing, since you are a default windows 'lone wolf' developer and you have to handle like this: "There is no other system in the wide world, is it?"

Unix-Machines

Add following lines to a config's [alias] section. This command lists all changed (i.e. modified/new) files: lc = "!f() { git status --porcelain \ | egrep -r \"^(\?| ).\*\\(.[a-zA-Z])*\" \ | cut -c 4- ; }; f " Convert all those changed files into dos format: unix2dos $(git lc) Optionally ... Create a git hook for this action to automate this process Use params and include it and modify the grep function to match only particular filenames, e.g.: ... | egrep -r "^(\?| ).*\.(txt|conf)" | ... Feel free to make it even more convenient by using an additional shortcut: c2dos = "!f() { unix2dos $(git lc) ; }; f " ... and fire the converted stuff by typing git c2dos