I have encountered a weird problem with git (particularly GitHub), while working for a company with an existing repository: It occurs every time I merge a hotfix/patch branch into develop or develop into main. Instead of only generating diffs for the portions of files that I change, it generates diffs for entire files on merging branches, without exception. (This doesn't happen while making changes to files when in the same branch, so I don't event think adding a conversion/check step before pushing will make a difference.) This is a problem I haven't encountered before, but there's a first time for everything, right? Well, this is it.
From what I have read in multiple places, the .git object index/DB uses lf for line endings, internally, and this cannot be changed when checking in/pushing to the remote repository. (Believe me, I've tried and it makes no difference.) It is only on checkout that line endings can change to suit a particular OS. However, since most modern code editors/IDEs can handle lf endings regardless of the OS on which they're running (yes, even notepad, finally), it makes sense to me to have everything use lf line endings. Converting back and forth seems like a waste of CPU cycles and, obviously, a source of errors. Besides, think of all the space saved by using one byte instead of two per line, multiplied by however many lines are stored in however many files in however many branches in however many repos and forks across the entirety of GitHub and GitLab. That's got to be TBs (possibly even PBs) of code, never mind binary files.
Anyway, from what I've read (see the resource links), here's how to get a repo in the right state (preferably before there's any code in it):
Configuration Files
Although git can/does have a global configuration, the caveat is that using it relies on developers to set it correctly on every single one of their machines before working with git. Of course, this often doesn't happen. Instead, a couple of files can be added to repositories, for use by all developers working with a repo.
At any rate, you should set the following directives:
git config --global core.autocrlf input # on *NIX systems. Use true on Windowsgit config --global merge.renormalize true # Normalise files on branch merge
.gitattributes
This file, at the root of a repo, tells git how to treat line endings in various files, based on their extensions (change or ignore):
# Custom, per-project settings for line endings: force all to LF, since that's how git stores them.
# Set the default behaviour, in case people don't have this set globally
* text=auto eol=lf
# Explicit conversion/normalisation on checkout (native line endings: LF on *NIX or CRLF on Windows)
*.php text
*.html text
*.htm text
*.css text
*.js text
*.njs text
*.sh text
*.md text
*.mmd text
*.page text
*.xml text
# Declare files that will always have CRLF line endings on checkout
# Declare files that should not be changed on checkout
# Declare files that are truly binary and should not be changed
*.gif binary
*.jpg binary
*.jpeg binary
*.otf binary
*.phar binary
*.png binary
*.svg binary
*.ttf binary
*.webp binary
*.woff binary
*.zip binary
.editorconfig
This file, also at the root of a repo, tells editors what line endings to use for files.
root = true
[*]
end_of_line = lf
eol = lf
Fix/Normalise line endings in files
Finally, in case the configuration isn't respected, here's a bash script to get git to normalise line endings (convert them all to lf) in text files, after the fact:
#!/usr/bin/env bash
git commit -m "Saving files before refreshing line endings"
#Remove the index and force Git to rescan the working directory.
rm .git/index
#Rewrite the Git index to pick up all the new line endings.
git reset
#Show the rewritten, normalized files.
git status
#Add all your changed files back, and prepare them for a commit. This is your chance to inspect which files, if any, were unchanged.
git add -u
# It is perfectly safe to see a lot of messages here that read
# "warning: CRLF will be replaced by LF in file."
#Rewrite the .gitattributes file.
git add .gitattributes
#Commit the changes to your repository.
git commit -m "Normalize all the line endings"
If, after all this, the problem persists (and isn't actually anything to do with differences in line endings), then I can only conclude that I fundamentally do not understand the documentation/inner workings of git and that's unlikely to be resolved by me alone. (Even the folks who work on projects like react/redux seem to have issues with fixing/removing errant line endings in the code in their repos.) I'm by no means proficient with git and there's certainly a lot I could learn about it. However, taking a deep dive into Pro Git for this issue might be overkill. Mainly, git is a useful tool that stays out of my way and lets me get on with the business of writing and committing code to a distributed version control system. If anyone can spot what I don't understand and/or am doing wrong, please drop me a comment. After two days of being inconvenienced by failing to resolve this issue that shouldn't even be one, I'm rather peeved and will be highly appreciative of being set straight. Hell, maybe I need to run dos2unix , Swiss File Knife (sfk) and/or an awk or sed script on every file I change before merging branches (since it doesn't seem to be a problem with pushing). What a pain (and not simply because I don't know how to automate/script awk and sed or set up git hooks or CI)...