ChatGPT解决这个技术问题 Extra ChatGPT

Is it possible to move/rename files in Git and maintain their history?

I would like to rename/move a project subtree in Git moving it from

/project/xyz

to

/components/xyz

If I use a plain git mv project components, then all the commit history for the xyz project gets lost. Is there a way to move this such that the history is maintained?

I just want to note that I just tested moving files via the filesystem, and after committing (via intellij) I can then see the whole history (including history when it was at a different location) when viewing the history (again in intellij). I'm assuming intellij isn't doing anything particularly special to do that, so its nice to know that at very least the history can be traced.
For the rules followed by Git when detecting a directory rename, see my answer below
I wrote an answer here. I hope it works. stackoverflow.com/questions/10828267/…
Git subtrees have "fake" histories anyway. When you break up a repository using git-subtree, Git gives the resulting subtree a fabricated history that is not the same as that of the project from which it broke away. I believe that git tries to determine all commits that involved the any of the files in the subtree, and it uses them to stitch together a history. Also, these histories are rewritten every time you recombine and resplit the subtrees. Submodules however each have their own history separate from the parent project.

C
Cristian Ciupitu

Git detects renames rather than persisting the operation with the commit, so whether you use git mv or mv doesn't matter.

The log command takes a --follow argument that continues history before a rename operation, i.e., it searches for similar content using heuristics.

To lookup the full history, use the following command:

git log --follow ./path/to/file

I suspect this is a performance consideration. If you don't need the full history, it will sure take longer time scanning the content. The easiest way is to setup an alias git config alias.logf "log --follow" and just write git logf ./path/to/file.
@TroelsThomsen this e-mail by Linus Torvalds, linked from this answer, indicates that it's an intentional design choice of Git since it's allegedly much more powerful than tracking renames etc.
This answer is a bit misleading. Git does "detect renames," but very late in the game; the question is asking how you ensure Git tracks renames, and someone reading this can easily infer that Git detects them automatically for you and makes note of it. It does not. Git has no real handling of renames, and instead there are merge/log tools that attempt to figure out what happened - and rarely get it right. Linus has a mistaken but vehement argument as to why git should never just do it the right way and track renames explicitly. So, we're stuck here.
Important: if you rename a directory, for example during renaming of a Java package, be sure to execute two commits, first for the 'git mv {old} {new}' command, second for the updates of all Java files that reference the changed package directory. Otherwise git can't track the individual files even with the --follow parameter.
Though Linus probably makes very few mistakes, this does appears to be one. Simply renaming a folder causes a massive delta to be uploaded to GitHub. Which makes me cautious about renaming my folders...but that's a pretty big straight-jacket for a programmer. Occasionally, I HAVE TO re-define the meaning of something, or change how things are categorized. Linus: "In other words, I'm right. I'm always right, but sometimes I'm more right than other times. And dammit, when I say 'files don't matter', I'm really really Right(tm)." ...I have my doubts about that one.
C
Cristian Ciupitu

No.

The short answer is NO. It is not possible to rename a file in Git and remember the history. And it is a pain.

Rumor has it that git log --follow --find-copies-harder will work, but it does not work for me, even if there are zero changes to the file contents, and the moves have been made with git mv.

(Initially I used Eclipse to rename and update packages in one operation, which may have confused Git. But that is a very common thing to do. --follow does seem to work if only a mv is performed and then a commit and the mv is not too far.)

Linus says that you are supposed to understand the entire contents of a software project holistically, not needing to track individual files. Well, sadly, my small brain cannot do that.

It is really annoying that so many people have mindlessly repeated the statement that Git automatically tracks moves. They have wasted my time. Git does no such thing. By design(!) Git does not track moves at all.

My solution is to rename the files back to their original locations. Change the software to fit the source control. With Git you just seem to need to "git" it right the first time.

Unfortunately, that breaks Eclipse, which seems to use --follow. git log --follow sometimes does not show the full history of files with complicated rename histories even though git log does. (I do not know why.)

(There are some too clever hacks that go back and recommit old work, but they are rather frightening. See GitHub-Gist: emiller/git-mv-with-history.)

In short: if Subversion doing this is wrong, then Git doing this is also wrong - doing this isn't some (mis!)feature, it's a mistake.


I believe you are correct. I was just trying to use php-cs-fixer to reformat the source for my Laravel 5 project but it insists on changing the capitalization of the namespace clauses to match the lowercase value of the app folder. But namespaces (or composer autoloads) only work with CamelCase. I need to change the capitalization of the folder to App but this causes my changes to be lost. This is the most trivial of examples, but shows how the git heuristic is not able to follow even the simplest of name changes (--follow and --find-copies-harder should be the rule, not the exception).
git -1, subversion +1
Is this still true ? That's more reason for me to stay with tfs for now, keeping the history of a moved/renamed file is a must in a large project.
The short answer is Yes. Git current version supports "git log --follow" as well. and I agree with @MohammadDehghan
git log --follow works for me, but only if the git mv moves the file to something untracked. If you try to do rm a.txt && git mv b.txt a.txt then b.txt's history will be destroyed. You have to first git rm a.txt then commit, then git mv b.txt a.txt if you want git log --follow to work.
r
royhowie

It is possible to rename a file and keep the history intact, although it causes the file to be renamed throughout the entire history of the repository. This is probably only for the obsessive git-log-lovers, and has some serious implications, including these:

You could be rewriting a shared history, which is the most important DON'T while using Git. If someone else has cloned the repository, you'll break it doing this. They will have to re-clone to avoid headaches. This might be OK if the rename is important enough, but you'll need to consider this carefully -- you might end up upsetting an entire opensource community!

If you've referenced the file using it's old name earlier in the repository history, you're effectively breaking earlier versions. To remedy this, you'll have to do a bit more hoop jumping. It's not impossible, just tedious and possibly not worth it.

Now, since you're still with me, you're a probably solo developer renaming a completely isolated file. Let's move a file using filter-tree!

Assume you're going to move a file old into a folder dir and give it the name new

This could be done with git mv old dir/new && git add -u dir/new, but that breaks history.

Instead:

git filter-branch --tree-filter 'if [ -f old ]; then mkdir dir && mv old dir/new; fi' HEAD

will redo every commit in the branch, executing the command in the ticks for each iteration. Plenty of stuff can go wrong when you do this. I normally test to see if the file is present (otherwise it's not there yet to move) and then perform the necessary steps to shoehorn the tree to my liking. Here you might sed through files to alter references to the file and so on. Knock yourself out! :)

When completed, the file is moved and the log is intact. You feel like a ninja pirate.

Also; The mkdir dir is only necessary if you move the file to a new folder, of course. The if will avoid the creation of this folder earlier in history than your file exists.


As an obsessive git-log-lover, I wouldn't go for this. The files weren't named that at those points in time, hence history reflects a never-existent situation. Who knows what tests might break in the past! The risk of breaking earlier versions is in pretty much every case not worth it.
@Vincent You're absolutely right, and I tried to be as clear as I could about the unlikeliness of this solution being appropriate. I also think we're talking about two meanings of the word "history" in this case, I appreciate both.
I find there are situations where one might need this. Say I developed something in my own personal branch, which I now want to merge upstream. But I discover, the filename isn't apropriate, so I change it for my whole personal branch. In that way I can keep a clean proper history and have the correct name from the beginning.
@user2291758 that's my use case. These more powerful git commands are dangerous but that doesn't mean they don't have very compelling use cases if you know what you're doing!
@MattiJokipii: The mvcommand is used to move the file before each commit throughout the repository's history, so using a normal unix mv is the correct one. I'm not even sure what would happen if you use a git mv. If you're using windows, you should use the move command.
E
Erik Hesselink
git log --follow [file]

will show you the history through renames.


It appears that this requires you to commit just the rename before you start modifying the file. If you move the file (in the shell) and then change it, all bets are off.
@yoyo: that's because git doesn't track renames, it detects them. A git mv basically does a git rm && git add. There are options like -M90 / --find-renames=90 to consider a file to be renamed when it's 90% identical.
T
Thomas Bormans

I do:

git mv {old} {new}
git add -u {new}

The -u doesn't seem to do anything for me, is it suppose to update the history?
Perhaps you want the behavior of -A instead? Again, see here: git-scm.com/docs/git-add
It does add the files, however it doesn't update the history so that 'git log file name' shows the full history. It only shows the full history if you use the --follow option still.
I did a complicated refactor that moved an include directory (using mv, not git mv) and then changed lots of #include paths within the renamed files. git could not find enough similarity to track the history. But git add -u was just the thing I needed. git status now indicates "renamed" where before it showed "deleted" and "new file".
There are lots of questions on SO that address the purpose of git add -u. The Git docs tend to be unhelpful, and are the last place I would want to look. Here's one post showing git add -u in action: stackoverflow.com/a/2117202.
V
VonC

I would like to rename/move a project subtree in Git moving it from /project/xyz to /components/xyz If I use a plain git mv project components, then all the commit history for the xyz project gets lost.

No (8 years later, Git 2.19, Q3 2018), because Git will detect the directory rename, and this is now better documented.

See commit b00bf1c, commit 1634688, commit 0661e49, commit 4d34dff, commit 983f464, commit c840e1a, commit 9929430 (27 Jun 2018), and commit d4e8062, commit 5dacd4a (25 Jun 2018) by Elijah Newren (newren).
(Merged by Junio C Hamano -- gitster -- in commit 0ce5a69, 24 Jul 2018)

That is now explained in Documentation/technical/directory-rename-detection.txt:

Example:

When all of x/a, x/b and x/c have moved to z/a, z/b and z/c, it is likely that x/d added in the meantime would also want to move to z/d by taking the hint that the entire directory 'x' moved to 'z'.

But they are many other cases, like:

one side of history renames x -> z, and the other renames some file to x/e, causing the need for the merge to do a transitive rename.

To simplify directory rename detection, those rules are enforced by Git:

a couple basic rules limit when directory rename detection applies:

If a given directory still exists on both sides of a merge, we do not consider it to have been renamed. If a subset of to-be-renamed files have a file or directory in the way (or would be in the way of each other), "turn off" the directory rename for those specific sub-paths and report the conflict to the user. If the other side of history did a directory rename to a path that your side of history renamed away, then ignore that particular rename from the other side of history for any implicit directory renames (but warn the user).

You can see a lot of tests in t/t6043-merge-rename-directories.sh, which also point out that:

a) If renames split a directory into two or more others, the directory with the most renames, "wins". b) Avoid directory-rename-detection for a path, if that path is the source of a rename on either side of a merge. c) Only apply implicit directory renames to directories if the other side of history is the one doing the renaming.


o
oHo

Yes

You convert the commit history of files into email patches using git log --pretty=email You reorganize these files in new directories and rename them You convert back these files (emails) to Git commits to keep the history using git am.

Limitation

Tags and branches are not kept

History is cut on path file rename (directory rename)

Step by step explanation with examples

1. Extract history in email format

Example: Extract history of file3, file4 and file5

my_repo
├── dirA
│   ├── file1
│   └── file2
├── dirB            ^
│   ├── subdir      | To be moved
│   │   ├── file3   | with history
│   │   └── file4   | 
│   └── file5       v
└── dirC
    ├── file6
    └── file7

Set/clean the destination

export historydir=/tmp/mail/dir       # Absolute path
rm -rf "$historydir"    # Caution when cleaning the folder

Extract history of each file in email format

cd my_repo/dirB
find -name .git -prune -o -type d -o -exec bash -c 'mkdir -p "$historydir/${0%/*}" && git log --pretty=email -p --stat --reverse --full-index --binary -- "$0" > "$historydir/$0"' {} ';'

Unfortunately option --follow or --find-copies-harder cannot be combined with --reverse. This is why history is cut when file is renamed (or when a parent directory is renamed).

Temporary history in email format:

/tmp/mail/dir
    ├── subdir
    │   ├── file3
    │   └── file4
    └── file5

Dan Bonachea suggests to invert the loops of the git log generation command in this first step: rather than running git log once per file, run it exactly once with a list of files on the command line and generate a single unified log. This way commits that modify multiple files remain a single commit in the result, and all the new commits maintain their original relative order. Note this also requires changes in second step below when rewriting filenames in the (now unified) log.

2. Reorganize file tree and update filenames

Suppose you want to move these three files in this other repo (can be the same repo).

my_other_repo
├── dirF
│   ├── file55
│   └── file56
├── dirB              # New tree
│   ├── dirB1         # from subdir
│   │   ├── file33    # from file3
│   │   └── file44    # from file4
│   └── dirB2         # new dir
│        └── file5    # from file5
└── dirH
    └── file77

Therefore reorganize your files:

cd /tmp/mail/dir
mkdir -p dirB/dirB1
mv subdir/file3 dirB/dirB1/file33
mv subdir/file4 dirB/dirB1/file44
mkdir -p dirB/dirB2
mv file5 dirB/dirB2

Your temporary history is now:

/tmp/mail/dir
    └── dirB
        ├── dirB1
        │   ├── file33
        │   └── file44
        └── dirB2
             └── file5

Change also filenames within the history:

cd "$historydir"
find * -type f -exec bash -c 'sed "/^diff --git a\|^--- a\|^+++ b/s:\( [ab]\)/[^ ]*:\1/$0:g" -i "$0"' {} ';'

3. Apply new history

Your other repo is:

my_other_repo
├── dirF
│   ├── file55
│   └── file56
└── dirH
    └── file77

Apply commits from temporary history files:

cd my_other_repo
find "$historydir" -type f -exec cat {} + | git am --committer-date-is-author-date

--committer-date-is-author-date preserves the original commit time-stamps (Dan Bonachea's comment).

Your other repo is now:

my_other_repo
├── dirF
│   ├── file55
│   └── file56
├── dirB
│   ├── dirB1
│   │   ├── file33
│   │   └── file44
│   └── dirB2
│        └── file5
└── dirH
    └── file77

Use git status to see amount of commits ready to be pushed :-)

Extra trick: Check renamed/moved files within your repo

To list the files having been renamed:

find -name .git -prune -o -exec git log --pretty=tformat:'' --numstat --follow {} ';' | grep '=>'

More customizations: You can complete the command git log using options --find-copies-harder or --reverse. You can also remove the first two columns using cut -f3- and grepping complete pattern '{.* => .*}'.

find -name .git -prune -o -exec git log --pretty=tformat:'' --numstat --follow --find-copies-harder --reverse {} ';' | cut -f3- | grep '{.* => .*}'

BEWARE: This technique splits commits that change 2 or more files into separate fragmented-commits, and furthermore scrambles their order by sorting on filename (so the fragments of one original commit do not appear adjacent in linear history). The resulting history is therefore only "correct" on a file-by-file basis. If you are moving more than one file, then NONE of the new commits in the resulting history represent a consistent snapshot of the moved files that ever existed in the history of the original repo.
Hi @DanBonachea. Thank you for your interesting feedback. I have successfully migrated some repos containing several files using this technique (even with renamed files and files moved across directories). What do you suggest to change in this answer. Do you think we should add an WARNING banner at the top of this answer explaining the limitations of this technique? Cheers
I adapted this technique to avoid the problem by inverting the loops of the git log generation command in step 1. Ie. rather than running git log once per file, run it exactly once with a list of files on the command line and generate a single unified log. This way commits that modify 2 or more files remain a single commit in the result, and all the new commits maintain their original relative order. Note this also requires changes in step 2 when rewriting filenames in the (now unified) log. I also used git am --committer-date-is-author-date to preserve the original commit timestamps.
Thank you for your experimentation and sharing. I have updated a bit the answer for other readers. However I have took time to test your processing. Please feel free to edit this answer if you want to provide examples of command lines. Cheers ;)
a
aboger

I have faced the issue "Renaming the folder without loosing history". To fix it, run:

$ git mv oldfolder temp && git mv temp newfolder
$ git commit
$ git push

This should be marked the correct answer. Totally worked for me to move a file from one folder to another within the same repo. I didn't even have to do the 'temp' thing. git mv olddir/file newdir/file worked for me.
And all history is saved.
Why is this better than git mv oldfolder newfolder?
P
Peter Mortensen

To rename a directory or file (I don't know much about complex cases, so there might be some caveats):

git filter-repo --path-rename OLD_NAME:NEW_NAME

To rename a directory in files that mention it (it's possible to use callbacks, but I don't know how):

git filter-repo --replace-text expressions.txt

expressions.txt is a file filled with lines like literal:OLD_NAME==>NEW_NAME (it's possible to use Python's RE with regex: or glob with glob:).

To rename a directory in messages of commits:

git-filter-repo --message-callback 'return message.replace(b"OLD_NAME", b"NEW_NAME")'

Python's regular expressions are also supported, but they must be written in Python, manually.

If the repository is original, without remote, you will have to add --force to force a rewrite. (You may want to create a backup of your repository before doing this.)

If you do not want to preserve refs (they will be displayed in the branch history of Git GUI), you will have to add --replace-refs delete-no-add.


git: 'filter-repo' is not a git command. See 'git --help'
@alper This command works! But filter-repo is not a standard command in Git. You need to install it before you can use it. You can find instructions how to download and how to install here github.com/newren/git-filter-repo
P
Peter Mortensen

I followed this multi-step process to move code to the parent directory and retained history.

Step 0: Created a branch 'history' from 'master' for safekeeping

Step 1: Used git-filter-repo tool to rewrite history. This command below moved folder 'FolderwithContentOfInterest' to one level up and modified the relevant commit history

git filter-repo --path-rename ParentFolder/FolderwithContentOfInterest/:FolderwithContentOfInterest/ --force

Step 2: By this time the GitHub repository lost its remote repository path. Added remote reference

git remote add origin git@github.com:MyCompany/MyRepo.git

Step 3: Pull information on repository

git pull

Step 4: Connect the local lost branch with the origin branch

git branch --set-upstream-to=origin/history history

Step 5: Address merge conflict for the folder structure if prompted

Step 6: Push!!

git push

Note: The modified history and moved folder appear to already be committed. enter code here

Done. Code moves to the parent / desired directory keeping history intact!


This should be way higher in the list of answers, as of 2020 filter-repo is the way to go for this kind of operations.
P
Peter Mortensen

Simply move the file and stage with:

git add .

Before commit you can check the status:

git status

That will show:

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        renamed:    old-folder/file.txt -> new-folder/file.txt

I tested with Git version 2.26.1.

Extracted from GitHub Help Page.


P
Peter Mortensen

While the core of Git, the Git plumbing doesn't keep track of renames, the history you display with the Git log "porcelain" can detect them if you like.

For a given git log use the -M option:

git log -p -M

With a current version of Git.

This works for other commands like git diff as well.

There are options to make the comparisons more or less rigorous. If you rename a file without making significant changes to the file at the same time it makes it easier for Git log and friends to detect the rename. For this reason some people rename files in one commit and change them in another.

There's a cost in CPU use whenever you ask Git to find where files have been renamed, so whether you use it or not, and when, is up to you.

If you would like to always have your history reported with rename detection in a particular repository you can use:

git config diff.renames 1

Files moving from one directory to another is detected. Here's an example:

commit c3ee8dfb01e357eba1ab18003be1490a46325992
Author: John S. Gruber <JohnSGruber@gmail.com>
Date:   Wed Feb 22 22:20:19 2017 -0500

    test rename again

diff --git a/yyy/power.py b/zzz/power.py
similarity index 100%
rename from yyy/power.py
rename to zzz/power.py

commit ae181377154eca800832087500c258a20c95d1c3
Author: John S. Gruber <JohnSGruber@gmail.com>
Date:   Wed Feb 22 22:19:17 2017 -0500

    rename test

diff --git a/power.py b/yyy/power.py
similarity index 100%
rename from power.py
rename to yyy/power.py

Please note that this works whenever you are using diff, not just with git log. For example:

$ git diff HEAD c3ee8df
diff --git a/power.py b/zzz/power.py
similarity index 100%
rename from power.py
rename to zzz/power.py

As a trial I made a small change in one file in a feature branch and committed it and then in the master branch I renamed the file, committed, and then made a small change in another part of the file and committed that. When I went to feature branch and merged from master the merge renamed the file and merged the changes. Here's the output from the merge:

 $ git merge -v master
 Auto-merging single
 Merge made by the 'recursive' strategy.
  one => single | 4 ++++
  1 file changed, 4 insertions(+)
  rename one => single (67%)

The result was a working directory with the file renamed and both text changes made. So it's possible for Git to do the right thing despite the fact that it doesn't explicitly track renames.

This is an late answer to an old question so the other answers may have been correct for the Git version at the time.


J
Jakub Pawlowski

First create a standalone commit with just a rename.

Then any eventual changes to the file content put in the separate commit.