anatolyborodin / git-filter-branch-bug

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Usage

Clone the repository and run ./run.sh.

Explanation

This repository is a test case to illustrate a bug related to git replace, git filter-branch, and git index race condition avoidance mechanism.

The idea is to replace a huge binary file with a small blob in a repository containing many files. The original file aa/bb.dat is changed (added) only in the very first commit:

# git log master -p --stat -- aa/bb.dat

commit a119016e951d320c08017fe5a4f298f46fcba555
Author: Anatoly Borodin <anatoly.borodin@gmail.com>
Date:   2 hours ago

    Add 2 huge binary files
---
 aa/bb.dat | Bin 0 -> 41943040 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/aa/bb.dat b/aa/bb.dat
new file mode 100644
index 0000000..88a0f09
Binary files /dev/null and b/aa/bb.dat differ

After git replace and git filter-branch the file aa/bb.dat should be replaced with a short text in all commits. But due to the bug, some original blobs in some commits could stay untouched:

# ./run.sh

Rewrite 5805be0ecc4521dc30d111cb93105b3400b52c2b (5/5) (10 seconds passed, remaining 0 predicted)
Ref 'refs/heads/test' was rewritten
Deleted replace ref '88a0f09b9b2e4ccf2faec89ab37d416fba4ee79d'
commit 9a3507b4a7c9561f7e92ef4396d80cf2ccf94e7e
Author: Anatoly Borodin <anatoly.borodin@gmail.com>
Date:   77 minutes ago

    Add 8000 small files
---
 aa/bb.dat | Bin 47 -> 41943040 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/aa/bb.dat b/aa/bb.dat
index 16e0939..88a0f09 100644
Binary files a/aa/bb.dat and b/aa/bb.dat differ

commit 95ef828e3a007705162e275f689a9f9bb2f992ae
Author: Anatoly Borodin <anatoly.borodin@gmail.com>
Date:   2 hours ago

    Add 2 huge binary files
---
 aa/bb.dat | 1 +
 1 file changed, 1 insertion(+)

diff --git a/aa/bb.dat b/aa/bb.dat
new file mode 100644
index 0000000..16e0939
--- /dev/null
+++ b/aa/bb.dat
@@ -0,0 +1 @@
+This file was to big, and it has been removed.

As we can see, the rewritten commit 5805be0 -> 9a3507b still contains the original blob.

PS Because the bug depend on timestamps, ./run.sh can behave in a nondeterministic way: it's not always the Add 8000 small files commit with the huge blob after the git filter-branch run, and the script can even run correct sometimes. But most of the time I could reproduce the bug in the 2.7.0 version on an Ubuntu workstation and on a FreeBSD laptop.

About


Languages

Language:Shell 100.0%