stephenh / mirror

A tool for real-time, two-way sync for remote (e.g. desktop/laptop) development

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Syncing .git folder

Bessonov opened this issue · comments

Hey @stephenh, your project seems very promising to use it with my development docker image to sync files between local and remote environment.

I've read Git Usage Note. But I think this isn't entirely true. AFAIK git doesn't have repo state outside of .git folder. It seems to work great on small test git repo. Everything is in-sync: files, stash, .git/config, current branch etc. But on a little bit larger repository it get out of sync. I'm not familiar with inotify and mirror, but I think there is some bug, if multiple files are exchanged in the way how git do it. I don't think this is git-only issue. Maybe some events are missing? What do you think about it?

After some experiments I can reproduce it on a small repo. It miss some sort of delete folder event:

Start server in server folder:

docker run --rm --init -it -u $(id -u):$(id -g) -v $(pwd):/data -p 49172:49172 quay.io/stephenh/mirror:1.3.3 server

Start client in client folder:

docker run --rm --init -it --network host -u $(id -u):$(id -g) -v $(pwd):/data quay.io/stephenh/mirror:1.3.3 client --debug-all --include '**' --local-root /data --remote-root /data --host localhost

Go to the client folder and run:

mkdir missing
cd missing/
git init
touch file
git add file
git commit -m 'init'
git checkout -b test
mkdir folder
touch removes folder/stays
git add removes folder/stays
git commit -m 'add file'

Everything is fine on remote and local system:

$ tree
.
├── file
├── folder
│   └── stays
└── removes

Now run git checkout master on local or remote - removes file get removed, but folder/stays stays`:

$ tree
.
└── file

0 directories, 1 file

and

$ tree
.
├── file
└── folder
    └── stays

1 directory, 2 files

Client log:

2020-01-04 22:56:40 INFO  Queueing: path: "missing/.git/HEAD" modTime: 1578178600387 local: true
2020-01-04 22:56:40 INFO  Queueing: path: "missing/.git/logs/HEAD" modTime: 1578178600387 local: true
2020-01-04 22:56:40 INFO  Queueing: path: "missing/.git/index" modTime: 1578178600359 local: true
2020-01-04 22:56:40 INFO  Queueing: path: "missing/.git" modTime: 1578178600387 local: true directory: true executable: true
2020-01-04 22:56:40 INFO  Queueing: path: "missing/.git/index.lock" delete: true local: true
2020-01-04 22:56:40 INFO  Queueing: path: "missing/folder" delete: true local: true directory: true executable: true
2020-01-04 22:56:40 INFO  Queueing: path: "missing" modTime: 1578178600359 local: true directory: true executable: true
2020-01-04 22:56:40 INFO  Queueing: path: "missing/folder/stays" delete: true local: true
2020-01-04 22:56:40 INFO  Queueing: path: "missing/removes" delete: true local: true
2020-01-04 22:56:40 INFO  missing/.git/HEAD isLocalNewer
2020-01-04 22:56:40 INFO    l: modTime: 1578178600387 local: true
2020-01-04 22:56:40 INFO    r: modTime: 1578178583227 local: true
2020-01-04 22:56:40 INFO  Sending missing/.../HEAD
2020-01-04 22:56:42 INFO  missing/removes isLocalNewer
2020-01-04 22:56:42 INFO    l: modTime: 1578178584227 delete: true local: true
2020-01-04 22:56:42 INFO    r: modTime: 1578178583227 local: true
2020-01-04 22:56:42 INFO  missing/.git/index isLocalNewer
2020-01-04 22:56:42 INFO    l: modTime: 1578178600359 local: true
2020-01-04 22:56:42 INFO    r: modTime: 1578178583227 local: true
2020-01-04 22:56:42 INFO  missing/.git/logs/HEAD isLocalNewer
2020-01-04 22:56:42 INFO    l: modTime: 1578178600387 local: true
2020-01-04 22:56:42 INFO    r: modTime: 1578178583227 local: true
2020-01-04 22:56:42 INFO  Sending (delete) missing/removes
2020-01-04 22:56:42 INFO  Sending missing/.../index
2020-01-04 22:56:42 INFO  Sending missing/.../HEAD

Server log:

2020-01-04 22:56:40 INFO  Remote update missing/.../HEAD
2020-01-04 22:56:42 INFO  Remote delete missing/removes
2020-01-04 22:56:42 INFO  Remote update missing/.../index
2020-01-04 22:56:42 INFO  Remote update missing/.../HEAD

As you can see, deletion of stays and folder is queued, but not sent. But sometimes I see even:

2020-01-04 23:00:23 INFO  Sending (delete) missing/removes
2020-01-04 23:00:23 INFO  Sending (delete) missing/folder

And sometimes it even works.

Discovered, that it works one time, if you are in test branch, then start server and then client. If you switch to master, then everything is fine. But if again to test and then to master - it doesn't work.

Log of first time switch:

Client:

2020-01-04 23:13:25 INFO  Queueing: path: "missing/.git/HEAD" modTime: 1578179605199 local: true
2020-01-04 23:13:25 INFO  Queueing: path: "missing/.git/logs/HEAD" modTime: 1578179605199 local: true
2020-01-04 23:13:25 INFO  Queueing: path: "missing/.git/HEAD.lock" delete: true local: true
2020-01-04 23:13:25 INFO  Queueing: path: "missing/.git/index" modTime: 1578179605167 local: true
2020-01-04 23:13:25 INFO  Queueing: path: "missing/.git" modTime: 1578179605199 local: true directory: true executable: true
2020-01-04 23:13:25 INFO  Queueing: path: "missing/.git/index.lock" delete: true local: true
2020-01-04 23:13:25 INFO  Queueing: path: "missing/removes" delete: true local: true
2020-01-04 23:13:25 INFO  Queueing: path: "missing/folder" delete: true local: true directory: true executable: true
2020-01-04 23:13:25 INFO  Queueing: path: "missing/folder/stays" delete: true local: true
2020-01-04 23:13:25 INFO  Queueing: path: "missing" modTime: 1578179605167 local: true directory: true executable: true
2020-01-04 23:13:25 INFO  missing/.git/HEAD isLocalNewer
2020-01-04 23:13:25 INFO    l: modTime: 1578179605199 local: true
2020-01-04 23:13:25 INFO    r: modTime: 1578178871683 data: "initialSyncMarker" local: true
2020-01-04 23:13:25 INFO  Sending missing/.../HEAD
2020-01-04 23:13:27 INFO  missing/folder isLocalNewer
2020-01-04 23:13:27 INFO    l: modTime: 1578178872655 delete: true local: true directory: true executable: true
2020-01-04 23:13:27 INFO    r: modTime: 1578178809171 data: "initialSyncMarker" local: true directory: true executable: true
2020-01-04 23:13:27 INFO  missing/removes isLocalNewer
2020-01-04 23:13:27 INFO    l: modTime: 1578178872655 delete: true local: true
2020-01-04 23:13:27 INFO    r: modTime: 1578178871655 data: "initialSyncMarker" local: true
2020-01-04 23:13:27 INFO  missing/.git/index isLocalNewer
2020-01-04 23:13:27 INFO    l: modTime: 1578179605167 local: true
2020-01-04 23:13:27 INFO    r: modTime: 1578178871655 data: "initialSyncMarker" local: true
2020-01-04 23:13:27 INFO  missing/.git/logs/HEAD isLocalNewer
2020-01-04 23:13:27 INFO    l: modTime: 1578179605199 local: true
2020-01-04 23:13:27 INFO    r: modTime: 1578178871683 data: "initialSyncMarker" local: true
2020-01-04 23:13:27 INFO  Sending (delete) missing/folder
2020-01-04 23:13:27 INFO  Sending (delete) missing/removes
2020-01-04 23:13:27 INFO  Sending missing/.../index
2020-01-04 23:13:27 INFO  Sending missing/.../HEAD

Server:

2020-01-04 23:13:25 INFO  Remote update missing/.../HEAD
2020-01-04 23:13:27 INFO  Remote delete missing/folder
2020-01-04 23:13:27 INFO  Remote delete missing/removes
2020-01-04 23:13:27 INFO  Remote update missing/.../index
2020-01-04 23:13:27 INFO  Remote update missing/.../HEAD

Hm, there are some troubles with directories.

Go to client and just create a folder, delete it and then create with the same name again:

mkdir newfolder

$ tree ~/client/ ~/remote/
/home/user/client/
└── newfolder
/home/user/remote/
└── newfolder

2 directories, 0 files
rmdir newfolder/

$ tree ~/client/ ~/remote/
/home/user/client/
/home/user/remote/

0 directories, 0 files
mkdir newfolder

$ tree ~/client/ ~/remote/
/home/user/client/
└── newfolder
/home/user/remote/

1 directory, 0 files
touch newfolder/test

$ tree ~/client/ ~/remote/
/home/user/client/
└── newfolder
    └── test
/home/user/remote/
└── newfolder
    └── test

2 directories, 2 files
rm -rf newfolder/

$ tree ~/client/ ~/remote/
/home/user/client/
/home/user/remote/
└── newfolder
    └── test

1 directory, 1 file

Good find on the directories-not-recreated bug. I probably hadn't noticed before b/c, similar to git, I doubt I'd noticed directories not showing up before they had files in them anyway.

I pushed out a fix in 1.3.4.

Let me know if you have similar easy/great repros as your mkdir/rmdir/mkdir example.

For your larger point around syncing the .git directory, yes, you are right that, contrary to the FAQ entry, I think it could work / probably does work in some (most?) scenarios.

My concern with recommending it (and using it myself) is I just don't trust different versions of git, and even more so different versions across different platforms (i.e. git on mac and git linux), to guarantee to use exactly the same backwards/forwards compatible format for every file in the .git directory.

That said, knowing a little bit about git's approach to text files and overall .git organization., I'm not surprised it works, I just didn't want to take responsibility for a) verifying which combinations of git versions x OS platforms did/did not work and b) having people get annoyed when it magically didn't work and somehow corrupted their .git directory on either the client side or remote side.

All that said, I'm happy to update the readme/disclaimer/etc. to note that syncing the .git dir can actually work if you're able to prove out the workflow (and now I'm pretty tempted to try it myself, especially since I use the exact same linux/git/etc. on both sides of my client/remote :-) ).

Let me know how it goes with the 1.3.4 release.

@stephenh great, thank you for you fast response and fix! It would be nice, if you could release a new docker image.

The issue with recreating directory is indeed fixed. But the example of branch switch works still only partially. If I just switch between branches test and master, it works every odd time and doesn't work every even time :)

Same behavior with just

mkdir test && touch test/test

and

rm -rf test/

Same behavior with just

Well, jeez, I've been using mirror almost daily for years and never noticed this. Thanks for another great repro. I fiddled with how deletions are handled (they're a little different b/c they don't have mod times).

Pushed out 1.3.6. I'm not necessarily confident that your .git workflow will work, but I think we should be closer. See how it goes?

I'm working on the docker build...a contributor setup the quay build so I'm kind of learning/re-learning how that works. I think I'd accidentally revoked its github access, but now it's back, and failing on some misc things...

Unfortunately, the directory isn't deleted now:

tree ~/client/ ~/remote/ && rmdir test/ && sleep 2 && tree ~/client/ ~/remote/

/home/user/client/
├── missing
│   └── file
└── test
/home/user/remote/
├── missing
│   └── file
└── test

4 directories, 2 files
/home/user/client/
├── missing
│   └── file
└── test
/home/user/remote/
├── missing
│   └── file
└── test

4 directories, 2 files

It seems fine for me?

$ mkdir remote/test && sleep 2 && tree && rmdir remote/test && sleep 2 && tree
.
├── client
│   └── test
└── remote
    └── test

4 directories, 0 files
.
├── client
└── remote

2 directories, 0 files

Also using mkdir -p and nested dirs seems to work:

$ mkdir -p remote/test/test && sleep 2 && tree && rm -fr remote/test && sleep 2 && tree
.
├── client
│   └── test
│       └── test
└── remote
    └── test
        └── test

6 directories, 0 files
.
├── client
└── remote

2 directories, 0 files

(Nice idea on the mkdir/sleep/tree-based repros.)

Strange, I can reproduce it locally. I'll try with original image.

BTW, appropriate image tag is missing.

BTW, appropriate image tag is missing.

For some reason quay wasn't picking up new tags, but I just added a new 1.3.8 version tag, and now it's building that ... so not sure what I missed before. Thanks for mentioning it; once that build is finished AFAICT both new-release-tag and latest label docker builds will be working.

Strange, I can reproduce it locally

Locally as in without docker, or with docker? I haven't been using docker so far to reproduce/fix the issues you've found. I see you gave the full server/client docker commands you had above, so I'll try that later today if I get a chance.

once that build is finished AFAICT both new-release-tag and latest label docker builds will be working.

Thanks!

Locally as in without docker, or with docker?

With docker with client and server on the same host. Maybe I didn't update something, but with latest I was not able to reproduce the issue 👍

But I had two times desynchronized state, for which I'm not able to find the steps to reproduce even with some stress tests (while true: steps).

Because the original issue is resolved, I close this ticket.

Thanks you very much for the great support!

Np, thanks for following up!

@Bessonov fwiw you've tempted me into trying this workflow out.

I always figured "there be dragons" with the sort of high-rate-of-churn stuff git would do in the .git/ directory, even from a simple git pull or git fetch command (i.e. mirror's approach is admittedly timestamp / semi-heuristically based and not "we're keeping a change data capture log of both sides file system events and merging them with some sort of proven-to-be-always-right operational transform / CRDT data structure".

That said, it does seem to be working. :-) I'll at least be more likely to flush out bugs by using it personally.

Also, I did observe a bug that happened often enough it should be reproducible where deleting a directory with N levels of files would not fully delete the directory, because deleting files on the remote side causes the parent directory modtimes to get bumped, so the remote side thinks it "bumped-b/c-I-just-did-a-delete" directories are newer than the local side's timestamps. It's not a huge deal because it shouldn't affect files, just directories, but it's still annoying. I'll file a bug and poke at it at some point.

@stephenh thanks to let me know!

Also, I did observe a bug that happened often enough it should be reproducible where deleting a directory with N levels of files would not fully delete the directory, because deleting files on the remote side causes the parent directory modtimes to get bumped, so the remote side thinks it "bumped-b/c-I-just-did-a-delete" directories are newer than the local side's timestamps.

Well, it sounds like the bug I've described above, but was not able to reproduce.