Files are repeatedly transferred
fezzzza opened this issue · comments
Every time a "mirror client" command is run, it seems that a large number - presumably all files "visible" to mirror (see #22) are transferred one way or another from server to client, updating the ctime on the files.
for example, on the server:
root@www-1:/var/www/html/edge/i/flags# ls -al Finland.png
-rw-r--r-- 1 root root 3036 Oct 27 21:38 Finland.png
root@www-1:/var/www/html/edge/i/flags# stat Finland.png
File: Finland.png
Size: 3036 Blocks: 8 IO Block: 4096 regular file
Device: 801h/2049d Inode: 308285 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2018-10-27 22:44:48.945791967 +0100
Modify: 2018-10-27 21:38:38.372526215 +0100
Change: 2018-10-27 21:38:38.372526215 +0100
Birth: -
whereas on the client:
me@devBox:/var/www/html/edge/i/flags$ ls -al Finland.png
-rw-r--r-- 1 ferenc ferenc 3036 Oct 27 21:38 Finland.png
me@devBox:/var/www/html/edge/i/flags$ stat Finland.png
File: Finland.png
Size: 3036 Blocks: 8 IO Block: 4096 regular file
Device: 805h/2053d Inode: 1835299 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ ferenc) Gid: ( 1000/ ferenc)
Access: 2018-10-27 21:38:38.372000000 +0100
Modify: 2018-10-27 21:38:38.372000000 +0100
Change: 2018-10-28 16:48:06.759671332 +0000
Birth: -
I don't know what the basis is for your file comparison, but surely this can be avoided by using mtime as a point of comparison and writing a synchronised mtime at the same time as writing the file, or even just updating the mtime on the file after the write?
I rely on the mtime to see the latest files I am working on, and don't really want the mtime interfered with by any external process.
I am running linux mint 19(~ubuntu 18 bionic) without watchman installed (see #20)
mtime
is what is used for comparison, and if mtime
is different, the latest mtime
wins, and then the opposite side has the file written + mtime
set to match, exactly as you're reasoning about:
https://github.com/stephenh/mirror/blob/master/src/main/java/mirror/SaveToLocal.java#L103
https://github.com/stephenh/mirror/blob/master/src/main/java/mirror/NativeFileAccessUtils.java#L21
Modify: 2018-10-27 21:38:38.372526215 +0100
Modify: 2018-10-27 21:38:38.372000000 +0100
I did have a bug where Java (or maybe even POSIX I forget) could only set mtime
with a resolution of milliseconds or what not. So the two sides would always look out of sync, because the "set mtime to X" was always nanoseconds/whatever off.
Here is the commit that fixed that:
I wonder if you're seeing some variation of that.
mirror client
has a --debug-prefixes
option that if you include one of the file paths that is doing this, it should output some hopefully useful information about why it was decided it needs synced.
Here's the relevant portion of the client log just for that file:
2018-10-29 22:46:37 INFO edge/i/flags/Finland.png gitIgnored=false, extraIncluded=false, extraExcluded=false
2018-10-29 22:46:37 INFO edge/i/flags/Finland.png isRemoteNewer
2018-10-29 22:46:37 INFO l: null
2018-10-29 22:46:37 INFO r: modTime: 1540672718372 data: "\357\277\275PNG\r\n\032\n\000\000\000\rIHDR\000\000\000\024\000\000\000\f\b\002\000\000\000\357\277\275n\n\357\277\275\000\000\000\tpHYs\000\000-\357\277\275\000\000..."
2018-10-29 22:46:37 INFO Remote update edge/.../Finland.png
and on the server:
2018-10-29 22:46:37 INFO edge/i/flags/Finland.png gitIgnored=false, extraIncluded=false, extraExcluded=false
2018-10-29 22:46:37 INFO edge/i/flags/Finland.png isLocalNewer
2018-10-29 22:46:37 INFO l: modTime: 1540672718372 local: true
2018-10-29 22:46:37 INFO r: null
...
2018-10-29 22:46:37 INFO Sending edge/.../Finland.png
then I installed watchman on the client and updated the client to v1.2.1 and:
2018-10-29 23:13:26 INFO Queueing: path: "edge/i/flags/Finland.png" modTime: 1540672718372 local: true
--
2018-10-29 23:13:29 INFO edge/i/flags/Finland.png gitIgnored=false, extraIncluded=false, extraExcluded=false
--
2018-10-29 23:13:51 INFO Queueing: path: "edge/i/flags/Finland.png" modTime: 1540672718372 local: true
--
2018-10-29 23:13:53 INFO edge/i/flags/Finland.png gitIgnored=false, extraIncluded=false, extraExcluded=false
--
2018-10-29 23:14:01 INFO edge/i/flags/Finland.png isLocalNewer
2018-10-29 23:14:01 INFO l: modTime: 1540672718372 local: true
2018-10-29 23:14:01 INFO r: null
--
2018-10-29 23:14:28 INFO Sending edge/.../Finland.png
--
still the file is transferred
then installing watchman and updating the executable on the server to v1.2.1:
server side with 1.2.1 + watchman:
--
2018-10-30 00:51:41 INFO Queueing: path: "edge/i/flags/Finland.png" modTime: 1540672718372 local: true
--
2018-10-30 00:51:43 INFO edge/i/flags/Finland.png gitIgnored=false, extraIncluded=false, extraExcluded=false
client side:
--
2018-10-30 00:51:11 INFO Queueing: path: "edge/i/flags/Finland.png" modTime: 1540672718372 local: true
--
2018-10-30 00:51:13 INFO edge/i/flags/Finland.png gitIgnored=false, extraIncluded=false, extraExcluded=false
--
So now nothing seems to be transferred.
I think that's a pretty compelling argument for installing watchman!
Oddly, one file bounced quickly back-and-forth between the servers when I changed the file contents and saved it:
client side:
2018-10-30 01:24:58 INFO Client has 2314 paths
2018-10-30 01:25:15 INFO Server has 2315 paths
2018-10-30 01:25:15 INFO Tree populated
2018-10-30 01:26:33 INFO Queueing: path: "dev/panic_fw.php" modTime: 1540862793272 local: true
2018-10-30 01:26:33 INFO dev/panic_framework.php isLocalNewer
2018-10-30 01:26:33 INFO l: modTime: 1540862793272 local: true
2018-10-30 01:26:33 INFO r: modTime: 1540862407222 data: "initialSyncMarker" local: true
2018-10-30 01:26:33 INFO Sending dev/panic_fw.php
2018-10-30 01:26:36 INFO dev/panic_framework.php isRemoteNewer
2018-10-30 01:26:36 INFO l: modTime: 1540862793272 local: true
2018-10-30 01:26:36 INFO r: modTime: 1540862795427 data: "<?php\n/**\n ... * \n..."
2018-10-30 01:26:36 INFO Remote update dev/panic_fw.php
2018-10-30 01:26:36 INFO Queueing: path: "dev/panic_fw.php" modTime: 1540862795427 local: true
server side:
2018-10-30 01:25:14 INFO Server has 2315 paths
2018-10-30 01:25:14 INFO Client has 2314 paths
2018-10-30 01:25:14 INFO Tree populated
2018-10-30 01:25:15 INFO Sending dev/scripts
2018-10-30 01:26:34 INFO Remote update dev/panic_fw.php
2018-10-30 01:26:36 INFO Sending dev/panic_fw.php
2018-10-30 01:26:38 INFO Remote update dev/panic_fw.php
...though this has only happened once and it has not recurred during my very short test. Much seems to be working as it should.
2018-10-29 23:14:01 INFO r: null
Those r: null
s insinuates the client (l
) thought the server (r
) didn't have the file, so that is why it sent it...
Maybe this ties in with your other bug where, for some reason the Java/non-watchman codepath is just blithely ignoring/not finding files, so then they either don't sync or resync or what not.
Sounds like I should just hard-delete the Java watcher. I've only used the watchman-based codepaths for ~years now so really don't know how well the Java-only one does/does not work...
Pre-emptively closing with the plan to "fix" via #25.