glandium / git-cinnabar

git remote helper to interact with mercurial repositories

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

git cinnnabar fsck fails on searchfox indexer's gecko-dev checkout with `Sha1 mismatch for file browser/config/version.txt`

asutherland opened this issue · comments

Basically reposting the details of https://bugzilla.mozilla.org/show_bug.cgi?id=1716167#c4 cc @staktrace

Here's the cinnabar version dump:

0.5.8a
module-hash: ce3b0259e8dc6f943528e175597a3152f5cf1a24
helper-hash: b8923f9ad6cd68864e7cf0eec5e88a00aafcda84

Here's the log excerpt, and the fast import is uploaded to bugzilla at https://bugzilla.mozilla.org/attachment.cgi?id=9227980

Checking 235 changeset heads
Loading 651889 manifests
Checking 365 manifest heads
Checking 22346 filesfatal: Missing data
fast-import: dumping crash report to git/.git/fast_import_crash_14440
Checking 22447 filesTraceback (most recent call last):
  File "/home/ubuntu/git-cinnabar/cinnabar/util.py", line 999, in run
    retcode = func(args)
  File "/home/ubuntu/git-cinnabar/cinnabar/cmd/fsck.py", line 376, in fsck
    return fsck_quick(args.force)
  File "/home/ubuntu/git-cinnabar/cinnabar/cmd/fsck.py", line 302, in fsck_quick
    if not GitHgHelper.check_file(hg_file, *hg_fileparents):
  File "/home/ubuntu/git-cinnabar/cinnabar/helper.py", line 311, in check_file
    with self.query(b'check-file', hg_sha1, *parents) as stdout:
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/home/ubuntu/git-cinnabar/cinnabar/helper.py", line 198, in query
    wrapper.write(b'%s %s\n' % (name, b' '.join(args)))
IOError: [Errno 32] Broken pipe
Sha1 mismatch for file browser/config/version.txt
  revision eb289cb40f790c0a612b2db711ce3c336f97a840
  with parent 00705a34f53c9c6808691de3686bdd1c71c5b6f3

Can you create a git bundle of refs/cinnabar/metadata and put it somewhere I can download it?

I've also invited you to the people.mozilla.org searchfox LDAP group so that if you want to directly work on one of the VMs, you can. The instructions are at https://github.com/mozsearch/mozsearch/blob/master/docs/aws.md#setting-up-aws-locally and mozilla-central should always be checked out on web-servers with tags like 'cfile: config1.json', 'channel: release1'. The most recent web-server is going to be the newest one, and that shouldn't be used to avoid impacting the site, but the older ones are just backups. The git repo can be found at ~/index/mozilla-central/git on those machines. The caveat is the web-servers are not very powerful and are backed by S3 which means I/O isn't amazing. The indexers do use SSD's though and can be manually triggered (use the "dev" channel ideally), although if you see one alive, you can just poke around on it too, with the data being in /mnt/index-scratch/mozilla-central/ until completed when the data moves onto /index (S3) but there will be symlinks created as part of the migration.

Ok, so this is a side effect of how grafting worked in the past, combined with #249 and the fact that gecko-dev is missing the esr10 branch. Adding esr10 to gecko-dev would fix the problem. Alternatively, you can add a remote for esr10 to your repo, like you have for the other esr branches, git remote update esr10 and fsck will work after that. Now that gecko-dev uses git-cinnabar, even if gecko-dev adds esr10 after that, you won't have different git sha1s.

https://s3.us-west-2.amazonaws.com/searchfox.repositories/gecko.tar (8.3G) is the full "git" directory that searchfox downloads in each config1 indexer run, updates, and then re-uploads/

Note that you could save a lot in the size of that tar if you didn't include the checkout, which you obviously can do manually after downloading the contents of the .git directory.

Thanks very much for the analysis and solution!

Note that you could save a lot in the size of that tar if you didn't include the checkout, which you obviously can do manually after downloading the contents of the .git directory.

Yeah, that's an inefficiency we could optimize in https://github.com/mozsearch/mozsearch-mozilla/blob/master/mozilla-central/upload