thblt / write-yourself-a-git

Learn Git by reimplementing it from scratch

Home Page:https://wyag.thb.lt

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ls-tree - Throw exception when the hash starts with "0"

cuong-nguyenduy opened this issue · comments

I have a Git Repo as below:

$ git ls-tree -r -t HEAD
100644 blob 493ea4464a7d6a859a41fbe715e9b94b985c6821    README
100644 blob 35a5f58b93bbbed9d8a500194f07b3a7db257661    Timestamp
040000 tree d49390b6e763b7a1ecdc9760ff2ca56e70623321    build
100644 blob b45a037cb8ff4a41e7175423eca4b46878b610ab    build/build.c
040000 tree 04294b6fcc42fa918540bdebf3c39a1c101261df    build/dist
100644 blob e965047ad7c57865823c7d992b1d046ea66edf78    build/dist/out
100644 blob db004deb6868fe44abbb9b2dc1aea9b57a6be5b8    feature01
100644 blob 16f9586f2a7e7af95e0fc70da82cae9e30967258    feature02
040000 tree 0159bb70fe4d813027076d25b7747a1e506ca620    src
040000 tree 3f0375b79b9e445c86be23cef6941170194a364f    src/libs
100644 blob 0d76a16e5cd889bd64ddd006a5e4cb75b8b81c54    src/libs/lib1.c
100644 blob d751e9dbb2a921d46c14dfd474e5f924dd0fc2ed    src/src.c

ls-tree worked fine with d49390b6e763b7a1ecdc9760ff2ca56e70623321

$ wyag.py ls-tree d49390b6e763b7a1ecdc9760ff2ca56e70623321
100644 blob b45a037cb8ff4a41e7175423eca4b46878b610ab    build.c
040000 tree 04294b6fcc42fa918540bdebf3c39a1c101261df    dist

However, ls-tree cannot read d49390b6e763b7a1ecdc9760ff2ca56e70623321 properly, since the second entry starts with "0"

$ wyag.py ls-tree d49390b6e763b7a1ecdc9760ff2ca56e70623321
100644 blob b45a037cb8ff4a41e7175423eca4b46878b610ab    build.c
Traceback (most recent call last):
  File "/home/dcnguyen/projects/wyag/wyag.py", line 4, in <module>
    libwyag.main()
  File "/home/dcnguyen/projects/wyag/libwyag.py", line 101, in main
    elif args.command == "ls-tree"     : cmd_ls_tree(args)
  File "/home/dcnguyen/projects/wyag/libwyag.py", line 578, in cmd_ls_tree
    object_read(repo, item.sha).obj_type.decode("ascii"),
  File "/home/dcnguyen/projects/wyag/libwyag.py", line 273, in object_read
    with open(path, "rb") as file:
FileNotFoundError: [Errno 2] No such file or directory: '/home/dcnguyen/projects/test/.git/objects/42/94b6fcc42fa918540bdebf3c39a1c101261df'

I fixed it by padding "0" in front of the sha if its length is less than 40

def tree_parse_one(raw, start=0):
    """"""

    # Find the space terminator of the mode
    x = raw.find(b' ', start)
    assert (x - start == 5 or x - start == 6)

    # Read the mode
    mode = raw[start:x]

    # Find the NULL terminator of the path
    y = raw.find(b'\x00', x)
    # and read the path
    path = raw[x + 1:y]

    # Read the SHA and convert to an hex tring
    sha = hex(
        int.from_bytes(raw[y + 1: y + 21], "big")
    )[2:]  # hex() adds '0x' in front - we don't want that

    # Padding 0 if needed
    if len(sha) < 40:
        for _ in range(40 - len(sha)):
            sha = "0" + sha

    return y + 21, GitTreeLeaf(mode, path, sha)

Thanks a lot! I think a cleaner solution would be to use

format(int_val, "040x")

instead of hex()

I'll fix this when I'll find the time