skvadrik / re2c

Lexer generator for C, C++, Go and Rust.

Home Page:https://re2c.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use full hashes in changelog

SuperSandro2000 opened this issue · comments

commented

The latest release notes on https://re2c.org/releases/changelog/changelog.html#id1 contains a lot of short hashes. This is problematic because github shared commit hashes between forks and a fork could trivially brute force those hashes and those break the links. If GitHub cannot identify to which commit a short hash belongs it just returns 404.

For example NixOS/nixpkgs@27250f7 works but NixOS/nixpkgs@27250f already does not

not on

what do you mean, is this a typo?

I didn't use the full hashes because they clutter the changelog too much. On the website it is possible to make them look short but still use the full hash in the link, so that's not a problem.

Linux kernel did a bit of back-fo-the-napkin math for reasonable abbreviation for reasonable sizes of the repos in 2010: https://lkml.org/lkml/2010/10/28/287

commented

not on

*notes on

I didn't use the full hashes because they clutter the changelog too much. On the website it is possible to make them look short but still use the full hash in the link, so that's not a problem.

that would totally work.

Just as an example how easy it is to generate commits with certain prefixes NixOS/nixpkgs@222222b or NixOS/nixpkgs@ddddcff

Just as an example how easy it is to generate commits with certain prefixes

I like the idea of 12-digit prefixes as the kernel does (suggested by @trofi above). Note that re2c is smaller than the kernel and Torvalds' script finds zero buckets at 9 digits not 11: git rev-list --objects --all | cut -c1-9 | sort | uniq -dc finds nothing.

By "easy to generate", do you mean that some evil person could fork re2c and deliberately add commits until they have the desired duplicate checksum? That seems like a tedious and useless process, given that the outcome is a mere 404 page on GitHub.

I think it makes sense to have 12-digit prefixes to keep the text version of CHANGELOG more readable, which seems more important than guarding against the unlikely possibility that someone will generate a duplicate prefix.