calculation question
jdevoo opened this issue · comments
Hello Valentin,
I am trying to understand a minor discrepancy in the result of HoC.
I tried the following command against two repos with badges...
git log --ignore-space-change --ignore-all-space --ignore-submodules --find-copies-harder --diff-filter=ACDM --find-renames --numstat --pretty=tformat:'' | awk 'BEGIN{n=0}{n+=($1+$2)}END{print n}'
For https://github.com/yegor256/rultor
I get 305745 instead of 304252
For https://github.com/yegor256/tacit/
I get 4126 instead of 4123
I am using git 2.17.1
TIA
Good catch, thank you very much!
I was testing with this repository and there the numbers match. I'll investigate where the problem lies and will try to write integration tests to prevent this from happening again.
Ok thank you! Just wanted to be sure as I am planning to use this metric for a study and wasn't sure if I had to take extra measures such as looking into branches as I noticed the rev-list and reference to ref/heads.
I've been using @yegor256's implementation of hoc
as a reference: https://github.com/yegor256/hoc/blob/master/lib/hoc/git.rb#L40
For tacit
(on yegor256/tacit@61d6d38), both hoc
and my service return the same result: 4124
while your command returns 4126
For rultor
(on yegor256/rultor@3b2e996) all three return different values:
Tool | Result |
---|---|
yegor256/hoc | 304266 |
vbrandl/hoc | 304252 304266 |
manual | 305761 |
So there is still a problem.
If you are interested, here is how I calculate the hoc count: https://github.com/vbrandl/hoc/blob/master/src/main.rs#L85. There is some caching but I just deleted the cache and the results were the same.
I'll investigate further and keep you updated. If you need anything for your study (e.g. better machine readable output) just hit me up and I'll see what I can do.
Edit: I just deleted the cache again and now the webservice returns the same number as the command line tool. So there is something wrong with the caching logic (force pushing to a repository is a known problem). I will at least implement some functionality to delete the cache.
Edit2: I just released v0.12.0
which allows removing the repository and cache from the server to rebuild both. On the overview page there is now a button for this operation. If you want to do this from code, just POST /<service>/<user>/<repo>/delete