ishepard / pydriller

Python Framework to analyse Git repositories

Home Page:http://pydriller.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

only_in_branch filter not working

Tommy-TI opened this issue · comments

Describe the bug:
When attempting to iterate over commits in my repository using the following code:

for commit in Repository('https://github.com/Projet-de-fin-etudes/Outil-de-visualisation-evolution-des-traces-execution', since=from_date, to=to_date, only_modifications_with_file_types=FILE_TYPES, only_in_branch='origin/dev').traverse_commits():
    print(commit.branches)

I'm only getting multiple outputs of {'main'}, which does not include the specified branch. no output at all contain the dev branch.
image

Expected behavior:
I expected the code to iterate over commits in the specified branch 'origin/dev' and provide the corresponding branch information for each commit.

Steps to reproduce:

  1. Use the provided code to iterate over commits in the repository.
  2. Check the output of commit.branches for each iteration.

Additional information:
I attempted to clone the repository and use the local path instead of the URL, but encountered the same problem in both cases.

OS Version:
Windows

It appears that this function isn't compatible with all options. Here is the code:

        if single is not None:
            rev = [single, '-n', 1]
        elif from_commit is not None or to_commit is not None:
            if from_commit is not None and to_commit is not None:
                rev.extend(from_commit)
                rev.append(to_commit)
            elif from_commit is not None:
                rev.extend(from_commit)
                rev.append('HEAD')
            else:
                rev = to_commit
        elif branch is not None:
            rev = branch
        else:
            rev = 'HEAD'

I can't repro locally, unless I'm doing something wrong.
Using your repo:

from pydriller import Repository
from datetime import datetime

fromd = datetime(2023,5,21)
tod = datetime(2023,6,21)

for commit in Repository('/tmp/test/', 
                         since=fromd, 
                         to=tod, 
                         only_modifications_with_file_types=['.py'], 
                         only_in_branch="origin/dev").traverse_commits():
    print(commit.hash)

produces:

47e82ae3eca9f2db20598dd169b58e76013e9c3f
f58e6548b81480ccbbebf2d57b03384440d5b7d4
7e560bd0bea0dad90719af242181242b0fa1636a
d0d4587378dc51b0a719560bea25b9814f357176
4615d5f46d7d1ebcaedc5f824d52049682249b78
0fa3eaa0358768602648e4ab59123411694ff987
1f244272eceb0647f3194b0ab3a9bcb29a938c8d
252c82378a91e2c535deca1601a04eace9004871
4975a8e501410b74df4373259b6d4f1cd65f0d84
179644ebe25e12fa2f9136f6fcd032bf91679659
d38ee599e6562943f08722295a2e90ae3297ef49
f3a759c2442b096d7787d71acde12289862b5836
d8009815b09c0270186ed2e241ad6884221f9843
d3ace33853cd855a924cedb273d5ab99b8ffd3a7
6f66a098f11d2616037fb4def0c2251be27a9527
9cb93d8379aaa397cb857b85e20099f3a0f30007
91cd75502a23b366c8308e502518fbd3060f3739
5a728346e17ac326cb13d58d7be5c46df37b08ac
9f1bff41fc2ca84e9a2878f03177601d9889f8f9
fed1d816c7261fd901c8dee0ae1cb797aa5beb89
76ad8e7a67af5c22e57b343d3a67228e81c0dbb0
cca052a08b312cf5556006ca357782b4c56e0b03
331736a795378bb18f0654ca6b80a85650b932f4
4a6ab0044e77cae9c488c9f874a48e5e06ba0dc8
9ee53f1dbc05a078fa2470e8b470c1785b6e13bc
32cde91e1913ca152c4400b92ea4b808fa006f65
ee9a168f269c48d479112f19bdc3af34974289d3
933939b5bb200fb84323a1434db2ed751e5caad5
d84b01760f5a52587bcbff8f8522288153ca383f
3cadf328db73070f7e134ab4ecf4e0657640e286
a4fc7e16cba09cadabc62c60e629c6509ca464b6
494e3ccf34bb5a5a87064fe125e0f89c034ba3ac
e715881c561bc0df2ef2b0524a0d222d26261f59
3de989dd8f96351e0f9445132eb3352efa395aac
eca4b53f2b3d2a6942a89b7e22e9de3dc2617753
e371649273693a3a0a25cc7dd06b140d0945d951
17327cea3c2e4ad701473ea65044dbbf2bc1fdd0
d17e2b329286a43eabdbbd06e035ffbf76d0f96b
3f33cbee9afc829ebb0010f76d55a7d7efc8f4b7
c9fa96f77544eea43d100fe0f188dbca745d47f4
55494b0bf5344027ccee765a927f6981cb207623
66bf064e5165fe24c37fa924b89a3d235c753650
5d9a5df12f9dc864f3736e57c87531622cb85507
102b1d5ea14dfa541249665590de12e1acd13d36
fea8dd536118f255f81a40ca17b57c6a52a102b6
a94040e99e845c5fc53fe8070484d12926d64675
cb4bf95b3ec6de1be1c34d75a03c8fe757487e94
3c10a3542e7b444528f25a6dbf83acfcc583bc9a
1f989b6ac089bf5efddedaa651e11a099f67d3bd
51e1a8d6ce9090c25590ac7ab6fb2138331f6349
3be8dc9fcae6e625780f0b0f20e2c6e804a18385
0065d940085f505231df19cba843fa3f6901d916
b1fd2ee3b2e25c70d8cfa38e47cd63d052ede62b
2bb6688912062c23294cb6bfb223dc8fc262bd4c
fd3b3391f83b01ea1f13eccd3c8ddba171e557c6
e179128fec66b0354235b394279db63a3d289b3d
80f344e46e52283b0d17aefbe7e8852c1be08ee6
19a415fac7158dcdab3b43e80022c752f019ee2d
d4cebb48916ec4e5eccb034ca9244f84f4b47681
af8ae057f1ab2e25a2ffefcb0d061c4f530582bc

Can you check your git version? I have git version 2.41.0.

I can't repro locally, unless I'm doing something wrong. Using your repo:

from pydriller import Repository
from datetime import datetime

fromd = datetime(2023,5,21)
tod = datetime(2023,6,21)

for commit in Repository('/tmp/test/', 
                         since=fromd, 
                         to=tod, 
                         only_modifications_with_file_types=['.py'], 
                         only_in_branch="origin/dev").traverse_commits():
    print(commit.hash)

produces:

47e82ae3eca9f2db20598dd169b58e76013e9c3f
f58e6548b81480ccbbebf2d57b03384440d5b7d4
7e560bd0bea0dad90719af242181242b0fa1636a
d0d4587378dc51b0a719560bea25b9814f357176
4615d5f46d7d1ebcaedc5f824d52049682249b78
0fa3eaa0358768602648e4ab59123411694ff987
1f244272eceb0647f3194b0ab3a9bcb29a938c8d
252c82378a91e2c535deca1601a04eace9004871
4975a8e501410b74df4373259b6d4f1cd65f0d84
179644ebe25e12fa2f9136f6fcd032bf91679659
d38ee599e6562943f08722295a2e90ae3297ef49
f3a759c2442b096d7787d71acde12289862b5836
d8009815b09c0270186ed2e241ad6884221f9843
d3ace33853cd855a924cedb273d5ab99b8ffd3a7
6f66a098f11d2616037fb4def0c2251be27a9527
9cb93d8379aaa397cb857b85e20099f3a0f30007
91cd75502a23b366c8308e502518fbd3060f3739
5a728346e17ac326cb13d58d7be5c46df37b08ac
9f1bff41fc2ca84e9a2878f03177601d9889f8f9
fed1d816c7261fd901c8dee0ae1cb797aa5beb89
76ad8e7a67af5c22e57b343d3a67228e81c0dbb0
cca052a08b312cf5556006ca357782b4c56e0b03
331736a795378bb18f0654ca6b80a85650b932f4
4a6ab0044e77cae9c488c9f874a48e5e06ba0dc8
9ee53f1dbc05a078fa2470e8b470c1785b6e13bc
32cde91e1913ca152c4400b92ea4b808fa006f65
ee9a168f269c48d479112f19bdc3af34974289d3
933939b5bb200fb84323a1434db2ed751e5caad5
d84b01760f5a52587bcbff8f8522288153ca383f
3cadf328db73070f7e134ab4ecf4e0657640e286
a4fc7e16cba09cadabc62c60e629c6509ca464b6
494e3ccf34bb5a5a87064fe125e0f89c034ba3ac
e715881c561bc0df2ef2b0524a0d222d26261f59
3de989dd8f96351e0f9445132eb3352efa395aac
eca4b53f2b3d2a6942a89b7e22e9de3dc2617753
e371649273693a3a0a25cc7dd06b140d0945d951
17327cea3c2e4ad701473ea65044dbbf2bc1fdd0
d17e2b329286a43eabdbbd06e035ffbf76d0f96b
3f33cbee9afc829ebb0010f76d55a7d7efc8f4b7
c9fa96f77544eea43d100fe0f188dbca745d47f4
55494b0bf5344027ccee765a927f6981cb207623
66bf064e5165fe24c37fa924b89a3d235c753650
5d9a5df12f9dc864f3736e57c87531622cb85507
102b1d5ea14dfa541249665590de12e1acd13d36
fea8dd536118f255f81a40ca17b57c6a52a102b6
a94040e99e845c5fc53fe8070484d12926d64675
cb4bf95b3ec6de1be1c34d75a03c8fe757487e94
3c10a3542e7b444528f25a6dbf83acfcc583bc9a
1f989b6ac089bf5efddedaa651e11a099f67d3bd
51e1a8d6ce9090c25590ac7ab6fb2138331f6349
3be8dc9fcae6e625780f0b0f20e2c6e804a18385
0065d940085f505231df19cba843fa3f6901d916
b1fd2ee3b2e25c70d8cfa38e47cd63d052ede62b
2bb6688912062c23294cb6bfb223dc8fc262bd4c
fd3b3391f83b01ea1f13eccd3c8ddba171e557c6
e179128fec66b0354235b394279db63a3d289b3d
80f344e46e52283b0d17aefbe7e8852c1be08ee6
19a415fac7158dcdab3b43e80022c752f019ee2d
d4cebb48916ec4e5eccb034ca9244f84f4b47681
af8ae057f1ab2e25a2ffefcb0d061c4f530582bc

Can you check your git version? I have git version 2.41.0.

I just updated to git version 2.41.0.windows.1. I still have the same problem. Pydriller version 2.5 too. Maybe the problem is only with the display of the branches in this case ?

            fromd = datetime(2023,5,21)
            tod = datetime(2023,6,21)

            for commit in Repository('https://github.com/Projet-de-fin-etudes/Outil-de-visualisation-evolution-des-traces-execution', 
                                    since=fromd, 
                                    to=tod, 
                                    only_modifications_with_file_types=['.py'], 
                                    only_in_branch="origin/dev").traverse_commits():
                print("hash : " + str(commit.hash) + "\tbranches" + str(commit.branches))

produce :

hash : 47e82ae3eca9f2db20598dd169b58e76013e9c3f branches{'main'}
hash : f58e6548b81480ccbbebf2d57b03384440d5b7d4 branches{'main'}
hash : 7e560bd0bea0dad90719af242181242b0fa1636a branches{'main'}
hash : d0d4587378dc51b0a719560bea25b9814f357176 branches{'main'}
hash : 4615d5f46d7d1ebcaedc5f824d52049682249b78 branches{'main'}
hash : 0fa3eaa0358768602648e4ab59123411694ff987 branches{'main'}
hash : 1f244272eceb0647f3194b0ab3a9bcb29a938c8d branches{'main'}
hash : 252c82378a91e2c535deca1601a04eace9004871 branches{'main'}
hash : 4975a8e501410b74df4373259b6d4f1cd65f0d84 branches{'main'}
hash : 179644ebe25e12fa2f9136f6fcd032bf91679659 branches{'main'}
hash : d38ee599e6562943f08722295a2e90ae3297ef49 branches{'main'}
hash : f3a759c2442b096d7787d71acde12289862b5836 branches{'main'}
hash : d8009815b09c0270186ed2e241ad6884221f9843 branches{'main'}
hash : d3ace33853cd855a924cedb273d5ab99b8ffd3a7 branches{'main'}
hash : 6f66a098f11d2616037fb4def0c2251be27a9527 branches{'main'}
hash : 9cb93d8379aaa397cb857b85e20099f3a0f30007 branches{'main'}
hash : 91cd75502a23b366c8308e502518fbd3060f3739 branches{'main'}
hash : 5a728346e17ac326cb13d58d7be5c46df37b08ac branches{'main'}
hash : 9f1bff41fc2ca84e9a2878f03177601d9889f8f9 branches{'main'}
hash : fed1d816c7261fd901c8dee0ae1cb797aa5beb89 branches{'main'}
hash : 76ad8e7a67af5c22e57b343d3a67228e81c0dbb0 branches{'main'}
hash : cca052a08b312cf5556006ca357782b4c56e0b03 branches{'main'}
hash : 331736a795378bb18f0654ca6b80a85650b932f4 branches{'main'}
hash : 4a6ab0044e77cae9c488c9f874a48e5e06ba0dc8 branches{'main'}
hash : 9ee53f1dbc05a078fa2470e8b470c1785b6e13bc branches{'main'}
hash : 32cde91e1913ca152c4400b92ea4b808fa006f65 branches{'main'}
hash : ee9a168f269c48d479112f19bdc3af34974289d3 branches{'main'}
hash : 933939b5bb200fb84323a1434db2ed751e5caad5 branches{'main'}
hash : d84b01760f5a52587bcbff8f8522288153ca383f branches{'main'}
hash : 3cadf328db73070f7e134ab4ecf4e0657640e286 branches{''}
hash : a4fc7e16cba09cadabc62c60e629c6509ca464b6 branches{''}
hash : 494e3ccf34bb5a5a87064fe125e0f89c034ba3ac branches{''}
hash : e715881c561bc0df2ef2b0524a0d222d26261f59 branches{''}
hash : 3de989dd8f96351e0f9445132eb3352efa395aac branches{''}
hash : eca4b53f2b3d2a6942a89b7e22e9de3dc2617753 branches{''}
hash : e371649273693a3a0a25cc7dd06b140d0945d951 branches{'main'}
hash : 17327cea3c2e4ad701473ea65044dbbf2bc1fdd0 branches{'main'}
hash : d17e2b329286a43eabdbbd06e035ffbf76d0f96b branches{'main'}
hash : 3f33cbee9afc829ebb0010f76d55a7d7efc8f4b7 branches{'main'}
hash : c9fa96f77544eea43d100fe0f188dbca745d47f4 branches{'main'}
hash : 55494b0bf5344027ccee765a927f6981cb207623 branches{'main'}
hash : 66bf064e5165fe24c37fa924b89a3d235c753650 branches{'main'}
hash : 5d9a5df12f9dc864f3736e57c87531622cb85507 branches{'main'}
hash : 102b1d5ea14dfa541249665590de12e1acd13d36 branches{'main'}
hash : fea8dd536118f255f81a40ca17b57c6a52a102b6 branches{'main'}
hash : a94040e99e845c5fc53fe8070484d12926d64675 branches{'main'}    
hash : cb4bf95b3ec6de1be1c34d75a03c8fe757487e94 branches{'main'}    
hash : 3c10a3542e7b444528f25a6dbf83acfcc583bc9a branches{'main'}    
hash : 1f989b6ac089bf5efddedaa651e11a099f67d3bd branches{'main'}    
hash : 51e1a8d6ce9090c25590ac7ab6fb2138331f6349 branches{'main'}    
hash : 3be8dc9fcae6e625780f0b0f20e2c6e804a18385 branches{'main'}    
hash : 0065d940085f505231df19cba843fa3f6901d916 branches{'main'}    
hash : b1fd2ee3b2e25c70d8cfa38e47cd63d052ede62b branches{'main'}    
hash : 2bb6688912062c23294cb6bfb223dc8fc262bd4c branches{''}        
hash : fd3b3391f83b01ea1f13eccd3c8ddba171e557c6 branches{''}        
hash : e179128fec66b0354235b394279db63a3d289b3d branches{''}        
hash : 80f344e46e52283b0d17aefbe7e8852c1be08ee6 branches{''}        
hash : 19a415fac7158dcdab3b43e80022c752f019ee2d branches{''}        
hash : d4cebb48916ec4e5eccb034ca9244f84f4b47681 branches{''}        
hash : af8ae057f1ab2e25a2ffefcb0d061c4f530582bc branches{''}        

Yes, that's the correct output, since origin/dev is merged into master.

You can test it doing:
git log origin/dev --since 2023-05-21 --until 2023-06-21 --reverse

It will give the same list of commits (of course you need to exclude the filter for '.py' files).

ok i think i understand, but why the commit : af8ae057f1ab2e25a2ffefcb0d061c4f530582bc doesn't have branch, it has been in dev and not merged in main for this commit ?

Sorry for the late reply, I missed the question.
The reason is that by default Git doesn't download ALL branches, just the ones that it needs (namely, main).
You can test it by doing:

git branch --list

and it outputs:

* main

So according to Git, your repo only contains 1 branch. However, that is not the case, as there are many more branches.

So you can do:

git branch --list -r

which also lists the remotes, and you get:

  origin/HEAD -> origin/main
  origin/add_parser
  origin/add_test
  origin/add_test2
  origin/commitWindow
  origin/dev
  origin/gab
  origin/main
  origin/max_from_gab
  origin/new_architecture
  origin/old_arrchitecture

To also get all refs, you can do

git branch --list -a

and it outputs:

* main
  remotes/origin/HEAD -> origin/main
  remotes/origin/add_parser
  remotes/origin/add_test
  remotes/origin/add_test2
  remotes/origin/commitWindow
  remotes/origin/dev
  remotes/origin/gab
  remotes/origin/main
  remotes/origin/max_from_gab
  remotes/origin/new_architecture
  remotes/origin/old_arrchitecture

You can find more info on the documentation of git branch: https://git-scm.com/docs/git-branch

Anyway, if you want to include these branches, there are 2 parameters you can pass, include_refs and include_remotes.

As this is not a problem of Pydriller, I'm gonna close the issue. Hope this helped!