Submodule.__init__ `parent_commit` conversion/validation is implied but not done
EliahKagan opened this issue · comments
The parent_commit
parameter of Submodule.__init__
is documented this way:
GitPython/git/objects/submodule/base.py
Lines 135 to 136 in edb8d26
Submodule.__init__
binds this directly to the private _parent_commit
attribute:
GitPython/git/objects/submodule/base.py
Line 147 in edb8d26
But this is at odds with the documented relationship to Submodule.set_parent_commit
. That method's commit
parameter corresponds to the parent_commit
parameter of Submodule.__init__
, in that its commit
parameter is used to identify the commit, if any, to set as _parent_commit
. However, set_parent_commit
performs both conversion and validation.
This is the relevant fragment of set_parent_commit
's docstring:
GitPython/git/objects/submodule/base.py
Lines 1253 to 1255 in edb8d26
When commit
is None
, it sets None
to _parent_commit
. Otherwise, however, commit
may not already be a Commit
object, and that is okay, because a commit is looked up from it:
GitPython/git/objects/submodule/base.py
Line 1274 in edb8d26
That's the conversion. Then validation is performed, with _parent_commit
ending up set to the commit that commit
identified only if there is such a suitable commit:
GitPython/git/objects/submodule/base.py
Lines 1275 to 1289 in edb8d26
The type annotations do not reveal the intent, as they are among those using Commit_ish
that need to be updated with the fix for Commit_ish
, and that I am fixing up in #1859. My immediate motivation for opening this issue is that I'm having trouble figuring out how to annotate them, because due to the inconsistency between the docstring and the implementations, I don't know what is intended to be accepted.
Either the documentation should be updated, which could be part of #1859, or the code should be fixed to perform any expected validation and conversion and a test case added to check that this is working, which would be best done separately from #1859 (lest its scope expand ever further). I am not sure which. For #1859, it is likely sufficient for me to know what is intended, so full progress on this is not needed to finish #1859. It is my hope and also strong guess that this issue is not a blocker for #1859.
This should not be confused with #1866, which is about the parent_commits
parameter of IndexFile.commit
rather than the parent_commit
parameter of Submodule.__init__
(and which, unlike this, really is about annotations).
Thanks a lot for documenting the issue.
As this implementation was intended to be 'a more easily usable' version of git submodules (back then they were much rougher than they are now), I'd think it's better to let the code match the documented behaviour, assuming that wouldn't unnecessarily narrow its applicability but instead makes it safer. However, it's hard for me to be more assertive here.
I worry about the possibility that binding something other than a GitPython Commit
(or None
) object to the private _parent_commit
attribute might actually be relied on. Because much interaction with repositories is through the git
executable where GitPython marshals Python objects into command-line arguments by using their str
, often more types than intended are possible to use successfully.
To be clear, I'm not suggesting GitPython forever continue to support all possible wrong uses that happen somehow to be working, only that maybe this hasn't been found before because the asymmetry is doing something valuable that should be retained. I'll look into this a bit further by seeing how _parent_commit
is used, and how the methods that access it are themselves used.
I'll report back, at least to say whether I was able to find enough information to proceed with the related part of #1859.
I've looked into this a bit further. First, I should mention that some of the validation I showed above is not done by default in Submodule.set_parent_commit
:
GitPython/git/objects/submodule/base.py
Lines 1276 to 1277 in edb8d26
But then also, related to that, there is this near the end, which I had not shown:
GitPython/git/objects/submodule/base.py
Lines 1291 to 1297 in edb8d26
The potentially expensive validation is not done by default--check
defaults to False
--so a plain reading of the Submodule.__init__
docstring saying to see set_parent_commit
for the parent_commit
parameter would be that this extra validation should not be done in __init__
either and that, moreover, initializing a Submodule
object should set its parent commit the same way as calling set_parent_commit
.
But it seems to me that it would not be intuitive that constructing a Submodule
object with an explicit binsha
and explicit parent_commit
could result in no error, but the constructed Submodule
object immediately having an all-zeros binsha
instead of the binsha
passed. For this reason, I am reluctant to forge ahead in having Submodule.__init__
call Submodule.set_parent_commit
(or equivalent) without further discussion.
Besides that, however, it does seem like the private _parent_commit
attribute that these methods set is always expected, when not None
, to be a Commit
object and not, for example, a string. In particular, Submodule
methods like config_reader
need this. The config_reader
method calls _config_parser_constrained
, which calls _config_parser
, which for unrecognized values (even when correct) ends up calling _sio_modules
, which is incompatible with a str
because it accesses the tree
attribute of the parent commit object:
GitPython/git/objects/submodule/base.py
Line 277 in edb8d26
I get AttributeError
if I set a string initially when constructing a Submodule
object and then call config_reader
.
Thanks for the update!
By the looks of it, fixing the code is a bit tougher than anticipated and it pains me to see you spend time on the submodule implementation, even though I know how it came to be naturally and that it is necessary. After all, the submodule implementation is likely broken in many ways, so it is my hope that any fixes only go as deep as they have to be in order to help with what initially triggered the investigation into submodules in the first place.
With that said, maybe adjusting the documentation would be easier at this point, and not necessarily (much) worse?
I think for now--not as a fix for this bug, but as a way forward that somewhat mitigates it at least from a documentation perspective and also allows #1859 to be finished--that it may be sufficient just to fix the type annotations based on what actually works. This will make clearer how they can be used, and updates to the docstrings or changes to their behavior could come later. I've done this in 1f03e7f.
It looks like this issue may have been inadvertently closed automatically as a result of the wording in 1f03e7f:
Even once that is done, this will not have fixed #1869
The phrase "will not have fixed #1869" contains "fixed #1869", which GitHub took to mean that the commit fixed #1869:
I've done a bit more in #1877, which I think is the minimal further change necessary to consider this fixed (though it might be appropriate to leave it open even after that, since greater documentation changes and/or behavioral changes may be warranted).
Thanks for catching this! I merged #1877 and would be happy with keeping this issue closed if you agree. Otherwise it's definitely OK to keep it open as well.
I think it may be reasonable to consider #1877 (together with previous changes) to have fixed this, I'm not sure. I'm not advocating that it be reopened at this time.