gitpython-developers / GitPython

GitPython is a python library used to interact with Git repositories.

Home Page:http://gitpython.readthedocs.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unpickled Git object has broken version_info

EliahKagan opened this issue · comments

A Git instance produced by pickle deserialization always has a version_info of None. This prevents version information from being accessed through version_info. It also causes the version_info property not to satisfy its type annotation as a tuple of ints (independently of #1830). This is the case whether or not version_info has been accessed on the original instance before it was pickled:

>>> import pickle, git
>>> g = git.Git()
>>> g.version_info
(2, 34, 1)
>>> g2 = pickle.loads(pickle.dumps(g))
>>> g2.version_info
>>>
>>> import pickle, git
>>> g = git.Git()
>>> g2 = pickle.loads(pickle.dumps(g))
>>> g2.version_info
>>>

This is due to an inconsistency in how the state of not yet having computed version_info is represented in the backing attribute _version_info:

  • Git uses LazyMixin caching for _version_info, which treats the attribute as uncomputed when absent and computed when present.
  • Git defines __getstate__ and __setstate__ for pickling and unpickling to represent state pickling does not preserve by setting attributes to None.

The _version_info attribute is set to None on a Git object created by unpickling, so when accessing the version_info property delegates to the _version_info attribute, its value of None is returned and the if attr == "_version_info" logic in the _set_cache_ method (inherited from LazyMixin and overridden) is never run.

The intention is that an unpickled Git object have version_info in the uncomputed state. This works for the other two excluded attributes, which use the same None convention as in the __getstate__ and __setstate__ implementations, and which do not use any LazyMixin functionality for their caching:

_excluded_ = ("cat_file_all", "cat_file_header", "_version_info")

GitPython/git/cmd.py

Lines 319 to 323 in afa5754

def __getstate__(self) -> Dict[str, Any]:
return slots_to_dict(self, exclude=self._excluded_)
def __setstate__(self, d: Dict[str, Any]) -> None:
dict_to_slots_and__excluded_are_none(self, d, excluded=self._excluded_)

Unlike _version_info, which is managed by _set_cache_, the cat_file_all and cat_file_header attributes are initialized to None in Git.__init__:

GitPython/git/cmd.py

Lines 786 to 788 in afa5754

# Cached command slots
self.cat_file_header: Union[None, TBD] = None
self.cat_file_all: Union[None, TBD] = None

This is natural to fix at the same time as #1829, because a fix for that should change the way caching works. LazyMixin caching supports attributes that are computed once and never become stale, so version_info/_version_info should implement caching in a different way to fix #1829, and that can fix this too if consistent with __getstate__/__setstate__.