gitpython-developers / GitPython

GitPython is a python library used to interact with Git repositories.

Home Page:http://gitpython.readthedocs.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Git.refresh GitCommandNotFound indicates "git" even for other commands

EliahKagan opened this issue · comments

The bug

Even when the Git command that Git.refresh runs and checks is not git, its command line is reported as git in the GitCommandNotFound exception, when such an exception is raised:

GitPython/git/cmd.py

Lines 483 to 485 in d28c20b

# After the first refresh (when GIT_PYTHON_GIT_EXECUTABLE is no longer
# None) we raise an exception.
raise GitCommandNotFound("git", err)

Steps to reproduce

The effect happens when the exception is raised due to a subsequent call to Git.refresh, since that is when Git.refresh raises GitCommandNotFound rather than ImportError. The command attribute is directly part of the traceback message, presented as cmdline. This can be produced by running a shell one-liner:

$ GIT_PYTHON_GIT_EXECUTABLE=git-2.43 GIT_PYTHON_REFRESH=quiet python -c 'import git; git.refresh()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ek/repos-wsl/GitPython/git/__init__.py", line 127, in refresh
    if not Git.refresh(path=path):
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ek/repos-wsl/GitPython/git/cmd.py", line 485, in refresh
    raise GitCommandNotFound("git", err)
git.exc.GitCommandNotFound: Cmd('git') not found due to: 'Bad git executable.
The git executable must be specified in one of the following ways:
    - be included in your $PATH
    - be set via $GIT_PYTHON_GIT_EXECUTABLE
    - explicitly set via git.refresh()
'
  cmdline: git

Although often a custom Git command may be a full path whose last component is exactly git (or sometimes git.exe on Windows), the above shows a realistic scenario in which even that may not be the case. I do not have a command named git-2.43 installed, so it is correct that I get an error, but the error should not state the name of the command as git.

Verification

Verbose debugging confirms that the command given in GIT_PYTHON_GIT_EXECUTABLE really is running, and that the only problem is that the hard-coded command name git is printed instead:

$ GIT_PYTHON_GIT_EXECUTABLE=git-2.43 GIT_PYTHON_REFRESH=quiet GIT_PYTHON_TRACE=full python -c 'import logging; logging.basicConfig(level=logging.DEBUG); import git; git.refresh()'
DEBUG:git.cmd:Popen(['git-2.43', 'version'], cwd=/home/ek/repos-wsl/GitPython, stdin=None, shell=False, universal_newlines=False)
DEBUG:git.cmd:Popen(['git-2.43', 'version'], cwd=/home/ek/repos-wsl/GitPython, stdin=None, shell=False, universal_newlines=False)
Traceback (most recent call last):
  ...
  cmdline: git

Capturing the subprocess.Popen audit event verifies that no other logging bug in GitPython is confusing the analysis:

$ GIT_PYTHON_GIT_EXECUTABLE=git-2.43 GIT_PYTHON_REFRESH=quiet python -c 'import sys, git; sys.addaudithook(lambda event, args: event == "subprocess.Popen" and print(args[:2])); git.refresh()'
('git-2.43', ['git-2.43', 'version'])
Traceback (most recent call last):
  ...
  cmdline: git

(For both the above runs, I replaced most of the traceback with ....)

Why this is a bug

Since it's possible, however unlikely, that someone has come to rely on the exception object's command attribute having the exact value "git" in this situation, it should arguably only be changed if it really is a bug for it to hold that value.

I believe it is a bug, for three reasons:

1. It is presented as the command line

When users see this in the GitCommandNotFound exception message, I believe they will assume it is the command that was run:

  cmdline: git

2. It differs from any other GitCommandNotFound

In other situations that raise GitCommandNotFound, it comes from Git.execute, and the exception is constructed with the actual command line as command (except that redactions are performed, such as for passwords):

GitPython/git/cmd.py

Lines 1065 to 1079 in d28c20b

try:
proc = safer_popen(
command,
env=env,
cwd=cwd,
bufsize=-1,
stdin=(istream or DEVNULL),
stderr=PIPE,
stdout=stdout_sink,
shell=shell,
universal_newlines=universal_newlines,
**subprocess_kwargs,
)
except cmd_not_found_exception as err:
raise GitCommandNotFound(redacted_command, err) from err

3. The superclass documents the command argument

GitCommandNotFound inherits its command attribute from its CommandError superclass. Although CommandError does not document the command attribute directly, it does document the command argument from which the value of that attribute is derived:

GitPython/git/exc.py

Lines 83 to 88 in d28c20b

class CommandError(GitError):
"""Base class for exceptions thrown at every stage of `Popen()` execution.
:param command:
A non-empty list of argv comprising the command-line.
"""

This does not really guarantee the specific meaning of the command attribute. (Likskov substitutability is usually a goal for objects that have already been constructed/initialized; subclass __new__ and __init__ are not expected to treat arguments the same ways as the superclass.) But it is a further reason to consider the current behavior in Git.refresh unexpected.

Possible fixes

This could be fixed by having Git.refresh construct the GitCommandNotFoundError with the same command line that its cls().version() call causes Git.execute to use, which is given by GIT_PYTHON_GIT_EXECUTABLE.

Another approach, though, would be to keep a reference to the GitCommandNotFound exception that occurred occurred due to running cls().version():

GitPython/git/cmd.py

Lines 381 to 385 in d28c20b

try:
cls().version()
has_git = True
except (GitCommandNotFound, PermissionError):
pass

Then that same exception object, with its original information, could be reraised instead of constructing and raising a new GitCommandNotFound object.

If this is to be done, then some refactoring may be in order, and a decision about whether that GitCommandNotFound object should be used as the cause of an ImportError in the case that is raised should probably also be made:

raise ImportError(err)

Thus, in addition to the other changes, that statement could be changed to the raise ... from ... form.

There may be reasons to avoid this approach, in whole or in part, that I am not aware of.

That's a great catch, thanks for reporting!

I agree that the exception should show the actual git command invoked, instead of guessing it.

As for the best way to do this, I can't tell. As always, it would be great if that wasn't a breaking change, and if the resulting code wouldn't be (much) harder to understand.