Show progressive output in verbose mode

Question

Show progressive output in verbose mode

chadrik opened this issue 5 years ago · comments

Some hooks take an appreciable amount of time to complete, so if users could see output as it is produced it would give them an opportunity to take early action. A good example is mypy, which must run on the entire codebase to be truly useful and so often produces a trickle of errors or warnings over a long period of time. If pre-commit is being run in verbose mode, I don't see any downside to showing immediate updates of the hook process's stdout and stderr, since it will be displayed eventually anyway.

This is similar to #872 but I'm not interested in interacting with stdin, and I'm suggesting that we do this by default if verbose is enabled, rather than introducing new config options. This works nicely with #695.

Anthony Sottile · Answer 1 · Mon Apr 22 2019 02:13:52 GMT+0800 (China Standard Time)

one of the comments in #872 also applies here:

(the output of pre-commit could not be in the order it is today with that, I have no intention to change that output)

Basically, the hook name....................{Passed,Failed,Skipped, etc.} output currently comes first and is used to delineate which hook is running. And in order to display output immediately that couldn't happen (and would confuse the user interface).

That said, I do play to make the ...s actually a progress bar, just haven't had time to work on it. I think @chriskuehl actually had a cool demo of this working so it might be easier than I expect (and especially now that we've flipped to using concurrent.futures most of the time)

Chad Dombrova · Answer 2 · Mon Apr 22 2019 07:49:28 GMT+0800 (China Standard Time)

A simple solution that we can act on immediately is to reprint the status line:

lint.....................................................................Passed
mypy.....................................................................
hookid: mypy

python/whatever.py:57: error: "blah" does not return a value
python/whatever.py:192: error: Incompatible types in assignment (expression has type "deque[Any]", variable has type "List[Any]")

mypy.....................................................................Failed

I'm happy to make a PR for this.

Anthony Sottile · Answer 3 · Mon Apr 22 2019 07:53:07 GMT+0800 (China Standard Time)

I'd rather not, that's not pretty and is jarringly different

Chad Dombrova · Answer 4 · Mon Apr 22 2019 08:14:43 GMT+0800 (China Standard Time)

What would this look like in your ideal scenario? Use curses to ensure that the task..................................... line is always at the bottom of the task output? That's great and all for the normal pre-commit usage, but it doesn't work for non-tty terminals like those used in CI tools (jenkins, gitlab, travis). Yes, I know that pre-commit is not expressly designed for CI, but it contains the configuration for the files that should be included/excluded, so it's logical to use it in CI to avoid duplicating that configuration. So if you accept that it's valid to use pre-commit from terminals that don't support curses, then you're going to need a low-tech fallback for those cases. So what does your ideal non-curses-based solution look like?

Chad Dombrova · Answer 5 · Mon Apr 22 2019 08:17:03 GMT+0800 (China Standard Time)

I should point out that pytest faces a similar issue of verbose vs non-verbose output modes, so it could be worth looking there for inspiration, however unlike pre-commit pytest does not support setting verbosity per test/task.

Anthony Sottile · Answer 6 · Mon Apr 22 2019 08:38:03 GMT+0800 (China Standard Time)

I don't really want to support this issue at all, it's not a common scenario (and I really didn't want to add the verbose configuration to begin with (tools should be as quiet as possible, warning noise causes individuals to ignore the entire tool)).

The most that could come out of this that I'd be ok with getting implemented is accurate progress for the .s as files are fed to the underlying tool.

I absolutely do not want to involve curses

Your example involves mypy, I really don't think that the subsecond difference between immediate response and the full response is important enough to justify this complexity.

I'm very happy with the current output aesthetics so I'm not inclined to change it unless there's a very compelling argument for it.

Chad Dombrova · Answer 7 · Mon Apr 22 2019 09:13:14 GMT+0800 (China Standard Time)

Your example involves mypy, I really don't think that the subsecond difference between immediate response and the full response is important enough to justify this complexity.

On our codebase, which is quite large, mypy can take 30 seconds or more to run, so this is not a sub-second difference (this is with the new mypyc compiled version that was released last month. it was 3x longer before). And mypy does not benefit from being run only on the files that changed, since a change to one function could adversely affect code in other files.

The most that could come out of this that I'd be ok with getting implemented is accurate progress for the .s as files are fed to the underlying tool.

Unfortunately, that's not satisfactory for any task that takes, say, 10 seconds or more to run in total, where you could have had actionable results after less than a second, which is exactly the case that I have with mypy. At 30 seconds a run, a user might spend 10-20 minutes a day waiting for results from pre-commit which they could have had much faster. Multiply that across an entire team and it's a pretty substantial loss in productivity.

I know this is only a slight change from what I suggested before, what what if we added a new status, "Running":

lint.....................................................................Passed
mypy.....................................................................Running
hookid: mypy

python/whatever.py:57: error: "blah" does not return a value
python/whatever.py:192: error: Incompatible types in assignment (expression has type "deque[Any]", variable has type "List[Any]")

mypy.....................................................................Failed

Slightly more aesthetically and logically pleasing?

Anthony Sottile · Answer 8 · Mon Apr 22 2019 09:26:27 GMT+0800 (China Standard Time)

Unfortunately, that's not satisfactory for any task that takes, say, 10 seconds or more to run in total, where you could have had actionable results after less than a second, which is exactly the case that I have with mypy.

I'd argue you shouldn't be running slow tasks at commit time

Chad Dombrova · Answer 9 · Thu Apr 25 2019 01:03:21 GMT+0800 (China Standard Time)

I'd argue you shouldn't be running slow tasks at commit time

I would truly love for mypy to be faster! But sadly it's not, and it still seems like the type of task that should be running prior to a commit. I can cut the run time in half again (to 15s) by using the mypy daemon, but it's not super stable, and when it has to restart it takes longer than a normal run (45s vs 30s).

I feel like my proposal above is a pretty fine compromise between form and function, since it's only displayed when verbose=true, but if that's still not satisfactory, perhaps we can add a flag to enable this mode so that it's not displayed by default, and users with long running tasks can still benefit. I'm happy to make the PR if you approve.

Anthony Sottile · Answer 10 · Thu Apr 25 2019 01:10:57 GMT+0800 (China Standard Time)

I still don't want to take on complexity for a feature I don't think should exist beyond debugging (verbose) and a use case that I believe to be an antipattern (always running against all files w/ mypy and having slow pre-commit hooks)

I also think your proposal is in the right direction if I did want something like this, but it still suffers from a bunch of usability problems:

it looks like the hook is running twice (it isn't)
all output is currently headers followed by the output, this sandwiches things in between
it can't possibly work when hooks are running in parallel, you'd have jumbled output as each process tramples the others
what would it do if there was no output or it didn't fail? we can't possibly know that ahead of time either to adjust the output

Chad Dombrova · Answer 11 · Thu Apr 25 2019 01:41:15 GMT+0800 (China Standard Time)

Fair enough. I respect your position.

I'll address a few of your notes for the sake of completeness:

I still don't want to take on complexity for a feature I don't think should exist beyond debugging (verbose) and a use case that I believe to be an antipattern (always running against all files w/ mypy and having slow pre-commit hooks)

Running mypy on all files is precisely how mypy is intended to be used. It requires the complete context in order to understand how a change to a function in one file adversely affects code in other files that you haven't modified.

That said, there's a lot of room for improvement in the mypy UX, so I've written a runner script that makes it easy to use in a pre-commit setting: https://github.com/chadrik/mypy-runner. One of the features (which I haven't properly documented yet) is that it can use different filter settings for the files that are passed to it, so that e.g. it can show warnings just for changed files while showing errors across the entire code-base. I'll be adding the pre-commit config files soon. When it's ready I'd love to get it added to https://pre-commit.com/hooks.html. What's the process for that?

it looks like the hook is running twice (it isn't)

I certainly agree it's not as elegant as the current solution, but I think that proper status labels can help convey this (especially if the user has explicitly requested progressive feedback):

mypy....................................................................Started
mypy.....................................................................Failed

it can't possibly work when hooks are running in parallel, you'd have jumbled output as each process tramples the others

I'd argue that the current design suffers from a similar problem, and that my suggestion of using a separate line for each status change is actually a solution to that problem.

You currently print the name of the hook fist, then the status when it completes on the same line. If you run them in parallel, you'll need to do one of the following:

A) print the name of the hook and its status only after it's finished (bad UX)
B) use something like curses to edit the original line, which you've already stated you're opposed to
C) print separate lines for each status as they occur. i.e.

lint....................................................................Started
mypy....................................................................Started
mypy.....................................................................Failed
lint.....................................................................Passed

That said, you're right that there would be no good way to progressively stream output in this scenario.

what would it do if there was no output or it didn't fail? we can't possibly know that ahead of time either to adjust the output

This is only for verbose mode, so the expectation is that we're always showing output.

For example, if we have two jobs, lint and mypy, and only mypy has requested progressive output, and there was no output generated, then it would look like this:

lint....................................................................Passed
mypy...................................................................Started
mypy....................................................................Passed

Anthony Sottile · Answer 12 · Thu Apr 25 2019 01:51:41 GMT+0800 (China Standard Time)

Running mypy on all files is precisely how mypy is intended to be used.

right, in a test setting, but at commit time that's far too costly and slow

That said, there's a lot of room for improvement in the mypy UX, so I've written a runner script that makes it easy to use in a pre-commit setting: https://github.com/chadrik/mypy-runner.

I've been using https://github.com/pre-commit/mirrors-mypy basically without issue -- it doesn't run across all files and I think I've only hit a case where I forgot to update a callsite once -- and CI caught that because it was using --all-files -- I get that there's a trade off on correctness but pre-commit hooks aren't meant to catch every possible mistake you could make and are really meant to be the first line of defense. Running mypy just against the changed files is a good first line of defense and I really don't think the time trade off for running against all files at commit time is anywhere close to worth it.

When it's ready I'd love to get it added to https://pre-commit.com/hooks.html. What's the process for that?

there's a list in https://github.com/pre-commit/pre-commit.github.io -- append to that :)

it can't possibly work when hooks are running in parallel, you'd have jumbled output as each process tramples the others

I'd argue that the current design suffers from a similar problem, and that my suggestion of using a separate line for each status change is actually a solution to that problem.

I think you're misunderstanding what I mean here. Within a single hook (take trailing-whitespace for instance), pre-commit will spawn N invocations of that hook in parallel and hand each of them a list of filenames to handle (think like xargs -P) -- then at the end these outputs are stacked and presented to the user. If we're sending the streams straight to the user these would all trample on each other.

Running different hooks in parallel won't be implemented because fixers cannot possibly function correctly in that world.

This is only for verbose mode, so the expectation is that we're always showing output.

that's not the case today:

$ pre-commit  run trailing-whitespace --verbose --all-files
[trailing-whitespace] Trim Trailing Whitespace...........................Passed

But in verbose mode you're missing the hookid: ... bit, but adding that looks even weirder:

$ pre-commit  run trailing-whitespace --verbose --all-files
[trailing-whitespace] Trim Trailing Whitespace..........................Running
hookid: trailing-whitespace
[trailing-whitespace] Trim Trailing Whitespace...........................Passed

Chad Dombrova · Answer 13 · Thu Apr 25 2019 08:42:51 GMT+0800 (China Standard Time)

I've been using https://github.com/pre-commit/mirrors-mypy basically without issue -- it doesn't run across all files and I think I've only hit a case where I forgot to update a callsite once -- and CI caught that because it was using --all-files -- I get that there's a trade off on correctness but pre-commit hooks aren't meant to catch every possible mistake you could make and are really meant to be the first line of defense. Running mypy just against the changed files is a good first line of defense and I really don't think the time trade off for running against all files at commit time is anywhere close to worth it.

It's sound advice, and I went back to test this out, but the problem is that if follow_imports = skip then it's super fast to run on individual files, but anything that's imported is marked as Any so a lot of stuff is missed. It doesn't even flag invalid identifiers (i.e. type annotations referring to types that don't exist at the module scope, e.g. because they have not been imported), which is one of the main things that users get wrong. If I set follow_imports = silent then it takes 15s to run on a single file, so I may as well be running it on the entire codebase in daemon mode :(

Anyway, I see your point on this, so I'll go ahead and close this out.

Aleksey Gurtovoy · Answer 14 · Sat May 01 2021 02:34:41 GMT+0800 (China Standard Time)

I still don't want to take on complexity for a feature I don't think should exist beyond debugging (verbose) and a use case that I believe to be an antipattern (always running against all files w/ mypy and having slow pre-commit hooks)

@asottile It's not just about pre-commit hooks, though. For example, consider post-push DVC hook; large Git commits take time, and Git reflects that reality by showing us progress. DVC shows us progress too, but pre-commit eats that output and doesn't indicate in any way what's happening and how long does the user still have to wait, which is a big hit to the usability.

Anthony Sottile · Answer 15 · Sat May 01 2021 02:43:28 GMT+0800 (China Standard Time)

please read the thread. it isn't possible to do what you want

if you're unhappy, don't use the tool, fork the tool, or use legacy hooks