Mutating const functions can cause the build to hang
zaneduffield opened this issue · comments
I understand why the standard timeout doesn't apply to cargo build
, but it's possible for a mutation to be made in a const
function which is evaluated at compile time. This mutation can cause an infinite loop that will then stall the entire pipeline.
If there were a separate option for a build timeout, then this could be prevented. Much like the test timeout, this value could be inferred from the baseline build.
There are of course workarounds I can use (--exclude-re
, #[mutants::skip]
), so this isn't a blocker.
P.S. If this feature is welcome, I'm happy to have a go at implementing it.
Thinking about this some more, I remembered the const_eval_long_running
lint that was added back in rust 1.72.
It's supposed to be deny-by-default, but I was experiencing compiles stuck for hundreds of seconds longer than normal after a const function mutation.
Hi, thanks, that makes sense to add, thanks for reporting it.
If you would like to try a PR that would be very welcome and it should not be too hard, by mirroring the existing timeout
options.
For test timeouts there's a multiplier used on the baseline value and perhaps we should use the same approach here, so you'd want a build-timeout-multiplier
applied to the baseline build, if there was one. Maybe this should be 5 by default too, especially since we'd expect the incremental builds to generally be much faster than the baseline.
- Add into
Options
- Add the CLI option in
main.rs
and wire it through toOptions
- Add a unit test that the CLI option appears in Options
- In
cargo.rs
use the option to set a timeout on the command - Document it in
book/src
- Maybe we should add a specific
ScenarioOutcome
for build timeout, distinguishing it from a test timeout; you'd need to change the code that builds these from the outcomes of individual phases - Add an integration test that tries to build the
well_tested
tree (which is relatively large) with a very short timeout like 0.01s; check it fails as expected. (This has some risk of being flaky but hopefully not too much.)
Thinking about this some more, I remembered the
const_eval_long_running
lint that was added back in rust 1.72. It's supposed to be deny-by-default, but I was experiencing compiles stuck for hundreds of seconds longer than normal after a const function mutation.
The weird thing is that when I manually apply the exact same mutation that causes cargo mutants
to hang, I trigger this lint
error: constant evaluation is taking a long time
which terminates the compile.
Could cargo-mutants
somehow be compiling with this lint disabled?
Ah, I think I may have found it
Lines 181 to 185 in 79f2f8a
Oh well done! That is probably the cause.
So it seems like there are at least three options:
- impose our own timeout
- do something else to control lints
- don't mutate compile-time code
There is possibly some value in capping compile time: perhaps the timeout can be set more precisely than rustc's heuristic, or maybe it would catch other things causing the compiler to hang. On the other hand the lint will give a better message, and letting the compiler handle it seems simpler.
cap-lints does not seem to have a way to cap most lints but to leave const_eval_long_running
enabled.
Maybe instead of --cap-lints
we should use -A
to turn off lints or lint groups like deny
that are likely to cause unwanted errors on mutated code. We should check how this interacts with annotations inside the code: ideally if someone has #[deny(unused)]
that would still get switched off. There might be some closed bugs with history on this.
https://doc.rust-lang.org/rustc/lints/levels.html#via-compiler-flag
Maybe just not mutating const functions would be a good idea?
It seems like it could be tricky to stop using --cap-lints
without introducing new bugs. I don't know how many relevant lints there are that need to be ignored (I guess it's any lint that could be triggered by a mutation - given the mutated code can choose the severity).
Unfortunately I don't think there is a lint group for deny
. There is a lint group for unused
, which is probably the most likely to be triggered by a mutation, but I can imagine other cases where a lint is triggered that wouldn't be in unused
.
Another reason why the lint-based solution might not be the best, is that a user might have configured #[allow(const_eval_long_running)]
because they know that their const function is slow. In that case the lint no longer indicates when a function has entered an infinite loop.
Not mutating const functions would probably be the simplest solution, but would be an annoying limitation as it also limits how much 'runtime' code is being mutated.
Considering all of that, I think capping compile time is the best option.
Sorry, I mean to say the lint group unused
, which is most likely to be triggered by mutations.
But, from memory, these are overridden by annotations in the source code, so just changing them there is not enough.
Another option is that cargo-mutants could insert #![allow(all)]
in the code that it modifies, but I think this would have to go at the start of the scope so it all gets a bit complicated, especially if there are any existing annotations.
Inserting #![allow(...)]
does sound tricky. As you say, it is overridden by subsequent #![deny(...)]
attributes.
What do you think? Should we try and insert something like
#![allow(unused)]
#![deny(long_running_const_eval)]
at the start of every modified file and then go delete all the #![deny]
instances in that file?
Or should we have a build timeout?
I think we should add a build timeout first for a few reasons:
- It should be a fairly simple extension of the timeout features that already exist
- I can't think of any downsides or cases where you'd regret having it
- It should be fairly watertight against anything that can cause a build to hang
That's probably better than skipping const functions, which would cause some coverage loss.
It is probably also worth separately thinking or experimenting with how to better control lints that might make mutants unviable for shallow reasons. Perhaps a good option is to just remove every deny
and forbid
attribute. There is a complication that lints can be set in the crate's top-level file (https://doc.rust-lang.org/rustc/lints/levels.html#via-an-attribute) and so we'd need to modify more than one file at a time, which is a bit more complicated than what cargo-mutants does today.
Or, maybe -Aunused
would be enough: I don't remember whether attributes override it, and https://doc.rust-lang.org/rustc/lints/levels.html does not specify. But from memory, this didn't work.