The case for dropping JUMPF and non-returning functions
gumb0 opened this issue · comments
Main use case for JUMPF
was being able to call a non-returning helper without the requirement to have equal stack height at each call site and without the need to POP
extra items before calling.
In the current spec:
-
If such helper is implemented inside caller section (without
CALLF
/JUMPF
), it is allowed to call it from different stack heights. -
If such helper is in a separate section, we can make it work without the need for non-returning flag and
JUMPF
: non-returning functions are declared with 0 outputs, and the requirement to notPOP
extra items is achieved withCALLF STOP
orCALLF INVALID
sequence.
The difference with JUMPF
would be 4 bytes of code instead of 3 bytes and an item pushed into call stack at run-time.
what is the current requirement for CALLF
to pop extra items?
what is the current requirement for
CALLF
to pop extra items?
No such requirement for CALLF
, but there is one for JUMPF
to returning functions, it doesn't allow unbalanced stack, same as RETF
.
So if we replaced JUMPF
with CALLF RETF
sequence - that would require popping extra items.
hmm. it seems JUMPF
is useful here for "stack-unwinding" type operations. just brainstorming here but maybe there could be another type (marked with a sentinel outputs value) of code section which when you CALLF
it basically works like the currently spec'ed JUMPF
?
Asked this in discord, but sharing it here for archives
"Hey I still dont understand the benefit of JUMPF. For non-returning function wouldn't that be just ordinary CALLF with zero output, what are we optimizing with JUMPF and how would alternative without JUMPF look like?"
One of the main arguments for JUMPF
was code deduplication. Smart contract functions involve an abundance of revert conditions, each of which potentially supposed to emit one of a small number of revert messages indicating error conditions. Previous versions of the specification did not allow to deduplicate jumps to such reverts within a function at least without stack cleanup (which itself significantly inflate code size to a degree that should not be underestimated) due to the requirement to have equal stack heights (this seems to be mitigated in the current spec), but IIRC also required stack cleanup on termination, so even e.g. in a CALLF STOP
scenario (I may be wrong about that, though).
In any case, it seems like the latest spec mitigates some of these very strong points in favour of JUMPF
, but there is still a case to be made for JUMPF
:
- It shouldn't be underestimated how prevalent small non-returning code paths are in smart contract code - and the pre-EOF-EVM has an important property here: by simply collapsing jump targets, code deduplication in cases like this is free in the current EVM (zero gas difference, unconditional code size gain). We need to be careful for EOF not to become a regression here that both complicates code deduplication by requiring more complex tradeoffs and by generally inflating code size. Allowing cross-function deduplication of non-returning code paths with JUMPF, but also of returning code paths (functions with a similar return value structure tend to share cleanup code that can be deduplicated using JUMPF - I'm not sure, but that may be what @charles-cooper had in mind with "stack-unwinding" type operations; deduplication here pre-EOF is also "free" without involving any tradeoff) may help towards that. In this context the expectation was also that
JUMPF
would in fact be cheaper thanCALLF
(due to not having to manipulate the return stack). JUMPF
enables tail-call-optimizations, i.e. the unlimited continuation of a function by handing over execution to another function at minimal cost and without return stack depth limitation.
That being said, I haven't had the time yet for a full analysis of the impact of JUMPF
in the latest revision of the specification; it's definitely greater than zero, though.
but IIRC also required stack cleanup on termination, so even e.g. in a CALLF STOP scenario (I may be wrong about that, though).
i think this is mitigated by changing the validation rules for non-returning code sections. so if you have a code section foo <non returning> <code: ... REVERT
, this can be CALLF
ed and behaves like the current JUMPF
to non-returning function.
I'm not sure, but that may be what @charles-cooper had in mind with "stack-unwinding" type operations
yes -- one currently useful case for JUMPF
is either shared cleanup blocks for different subroutines or exception handling. like an exception handling mechanism can be implemented by JUMPF
ing to a shared block which knows how to propagate any exception handling data structures or knows how to halt propagation (catch the exception).
JUMPF enables tail-call-optimizations, i.e. the unlimited continuation of a function by handing over execution to another function at minimal cost and without return stack depth limitation.
this is one thing i'm not convinced is super useful about JUMPF. tail call optimization can be implemented in compilers using regular jumps (at least for functions which recurse into themselves -- for corecursive, i haven't analyzed it yet but i think it is the same).
We need to be careful for EOF not to become a regression here that both complicates code deduplication by requiring more complex tradeoffs and by generally inflating code size.
i agree fully here. actually i think there is a strong case to be made for bringing back a single global code section a la EIP-2315. i have been told that EIP-2315 was infeasible or doesn't address certain use cases, but i don't fully understand what the issues are here.
this is one thing i'm not convinced is super useful about JUMPF. tail call optimization can be implemented in compilers using regular jumps (at least for functions which recurse into themselves -- for corecursive, i haven't analyzed it yet but i think it is the same).
Well, that restricts you to a single function frame, so it would require to inline the entire graph of any corecursion. And tail calls are not necessarily recursive - in the end it's just the more general case of "shared cleanup blocks" for cases in which more logic is shared among functions.
In any case: just to be clear about that: at least for us in Solidity, assuming the weakened stack validations that the current version of the spec now involves, our last implementation of a previous in these aspects very similar specification overall still resulted in a net win not only in gas, but also in code size due to independent savings e.g. of jumpdests. So I don't think there is a strong case for a radical change as towards a single global code section, especially since a clear split into function sections does have advantages in terms of simpler analyzability. What we're talking about here with JUMPF
as far as I'm concerned are minor tweaks to ensure the minimal amount of concessions to code deduplication compared to the current EVM - but also to be clear: if really need be, we could work without JUMPF
- I'd just argue that it does remain useful despite the more relaxed stack validation.
JUMPF imo is the only EIP that is questionable, other EIP's have a stronger purpose and the reason is obvious as they solve the problem. This seems not the case for JUMPF.
We of course don't want to introduce regression but on the other hand we don't want to include something that pans up not useful or can be done in a different maybe simpler way.
What I would like to know is what are we optimizing for, it is not that clear to me. JUMPF
acts like the ordinary function but "Knowing at validation time that a function will never return control" then that means it is forced to RETURN/REVERT/STOP ( I am concluding this from It is particularly benefitial for small error handling helpers, that end execution with REVERT
)
ref this: EIPS-6206
Few comments/questions:
tail-call-optimizations are mostly useful for recursion at least in ordinary CPU's, how much are recursions found inside solidity/vyper code?
Isn't JUMPF just a CALLF that does RETURN, is code/stack validation problem here?
@ekpyron @charles-cooper thank you for discussing this
then that means it is forced to RETURN/REVERT/STOP
It is not, it can also RETF, but since JUMPF didn't push to the return stack, it acts as if it RETFed from the JUMPF caller section. The It is particularly beneficial
only refers to the case where we JUMPF into a non-returning section.
As an aside note, there was an error I slipped into the megaspec's write-up of relaxed stack validation. That possibly might have caused confusion, apologies if that was the case. See #44 fixing the error.
I thought of another argument in favor of having non-returning flag and JUMPF: without them, some code after CALLF to non-returning function is non-reachable in practice, but will be considered reachable by the validation. In other words CALLF STOP
sequence can be abused as CALLF <code> STOP
and then <code>
has to be validated and can lead to some validation errors, although being non-reachable in practice.
In this way non-returning flag really makes EOF Functions spec complete. Without it some functions that are non-returning in practice, but are not declared as such, will cause some quirks like this.
Well, that restricts you to a single function frame, so it would require to inline the entire graph of any corecursion.
I mean this is actually kind of a strong argument for a global code section, right?
how much are recursions found inside solidity/vyper code?
Vyper actually disallows recursion entirely. Even if it allowed recursion, I think generally getting to a recursion depth where tail call optimization is needed is a code smell.
On the EOF Implementers Call #31 we have agreed to keep JUMPF and non-returning functions in the specification for now. It can be easily dropped later, should we decide closer to deployment, but re-adding it would be not possible.
We do want to get actual measurements from the Solidity and/or Vyper implementation in the upcoming months to validate this question, and ultimately keep or remove this feature.
Given the above discussion the call felt there's a slight leaning towards JUMPF being useful, but it is to be validated via actual compiler feedback.