How to overload / specialize / dispatch based on the expression phase of the argument
chandlerc opened this issue · comments
This is largely my memory & summary of an open discussion from a few weeks ago. It started with all of @josh11b, myself, and @zygoloid, but most of it was between just @zygoloid and myself so we didn't have notes taken live. I'm trying to capture everything I can though, as I think this gives enough context to begin really digging into this as a leads question.
Summary
We expect a common pattern where functions may be evaluated with either compile-time or run-time parameters:
fn CTMax(X:! i32, Y:! i32) -> i32;
fn RTMax(x: i32, y: i32) -> i32;
Note: there is a related question about how to mark that the function can be fully evaluated at compile time and produce a compile time result. But this issue is more focused on selecting between implementations based on different input expression phases.
Beyond an explicit pair of functions which vary in the expression phase of their arguments, it is important to also consider whatever interface system in generics is used to generically dispatch to some callable entity (a function, lambda, etc). So far, this has been described as the Call
interface.
After many discussions, it feels to me like there are three core directional options that emerge:
-
The
Call
interface is runtime only. If needed, explicit function overloads can overload on compile-time phase, but that uses its own mechanism and is explicitly not modeled in the generics layer. -
Base dispatch and writing of the call interface on the typesystem by fully modeling compile-time vs run-time. This would mean first changing such that types say whether values that inhabit that type are compile-time or run-time in nature, which is also expected to obviate the need for
:!
in bindings. And then leveraging that as the basis for overloading. -
The
Call
interface, specifically what parameters may be passed to it in a call expression, cannot be represented as a sequence of types. As a consequence, we must build a richer & more expressive way to describe this interface.
While there are a lot of important details to work out about each of these three in order to pursue it, it seems useful to understand which of these directions we're trying to pursue. At the least, choosing between these three directions seems like a reasonable request from the leads.
I'll write up some thoughts about each of these direction in subsequent comments, but leaving the summary as just the high level direction and encouraging others to edit that summary to capture things as much as possible.
Looking at option (1), it is very close to the status-quo in Carbon. It is also very much the approach of C++ currently. So in some ways, this is the easiest direction to pursue. And C++ does seem to be coping OK with it.
However, C++ has also seen a steady and rising amount of pressure from the community to leave this position and become more expressive in precisely these ways.
Looking at (2):
One observation is that there is another direction that may seem distinct but I think is usefully included in (2), let's call it (2b): Only types are compile-time, and values that need to be compile-time are exactly those values where the type carries the value fully. For example typeof 42
in Carbon's current design for integer literals.
One concern that I had with all of the options in (2) is that I think they will end up motivating a bifurcation of the type system and vocabulary types. For each container or vocabulary wrapper, we'll need a runtime and a compile-time flavor.
One possible way to make this more amenable is to have a qualifier like comptime
or !
that can be applied to a type name in order to easily name both flavors of types. However, that raises the question of how to handle when the definitions of the types need to be substantially different.
Looking at (3):
Richard and I talked pretty extensively about this option. Specifically, we explored what it would look like to deeply embrace this direction and try to get something powerful.
What we came up with looked something like the following impl:
impl ... as Call(fn_type(a: i32, var b: i32, T:! type, x: T)) where .Result = i32;
But maybe with a call
keyword (or some other keyword) to specially name the call interface, and have it be directly followed by a pattern just like a function declaration would be:
impl ... as call(a: i32, var b: i32, T:! type, x: T) -> i32;
The end result is that a function declaration:
fn F(a: i32, var b: i32, T:! type, x: T) -> i32;
Is just syntactic sugar for something lower level:
class F_type {};
impl F_type as call(a: i32, var b: i32, T:! type, x: T) -> i32;
let F:! F_type = {};
Some observations about this direction:
- Signatures become first-class entities in the language
impl
selection delegates to custom rules when selecting for a signature- this is where we have whatever rules we want for overloading -- maybe you can overload on
:!
, but not ontemplate
orvar
. - can model other signature relationships like virtual dispatch
- this is where we have whatever rules we want for overloading -- maybe you can overload on
- requires a way to reify a generic signature into a function declaration & definition
- maybe a path to replace
like
with tools build on signatures andcall
. - need to build a coherence model for signatures
- likely by generalizing the binary
like
rule -- for an N-ary call, N-1 args may implicitly convert and there must be one remaining that anchors for coherence - overloading takes the place of "specialization"
- likely by generalizing the binary
- Allows us to be even more able to model exactly what C++ does with ADL but in a principled way in Carbon on top of generics
- These signatures and the coherence rules are pretty much an exact model for how to have ADL without ODR and with clear rules for what is allowed.
We also saw some possibilities to further iterate the syntax to try to get to the old goal of having dedicated support for single-function interfaces with a single name, and the result just being part of the structure. But that's not essential.
(I think I've recovered all of this, but please let me know if I missed something. Images of the whiteboard notes which contain a few more examples can be found here: https://docs.google.com/document/d/1gnJBTfY81fZYvI_QXjwKk1uQHYBNHGqRLI2BS_cYYNQ/edit?resourcekey=0-ql1Q1WvTcDvhycf8LbA9DQ#bookmark=id.6zetfzq4wsme)
Lot's of fresh discussion from today's open discussion session, decent notes (huge thanks to all!!) here: https://docs.google.com/document/d/1gnJBTfY81fZYvI_QXjwKk1uQHYBNHGqRLI2BS_cYYNQ/edit?resourcekey=0-ql1Q1WvTcDvhycf8LbA9DQ#heading=h.x7xa7wz11td3
I'll try to provide my summarized take-aways here, happy for others to add theirs if I've missed or put too much of my feeling into this...
Super high level: everyone was pretty excited to pursue (3). There are some challenges there, but seems worth trying to address them. And seems better to do this now than wait. We also came to some agreement that (2) had too many problems to pursue at this point / in this form.
Read-on for more details...
Re-iterated that there are three useful categories of functions to consider, which for now we are often naming using C++ terminology but we might want our own terminology:
- Runtime-only functions: typically just "functions" in C++
- Runtime or Compile-time functions:
constexpr
in C++ - Compile-itme only functions:
consteval
in C++
We talked about (2) quite a bit. One interesting point is that we already have some aspects of this emerging as useful things:
- Types are expected to largely have no runtime operations or usage -- you'll have to do something explicit to get any runtime metadata even, much less a runtime erased type.
- Some things are runtime-specific like the runtime heap.
Generally, it seemed like there was some motivation here, but when we dug into it, there were some really interesting issues.
- A
constexpr
-style function evaluated at compile time and taking ani32
is very different from a function taking a compile-timei32
. Both of them need a compile time argument, but how they internally use that compile-timei32
vary greatly -- theconstexpr
function wouldn't use it as part of a type for example but the other might do so reasonably. - Having compile-time-ness as a binary property of the type breaks compositional type checking
- Consider
Vector(i32!)
(wherei32!
is the spelling of a compile-time-onlyi32
) -- you expect to be able to add compile time integers and extract them from this container. But what you get back can't be a compile time integer! - Even something like
Vector(typeof 3)
wheretypeof 3
is a compile-time type which only has one value,3
inhabiting it. You still can't extract atypeof 3
from this, because it might fail.
- Consider
- The simple ways of being generic over all types aren't actually sufficient to be generic across runtime and compile time. Would need more constraints to model this, and might be confusing for users who don't expect to have to work with that.
Adding these interesting insights to some of the prior ones (the bifurcation of vocabulary types), everyone seemed really happy saying "no" to (2).
We also talked some about (3). General excitement about the direction.
One issue is whether this is syntactic sugar or a primitive. To a certain extent, may not matter too much, but there were arguments in several directions.
One concern raised is that we may find that there is a tension where we want selecting an impl
for a signature to use a really different model than what we have desired for types. General agreement this could cause a bit of a schism and that would be bad. But goal is to avoid that, and try to keep the rules as unified and integrated as possible.
One thing we need to figure out how to make these constructs actually work in all the relevant ways. For example, how do you match a constraint and an impl?
interface I(X:! Signature) {
}
class C {};
impl C as I(callable (a: i32));
// Want a call that can match C given an argument list, not a signature.
fn F(a: i32, T:! I(call(a))) { ... }
This idea resulted in us thinking about needing to articulate both an abstract "call" argument list, and an abstract "callable" pattern list that would match those calls.
So when we have F(a, b)
, we turn that into F as call(a, b)
, and look for an F
with impl callable
whose pattern matches a
and b
like callable(x: i32, y: i32)
.
Just a comment that while I think that we do want to pursue option (3), I don't think that should block #2875 which currently implements option (1), other than to note that there is more work to be done to incorporate pattern information beyond type in the Call
interface or overload resolution (see https://github.com/carbon-language/carbon-lang/pull/2875/files#r1310931830 ).