representation for initializing expressions and putting function return values in registers

Question

representation for initializing expressions and putting function return values in registers

zygoloid opened this issue 9 months ago · comments

We currently unconditionally permit returned vars in functions, and the implication seems to be that the assert in this testcase holds for any type T:

fn F[T:! type](v: T, p: T**) -> T {
  returned var ret: T = v;
  *p = &ret;
  return var;
}

fn G[T:! type](x: T) {
  var p: T*;
  var v: T = F(x, &p);
  Assert(&v == p);
}

But that's problematic: it means that a function like F that uses a returned var (or, transitively, initializes its return object from a function that uses a returned var) cannot return in registers, which means that our calling convention, at least for functions whose bodies we can't see, can't return in registers, even for types like i32 or ().

I think:

A type author should be able to choose whether their type is guaranteed to be returned in-place or not (and in the latter case might be returned in registers).
A function author should be able to choose whether they guarantee to return in-place or not.

As a starting point, how about this:

Any type T whose value representation is const T* defaults to being returned in-place; any type whose value representation is const T defaults to being returned by copy.
- Possibly this default can be overridden in some way, separately from customizing the value representation, or possibly not.
- For the in-place case, we could choose to both pass in a return slot and return a pointer to where the result actually is, putting the burden of performing an optional copy on the caller, or we could choose to require that the function actually initializes into the provided address.
- For a custom value representation, we could provide some mechanism to let the class author choose, or we could always use in-place initialization, or we could pass in a return slot and return a value representation and a bool indicating whether the return slot got initialized.
A function that uses a -> T return uses the default for the type; a function that uses a -> var T return guarantees to return in-place.
- returned var notation is only permitted in a function that is guaranteed to return in-place, either because the function uses -> var or (optionally) because the type requires an in-place return.

Richard Smith · Answer 1 · Wed Aug 23 2023 03:35:13 GMT+0800 (China Standard Time)

Another consideration: should the expression category of a function call always be "initializing", or should it follow the "guaranteed in-place" nature of the call? We may be able to avoid materializing some temporaries if we say that the expression category depends on the type:

fn F() -> i32;
fn G(n: i32);
// Does this materialize an i32 temporary, or does it just pass an i32
// value returned from F() directly into G()?
fn H() { G(F()); }

Also of note: there are definitely types for which it's reasonable for the value representation to be const T but for which the initializing representation should not be a simple value copy. For a type with a (non-trivial) destructor (eg, UniquePtr(U)), it's fine to form a const T value binding, pass it around, and use it while the original object is still alive. But it's not OK in general to return it by value from a function and have it outlive the original object, which will likely have been destroyed.

This suggests a refinement of the above rule: a type defaults to being returned in-place if either its value representation is not const T or it has a non-trivial destructor.

josh11b · Answer 2 · Wed Aug 23 2023 05:38:05 GMT+0800 (China Standard Time)

Isn't the relevant criteria whether the move constructor is a bit-copy?

UniquePtr(U) has a non-trivial destructor, but moving it is a bit-copy, so it is fine to return in a register. Since you can move by copying + destroying, types with trivial copy constructors and destructors also have trivial move, but it doesn't go the other direction.

Richard Smith · Answer 3 · Wed Aug 23 2023 06:06:15 GMT+0800 (China Standard Time)

Yes, I think instead of checking for a trivial destructor, we could check for a trivial destructive move. Either check seems OK to me, and it's plausible that we could encounter a type that has a trivial destructor but not trivial destructive move, so we could consider checking both properties.

josh11b · Answer 4 · Wed Aug 23 2023 06:08:12 GMT+0800 (China Standard Time)

Why check both though? UniquePtr(U) has trivial destructive move, non-trivial destructor, and should be allowed to be returned in a register.

Richard Smith · Answer 5 · Wed Aug 23 2023 06:11:33 GMT+0800 (China Standard Time)

Because there can exist types that have trivial destruction but not a trivial destructive move. Eg, if a type logs every time an instance is created, but has a trivial destructor, we can still return it in a register, because we're returning a value, not an object.

Edit: To be clear, we can return in registers if either condition holds.

josh11b · Answer 6 · Wed Aug 23 2023 06:35:46 GMT+0800 (China Standard Time)

I don't think so: consider this C++ class:

class CircularBuffer {
 public:
  CircularBuffer() : first(&data[0]), last(&data[0]) { }
  ~CircularBuffer() { }
  // ...
 private:
  char data[16];
  // points into data
  char* first;
  char* last;
}

Trivial destructor, but can't be returned in registers (since non-trivial destructive move / "not trivially relocatable").

I think the key issue is when the address of the object is captured, which isn't really related to whether the destructor is trivial.

Richard Smith · Answer 7 · Wed Aug 23 2023 07:27:51 GMT+0800 (China Standard Time)

Hm, right. The relevant constraint in C++ is "trivial copy or move + trivial destructor", and I was hoping we could get away with dropping the first half since we're not actually constructing the new object as part of the return (that could happen in the caller depending on how they use the resulting value), but yeah, there's still a lifetime issue if the object holds pointers into itself.

So perhaps "trivial destructive move" is the best we can do, and it's certainly logical and a good match to the C++ rule.

Richard Smith · Answer 8 · Thu Aug 24 2023 04:01:53 GMT+0800 (China Standard Time)

I think there are two different non-in-place optimizations here, and they have different constraints:

If, for a type T, a value created by a value binding can outlive the object that it is bound to, then a function can return a value representation for type T in registers. This seems like it should generally be possible if (1) the destructor for T is trivial (and so cannot invalidate any invariants of the information that the value representation describes), and (2) the value representation doesn't hold any handles to state whose lifetime is ending when the function returns (for example, pointers to the same object or to other objects that might be on the stack). The move operation is not relevant for this optimization.
If, for a type T, an object can be moved to a different address in memory, then a function can return an object representation for type T in registers, by pulling it out of memory in the callee and putting it back in memory in the caller. This should generally possible if the type supports a "move from one location in memory to another" operation that is either just moving the bits around, or if for example it supports some "prepare to be relocated" and "clean up after relocation" operations.

I suspect optimization (1) is more interesting, because it avoids going through memory unnecessarily, but it probably happens a lot less often because, for most interesting user-defined types, the value representation will be a pointer, which means it always contains a handle to state whose lifetime is ending when the function returns. In the case where the type's object and value representation are the same, these two optimizations seem like they might collapse to the same thing, but I think they actually don't, because optimization (2) is materializing an object in the caller (including, for example, a requirement to call a destructor) and optimization (1) is not.

josh11b · Answer 9 · Sun Sep 24 2023 11:49:07 GMT+0800 (China Standard Time)

If, for a type T, a value created by a value binding can outlive the object that it is bound to, then a function can return a value representation for type T in registers. This seems like it should generally be possible if (1) the destructor for T is trivial (and so cannot invalidate any invariants of the information that the value representation describes), and (2) the value representation doesn't hold any handles to state whose lifetime is ending when the function returns (for example, pointers to the same object or to other objects that might be on the stack). The move operation is not relevant for this optimization.

@zygoloid could you explain (1), maybe with an example? It also isn't obvious that you can construct an initializing expression from a value representation. It would be particularly useful if you give an example that can be returned due to (1) but not (2). I feel like criteria for (2) captures "nothing cares about the address of this object", which is what we really care about, but maybe there is another reason a type won't have a trivial/bit-copy move? My assumption was that "having trivial/bit-copy move" did not mean that we would materialized a representation in memory in order to then copy the bits -- I thought that criteria was sufficient to be able to do everything in registers.

Richard Smith · Answer 10 · Tue Sep 26 2023 07:17:50 GMT+0800 (China Standard Time)

It's a little hard to get to an example where optimization (1) is possible but (2) is not, because I think it requires the type to have a custom value representation, which isn't something I have a lot of experience with and examples of.

Here's a contrived example: suppose we have a type NumberAsText whose object representation is an (i32, String), where the string holds a cache of the formatted string representation of the number, and the value representation is just an i32.

Optimization (2) isn't possible for this type. It's not trivially relocatable, because the object representation contains a String, which is not trivially relocatable (assuming an SSO string representation).

But optimization (1) is possible. A value binding for this type can outlive the object that it binds to, because an i32 value doesn't refer to the storage from which it was created, so we can return this type in registers as an i32. (It's not completely clear whether the optimization is a good thing here; we're losing the cached string in the process, but that's something the type author opted into by excluding it from the value representation.)

github-actions · Answer 11 · Mon Dec 25 2023 09:48:30 GMT+0800 (China Standard Time)

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time.

This issue is labeled inactive because the last activity was over 90 days ago.