Is `.base` permitted as a struct member? Is it the same as `.r#base`?
zygoloid opened this issue · comments
Summary of issue:
We support initializing the base class of a derived class using .base
:
base class A { var a: i32; }
class B {
extend base: A;
var b: i32;
}
var b: B = {.base = {.a = 1}, .b = 2};
But what does it mean to use the keyword base
as a field name in a struct?
Details:
Some options:
.base =
is special syntax that can only be used in a struct literal that is converted to a class.
// Error, struct value cannot have field named `base`.
var a: auto = {.base = 5};
This means that we can't forward such values into a function that will construct an instance of a class:
fn F[U:! type](T:! type where U is ImplicitAs(T), x: U) -> T { return x as T; }
// Error!
var b: B = F(B, {.base = {.a = 1}, .b = 2});
.base
means the same thing as.r#base
--base
is treated as a non-keyword in this context, despite being a keyword in other contexts. This would be problematic for interop with C++ classes with abase
member:
var it: Cpp.std.reverse_iterator<It>;
// Ambiguous: is this the base class of the iterator, or is it the `base` member function?
var a = it.base;
.base
is a different name from.r#base
, and structs can have a field with that special name. That field is not special in a struct, it's just another field name.
// OK, two fields with different names.
var s: {.base: i32, .r#base = f64} = {.base = 5, .r#base = 7.0};
// Error, field notakeyword specified multiple times.
var t: auto = {.notakeyword = 1, .r#notakeyword = 2};
.base
is a different name from.r#base
, and structs can have a field with that special name. A struct implicitly extends itsbase
field, if present, just like a class explicitly extends it.
var xy: {.x: i32, .y: i32} = {.x = 1, .y = 2}
// OK, xyz is of type {.base: {.x: i32, .y: i32}, .z: i32}
var xyz: auto = {.base = xy, .z = 3};
// xyz is {.base = {.x = 1, .y = 2}, .z = 3}
// OK, found in base.
var x = xyz.x;
This makes the struct behave a bit more like the class that it converts into.
- We can extend (4) to get a struct update syntax:
// We can allow flattening if all field names match, when the destination is a struct with no `.base`.
var xyz: {.x: i32, .y: i32, .z: i32} = {.base = xy, .z = 3};
// xyz is {.x = 1, .y = 2, .z = 3}
// If we allow there to be unused fields in the source (perhaps only if they're shadowed),
// we can then perform struct updates via `.base` too.
var new_xyz: {.x: i32, .y: i32, .z: i32} = {.base = xyz, .y = 4};
// new_xyz is {.x = 1, .y = 4, .z = 3}
However, it's not clear to me how this would work if the destination also has a .base
. One possible approach would be to initialize the base-most destination from the base-most source, and so on until we reach a level where only the source is still a base, and then flatten. (And reject if the destination has deeper inheritance than the source.) If we want this kind of flattening, perhaps a different syntax would be preferable:
// (Placeholder syntax.)
var xyz: auto = xy with {.z = 3};
// xyz is {.x = 1, .y = 2, .z = 3}
var new_xyz: auto = xyz with {.y = 4};
// new_xyz is {.x = 1, .y = 4, .z = 3}
Any other information that you want to share?
No response
Options 4 and 5 are the most tempting to me
Option 4 seems good.
Regarding option 5, a few things I'd suggest considering before adopting the noted implicit flattening/nesting conversions via struct update syntax:
-
For background, considering C++, I think this is a change in behavior which may lead to different results.
.base
gives a syntax to update the parent without confusion about whether the field is on the child, which is not in C++ and would likely be helpful; still, there's a distinction between base and child structs. Maybe there's some way to make struct initialization work, but the naive way doesn't:struct A { int x; }; struct B : public A { int y; }; // Error: field designator 'x' does not refer to any field in type 'B' B c = {.x = 0, .y = 1}; struct D : public A { int x; int y; }; // Valid C++ code. D e = {.x = 0, .y = 1};
-
In most cases of name ambiguity, I would expect ambiguities to be resolved by adding qualifiers. In this case though, ambiguities are resolved by specifying more things: the original designated name remains. This feels a little unusual to me.
// An error due to ambiguity in the `.x` initialization? var a: {.base: {.x: i32}, .x: i32} = {.x = 0}; // Now unambiguous. var b: {.base: {.x: i32}, .x: i32} = {.base = {.x = 0}, .x = 1};
-
What about classes and tuples? In classes, making
base
optional seems like it would be similar, although it may raise concerns about type safety. In tuples, I think it'd already been discussed and the decision was to treat nesting as requiring explicit flattening, not an implicit conversion.For example:
// The struct literal example: var a_struct: {.base = {x: i32, y: i32}, z: i32} = {.x = 1, .y = 2, .z = 3}; var b_struct: {.x: i32, .y: i32, .z: i32} = a_struct; // Similar in classes: class BaseT { var x: i32; var y: i32; } class ChildT { extend base: BaseT; var z: i32; } var a_class: ChildT = {.x = 1, .y = 2, .z = 3}; var b_class: {.x: i32, .y: i32, .z: i32} = a_class; class FlatT { var x: i32; var y: i32; var z: i32; } var c_class: FlatT = a_class; // Similar in tuples: var a_tuple: ((i32, i32), i32) = (1, 2, 3); var b_tuple: (i32, i32, i32) = a_tuple;
Regarding the with
keyword in var xyz: auto = xy with {.z = 3};
, I think that is more explicit, and so doesn't have the same concerns as implicit casts, although name ambiguities may still be odd to resolve [e.g. var a: {{.base: {.x: i32}, .x: i32} = val with {.base = {}, .x = 3};
].
I think we are only really supporting struct -> class conversions, not the other direction.
FWIW, I also like option 4.
Options 1-3 seem likely to end up with some amount of friction, whereas option 4 seems to really elegantly express the range of things we want in struct literals and provide important functionality for initializing base classes.
We can always revisit option 5 if/when we have motivation and some ways to deal with both the issues raised in the original description and by Jon.
I think option 4 is my preference here -- it gives a consistent behavior for the name base
in classes and structs, seems (at least a little) useful for structs independent of the class initialization use case, and fits nicely into the class initialization use case.
Option 5 seems a little too implicit and do-what-I-mean-ish to me, and a more explicit struct update syntax seems like a better fit for Carbon's design aesthetic.
Let's call this decided with option 4 -- seems we have enough consensus among leads and no strong objections. And the comments above have lots of good points of rationale, including from leads.