dart-lang / language

Design of the Dart language

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Design for introspection on macro metadata / annotations

davidmorgan opened this issue · comments

Related: #3522

@johnniwinther @jakemac53 is there an issue already open for the investigation Johnni's been doing into macro metadata?

I couldn't find one, so here's a fresh one :) and I'll close #3522 in favour of this one since the discussion there does not seem super helpful.

Also related: #3728 for the move to use annotations for macro metadata.

Something similar to source_gen's TypeMatcher here would be cool.
It's likely going to be quite common

I've been using a pattern for configuring macros with application-specific data, which involves extending a base macro to then apply the new subclass macro to the project code.

// pet_store_client/macros.dart

macro class PetStoreApiBuilder extends OpenApiTypeBuilder {
  const PetStoreApiBuilder({
    bool missingKeywordWarning = false,
  }) : super(
          contents: petStoreSchemaContents, 
          $vocabularies: jsonSchemaVocabularies,
        );
}

// schema.json contents could also be pulled from Resources API, once available
const petStoreSchemaContents = r'''{
  "$id": "https://spec.openapis.org/oas/3.1/schema/2022-10-07",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  /* ... */
}''';

Usage:

// pet_store_client/api.dart

@PetStoreApiBuilder(r'/$defs/info')
class Info { }

@PetStoreApiBuilder(r'/$defs/components')
class Components {}

@PetStoreApiBuilder(r'/$defs/server')
class Server {}

/* ... */

This pattern feels a lot nicer to me than the build.yaml builder configurations; configuring macro metadata within Dart code makes a lot of sense (which is why I'm hoping that there's some reconsideration re: declaring macros in pubspec.yaml, but I digress). Having dedicated project files/directories for metadata-related configs is something I've wanted in lieu of yaml-based configurations for builders and analyzer plugins; it may be nice to have an accepted pattern for this :)

Hi @pattobrien, configuration with yaml files is over at #3728, this issue is about introspection on macro annotations and other annotations ... sorry, I didn't really make that clear in the issue, let me try to clarify a bit.

Re: Dart vs yaml, the issue is that analyzers/compilers need to know about macro applications very early on; it's a correctness/performance headache to have them only in the code.

Hi everyone :) @jakemac53 @scheglov this is my writeup of @johnniwinther's exploratory work for this issue after discussion with him. It's intended to kick off a round of discussion on where to go next.

Macro Metadata

Macro users need to be able to tell macros what to do.

This can take the form of values "passed" to the macro application, or of values in a normal annotation that one or more macros will inspect.

@MyMacro(foo: true, bar: false, baz: [SomeOtherClass])
class SomeClass {}

@NotAMacro(foo: true, bar: false)
class SomeOtherClass {}

Because we plan to unify macro applications and annotations this reduces to the problem of introspecting on annotations; and because annotations must be const, this mostly means introspecting on consts.

Constraints

Macros are compiled independently of the program they are applied in, so a const can be of a type unknown to the macro.

  • Implying that in the general case macros must introspect on rather than receive the const.

Macros run before consts are evaluated; the program might be incomplete when they run, meaning evaluation is impossible; and even if evaluation is possible the value might change during macro execution.

  • Implying that in the general case what macros must introspect on is not exactly a value; in cases where a value cannot be computed we still want to pass what the macro host does know; there should be a concept of "this is not known yet"; and we must decide what to do about changes.

Macros sometimes want to reproduce "values" passed to them in their generated output.

  • Implying that sometimes what the macro cares about is not the value, but rather a way to tell the host to write code that will create an instance of the value, or some "part" of it, at runtime.

Proposal

Given the constraints, the solution appears to be to give macros access to an "enhanced" AST model for const expressions:

  • As with any AST model: it can be inspected structurally, so a macro can pull out and process the different "arguments" and, when they have structure, the "nested arguments", as well as consts referred to and their structure;
  • Enhancement: the model or any part of it can be evaluated, in cases where evaluation is possible;
  • Enhancement: there is a representation for unresolved values;
  • As with any AST model: the model or any part of it can be output by the macro as code; the "enhancement" is in making this possible even when there are unresolved values in the model, by means of the host tying them to the corresponding resolution results when available.

Implications and Next Steps

This adds significantly to the surface area of the macros API, with 40+ new data types in the exploratory code.

A huge amount of complexity is added, with all the attendant costs. There is one clear upside: building a solution for const expression ASTs will mean there is likely to be a clear path to adding more access to ASTs to macros in a future release, for example to support introspecting method bodies.

Current discussions around managing the complexity and maintainability of the macros API are focused around a query-based API and a schema for the macro API. It might make sense to combine prototyping around queries/schemas with prototyping around macro metadata, to explore whether they seem like a good fit.

Detail

The exploratory PR contains lots of examples and some discussion; these can be discussed at length later, a selection are highlighted here to give some flavour:

  • The representation of an annotation can naturally change as new values become available and resolution completes. We will likely want to class some types of changes as a compile error, for example if a name that referred to a const now refers to something else entirely. Others may be allowed: most obviously, a chance from "unresolved" to a resolved value.
  • There is a question of when to resolve annotations, and the effect this has on asking "did anything change". A query-based API should allow macros to ask for resolution only when they want it, which may help.
  • Evaluated values might not match the const values exactly, for example in cases where type inference decides whether a number is a double or an int. More generally, the language that is being passed to macros looks like Dart but is not fully equivalent to it.
  • There is a need to represent references, including references that have not been resolved yet. If a public symbol in another library evaluates to a reference to a private symbol, the private symbol is then something that can't be used in generated code. If a reference is to a library that was compiled earlier, meaning source is no longer available, then there may be no way to recover in a way that can generate source.

Regarding re-binding issues in particular, I do think that treating declarations coming from augmentations (or "parent"s) as not being able to shadow things coming from imports would largely resolve the issue.

It becomes an error to shadow something from an import via augmentation and also reference it, and this means we can accept the fact that macros might resolve identifiers differently based on when they run - because there will necessarily be an error in the final program on the annotation which they resolved incorrectly.

See #3862 for context.

As far as greatly expanding the API surface area to expose a typed model for the entire AST that can be represented in a constant expression, I would really like to avoid that.

The ExpressionCode object is meant to be an equivalent abstraction but with a lot less surface area. We could specify more exactly how those should be constructed - maybe the original expression should be tokenized and then each token becomes a "part" of the code object, as an example.

Then you can have client side only APIs to parse that into whatever structure, but these don't need to be serializable types, and the compilers don't have to know about them, which should simplify things and make the API more stable (these APIs can come from some other helper package).

It's always tempting to push complexity to a shared client library, but then instead of a schema that you know you can evolve safely, you have a hybrid of schema and code that is very hard to safely evolve. You have to reason about what happens when there is version skew between the data and the code, and in practice you simply can't, you rely on test coverage; and then you don't have the tools you need to make progress.

For example:

Suppose the current schema is v3, and then we ship a language feature that adds new syntax: you can now write Dart code that cannot be described in v3, if you try to understand it as v3 then the meaning is lost.

With the hybrid approach, what do you do? You are forced to use versioning even though there is no schema: you say that client versions <X can't understand the new Dart code, and ... then what?

With a schema you can say, here is v4 that covers the new syntax. The macro can say it understands only up to v3, and the host can try to serve v3 and bail out in a reasonable way if syntax is used that needs v4. The macro author can update to support v4 at their convenience, and now the macro advertises that it supports v4 and can consume the new syntax.

You could say, ah, we'll just use the language version: instead of v3 -> v4 it's Dart language 3.5 -> 3.6; macros say what language versions they can support, that's the "schema version". But then you make every minor language version a "breaking" change for macros, and you don't actually tell people if it's really breaking and if so what broke.

Whereas when you publish a schema change everyone immediately knows if they have work to do: if in the context of one particular macro the missing surface area is irrelevant, will be seen as a bug, or should be supported as a feature.

Maintaining a schema is a lot of work but it is all work that saves you from doing more work later. That's why they are so incredibly widely used even though they are painful to work with :)

The analyzer already has exactly this problem, old versions of analyzer are generally selectable on new SDKs, but cannot parse that new code without a pub upgrade. It generally works out fine 🤷.

Whereas when you publish a schema change everyone immediately knows if they have work to do: if in the context of one particular macro the missing surface area is irrelevant, will be seen as a bug, or should be supported as a feature.

For any change to the language which is significant enough to require a parser change, there are almost certainly going to be corresponding AST changes, which some (but not all) macros ultimately have to care about.

You likely end up doing breaking changes in the package for these changes, since some macros will have to be updated to handle those changes (even just new AST nodes). And then every macro in the world has to be updated, regardless of if they are broken (to expand their constraint).

If the actual AST classes are in a separate package, only the macros which actually depend on that package will have to update when it changes. They also still get an indication that they should update (they will see a new version available).

Essentially this is a tradeoff of what kind of error you get - a static error because of code that can't be parsed by a macro with some old version of the shared package, versus a pub solve error after upgrading your SDK.

The pub solve error blocks you even if you aren't actually broken at all. The static error shows you exactly in your code the expression that failed to parse. We could likely make this a well understood macro error with good tooling around it too (such as a lint/hint that knows the required version of the package in order to parse a given metadata annotation, and suggests that you upgrade to that version).

Ultimately, I think I would prefer the static error in this case. It allows you to update your SDK constraint, and without any additional changes everything should continue to work. You are actually less likely to get blocked overall, because you can use macros that haven't yet been updated, as long as you don't use them on expressions using the new syntax. You could get broken on a pub upgrade, if you have macros which parse code, aren't updated to the latest version, and you use new syntax. But, in this case you would be broken either way, and in the pub solve case you can't just avoid using the new syntax.

Thanks Jake! There are some tricky corners here, for sure.

Fortunately I think covering the AST part with a schema does not restrict our options, there are a bunch of things we can do:

  • We can split out the AST part of the schema and version it separately;
  • and/or split out the code corresponding to that part of the schema into a different package or different part of a package;
  • and/or treat that part of the serialization or corresponding code API in any other way, really.

And since we will support versioning we can make these choices differently at different versions.

Not sure what the right time to dig further is--probably we can get a lot more clarity on the choices once we have an end to end demo running.

My guess at this point is that we should make dart_model have no breaking code changes, i.e. use the "breaking change -> new library" model. Always being able to version solve for macro deps seems like a good thing--as long as this doesn't turn out to be too much maintenance burden.

Then we would, as you say, report missing support in a macro implementation only when it actually matters, as a compile error.

Neatly splitting out the AST part, so we have e.g. dart_model/elements_v3.dart, dart_model/ast_v3.dart, might be what fits. Because we're doing serialization we can make these pieces independent, i.e. when you reach a node in the element model that corresponds to some AST you can try to interpret it as v2 if that's the best you can do, or v3 if you have it :)

Re #3847 (comment)

Regarding re-binding issues in particular, I do think that treating declarations coming from augmentations (or "parent"s) as not being able to shadow things coming from imports would largely resolve the issue.

I don't think it'll cover everything. We can still have

const foo = 5;
class Foo {
  @foo
  bar() {}
}

and with a macro generating

augment class Foo {
  static const foo = '5';
}

which would rebind @foo from the top level variable to the class variable.

Re #3847 (comment)

The ExpressionCode object is meant to be an equivalent abstraction but with a lot less surface area. We could specify more exactly how those should be constructed - maybe the original expression should be tokenized and then each token becomes a "part" of the code object, as an example.

I don't see how the ExpressionCode object is providing an equivalent abstraction. How does it for instance provide the ability to structurally inspect and/or evaluate the annotation?

I don't see how the ExpressionCode object is providing an equivalent abstraction. How does it for instance provide the ability to structurally inspect and/or evaluate the annotation?

It does not directly - it just exposes a list of "parts" which are ultimately either strings or identifiers. Essentially a raw token stream.

You can then build a client library on top of that, to "parse" it into a more structured AST, if desired.

In CFE terms, we would basically do the "scanner" but not the "parser" - the parser part would be left up to the client. However even the scanning in this case would be less structured than what the actual CFE scanner produces - no specific Token types just strings.

Another possible idea is to go all the way to a single string + scope. Any identifiers encountered when parsing would need to be constructed with that given scope, or maybe a sub-scope derived from that. This might actually work well with the data_model approach, where identifiers don't have IDs but instead a scope and a name?

Re: ExpressionCode: Julia implements metaprogramming by representing expressions in a Lisp-like syntax. Maybe this idea can help?

(See Greenspun's tenth rule)

Jake and I chatted about this a bit; the more I think about it the more I think JSON is a natural fit.

Subtrees of JSON can reference external schemas by version; each library in the model can be tagged with its element model version and AST model version. The macro can know if it has a recent enough schema, and decide whether to proceed using the corresponding helper Dart code--or simply manipulate the AST as a JSON tree.

Or something like that :) we'll see.

In any case I do think that whether we have a structural model on the wire versus a more raw token stream is a bit of a distraction from the more interesting questions. If the rest of the team is comfortable with the API (and protocol) surface area expansion I am fine with being overridden.

I am more specifically interested in discussing the semantic differences compared to the existing proposal. Is this mostly about specifying the behavior for edge cases better (for example adding a an error on re-binding), or is there something fundamentally different which allows the CFE to work better with this model?

Here is my high level of summary of answers to bullets in the "details" section above, in the current proposal:

The representation of an annotation can naturally change as new values become available and resolution completes. We will likely want to class some types of changes as a compile error, for example if a name that referred to a const now refers to something else entirely. Others may be allowed: most obviously, a chance from "unresolved" to a resolved value.

The current proposal says "All identifiers in code must be defined outside of the current strongly connected component (that is, the strongly connected component which triggered the current macro expansion)."

Essentially, it adds a restriction to sidestep the issue. We could instead make re-binding an error though. I don't have a strong opinion on this. The problem is more than just re-binding, because you could also change the actual value of a constant by augmenting its initializer. The restriction in the current proposal sidesteps both issues.

There is a question of when to resolve annotations, and the effect this has on asking "did anything change". A query-based API should allow macros to ask for resolution only when they want it, which may help.

The existing proposal only gives you Identifiers (via the parts), which you can ask to resolve, so it works in a similar way, resolving identifiers to declarations is on demand, and evaluating on demand.

Evaluated values might not match the const values exactly, for example in cases where type inference decides whether a number is a double or an int. More generally, the language that is being passed to macros looks like Dart but is not fully equivalent to it.

I believe this would be a potential issue in either proposal.

There is a need to represent references, including references that have not been resolved yet. If a public symbol in another library evaluates to a reference to a private symbol, the private symbol is then something that can't be used in generated code. If a reference is to a library that was compiled earlier, meaning source is no longer available, then there may be no way to recover in a way that can generate source.

I believe this would be a potential issue in either proposal.

The current proposal says "All identifiers in code must be defined outside of the current strongly connected component (that is, the strongly connected component which triggered the current macro expansion)."

Essentially, it adds a restriction to sidestep the issue. We could instead make re-binding an error though. I don't have a strong opinion on this. The problem is more than just re-binding, because you could also change the actual value of a constant by augmenting its initializer. The restriction in the current proposal sidesteps both issues.

I think the restriction is a problem: 99.999% of the time the const will not actually be affected by the import that triggers the restriction, I think we'll want to proceed as if it's going to work and bail out only if it doesn't.

Actually, you don't really get a choice: the macro annotation itself can't be const evaluated, for sure, so you do need to build an API that works on incomplete code.

An AST-based API can naturally handle incomplete values, because you have a meaningful way to dig into the pieces that are there. With a value-based API you hit a wall when you encounter something that needs to be complete to evaluate, like a constructor call. (Including, usually, the macro annotation itself).

The DartObject API is a lot of API surface to bring in, I guess it ends up being at least the same order of magnitude complexity as the AST.

I do think there is a chance we end up wanting a DartObject type of API, but in terms of what macros want, I think an AST-based API makes more sense than values anyway: it's about what you write when you apply a macro, and that's really the thing macro authors care about first.

In any case I do think that whether we have a structural model on the wire versus a more raw token stream is a bit of a distraction from the more interesting questions. If the rest of the team is comfortable with the API (and protocol) surface area expansion I am fine with being overridden.

Yes, the important part is deciding what data we need, how to get it, and how it will be exposed to the macro; once we have that sorted out we can move things around as needed, including changing our mind between versions.

Thanks :)

I chatted to Jake about this yesterday, and attempted to make progress today by digging into the analyzr's DartObject code and Johnni's example code.

My conclusion is that we probably need some worked examples to explore the details of what changes if we try to talk about values vs trying to work with the AST. I suspect an important part of the problem is cases where the analyzer/CFE cannot fully evaluate an expression and so the AST (or some approximate AST) is the best that can be provided. But it would be good to understand this with examples.

I do not have a good feel for what the examples need to be, but I tried to create one to kick things off :) in a doc which I think might be an easier way to iterate on specific examples, how does that sound?