0xPolygonMiden / compiler

Compiler from MidenIR to Miden Assembly

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Namespace of the compiled MASM module for the Rust code

greenhat opened this issue · comments

The compiler produces compilation artifacts in a form of one MASM file per module using module ID for the file name. The Rust code resides in a module with an ID corresponding to the crate name. If we load the MASM module without specifying the namespace, it gets assigned a #anon namespace. But I cannot figure out the way to call the procedure from the module with #anon namespace.
I've tried to call the procedure without importing the module assuming the #anon namespace is the default one, but it doesn't work. I've tried the use directive with the #anon namespace, but it doesn't work either.
At the end I went with a workaround by assigning a synthetic namespace user_ns to the module, but the DX is not ideal.

Procedures from anonymously parsed modules should be callable after 0xPolygonMiden/miden-vm#1363 is merged.

To elaborate a bit on how we should address this once 1363 is merged:

  • We need to ensure that we use absolute paths for all procedure references, and elide emitting imports (unless we have a specific reason to use relative paths and imports on a case-by-case basis). The syntax for this is the usual path syntax, but prefixed with ::, e.g. ::foo::bar, and can be used anywhere that you specify paths in MASM.
  • The assembler parser has been modified so that when an absolute path with a single component is parsed, it treats it as implicitly in the #anon namespace, e.g. exec.::foo is treated as a procedure defined in an anonymous module. More precisely, in a procedure context like that, it parses it as a ProcedurePath::Absolute, whose path field is a LibraryPath with LibraryNamespace::Anon and no additional path components, and whose name field is simply the identifier foo. If used in a module path context, it is essentially the same, except the resulting LibraryPath contains foo as the sole path component.
  • Raw identifiers can be used anywhere that a procedure name is expected, so the following is valid syntax exec.::"miden:tx-kernel/get-inputs", and would refer to a procedure named miden:tx-kernel/get-inputs in the anonymous module.
  • When emitting a program, where the program is all in a single anonymous module, we can combine the two points above to reference all procedures in the program absolutely.

I'm planning a subsequent expansion of the path syntax in the assembler so that Wasm Component Model identifiers are also valid Miden Assembly identifiers, e.g. exec.::miden:tx-kernel/get-inputs would be parsed as ProcedurePath::Absolute with a path given by LibraryPath::new_from_components(LibraryNamespace::User("miden"), ["tx-kernel".into()]), and a name given by Ident::new("get-inputs"). This would be able to be combined with other options, such as raw procedure name identifiers, to allow for things like exec.::foo:bar/"baz::<T>::new".

That change I'll land after 0xPolygonMiden/miden-vm#1363, but at the moment we don't have packages, so we can get away without it since we don't need to reference items in other Wasm-compiled modules yet. For the time being, we can simply make use of what is provided in 1363 to handle referencing items much more seamlessly.

I'm planning a subsequent expansion of the path syntax in the assembler so that Wasm Component Model identifiers are also valid Miden Assembly identifiers, e.g. exec.::miden:tx-kernel/get-inputs would be parsed as ProcedurePath::Absolute with a path given by LibraryPath::new_from_components(LibraryNamespace::User("miden"), ["tx-kernel".into()]), and a name given by Ident::new("get-inputs"). This would be able to be combined with other options, such as raw procedure name identifiers, to allow for things like exec.::foo:bar/"baz::<T>::new".

As mentioned on the call, we could also consider making this the canonical way to define paths in MASM. So:

use.std:math::u64

begin
    exec.u64/add
end

Would be the same as:

begin
    exec.std:math::u64/add
end

And in both cases, the absolute path would be:

LibraryPath::new_from_components(LibraryNamespace::User("std"), ["math".into(), "u64".into()])

The primary motivation for this would be to reduce the complexity in the assembler at the cost of introducing some breaking changes (which I think may be OK).


Separately, we should probably rename LibraryPath::new_from_components() into just LibraryPath::from_components() to be consistent with how things are usually named in Rust.