Miden and Rust disagree on the data layout for felt/word

Question

Miden and Rust disagree on the data layout for felt/word

bitwalker opened this issue 3 months ago · comments

Paul Schoenfelder commented 3 months ago

While discussing an issue that @greenhat encountered while working on the Wasm frontend, it became apparent that we missed a hole in the design for lowering Rust code to Miden (and more generally, any language like Rust, which is designed for a byte-addressable memory model).

Specifically, we made the following design choices:

Miden VM should be viewed as a 32-bit machine from the perspective of Rust. This is for multiple reasons:
- Miden's notion of a memory address is a u32 value
- Miden's instruction set only provides efficient implementations of key operations in terms of u32 values; e.g. bit manipulation functions like shifts/logical operators, load/store ops, and two's complement arithmetic functions.
Translate byte-addressable memory models to Miden's word-addressable memory as follows: Miden's memory is "word"-addressable (where "word" is 4 field elements), but can be viewed as felt-addressable if you pair a word address with an element index. It can further be viewed as byte-addressable if you constrain each addressable felt to the u32 value range, which allows us to perform bit masking/shifting ops to extract specific bits - pairing a word address with an element index, and a byte offset allows you to reference a specific byte anywhere in Miden's memory.
Designate the f64 type as a sentinel type to indicate that a given value should be treated as a field element when lowered to Miden Assembly. There is no support for floating point values in Miden, so their presence in any Rust code we compile is treated as an error, except in specific narrow circumstances. This allows us to recognize when it is safe to translate certain instrinsics to felt-native ops on the value when lowering to Miden Assembly.

These choices were made to ensure that code compiled from Rust to Miden would be maximally efficient, considering the semantic differences between the targets Rust was designed to support, and Miden. Furthermore, it does not require us to perform unsafe transformations on rustc-generated code in order to work on Miden, a major win. Most importantly however, it allows us to take advantage of the fact that these design choices allow Miden to be viewed as a variant wasm32 target (currently, using either wasm32-unknown-unknown, or wasm32-wasi). We gain an enormous amount of tooling and support as a result.

The Problem

However, you may have already noticed the issue: the native "machine word" type in Miden is not u32, it is felt. Crucially, Miden does not distinguish between u32 and field elements in any way other than the range of the value. This causes a major issue on the Rust side of things, as the full range of a field element requires 8 bytes to represent (hence our use of f64 as a sentinel type in Rust), but by treating Miden as a 32-bit machine, it expects the machine word size to be 4 bytes, not 8. Most problematic is the effect this has on memory layout: Miden's memory model is felt-oriented if you squint a bit, so the smallest addressable unit of memory is a field element, and as a result, smaller values are automatically promoted to the same size as a felt. Rust's memory model is byte-oriented, and because we are treating Miden as a 32-bit machine, Rust necessarily distinguishes between field elements and integral types with smaller ranges, since field elements require a larger-than-machine-word integral type to represent.

So we arrive at the crux of the issue - the representation of a Miden "word" (i.e. array of four field elements) in Miden does not align with the Rust representation, when laid out in memory. This becomes apparent when you examine how a Miden word is represented in Miden's memory model, and translate that into Rust:

Miden memory is really word-addressable, and for simplicity sake, we'll say that the word we care about is aligned (all elements are stored at the same address), and is stored at address 8
In Rust, each field element of a word in Miden's memory model, is treated as a 4-byte cell. Because each felt in Rust requires 8 bytes to represent, it will require two felts in Miden's memory model to represent (since each felt in Miden's memory model provides 4 bytes of storage).
Thus, Rust assumes we need two words of Miden memory to represent a single Miden word, whereas Miden only uses one (obviously)

Solutions

This disagreement between the two memory models is not possible to resolve in a universal fashion without implementing a new LLVM backend, and using a custom build of Rust that includes that backend. Such a backend would need to implement a pretty bizarre data layout - a 32-bit target whose smallest physical representation is 64-bits. I suspect that this would actually run into quite a few issues both in LLVM and Rust itself, since I'm unaware of any target like that, and LLVM is notorious for being quite buggy in uncommon configurations due to lack of testing in those configurations.

Where I've landed on this so far, is essentially treating felts/words as second-class citizens in Rust. In essence, we would require lifting/lowering between the Rust and Miden representations at call site boundaries where those types are present, namely with specific standard library and Miden SDK functions implemented in Miden Assembly that work in terms of them. Internally in the IR, we already have two pointer types (one of which represents Miden "native" pointers as a triple, described earlier, and which has up until now not been used). We would add two new IR types that correspond to "native" Miden felt/word representations. We could then emit code in the frontend to do the lifting/lowering in terms of loads/stores using the appropriate pointer type. A load/store of a felt/word from a "normal" pointer would be handled using Rust semantics, while a load/store from a "native" pointer would be handled using Miden semantics. To "lower" a word in Rust to Miden, we would first emit a cast of the "normal" pointer to the "native" pointer type (with the pointee type also being cast from "normal" to "native"), then a load. To "lift" a word from Miden to Rust, we would do the exact same thing, but with a cast going the opposite direction. It may be necessary to emit code to load/store each individual element of the word, rather than relying on the backend to handle that based on the type information, but for the moment I'm expecting that leaving that to the backend will result in more efficient code, so I think that would be preferable.

So what are the tradeoffs with this? Users targeting Miden via Rust, are going to be first and foremost concerned with the performance of Rust code, so choosing a 32-bit target for Miden is overall going to result in much more efficient codegen that if we treated it as a 64-bit target. On the flip side, any code that is operating on raw felts/words is going to be less efficient, in some cases drastically so, though I have not done the work to identify the worst case scenarios for it yet. I think this is acceptable, albeit undesirable, as it will always to escape Rust and hand write Miden Assembly code to handle performance-sensitive aspects of a contract which is otherwise written in Rust, and then simply call that handwritten MASM from Rust, rather than relying on the compiler to handle those parts. This will make it more critical that we have good tooling for integrating Rust and handwritten MASM though, and up until now we've kind of been looking at that as not super important.

There are other secondary effects this has, particularly on our overall strategy regarding language frontends for Miden, but not in the immediate term (assuming the proposed solution, or something like it, is workable).

TL;DR

Rust and Miden do not, and can not, agree on the representation of field elements and Miden words. The result is that we cannot transparently share such values between Rust and Miden, and will need to emit conversions at call site boundaries between Rust and MASM code. It remains to be seen how major of an issue this is in practice, but it is absolutely a blocker in the near term - we need a good solution for this ASAP.

@greenhat Thoughts?

Bobbin Threadbare · Answer 1 · Wed Apr 10 2024 14:17:40 GMT+0800 (China Standard Time)

Thank you for the detailed write up! I'm still digesting this - but a quick question: would using f32 as the type to represent a filed element solve the issue? On the Rust side, f32 would require 4 bytes to represent and if we can force f32 values to be aligned on 4-byte boundaries, they would naturally fall into elements on Miden side.

Philippe Laferrière · Answer 2 · Wed Apr 10 2024 21:13:23 GMT+0800 (China Standard Time)

Would it be possible instead to "simply" make memory u64 addressable (by wrapping all values over $P=2^{64} - 2^{32} + 1$)? This is what SP1 does: they use the baby bear field ($2^{31} - 2^{27} + 1$), and their memory is u32-addressable.

Although this creates a problem for Rust, given that Rust's memory model treats all $2^{64}$ addresses as unique, whereas the VM would consider address $i$ and $P+i$ as the same. I believe this is just a front-end problem though (i.e. doesn't affect soundness): the VM's memory is defined to have $P$ addresses (reflected in the AIR), and in the VM's Process, we fail to execute/prove all programs that access any addresses larger than $P$. If someone were to use a "malicious" prover to generate a proof for a Rust program that accesses a memory larger than $P$, well it would simply be a different program being proven on the VM (such that the Rust code doesn't represent what the VM executes). Again - this is a front-end issue only.

So concretely speaking, the compiler could blindly use u64 for memory addresses, and the program would fail to execute once the user tries to run it on the VM.

This might be worth discussing in a separate issue?

Denys Zadorozhnyi · Answer 3 · Wed Apr 10 2024 22:01:08 GMT+0800 (China Standard Time)

Where I've landed on this so far, is essentially treating felts/words as second-class citizens in Rust. In essence, we would require lifting/lowering between the Rust and Miden representations at call site boundaries where those types are present, namely with specific standard library and Miden SDK functions implemented in Miden Assembly that work in terms of them. Internally in the IR, we already have two pointer types (one of which represents Miden "native" pointers as a triple, described earlier, and which has up until now not been used). We would add two new IR types that correspond to "native" Miden felt/word representations. We could then emit code in the frontend to do the lifting/lowering in terms of loads/stores using the appropriate pointer type. A load/store of a felt/word from a "normal" pointer would be handled using Rust semantics, while a load/store from a "native" pointer would be handled using Miden semantics. To "lower" a word in Rust to Miden, we would first emit a cast of the "normal" pointer to the "native" pointer type (with the pointee type also being cast from "normal" to "native"), then a load. To "lift" a word from Miden to Rust, we would do the exact same thing, but with a cast going the opposite direction. It may be necessary to emit code to load/store each individual element of the word, rather than relying on the backend to handle that based on the type information, but for the moment I'm expecting that leaving that to the backend will result in more efficient code, so I think that would be preferable.

I'm not sure if I get how it would work for the array access ops. Let's see how this would work when passing a pointer to a felt array from Rust to Miden. On the Rust side, it'd be an array for a fixed-size list or a Vec for a size only known at runtime. Let's focus on array. We pass the pointer to the Miden side, where it gets filled using the native Miden felt semantics. When we access this array on the Rust side, we need to make sure that all subsequent access ops get converted to the native Miden felt semantics. The Rust access ops look like this:

    call get_felt_array_from_masm
    local.set 1
    local.get 1
    f64.load offset=8
    local.get 1
    f64.load offset=16
    call $miden_prelude::intrinsics::felt::extern_add

for the following Rust code:

    let arr = get_felt_array_from_masm();
    arr[1] + arr[2]

So, we need to "mark" the received pointer as from the Miden side and transform all the subsequent f64.load ops accordingly. However, I'm afraid that the pointer can "escape" the mark (via slice?) so we might not be able to track all the subsequent access ops. In this case, we can transform the memory layout of the returned array to be in the Rust expected form.

EDIT: To clarify, I like the single pointer conversion above. As for the array access, as we discussed on the call, we're going to convert between Miden and Rust memory layout of the whole array memory at call sites.

Paul Schoenfelder · Answer 4 · Thu Apr 11 2024 01:58:22 GMT+0800 (China Standard Time)

...would using f32 as the type to represent a filed element solve the issue? On the Rust side, f32 would require 4 bytes to represent and if we can force f32 values to be aligned on 4-byte boundaries, they would naturally fall into elements on Miden side.

This was actually my gut reaction when I first started thinking about this, but it is actually solving the wrong problem. The issue isn't that our choice of representation for felts in Rust has incorrect size; instead, the issue is that in order to represent an arbitrary field element value, Rust requires 8 bytes of memory, double the machine word size of 32-bits. We are forced to treat Miden as a 32-bit machine in Rust in order to map Rust's memory model on to Miden's memory model, primarily for two reasons:

We can't treat Miden as a 64-bit machine, because felts are too small to represent the full 64-bit range, and we want Rust's machine word type to correspond to a type with "native" support in Miden for efficiency reasons, leaving us with felt or u32 as options.
Because Miden's memory is represented in terms of field element cells (albeit only word-addressable), emulating byte-addressable memory on top of that requires two things:
- Each memory cell must correspond to a fixed number of bytes, N, where N is equal to, or a multiple of, the machine word size in bytes. This makes it as simple and efficient as possible to compute the location of a Rust address in Miden's address space. Field elements have no specific bitwise representation, but can be represented as a two's complement encoded integer using 8 bytes; despite this, we cannot choose N = 8, as field elements cannot represent the full 64-bit range. Thus N must be less than 8.
- Bitwise operations on the machine word type, so that we can extract specific bytes from a given memory cell, and reconstruct a value from bytes obtained from two adjacent cells.

We chose u32 because Miden provides "native" u32 bitwise operations, and treating each memory cell (a felt) as a 4 byte (32-bit) chunk, allows us to:

Use the natural alignment of u32 and felt values, i.e. Rust will lay things out in memory (by default) such that values of u32 type, or larger, are minimally-aligned to an address in Rust that corresponds to the start of a memory cell in Miden. This means that most load/store ops on values of those types are aligned, and translate directly to Miden's load/store ops. Unaligned loads/stores require complex sequences of instructions to handle joining/splitting a value across multiple memory cells, so we want to avoid that as much as possible.
Up until now, we also were thinking that this would allow a Miden word and its equivalent representation in Rust, to have the exact same layout in Miden's memory, thus making operating on felts/words as efficient in Rust as in Miden.
Ensure that any representable Rust pointer value, corresponds to an actual address in Miden's address space. This is trivially true for u32, because Miden's natural pointer type is u32, and each address corresponds to 16 bytes of storage available to Rust, so we actually have more storage than we can address (from Rust).

At some point we unintentionally conflated the way Miden stores felt/u32 values, and how Rust does. I think this probably occurred because last year we got hung up on trying to figure out how to represent felts in Wasm, and/or how to convey custom type information in Wasm. This led us to choosing f64, without really considering the implications of that choice in terms of storage. The reason we chose f64 rather than f32 was because subconsciously we knew that we needed 64 bits to represent the full field, and 32 bits weren't going to cut it - but we failed to follow through all the logical implications of that.

Would it be possible instead to "simply" make memory u64 addressable (by wrapping all values over )? This is what SP1 does: they use the baby bear field (), and their memory is u32-addressable.

I ended up addressing elements of both your and @bobbinth's comments above - but the tl;dr is that there are a variety of problems with this approach that are actually worse than the issue in question. Fundamentally, we're sort of forced to view Miden as a 32-bit machine in Rust. In theory we could address this in the VM, if memory in Miden was either byte addressable, or "mixed-mode", i.e. byte addressable but with dedicated instructions that operate on the address space in word/felt-sized chunks (to allow for the more efficient constraints/proofs that were the driving motivation behind making Miden's memory model word-oriented in the first place). This would eliminate the issue, but I do not know how feasibile that is, nor what the implications would be in terms of breaking changes to existing code, etc. It would certainly be a major change to how memory works in Miden.

Setting that aside, an even more significant issue with that is that we currently rely on Rust emitting Wasm modules, which is the actual input to the compiler that we lift into Miden IR and lower to Miden Assembly. The supported Wasm targets in Rust are all wasm32 targets, i.e. 32-bit machine targets. Even if we abandoned wasm32 for a custom target of our own making in Rust, we lose a lot as a result - a ton of tooling that we've been building around, as well as the benefits of providing a Wasm frontend, which allows any language with a Wasm backend to target Miden with minimal effort.

Although this creates a problem for Rust, given that Rust's memory model treats all 2^64 addresses as unique, whereas the VM would consider address P and P + i as the same.

The problem isn't really addressing, but actually data layout. A felt requires 8 bytes of storage to represent, but because it cannot represent the full 64-bit range, we cannot use a felt to represent 64-bits of storage. I go deeper into the details of this above, but data layout/representation is really the crux of the issue.

the VM's memory is defined to have P addresses (reflected in the AIR)

I don't believe this is actually true in practice - Miden Assembly requires all memory addresses to be < 2^32, and Miden's memory is documented to be 2^32 in size (where each unique address is a word). It may be the case that constraints in AIR don't reflect these limits, which perhaps are handled by constraints on the individual load/store ops (or maybe those just expand to something like u32assert then an unconstrained load, I'd have to look). Regardless, the end result is that Miden has a 32-bit address space, not a 64-bit one (or I guess a field-sized one, if I understand your proposal correctly). As mentioned above though, the problem isn't actually the address space, but how we map a byte-addressable memory model on to Miden's word/felt-addressable memory.

So concretely speaking, the compiler could blindly use u64 for memory addresses, and the program would fail to execute once the user tries to run it on the VM.

I think it is essential that we preserve the semantics of Rust's abstract machine when translated to Miden, or we risk all kinds of things breaking in unexpected ways. Just as a trivial example, the Rust abstract machine makes promises about the bitwise representation of pointers, and trying to store a 64-bit pointer in a field element would violate some of them by implicitly truncating some bits, which may be used for tagging or some other purpose. Bottom line, from Rust's perspective, everything has a physical representation in terms of bytes, so we have to work within those constraints.

Bobbin Threadbare · Answer 5 · Thu Apr 11 2024 02:23:26 GMT+0800 (China Standard Time)

The issue isn't that our choice of representation for felts in Rust has incorrect size; instead, the issue is that in order to represent an arbitrary field element value, Rust requires 8 bytes of memory, double the machine word size of 32-bits.

But that's what I meant with f32 - Rust can treat a field element as a 4-byte value rather than an 8-byte value. Assuming that these values are aligned on 4-byte boundaries, if we write an f32 into some memory location in Rust, it would be equivalent to writing a single field element in Miden, and unless I'm missing something, things should work transparently.

Paul Schoenfelder · Answer 6 · Thu Apr 11 2024 06:59:11 GMT+0800 (China Standard Time)

For posterity, @bobbinth's proposed solution above relies on treating felts specially in the compiler, so that we implicitly preserve the "hidden bits" of a cell in Miden's memory. Because our mapping of Rust's abstract machine to Miden treats each cell as a 32-bit chunk of raw memory, when each cell is in fact able to represent the full field element range, the "hidden bits" correspond to the range of unused values of the field above 2^32 which could theoretically be used, but which are not visible to Rust.

When we translate load/store instructions for 32-bit values, we mask out bits above 2^32 when issuing those ops to ensure that we are loading/storing precisely 32 bits. The proposal here would rely on treating felts as 32 bits from Rust's perspective, but not masking load/store ops for values of felt/word type. In addition, we would need to dynamically assert that the load/store address has an alignment which is a multiple of 4, so that loads/stores of felts are always naturally aligned.

I need to explore this as an option - it has two major downsides, which is that we end up with special types in the IR which do not adhere to the normal rules that all other types do, and require their own distinct handling; and it relies on behavior which is completely implicit (the "hidden bits" which are an artifact of emulating Rust's memory model in Miden). On the flip side, it would allow us to avoid paying the cost of inefficient felt/word encodings in code translated from Rust - which could be huge. The risk is that our special casing doesn't cover all the edge cases, resulting in data corruption (a felt being silently truncated or extended). Once I do some more exploration of all the implications, I'll be able to say with more clarity if this will work or not, but if it works, it might end up being the better solution, despite the downsides.

/cc @greenhat

Denys Zadorozhnyi · Answer 7 · Thu Apr 11 2024 16:16:17 GMT+0800 (China Standard Time)

I need to explore this as an option - it has two major downsides, which is that we end up with special types in the IR which do not adhere to the normal rules that all other types do, and require their own distinct handling; and it relies on behavior which is completely implicit (the "hidden bits" which are an artifact of emulating Rust's memory model in Miden). On the flip side, it would allow us to avoid paying the cost of inefficient felt/word encodings in code translated from Rust - which could be huge. The risk is that our special casing doesn't cover all the edge cases, resulting in data corruption (a felt being silently truncated or extended). Once I do some more exploration of all the implications, I'll be able to say with more clarity if this will work or not, but if it works, it might end up being the better solution, despite the downsides.

My gut feeling says that it might work. I don't foresee any issue at Rust<->MASM boundary. In the Rust land, I'd test the handling of complex structs that hold other types besides felts and throw some arrays into the mix.

Denys Zadorozhnyi · Answer 8 · Tue Jul 09 2024 23:22:39 GMT+0800 (China Standard Time)

@bitwalker I've finished #238 and intend to start this one tomorrow. Any tips and/or other things I should take care of here?