"Polluting" the trace

Question

"Polluting" the trace

pgrinaway opened this issue 2 months ago · comments

For various reasons, we are interested in knowing how many (perhaps a lower bound) columns would be "polluted" (that is, contain) a value that was used in the VM. Clearly, the compiler will do various optimizations that make it difficult to tell where the value will appear, but for instance, if I read in a private input, then commit it, it surely appears in at least 1 column, right? Is there any way for me to dig a little deeper into this and examine the trace?

linear · Answer 1 · Wed Mar 27 2024 06:30:42 GMT+0800 (China Standard Time)

ZKVM-394 "Polluting" the trace

Erik Kaneda · Answer 2 · Wed Mar 27 2024 13:29:11 GMT+0800 (China Standard Time)

What do you mean by "if I read in a private input, then commit it"? For me, this reads like "I read in a private input and commit it by writing the private input to the public journal" but I don't think this is what you mean @pgrinaway. Could you clarify?

Patrick Grinaway · Answer 3 · Thu Mar 28 2024 01:01:42 GMT+0800 (China Standard Time)

Sorry, I think I wrote that a little too fast. Essentially I did mean what you said--I am just trying to guarantee that the value appears in the trace somewhere, and that the verifier can check that it appears.

Erik Kaneda · Answer 4 · Thu Mar 28 2024 01:50:56 GMT+0800 (China Standard Time)

yes, but it will appear in the trace even if you don't write it to the journal. the trace keeps track of the entire state of the risc-v emulator and the program inputs are accessed through system calls (using the ecall instruction)

Frank Laub · Answer 5 · Thu Mar 28 2024 02:45:54 GMT+0800 (China Standard Time)

To be clear, the verifier won't see the execution trace. If a verifier wants to check on some value, then the guest needs to write this value to the journal using env::commit. This makes the value public.

Patrick Grinaway · Answer 6 · Thu Mar 28 2024 03:19:20 GMT+0800 (China Standard Time)

The remaining question for me is if it is possible to easily know how many columns of the trace were 'touched' by this value. I know this is affected by many things including the compiler used to generate the ELF binary, but could one establish a lower bound higher than 1 column?

Erik Kaneda · Answer 7 · Thu Mar 28 2024 06:47:58 GMT+0800 (China Standard Time)

The remaining question for me is if it is possible to easily know how many columns of the trace were 'touched' by this value. I know this is affected by many things including the compiler used to generate the ELF binary, but could one establish a lower bound higher than 1 column?

I don't think it's easy to know how many columns were touched by this value as it's dependent on the circuit. I think this will become clear when we publish our circuit code in the future.

Also, I'm curious about why are you asking this... Is there something security-related that you're wondering about?

Frank Laub · Answer 8 · Thu Mar 28 2024 07:09:40 GMT+0800 (China Standard Time)

@pdg744 @jbruestle is this something you can respond to?

Jeremy Bruestle · Answer 9 · Thu Mar 28 2024 07:15:41 GMT+0800 (China Standard Time)

I guess I don't understand the purpose behind the question. In general, the trace is only ever held by the prover, and the resulting proof is zero knowledge with respect to the trace (i.e. the proof doesn't reveal anything in the trace), so I'm not sure why it matters how many times a value appears in a trace or even if it appears at all (since clearly optimizations may modify the actual data representation). I suppose if one really wanted to know how a given input influences a trace, you could prove multiple times with different values for the input and do a diff of the traces, but I'm still unsure why this would be of any use.

Patrick Grinaway · Answer 10 · Thu Apr 04 2024 07:49:05 GMT+0800 (China Standard Time)

Sorry for the delayed response, and thanks for all the answers! I will write up a little more details as to the actual goal here--I don't think I have any direct security concerns about the proof itself. But I do have one more code question, if you don't mind: is there any way to get Receipt inside the guest? That is, I'd like to parse it inside the VM to prove things about it (besides its verification and input/output).

Thanks again!

Erik Kaneda · Answer 11 · Thu Apr 04 2024 08:09:35 GMT+0800 (China Standard Time)

@pgrinaway here's a great starting place: https://www.risczero.com/blog/proof-composition seeing that this is a non-issue, i'm going to close this. We also welcome questions on our discord https://discord.com/invite/risczero so feel free to ask questions there as well. Closing