privacy-scaling-explorations / zkevm-circuits

Home Page:https://privacy-scaling-explorations.github.io/zkevm-circuits/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Integration Tests] Evm & Super circuit test (ERC20 Transfer) fail for sub_mock_prover

AronisAt79 opened this issue · comments

What command(s) is the bug in?

No response

Describe the bug

https://github.com/privacy-scaling-explorations/zkevm-circuits/actions/runs/7113665859
failures:
    sub_mock_prover::serial_test_evm_circuit_erc20_openzeppelin_transfer_fail
    sub_mock_prover::serial_test_evm_circuit_erc20_openzeppelin_transfer_succeed
    sub_mock_prover::serial_test_evm_circuit_multiple_erc20_openzeppelin_transfers
    sub_mock_prover::serial_test_evm_circuit_multiple_transfers_0
    sub_mock_prover::serial_test_super_circuit_erc20_openzeppelin_transfer_fail
    sub_mock_prover::serial_test_super_circuit_erc20_openzeppelin_transfer_succeed
    sub_mock_prover::serial_test_super_circuit_multiple_erc20_openzeppelin_transfers
    sub_mock_prover::serial_test_super_circuit_multiple_transfers_0

thread 'sub_mock_prover::serial_test_evm_circuit_erc20_openzeppelin_transfer_fail' panicked at 'circuit fixed columns are not constant for different witnesses', /github/runner/_work/zkevm-circuits/zkevm-circuits/integration-tests/src/integration_test_circuits.rs:351:13

Concrete steps to reproduce the bug. If it's able reproduce via testool, please share test_id from jenkins report

Manually trigger the integration tests GH Action worfklow on main with option: sub_mock_prover

Further error description:
failure happens due to fn test_variadic; for evm tests, different witness inputs result in non equal circuit fixed-columns.
Differences in "fixed" matrix happen always at rows 1 and 2. Cell values are shown in attached file test_mock_variadic_EVM.log . (for each row, a list is printed with elements: (column index, value of cell for fixed, value of cell for prev_fixed)).

for example:
ROW 1: cells fixed[1][10-16] are assigned value 0x0000000000000000000000000000000000000000000000000000000000000009 while prev_fixed cells are unassigned

ROW 2: cells fixed[2][10-14]
are assigned values 0x000000000000000000000000000000000000000000000000000000000000000[1-5] while prev_fixed cells are unassigned

test_mock_variadic_EVM.log]

ps. error does not occur when test_variadic runs on witness input from the same block in sequence

I assume you swapped "Column" by "Row" in your description, so I will assume the following:

  • Fixed column 1 and 2 present differences at rows 10-16 and 10-14 respectively.

So first we need to figure out what column 1 and 2 are, and then what those rows in those columns contain. A straight forward way to do that would be to use a patched version of halo2 that adds the following code to assign_fixed at https://github.com/privacy-scaling-explorations/halo2/blob/73408a140737d8336490452193b21f5a7a94e7de/halo2_proofs/src/circuit.rs#L335

if column.index == 1 && offset = 10 {
    panic!("Assignment to [1][10]");
}

Then we run the test with the debug profile and with RUST_BACKTRACE=1 and we will get a backtrace that shows us where the fixed cell is assigned; from that it should be easy to figure out the column, and what that assignment is supposed to mean.

Nevertheless I didn't go to that route, instead I did the following:
Column indexes are assigned sequentially, so we can see the configuration of the EvmCircuit and count which columns are 1 and 2. The configure / configure_with_params is the method that creates the columns:

fn configure_with_params(meta: &mut ConstraintSystem<F>, params: Self::Params) -> Self::Config {
let tx_table = TxTable::construct(meta);
let rw_table = RwTable::construct(meta);
let bytecode_table = BytecodeTable::construct(meta);
let block_table = BlockTable::construct(meta);
let q_copy_table = meta.fixed_column();
let copy_table = CopyTable::construct(meta, q_copy_table);
let keccak_table = KeccakTable::construct(meta);

Now we check each table, who many fixed columns it has to see which are columns 1 and 2.
We find the following:

  • 0 - TxTable.tag
  • 1 - BlockTable.tag
  • 2 - BlockTable.tag

pub struct BlockTable {
/// Tag
pub tag: Column<Fixed>,
/// Index
pub index: Column<Fixed>,

So now we now that the affected columns are BlockTable.tag and BlockTable.index.

Next we check the code that assigns the values to the BlockTable:

config.block_table.load(&mut layouter, &block.context)?;


for (offset, row) in iter::once([Value::known(F::ZERO); 4])
.chain(block.table_assignments::<F>())

Row 0 has zeros, and the next ones have the output of:
pub fn table_assignments<F: Field>(&self) -> Vec<[Value<F>; 4]> {

Now we count up to row 10 and find this:
let len_history = self.history_hashes.len();
self.history_hashes
.iter()
.enumerate()
.map(|(idx, hash)| {
[
Value::known(F::from(BlockContextFieldTag::BlockHash as u64)),
Value::known((self.number - len_history + idx).to_scalar().unwrap()),
Value::known(WordLoHi::from(*hash).lo()),
Value::known(WordLoHi::from(*hash).hi()),
]
})
.collect()

And there we find the assignment at row 10 and onwards on columns index, value (which are columns 1 and 2)

We observe that on tag the assigned value is BlockContextFieldTag::BlockHash as u64 which is 9, and that matches the test logs. And on the index the assigned values are self.number - len_history + idx which would be sequential numbers, this also matches the logs.

We see that this assignment loops over history_hashes:

self.history_hashes
.iter()
.enumerate()
.map(|(idx, hash)| {

So we found the root of the problem. The vector history_hashes has variable length, and this leads to variable assignment to the fixed tag and index columns of the BlockTable. This issue was triggered by the integration tests because we start proving early blocks, so there are not yet 256 previous blocks.

The solution is to make sure that the history_hashes of the block is always padded to the same length (256), and that the padded entries have a number field set such that self.number - len_history + idx in the assignment loop always returns the sequence 0..255

This issue is now on hold. A fix is required to rerun integration tests. See #1787
@ed255