pick nonce based on atx chain instead of the latest for identity
dshulyak opened this issue · comments
when you sync a node you will see errors like that
2024-03-14T06:59:58.749Z INFO node.sync remaining ballots are rejected in the layer {"sessionId": "f411b333", "errmsg": "getting ballots: batch failure: 206106e59f=fetch ballots: batch failure: 1327b292db=validation reject: proof contains incorrect VRF signature. beacon: 9e188385, epoch: 4, counter: 0, vrfSig: 91ea700ce79adc13e97333400fbd520b4e5517819e6c96c29f2f47c35b0a45ccfe961a4ae263288b113553e50ba0156fe238e61b87bd3094e5a674ddcf71ab62734ca4d754b33eee7fb629712c577f0a", "layer_id": 19061, "name": "sync"}
the main part here is
proof contains incorrect VRF
which only can be for two reasons:
- somebody actually computed incorrect vrf - very unlikely
- we are using wrong nonce from equivocating atx for this validation
this 2nd root cause can be eliminated if we look up vrf nonce based on atx chain, and not just latest nonce for this identity.
reasonable way would be to copy nonce from previous atx when new atx is received, this will also simplify nonce lookups on startup.
more precisely, this is the query that we use to get nonce for identity:
select nonce from atxs
where pubkey = ?1 and epoch < ?2 and nonce is not null
order by epoch desc
limit 1;
if identity equivocated this will not return correct nonce for identity, in the end all of this should be prunable, but now we need to fix this issue, as it will lead to stuck sync if honest identities references ballot from equivocated identity.