[Bug] Malicious validator can send fake block locator and halt the network(node is syncing)
feezybabee opened this issue · comments
https://hackerone.com/reports/2481394
Summary:
Malicious validator send fake block locator and halt the network(node is syncing)
Steps To Reproduce:
- Use this branch.
git clone git@github.com:ghostant-1017/mysnarkOS.git && git checkout attack/block-locator
- Start the devnet
cd snarkos && ./devnet
with 4 validators, 0 clients - Observer the logs, we will find the
2024-04-28T05:47:13.565818Z DEBUG Skipping batch proposal (node is syncing) 2024-04-28T05:47:14.491356Z INFO @@@@@Recevied primary ping from '127.0.0.1:5000'..., height: 100
Proof-of-Concept (PoC)
- Assume
current_height = 100
, malicious validators will forge block_locators at height = 200 - Honest validators will find themselves beind, and set is_synced true
- All validators will skip proposal since they are syncing,
DEBUG Skipping batch proposal (node is syncing)
Impact
Malicious validator send fake block locator and halt the network(node is syncing)
@niklaslong can you comment on this, since I believe we discussed this previously. Shouldn't the continued random sampling of peers ensure that a validator does not get stuck on malicious block_locators for too long?
@niklaslong can you comment on this, since I believe we discussed this previously. Shouldn't the continued random sampling of peers ensure that a validator does not get stuck on malicious block_locators for too long?
It seems that the validator node will not sample peers to disconnect.
@ghostant-1017 If I'm reading this correctly, a single malicious validator is sufficient to reproduce the behaviour? I notice the height is also 0, is this a special case or is it reproducible with a non-empty chain state? Sync logic should contain redundancies (granted not up to quorum) against this type of attack already.
@niklaslong You are right, single malicious validator is sufficient.
You can use that branch in the report, it's reproducible with a non-empty chain state.
Steps:
- ./devnet with 4 validators 0 client
- stop node0 and wait for other 3 nodes produce blocks
- restart node0, the network can not produce blocks anymore.
@feezybabee this is a valid P1, especially for preparing the clear reproduction case. Though I'll note for context its not a very unique P1 because the topic has been discussed internally and there's already a // TODO (howardwu): Change this
in the code right above where the fix should be.
@feezybabee this is a valid P1, especially for preparing the clear reproduction case. Though I'll note for context its not a very unique P1 because the topic has been discussed internally and there's already a
// TODO (howardwu): Change this
in the code right above where the fix should be.
The // TODO
in the code is not the solution to solve this issue.
I'm concerned about utilizing the concept of is_synced
being set to true
at all. In reality, there's no way any validator can ever be confident that it is_synced
because at any given millisecond later, that will not be true. And, as long as validator progress is halted and is_synced
is used as a semaphore, the potential for validator starvation will always exist--unless the concept of is_synced
is hard-removed entirely.