AleoNet / snarkOS

A Decentralized Operating System for ZK Applications

Home Page:http://snarkos.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] Malicious validator can send fake block locator and halt the network(node is syncing)

feezybabee opened this issue · comments

https://hackerone.com/reports/2481394

Summary:

Malicious validator send fake block locator and halt the network(node is syncing)

Steps To Reproduce:

  1. Use this branch.git clone git@github.com:ghostant-1017/mysnarkOS.git && git checkout attack/block-locator
  2. Start the devnet cd snarkos && ./devnet with 4 validators, 0 clients
  3. Observer the logs, we will find the 2024-04-28T05:47:13.565818Z DEBUG Skipping batch proposal (node is syncing) 2024-04-28T05:47:14.491356Z INFO @@@@@Recevied primary ping from '127.0.0.1:5000'..., height: 100

Logs:
image

Proof-of-Concept (PoC)

  1. Assume current_height = 100, malicious validators will forge block_locators at height = 200
  2. Honest validators will find themselves beind, and set is_synced true
  3. All validators will skip proposal since they are syncing, DEBUG Skipping batch proposal (node is syncing)

Impact

Malicious validator send fake block locator and halt the network(node is syncing)

@niklaslong can you comment on this, since I believe we discussed this previously. Shouldn't the continued random sampling of peers ensure that a validator does not get stuck on malicious block_locators for too long?

@niklaslong can you comment on this, since I believe we discussed this previously. Shouldn't the continued random sampling of peers ensure that a validator does not get stuck on malicious block_locators for too long?

It seems that the validator node will not sample peers to disconnect.

@ghostant-1017 If I'm reading this correctly, a single malicious validator is sufficient to reproduce the behaviour? I notice the height is also 0, is this a special case or is it reproducible with a non-empty chain state? Sync logic should contain redundancies (granted not up to quorum) against this type of attack already.

@niklaslong You are right, single malicious validator is sufficient.
You can use that branch in the report, it's reproducible with a non-empty chain state.
Steps:

  1. ./devnet with 4 validators 0 client
  2. stop node0 and wait for other 3 nodes produce blocks
  3. restart node0, the network can not produce blocks anymore.

@feezybabee this is a valid P1, especially for preparing the clear reproduction case. Though I'll note for context its not a very unique P1 because the topic has been discussed internally and there's already a // TODO (howardwu): Change this in the code right above where the fix should be.

@feezybabee this is a valid P1, especially for preparing the clear reproduction case. Though I'll note for context its not a very unique P1 because the topic has been discussed internally and there's already a // TODO (howardwu): Change this in the code right above where the fix should be.

The // TODO in the code is not the solution to solve this issue.

I'm concerned about utilizing the concept of is_synced being set to true at all. In reality, there's no way any validator can ever be confident that it is_synced because at any given millisecond later, that will not be true. And, as long as validator progress is halted and is_synced is used as a semaphore, the potential for validator starvation will always exist--unless the concept of is_synced is hard-removed entirely.