Istanbul Byzantine Fault Tolerance

Question

Istanbul Byzantine Fault Tolerance

yutelin opened this issue 7 years ago · comments

yutelin commented 7 years ago

Change log:

Aug 8, 2017:
- Add gossip network
Jul 24, 2017:
- Add block locking mechanism.
- Performance/bug fixes.
Jun 26, 2017:
- Add extraData tools.
- Update notes and discussions on zero gas price transaction
Jun 22, 2017:
- Initial proposal of Istanbul BFT consensus protocol.

Pull request

ethereum/go-ethereum#14674

Istanbul byzantine fault tolerant consensus protocol

Note, this work is deeply inspired by Clique POA. We've tried to design as similar a mechanism as possible in the protocol layer, such as with validator voting. We've also followed its EIP style of putting the background and rationale behind the proposed consensus protocol to help developers easily find technical references. This work is also inspired by Hyperledger's SBFT, Tendermint, HydraChain, and NCCU BFT.

Terminology

Validator: Block validation participant.
Proposer: A block validation participant that is chosen to propose block in a consensus round.
Round: Consensus round. A round starts with the proposer creating a block proposal and ends with a block commitment or round change.
Proposal: New block generation proposal which is undergoing consensus processing.
Sequence: Sequence number of a proposal. A sequence number should be greater than all previous sequence numbers. Currently each proposed block height is its associated sequence number.
Backlog: The storage to keep future consensus messages.
Round state: Consensus messages of a specific sequence and round, including pre-prepare message, prepare message, and commit message.
Consensus proof: The commitment signatures of a block that can prove the block has gone through the consensus process.
Snapshot: The validator voting state from last epoch.

Consensus

Istanbul BFT is inspired by Castro-Liskov 99 paper. However, the original PBFT needed quite a bit of tweaking to make it work with blockchain. First off, there is no specific "client" which sends out requests and waits for the results. Instead, all of the validators can be seen as clients. Furthermore, to keep the blockchain progressing, a proposer will be continuously selected in each round to create block proposal for consensus. Also, for each consensus result, we expect to generate a verifiable new block rather than a bunch of read/write operations to the file system.

Istanbul BFT inherits from the original PBFT by using 3-phase consensus, PRE-PREPARE, PREPARE, and COMMIT. The system can tolerate at most of F faulty nodes in a N validator nodes network, where N = 3F + 1. Before each round, the validators will pick one of them as the proposer, by default, in a round robin fashion. The proposer will then propose a new block proposal and broadcast it along with the PRE-PREPARE message. Upon receiving the PRE-PREPARE message from the proposer, validators enter the state of PRE-PREPARED and then broadcast PREPARE message. This step is to make sure all validators are working on the same sequence and the same round. While receiving 2F + 1 of PREPARE messages, the validator enters the state of PREPARED and then broadcasts COMMIT message. This step is to inform its peers that it accepts the proposed block and is going to insert the block to the chain. Lastly, validators wait for 2F + 1 of COMMIT messages to enter COMMITTED state and then insert the block to the chain.

Blocks in Istanbul BFT protocol are final, which means that there are no forks and any valid block must be somewhere in the main chain. To prevent a faulty node from generating a totally different chain from the main chain, each validator appends 2F + 1 received COMMIT signatures to extraData field in the header before inserting it into the chain. Thus blocks are self-verifiable and light client can be supported as well. However, the dynamic extraData would cause an issue on block hash calculation. Since the same block from different validators can have different set of COMMIT signatures, the same block can have different block hashes as well. To solve this, we calculate the block hash by excluding the COMMIT signatures part. Therefore, we can still keep the block/block hash consistency as well as put the consensus proof in the block header.

Consensus states

Istanbul BFT is a state machine replication algorithm. Each validator maintains a state machine replica in order reach block consensus.

States:

NEW ROUND: Proposer to send new block proposal. Validators wait for PRE-PREPARE message.
PRE-PREPARED: A validator has received PRE-PREPARE message and broadcasts PREPARE message. Then it waits for 2F + 1 of PREPARE or COMMIT messages.
PREPARED: A validator has received 2F + 1 of PREPARE messages and broadcasts COMMIT messages. Then it waits for 2F + 1 of COMMIT messages.
COMMITTED: A validator has received 2F + 1 of COMMIT messages and is able to insert the proposed block into the blockchain.
FINAL COMMITTED: A new block is successfully inserted into the blockchain and the validator is ready for the next round.
ROUND CHANGE: A validator is waiting for 2F + 1 of ROUND CHANGE messages on the same proposed round number.

State transitions:

NEW ROUND -> PRE-PREPARED:
- Proposer collects transactions from txpool.
- Proposer generates a block proposal and broadcasts it to validators. It then enters the PRE-PREPARED state.
- Each validator enters PRE-PREPARED upon receiving the PRE-PREPARE message with the following conditions:
  - Block proposal is from the valid proposer.
  - Block header is valid.
  - Block proposal's sequence and round match the validator's state.
- Validator broadcasts PREPARE message to other validators.
PRE-PREPARED -> PREPARED:
- Validator receives 2F + 1 of valid PREPARE messages to enter PREPARED state. Valid messages conform to the following conditions:
  - Matched sequence and round.
  - Matched block hash.
  - Messages are from known validators.
- Validator broadcasts COMMIT message upon entering PREPARED state.
PREPARED -> COMMITTED:
- Validator receives 2F + 1 of valid COMMIT messages to enter COMMITTED state. Valid messages conform to the following conditions:
  - Matched sequence and round.
  - Matched block hash.
  - Messages are from known validators.
COMMITTED -> FINAL COMMITTED:
- Validator appends 2F + 1 commitment signatures to extraData and tries to insert the block into the blockchain.
- Validator enters FINAL COMMITTED state when insertion succeeds.
FINAL COMMITTED -> NEW ROUND:
- Validators pick a new proposer and starts a new round timer.

Round change flow

There are three conditions that would trigger ROUND CHANGE:
- Round change timer expires.
- Invalid PREPREPARE message.
- Block insertion fails.
When a validator notices that one of the above conditions applies, it broadcasts a ROUND CHANGE message along with the proposed round number and waits for ROUND CHANGE messages from other validators. The proposed round number is selected based on following condition:
- If the validator has received ROUND CHANGE messages from its peers, it picks the largest round number which has F + 1 of ROUND CHANGE messages.
- Otherwise, it picks 1 + current round number as the proposed round number.
Whenever a validator receives F + 1 of ROUND CHANGE messages on the same proposed round number, it compares the received one with its own. If the received is larger, the validator broadcasts ROUND CHANGE message again with the received number.
Upon receiving 2F + 1 of ROUND CHANGE messages on the same proposed round number, the validator exits the round change loop, calculates the new proposer, and then enters NEW ROUND state.
Another condition that a validator jumps out of round change loop is when it receives verified block(s) through peer synchronization.

Proposer selection

Currently we support two policies: round robin and sticky proposer.

Round robin: in a round robin setting, proposer will change in every block and round change.
Sticky proposer: in a sticky proposer setting, propose will change only when a round change happens.

Validator list voting

We use a similar validator voting mechanism as Clique and copy most of the content from Clique EIP. Every epoch transaction resets the validator voting, meaning if an authorization or de-authorization vote is still in progress, that voting process will be terminated.

For all transactions blocks:

Proposer can cast one vote to propose a change to the validators list.
Only the latest proposal per target beneficiary is kept from a single validator.
Votes are tallied live as the chain progresses (concurrent proposals allowed).
Proposals reaching majority consensus VALIDATOR_LIMIT come into effect immediately.
Invalid proposals are not to be penalized for client implementation simplicity.
A proposal coming into effect entails discarding all pending votes for that proposal (both for and against) and starts with a clean slate.

Future message and backlog

In an asynchronous network environment, one may receive future messages which cannot be processed in the current state. For example, a validator can receive COMMIT messages on NEW ROUND. We call this kind of message a "future message." When a validator receives a future message, it will put the message into its backlog and try to process later whenever possible.

Optimization

To speed up the consensus process, a validator that received 2F + 1 of COMMIT messages prior to receiving 2F + 1 of PREPARE message will jump to the COMMITTED state so that it is not necessary to wait for further PREPARE messages.

Constants

We define the following constants:

EPOCH_LENGTH: Number of blocks after which to checkpoint and reset the pending votes.
- Suggested 30000 for the testnet to remain analogous to the main net ethash epoch.
REQUEST_TIMEOUT: Timeout for each consensus round before firing a round change in millisecond.
BLOCK_PERIOD: Minimum timestamp difference in seconds between two consecutive blocks.
PROPOSER_POLICY: Proposer selection policy, defaults to round robin.
ISTANBUL_DIGEST: Fixed magic number 0x63746963616c2062797a616e74696e65206661756c7420746f6c6572616e6365 of mixDigest in block header for Istanbul block identification.
DEFAULT_DIFFICULTY: Default block difficulty, which is set to 0x0000000000000001 .
EXTRA_VANITY: Fixed number of extra-data prefix bytes reserved for proposer vanity.
- Suggested 32 bytes to retain the current extra-data allowance and/or use.
NONCE_AUTH: Magic nonce number 0xffffffffffffffff to vote on adding a validator.
NONCE_DROP: Magic nonce number 0x0000000000000000 to vote on removing a validator.
UNCLE_HASH: Always Keccak256(RLP([])) as uncles are meaningless outside of PoW.
PREPREPARE_MSG_CODE: Fixed number 0. Message code for PREPREPARE message.
COMMIT_MSG_CODE: Fixed number 1. Message code for COMMIT message.
ROUND_CHANGE_MSG_CODE: Fixed number 2. Message code for ROUND CHANGE message.

We also define the following per-block constants:

BLOCK_NUMBER: Block height in the chain, where the height of the genesis block is 0.
N: Number of authorized validators.
F: Number of allowed faulty validators.
VALIDATOR_INDEX: Index of the block validator in the sorted list of current authorized validators.
VALIDATOR_LIMIT: Number of validators to pass an authorization or de-authorization proposal.
- Must be floor(N / 2) + 1 to enforce majority consensus on a chain.

Block header

We didn't invent a new block header for Istanbul BFT. Instead, we follow Clique in repurposing the ethash header fields as follows:

beneficiary: Address to propose modifying the list of validator with.
- Should be filled with zeroes normally, modified only while voting.
- Arbitrary values are permitted nonetheless (even meaningless ones such as voting out non validators) to avoid extra complexity in voting mechanics implementation.
nonce: Proposer proposal regarding the account defined by the beneficiary field.
- Should be NONCE_DROP to propose deauthorizing beneficiary as a existing validator.
- Should be NONCE_AUTH to propose authorizing beneficiary as a new validator.
- Must be filled with zeroes, NONCE_DROP or NONCE_AUTH
mixHash: Fixed magic number 0x63746963616c2062797a616e74696e65206661756c7420746f6c6572616e6365 for Istanbul block identification.
ommersHash: Must be UNCLE_HASH as uncles are meaningless outside of PoW.
timestamp: Must be at least the parent timestamp + BLOCK_PERIOD
difficulty: Must be filled with 0x0000000000000001.
extraData: Combined field for signer vanity and RLP encoded Istanbul extra data, where Istanbul extra data contains validator list, proposer seal, and commit seals. Istanbul extra data is defined as follows:
```
 type IstanbulExtra struct {
 	Validators    []common.Address 	//Validator addresses
 	Seal          []byte			//Proposer seal 65 bytes
 	CommittedSeal [][]byte			//Committed seal, 65 * len(Validators) bytes
 }
```
Thus the extraData would be in the form of EXTRA_VANITY | ISTANBUL_EXTRA where | represents a fixed index to separate vanity and Istanbul extra data (not an actual character for separator).
- First EXTRA_VANITY bytes (fixed) may contain arbitrary proposer vanity data.
- ISTANBUL_EXTRA bytes are the RLP encoded Istanbul extra data calculated from RLP(IstanbulExtra), where RLP() is RLP encoding function, and IstanbulExtra is the Istanbul extra data.
  - Validators: The list of validators, which must be sorted in ascending order.
  - Seal: The proposer's signature sealing of the header.
  - CommittedSeal: The list of commitment signature seals as consensus proof.

Block hash, proposer seal, and committed seals

The Istanbul block hash calculation is different from the ethash block hash calculation due to the following reasons:

The proposer needs to put proposer seal in extraData to prove the block is signed by the chosen proposer.
The validators need to put 2F + 1 of committed seals as consensus proof in extraData to prove the block has gone through consensus.

The calculation is still similar to the ethash block hash calculation, with the exception that we need to deal with extraData. We calculate the fields as follows:

Proposer seal calculation

By the time of proposer seal calculation, the committed seals are still unknown, so we calculate the seal with those unknowns empty. The calculation is as follows:

Proposer seal: SignECDSA(Keccak256(RLP(Header)), PrivateKey)
PrivateKey: Proposer's private key.
Header: Same as ethash header only with a different extraData.
extraData: vanity | RLP(IstanbulExtra), where in the IstanbulExtra, CommittedSeal and Seal are empty arrays.

Block hash calculation

While calculating block hash, we need to exclude committed seals since that data is dynamic between different validators. Therefore, we make CommittedSeal an empty array while calculating the hash. The calculation is:

Header: Same as ethash header only with a different extraData.
extraData: vanity | RLP(IstanbulExtra), where in the IstanbulExtra, CommittedSeal is an empty array.

Consensus proof

Before inserting a block into the blockchain, each validator needs to collect 2F + 1 of committed seals from other validators to compose a consensus proof. Once it receives enough committed seals, it will fill the CommittedSeal in IstanbulExtra, recalculate the extraData, and then insert the block into the blockchain. Note that since committed seals can differ by different sources, we exclude that part while calculating the block hash as in the previous section.

Committed seal calculation:

Committed seal is calculated by each of the validator signing the hash along with COMMIT_MSG_CODE message code of its private key. The calculation is as follows:

Committed seal: SignECDSA(Keccak256(CONCAT(Hash, COMMIT_MSG_CODE)), PrivateKey).
CONCAT(Hash, COMMIT_MSG_CODE): Concatenate block hash and COMMIT_MSG_CODE bytes.
PrivateKey: Signing validator's private key.

Block locking mechanism

Locking mechanism is introduced to resolve safety issues. In general, when a proposer is locked at certain height H with a block B, it can only propose B for height H. On the other hand, when a validator is locked, it can only vote on B for height H.

Lock

A lock Lock(B, H) contains a block and its height, which means its belonging validator is currently locked at certain block B and height H. In the following, we also use + to denote more than and - to denote less than. For example +2/3 validators denotes more than two-thirds of validators, while -1/3 validators denotes less than one-third of validators.

Lock and unlock

Lock: A validator is locked when it receives 2F + 1 PREPARE messages on a block B at height H.
Unlock: A validator is unlocked at height H and block B when it fails to insert block B to blockchain.

Protocol (`+2/3` validators are locked with `Lock(B,H)`)

PRE-PREPARE:
- Proposer:
  - Case 1, proposer is locked: Broadcasts PRE-PREPARE on B, and enters PREPARED state.
  - Case 2, proposer is not locked: Broadcasts PRE-PREPARE on block B'.
- Validator:
  - Case 1, received PRE-PREPARE on existing block: Ignore.
    - Note: It will eventually lead to a round change, and the proposer will get the old block through synchronization.
  - Case 2, validator is locked:
    - Case 2.1, received PRE-PREPARE on B: Broadcasts PREPARE on B.
    - Case 2.2, received PRE-PREPARE on B': Broadcasts ROUND CHANGE.
  - Case 3, validator is not locked:
    - Case 3.1, received PRE-PREPARE on B: Broadcasts PREPARE on B.
    - Case 3.2, received PRE-PREPARE on B': Broadcasts PREPARE on B'.
      - Note: This consensus round will eventually get into round change since +2/3 are locked at B and which would lead to round change.
PREPARE:
- Case 1, validator is locked:
  - Case 1.1, received PREPARE on B: Broadcasts COMMIT on B, and enters PREPARED state.
    - Note: This shouldn't happen though, it should have skipped this step and entered PREPARED in PRE-PREPARE stage.
  - Case 1.2, received PREPARE on B': Ignore.
    - Note: There shouldn't be +1/3 PREPARE on B' since +2/3 are locked at B. Thus the consensus round on B' will cause round change. Validator cannot broadcast ROUND CHANGE directly here since this PREPARE message can possibly from a faulty node.
- Case 2, validator is not locked:
  - Case 2.1, received PREPARE on B: Waits for 2F + 1 PREPARE messages on B.
    - Note: Most likely it will receive 2F + 1 COMMIT messages prior to receiving 2F + 1 PREPARE messages since there are +2/3 validators being locked at B. In this case, it will jump to COMMITTED state directly.
  - Case 2.2, received PREPARE on B': Waits for 2F + 1 PREPARE message on B'.
    - Note: This consensus will eventually get into round change since +2/3 validators are locked on B and which would lead to round change.
COMMIT:
- Validator must be locked:
  - Case 1, received COMMIT on B: Waits for 2F + 1 COMMIT messages.
  - Case 2, received COMMIT on B': Shouldn't happen.

Locking cases

Round change:
- Case 1, +2/3 are locked:
  - If proposer is locked, it'd propose B.
  - Else it'd propose B', but which will lead to another round change.
  - Conclusion: eventually B will be committed by honest validators.
- Case 2, +1/3 ~ 2/3 are locked:
  - If proposer is locked, it'd propose B.
  - Else it'd propose B'. However, since +1/3 are locked at B, no validators can ever receive 2F + 1 PREPARE on B', meaning no validators can be locked at B'. Also those +1/3 locked validators will not response to B' and eventually lead to round change.
  - Conclusion: eventually B will be committed by honest validators.
- Case 3, -1/3 are locked:
  - If propose is locked, it'd propose B.
  - Else it'd propose B'. If +2/3 reach consensus on B', those locked -1/3 will get B' through synchronization and move to next height. Otherwise, there will be another round change.
  - Conclusion: it can be B or other block B' be finally committed.
Round change caused by insertion failure:
- It will fall in one of the above round change cases.
  - If the block is actually bad (cannot be inserted to blockchain), eventually +2/3 validators will unlock block B at H and try to propose a new block B'.
  - If the block is good (can be inserted to blockchain), then it would still be one of the above round change cases.
-1/3 validators insert the block successfully, but others successfully trigger round change, meaning +1/3 are still locked at Lock(B,H)
- Case 1, proposer has inserted B: Proposer will propose B' at H', but +1/3 are locked at B, so B' won't pass the consensus, which will eventually lead to round change. Other validators will either perform consensus on B or get B through synchronization.
- Case 2, proposer hasn't inserted B:
  - Case 2.1, proposer is locked: Proposer proposes B.
  - Case 2.2, proposer is not locked: Proposer will propose B' at H. The rest is the same as above case 1.
+1/3 validators insert the block successfully, -2/3 are trying to trigger round change at H.
- Case 1, proposer has inserted B: Proposer will propose B' at H', but won't pass the consensus until +1/3 get B through synchronization.
- Case 2, proposer has not inserted B:
  - Case 2.1, proposer is locked: Proposer proposes B.
  - Case 2.2, proposer is not locked: Proposer proposes B' at H. The rest is the same as above case 1.
+2/3 validators insert the block successfully, -1/3 are trying to trigger round change at H.
- Case 1, proposer has inserted B: proposer will propose B' at H', which may lead to a successful consensus. Then those -1/3 need to get B through synchronization.
- Case 2, proposer has not inserted B:
  - Case 2.1, proposer is locked: Proposer proposes B.
  - Case 2.2, proposer is not locked: Proposer proposes B' at H. Since +2/3 have B at H already, this round would cause round change.

Gossip network

Traditionally, validators need to be strongly connected in order to reach stable consensus results, which means all validators need to be connected directly to each other; however, in practical network environment, stable and constant p2p connections are hard to achieve. To resolve this issue, Istanbul BFT implements gossip network to overcome this constrain. In a gossip network environment, all validators only need to be weakly connected, which means any two validators are seen connected when either they are directly connected or they are connected with one or more validators in between. Consensus messages will be relayed between validators.

How to run

Running Istanbul BFT validators and nodes is similar to running the official node in a private chain. First of all, you need to initialize the data folder as:

geth  --datadir "/eth" init "/eth/genesis.json"

Then,
for validators:

geth --datadir "/eth" --mine --minerthreads 1 --syncmode "full"

for regular nodes:

geth --datadir "/eth"

Note on syncmode:
--syncmode "full" is required for the first set of validators to initialize a new network. Since we are using fetcher to insert blocks, if we don't set it to full mode, the fetcher cannot insert the first block. Please refer the following code in eth/handler.go.

inserter := func(blocks types.Blocks) (int, error) {
		// If fast sync is running, deny importing weird blocks
		if atomic.LoadUint32(&manager.fastSync) == 1 {
			log.Warn("Discarded bad propagated block", "number", blocks[0].Number(), "hash", blocks[0].Hash())
			return 0, nil
		}
		atomic.StoreUint32(&manager.acceptTxs, 1) // Mark initial sync done on any fetcher import
		return manager.blockchain.InsertChain(blocks)
}

The sync mode affects only if there are some existing blocks, so there is no impact for initializing a new network.

For the later joined validators, we don't need to use full mode as they can get blocks by downloader. After the first sync from peers, they will automatically switch to full mode.

Command line options

$geth help

ISTANBUL OPTIONS:
  --istanbul.requesttimeout value  Timeout for each Istanbul round in milliseconds (default: 10000)
  --istanbul.blockperiod value     Default minimum difference between two consecutive block's timestamps in seconds (default: 1)

Nodekey and validator

To be a validator, a node needs to meet the following conditions:

Its account (the address derived from its nodekey) needs to be listed in extraData's validators section.
Use its nodekey as its private key to sign consensus messages.

genesis.json

To run the Istanbul BFT chain, the config field is required, and the pbft subfield must present. Example as the following:

{
  "config": {
    "chainId": 2016,
    "istanbul": {
		"epoch": 30000,
		"policy" 0,
	}
  },
  "timestamp": "0x0",
  "parentHash": "0x0000000000000000000000000000000000000000000000000000000000000000",
  "extraData": "0x0000000000000000000000000000000000000000000000000000000000000000f89af85494475cc98b5521ab2a1335683e7567c8048bfe79ed9407d8299de61faed3686ba4c4e6c3b9083d7e2371944fe035ce99af680d89e2c4d73aca01dbfc1bd2fd94dc421209441a754f79c4a4ecd2b49c935aad0312b8410000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000c0",
  "gasLimit": "0x47e7c4",
  "mixhash": "0x63746963616c2062797a616e74696e65206661756c7420746f6c6572616e6365",
  "coinbase": "0x3333333333333333333333333333333333333333",
  "nonce": "0x0",
  "difficulity": "0x0",
  "alloc": {}
}

`extraData` tools

We've create a set of extraData coding tools in istanbul-tools repository to help developers to manually generate genesis.json.

Encoding:
Before encoding you need to define a toml file with vanity and validators fields to define proposer vanity and validator set. Please refer to example.toml for the example. The output would be a hex string which can be put into extraData field directly.

Command:

istanbul encode --config ./config.toml

Decoding:
Use --extradata option to give the extraData hex string. The output would show the following if presents: vanity, validator set, seal, and committed seal.

Command:

istanbul decode --extradata <EXTRA_DATA_HEX_STRING>

Ottoman testnet

We have setup a testnet for public testing. There are initially 4 validators and no designated faulty nodes. In the future, we want to extend it to 22 validators and setup few faulty nodes amongst them.

Run testnet node

geth --ottoman

Faulty node

We have implemented a simple faulty node that can make a validator run faulty behaviors during consensus. There are five behaviors included in this implementation:

NotBroadcast: The validator doesn't broadcast any message.
SendWrongMsg: The validator sends out messages with wrong message codes.
ModifySig: The validator modifies the message signatures.
AlwaysPropose: The validator always sends out proposals.
AlwaysRoundChange: The validator always sends ROUND CHANGE while receiving messages.
BadBlock: The validator proposes a block with bad body

Run following command to enable faulty node:

geth --istanbul.faultymode <MODE>

Where the <MODE> can be the following number:

0: Disable faulty behaviors.
1: Randomly run any faulty behaviors.
2: NotBroadcast.
3: SendWrongMsg.
4: ModifySig.
5: AlwaysPropose.
6: AlwaysRoundChange.
7: BadBlock.

Background

The idea of implementing a byzantine fault tolerance (BFT) consensus came from the challenges we faced while building blockchain solutions for banks. We chose ethereum as the baseline protocol mostly because of its smart contract capability. However, the built-in consensus, proof of work or ethash, is not the ideal choice when settlement finality and minimum latency is required.

Banking systems tend to form a private chain or consortium chain to run their applications. PBFT is ideal for these settings. These environments require a higher degree of manageability and higher throughput. In terms of scalability, validator scalability is not required. Many of the decentralization benefits of PoW in public chains become drawbacks in a private/consortium chain. On the other hand, designated validators in a PBFT environment maps well to private/consortium chains.

Remaining Tasks

Testnet: Currently the Ottoman testnet only has 4 validators. We'd like to extend it to 22 validator nodes and setup some faulty nodes amongst them (fewer than 7 faulty nodes).
Weighted round robin: This will require a redesign of the extraData field, but should be fairly straightforward.
Remove or make block period configurable: In certain setups, it may make sense to generate as many blocks as possible. Currently, the default value is 1 second. To remove this limitation, we will also need to adjust the original worker.go code.
Benchmarking and stress testing:
- Validator scalability.
- Node scalability.
- Transaction per second.
Smarter way to detect faulty proposer: A proposer can always generate empty blocks or small blocks without being acting faulty; however, this would impact the throughput of the network. We need to design better round change criteria to take into consideration those kind of performance related faulty behaviors.
Formal proof of safety and liveness.

Notes and discussions

Does it still make sense to use gas?

Yes. We still need gas to prevent infinite loops and any kind of EVM exhaustion.

Does it make sense to charge gas in a consortium chain?

The network would be vulnerable if every account has unlimited gas or unlimited transaction sending power. However, to enable so, one can run all validators with gas price flag --gasprice 0 to accept gas price at zero.

Put consensus proof in the next block?

Currently our block header can be varied in extraData depending on its source validator because of the need to put consensus proof in the block header (by each validator). One way to resolve this is to put the proof in the next block. Therefore, in the proposing stage, the proposer can select 2F + 1 of commitment signatures of the previous block and put them in the current proposed block header. However, it would require each block to have one confirmation to reach finality (not instant finality).

Proof of lock

Inspired by Tendermint. We are still considering whether to add it to this EIP. Further efficiency benefits can be realized by reusing a current proposed block in a round change situation.

Contribution

The work was initiated and open sourced by the Amis team. We're looking for developers around the world to contribute. Please feel free to contact us:

Forked repository (and original implementation branch)

https://github.com/getamis/go-ethereum/tree/feature/pbft

Clarifications and feedback

TBD

Bob Summerwill · Answer 1 · Sun Jun 25 2017 08:16:10 GMT+0800 (China Standard Time)

Fantastic work, guys!

vbuterin · Answer 2 · Mon Jun 26 2017 14:17:15 GMT+0800 (China Standard Time)

Great work!

Block insertion fails

Can you explain when block insertion might fail? I'm struggling to see why block insertion would ever fail for a valid proposal.

Return transaction fee to sender

Why not just accept zero-gasprice transactions?

We have implemented a simple faulty node that can make a validator run faulty behaviors during consensus.

Have you tried running the network with >=1/3 faulty nodes? If so, what does the result look like; what kinds of failures do you see in practice?

yutelin · Answer 3 · Mon Jun 26 2017 17:51:07 GMT+0800 (China Standard Time)

Thanks @vbuterin

Block insertion fails

Before actually inserting the block into the chain, the consensus only validates the block header. Inserting will do more checks so it can fail with other reasons.

Return transaction fee to sender

You're right. We've updated the EIP according.

testing >=1/3 faulty nodes?

Yes.

If there are more than 2/3 of faulty nodes, those faulty nodes can control the consensus. They can generate faulty blocks or keep running round change.
If there are more than 1/3 and less than 2/3 of faulty nodes, it will keep running round change and no consensus can be reached.

vbuterin · Answer 4 · Mon Jun 26 2017 19:12:34 GMT+0800 (China Standard Time)

If there are more than 1/3 and less than 2/3 of faulty nodes, it will keep running round change and no consensus can be reached.

Theoretically it's also possible to finalize two conflicting blocks, if the proposer is one of the Byzantine nodes and makes two proposals and each get 2/3 prepares+commits. Though I guess that's fairly unlikely to happen in practice and so won't appear in that many random tests.

Stefano De Angelis · Answer 5 · Mon Jun 26 2017 22:42:43 GMT+0800 (China Standard Time)

Each validator enters PRE-PREPARED upon receiving the PRE-PREPARE message with the following conditions:
Block proposal is from the valid proposer.
Block header is valid.
Block proposal's sequence and round match the validator's state.

I know the meaning of block validity, but outside the PoW this is a little bit ambiguous.
When a block is defined Valid or not without the proof-of-work?

Alexander C. · Answer 6 · Tue Jun 27 2017 02:17:00 GMT+0800 (China Standard Time)

sequence number should be greater than all pervious sequence numbers.

pervious -> previous

I like the structure, but for someone not accustomed to the terminology, 2F + 1 not defining up to section constants makes it more difficult to understand.

yutelin · Answer 7 · Tue Jun 27 2017 20:22:39 GMT+0800 (China Standard Time)

@vbuterin

Theoretically it's also possible to finalize two conflicting blocks, if the proposer is one of the Byzantine nodes and makes two proposals and each get 2/3 prepares+commits. Though I guess that's fairly unlikely to happen in practice and so won't appear in that many random tests.

Yes, I think you are right. Suppose there are f+1 faulty nodes, f+f good nodes, and the propose is among the faulty nodes. The proposer can send first f good nodes A block and second f good nodes B block. Then both groups can receive 2f+1 of prepares+commits for block A and B respectively. Thus two conflicting blocks can be finalized.

@deanstef

I know the meaning of block validity, but outside the PoW this is a little bit ambiguous. When a block is defined Valid or not without the proof-of-work?

Each validator puts 2F+1 committed seals into the extraData field in block header before inserting the block into the chain, which is seen as the consensus proof of the associated block. extraData also contains proposer seal for validators to verify the block source during consensus (same mechanism as in Clique).

@ice09
Thanks, we've updated this EIP accordingly.

Stefano De Angelis · Answer 8 · Tue Jun 27 2017 23:53:38 GMT+0800 (China Standard Time)

Each validator puts 2F+1 committed seals into the extraData field in block header before inserting the block into the chain, which is seen as the consensus proof of the associated block. extraData also contains proposer seal for validators to verify the block source during consensus (same mechanism as in Clique).

Great! I was a little confuse through Valid block and Consensus Proof, your response is helpful also for the meaning of validation in Clique. Thank you.
Nice work guys !

Ethan Buchman · Answer 9 · Thu Jun 29 2017 09:23:05 GMT+0800 (China Standard Time)

Round change timer expires.

Can you clarify when this timer starts? Is there one timer for the whole round, like in PBFT (well, in PBFT the timer starts once the client request is received), or is there a new timer at each phase (pre-prepared, prepared, etc.) as the figure seems to suggest?

Unless there is additional mechanism not described above (or perhaps I am just missing something), I think this protocol may have safety issues across round changes, as there does not seem to be anything stopping validators from committing a new block in a new round after others have committed in the previous round. This is what the "locking" mechanism in Tendermint addresses. In PBFT it's handled by broadcasting much more information during the round change. When you "blockchainify" PBFT, you can do away with this extra information if you're careful to introduce something like Tendermint's locking mechanism. I suspect that if you address these issues, you will end up with a protocol that is roughly identical (if not exactly identical) to Tendermint. Happy to discuss further and collaborate on this - great initiative!

yutelin · Answer 10 · Thu Jun 29 2017 18:55:25 GMT+0800 (China Standard Time)

@ebuchman

Can you clarify when this timer starts?

Yes, there is only one timer which is reset/triggered in every beginning of a new round.

safety issues across round changes

Yes, in some extreme cases there might be safety issues. For example, say there is only one validator which receives 2F+1 commits but all the others do not. Then that validator would insert a valid block in to its chain while others would start a new round on the same block height. Eventually that might lead to conflict blocks.. We've put locking mechanism in the remaining tasks section. And yeah, we're looking forward to collaboration with Tendermint!

kumavis · Answer 11 · Sat Jul 01 2017 03:22:19 GMT+0800 (China Standard Time)

Sticky proposer seems like it would be able to submit empty blocks or censorship transactions if it never passed through the RoundChange state. As long as they submit valid blocks, they can hold their Proposer role indefinitely.

kumavis · Answer 12 · Sat Jul 01 2017 03:27:09 GMT+0800 (China Standard Time)

Blocks in Istanbul BFT protocol are final, which means that there are no forks and any valid block must be somewhere in the main chain.

Seems like a strong claim considering there is no penalty to being a faulty node (e.g. voting on multiple forks)

yutelin · Answer 13 · Sat Jul 01 2017 09:02:41 GMT+0800 (China Standard Time)

@kumavis

Faulty sticky proposer can keep generating empty valid blocks.

Yes, sticky proposer policy can lead to this issue. We've listed "faulty propose detection" in the remaining tasks section aiming to resolve it. One possible way is to switch to round robin policy whenever a validator sees an empty block. However, sticky proposer can still hack it by generating very small block every round.

Block finality and penalty on faulty node.

Detecting faulty node deterministically is hard which makes penalize faulty nodes even harder. For simplicity, this PR doesn't dive into this topic. It might be worth looking in the follow up EIP and research.
Block finality is indeed a strong claim. In some rare case as @ebuchman pointed out, there might be safety issues. We listed it in remaining tasks section as well, and are looking to resolve it by introducing some kind of locking mechanism.

Thomas Hu · Answer 14 · Tue Jul 04 2017 06:58:04 GMT+0800 (China Standard Time)

Awesome work! Can you give us a sense of performance benchmark in terms of throughput and latency? Thanks!

yutelin · Answer 15 · Tue Jul 04 2017 09:59:35 GMT+0800 (China Standard Time)

@epoquehu

Throughput and latency

In our preliminary testing result with 4 validators setup, the consensus time took around 10ms ~ 100ms, depending on how many transactions per block. In our testing, we allow each block to contain up to 2000 transactions.
Regarding throughput, the transaction per second (TPS) ranges from 400 ~ 1200; however, there are still too many Geth factors that significantly affect the result. We are trying to fix some of them and workaround some of them as well.
More comprehensive benchmarking and stress testing is still in progress. Stay tuned!

yutelin · Answer 16 · Mon Jul 24 2017 15:11:11 GMT+0800 (China Standard Time)

Update: 68cbcf

Add block locking mechanism.
Performance/bug fixes.

Jonathan · Answer 17 · Wed Aug 02 2017 11:48:11 GMT+0800 (China Standard Time)

Is there any way to keep the nodekey (account private key) secured? Seems like it's left there unencrypted.

yutelin · Answer 18 · Tue Aug 08 2017 11:11:50 GMT+0800 (China Standard Time)

Update: 0f066fb

Add gossip network

Michael · Answer 19 · Thu Aug 10 2017 16:51:41 GMT+0800 (China Standard Time)

Great work on developing Istanbul!

One comment on "Does it still make sense to use gas?"

I've developed a testnet (using Ethermint) and modified the client to not charge gas. I wanted to bounce this idea of others to see whether this it is valid...

To avoid the infinite loop problem, the validators ensure the that smart contracts being published to the blockchain are sent from a small set of white-listed accounts.

These accounts are trusted by the consortium to only publish smart contracts that have gone through a strict review process.

I suppose in the extreme edge case that a computationally expensive slipped through and was published by mistake, then the validators stop and rollback to before the event.

Does this sound reasonable?

Appreciate any feedback on the faults with such an implementation.

Thanks.

Steven Roose · Answer 20 · Wed Jan 24 2018 18:20:37 GMT+0800 (China Standard Time)

The current implementation (as found in Quorum) breaks the concept of the "pending" block, used in several calls, but most notably in eth_getTransactionCount (PendingNonceAt in ethclient):

In Ethereum, the pending block means the latest confirmed block + all pending transactions the node is aware of. This means that directly after a transaction is sent to the node (through RPC), the transaction count (aka nonce) in the "pending" block is increased. A lot of tools, like abigen in this repo or any other tool where tx signing occurs at the application level instead of in geth, rely on this for making multiple transactions at once. After the first one, the result of eth_getTransactionCount will increase so that a valid second tx can be crafted.

With the current implementation of Istanbul, the definition of the "pending block" seem to be different. When submitting a transaction, the result for eth_getTransactionCount for the sender in the "pending" block does not change. When a new block is confirmed (not containing this tx), it does change however (while the value for "latest" doesn't). Then, on the next block confirmation, the "latest" also changes because the tx is in the confirmed block.

So this seems to mean that the "pending block" definition changed from "latest block + pending txs" to "the block that is currently being voted on". I consider this a bug; if this is done on purpose, it breaks with a lot of existing applications (all users of abigen, f.e.) and should be reconsidered.

I originally reported about this issue in the Quorum repo, but there doesn't seem to be a good place to report bugs in Istanbul other than here.

Matthew Halpern · Answer 21 · Fri Feb 02 2018 22:33:59 GMT+0800 (China Standard Time)

I'm sorry to disrupt the technical discussion here with a non-technical question: What is the intention for including this in the EIP repository? In particular I was wondering:

(1) Is this proposal seeking public protocol adoption (it seems private chain focused, really at extending quorum with the aims of also moving upstream to geth)?
(2) Does the scope of EIPs in this repository extend beyond public chain protocol improvements?

renuseabhaya · Answer 22 · Fri Apr 20 2018 16:06:29 GMT+0800 (China Standard Time)

I have used set of extraData coding tools in istanbul-tools repository to manually generate genesis.json & defined toml file too, but when i starts nodes, it throws error as "Failed to decode message from payload", err="unauthorized address"

Chaitanya Varma Konduru · Answer 23 · Wed May 02 2018 04:35:55 GMT+0800 (China Standard Time)

Fantastic work

Anderson de Souza · Answer 24 · Fri Jun 22 2018 22:29:11 GMT+0800 (China Standard Time)

Thank you guys very much for this great contribuition.
I would like to know about the progress on it.

michaelkunzmann-sap · Answer 25 · Mon Jul 23 2018 08:05:48 GMT+0800 (China Standard Time)

@renuseabhaya I had the same issue. My problem was that with Istanbul, you do not use a "regular" account (meaning, an account that you generate using geth account new) to makes nodes validators. You need to use the node key and create an account from the node key.

@yutelin Can you explain what the rationale was behind using an account address, derived from the node key, to identify validators instead of using the regular enode ID that is already being used for identifying nodes?

yutelin · Answer 26 · Mon Jul 23 2018 11:34:01 GMT+0800 (China Standard Time)

@michaelkunzmann-sap enode id is from node key.

michaelkunzmann-sap · Answer 27 · Tue Jul 24 2018 01:45:39 GMT+0800 (China Standard Time)

@yutelin Yes, correct. So currently we are using

istanbul.propose("0x23971dab0b29c27fa0de9226c45bef04d9f39156", true)

Where 0x23971dab0b29c27fa0de9226c45bef04d9f39156 is the "address" of the node to be permitted. As far as I understand, this address does not represent a regular account like we create with geth account new, since it is derived from the node key:

node_address = address(pub(node_key))

Since the enode id is also derived from the private node key (in its original purpose), is it possible to use the enode id instead of the address? This would save the extra step of generating an address from node key.

istanbul.propose("6f8a80d14311c39f35f516fa664deaaaa13e85b2f7493f37f6144d86991ec012937307647bd3b9a82abe2974e1407241d54947bbb39763a4cac9f77166ad92a0", true)

drandreaskrueger · Answer 28 · Tue Aug 07 2018 16:51:08 GMT+0800 (China Standard Time)

Ottoman testnet
We have setup a testnet for public testing. There are initially 4 validators and no designated faulty nodes. In the future, we want to extend it to 22 validators and setup few faulty nodes amongst them.
Run testnet node
geth --ottoman

I have tried that with the newest geth

geth version
Version: 1.8.14-unstable
Architecture: amd64
Go Version: go1.10.3
Operating System: linux

but I get a

flag provided but not defined: -ottoman

drandreaskrueger · Answer 29 · Tue Aug 07 2018 16:57:31 GMT+0800 (China Standard Time)

So it is not yet part of vanilla geth? Only quorum?

In quorum the switch --ottoman is recognized:

geth_quorum --ottoman

WARN [08-07|09:50:51] No etherbase set and no accounts found as default 
INFO [08-07|09:50:51] Starting peer-to-peer node               instance=Geth/v1.7.2-stable-df4267a2/linux-amd64/go1.9.3
INFO [08-07|09:50:51] Allocated cache and file handles         database=~/.ethereum/ottoman/geth/chaindata cache=128 handles=1024
INFO [08-07|09:50:51] Writing custom genesis block 
INFO [08-07|09:50:51] Initialised chain configuration          config="{ChainID: 5 Homestead: 1 DAO: <nil> DAOSupport: true EIP150: 2 EIP155: 3 EIP158: 3 Byzantium: 9223372036854775807 IsQuorum: false Engine: istanbul}"
INFO [08-07|09:50:51] Initialising Ethereum protocol           versions="[63 62]" network=5
INFO [08-07|09:50:51] Loaded most recent local header          number=0 hash=22919a…075196 td=1
INFO [08-07|09:50:51] Loaded most recent local full block      number=0 hash=22919a…075196 td=1
INFO [08-07|09:50:51] Loaded most recent local fast block      number=0 hash=22919a…075196 td=1
INFO [08-07|09:50:51] Regenerated local transaction journal    transactions=0 accounts=0
INFO [08-07|09:50:51] Starting P2P networking 
INFO [08-07|09:50:53] UDP listener up                          self=enode://fe329f4395d30db66cced5d750fd4395993f66ccd08c703ea2653b78cdd364b76938e13d2ab8cc5129a295fdc0d43ecc1dfb9c408b24639bbe42dd1091333251@[::]:30303
INFO [08-07|09:50:53] RLPx listener up                         self=enode://fe329f4395d30db66cced5d750fd4395993f66ccd08c703ea2653b78cdd364b76938e13d2ab8cc5129a295fdc0d43ecc1dfb9c408b24639bbe42dd1091333251@[::]:30303
INFO [08-07|09:50:53] IPC endpoint opened: ~/.ethereum/ottoman/geth.ipc

but then it does not sync.

Please update the hardcoded IP addresses of the bootnodes, or publish a script / list of current bootnodes. Thanks.

Alejandro Loaiza · Answer 30 · Thu Aug 16 2018 02:51:22 GMT+0800 (China Standard Time)

Hi, have some issues with block creation (Mining) using IBFT, I'm testing with 7 validator nodes, when I bring 4 nodes up wait some time (around 30 minutes) and then bring the 5th node up, there is no block creation after more than half an hour (another 30 minutes). now, if I bring all 5 nodes up at the same time block creation is happening normally. What might be the issue?

I have given more details here getamis/istanbul-tools#113

horca17 · Answer 31 · Tue Aug 21 2018 23:17:04 GMT+0800 (China Standard Time)

Is there anyway, using ISTANBUL OPTIONS: --istanbul.requesttimeout value & --istanbul.blockperiod value, change the time block creation? As a default, mine time is 1sec, I would like to increase it to 10sec, thanks.

ivica7 · Answer 32 · Mon Oct 01 2018 17:41:39 GMT+0800 (China Standard Time)

Any plans to integrate this into the official go-ethereum project?

ivica7 · Answer 33 · Mon Oct 01 2018 19:23:28 GMT+0800 (China Standard Time)

One more question: In Clique, with N = 3*f + 1 nodes, if I wait for a TX to be confirmed in 2*f + 1 blocks, would this reassemble the same consistency property (transaction finality) like in IBFT/PBFT? Of course, it would be slower, but theoretically, it would be the same behaviour?

zjsunzone · Answer 34 · Thu May 23 2019 08:46:53 GMT+0800 (China Standard Time)

Is Gossip complete?

Duc Liem Nguyen · Answer 35 · Fri Aug 09 2019 16:51:25 GMT+0800 (China Standard Time)

Hi I have a question about IBFT’s consensus fault at the number of lock <n/3:

Imagine we have n=7 node, f=2. The node are A, B, C, D, E, F, G
F and G are Byzantine node.

At first round:
A propose p1, only E saw that B-C-D-E-F vote PREPARE p1-> E lock p1. The rest of the nodes are timed out at PREPREPARED.

Second round:
B propose p2, only D saw that A-C-D-F-G vote vote PREPARE p2-> D lock p2. The rest of the nodes are timed out at PREPREPARED.

At this stage, F and G stop voting.

We have 5 nodes, however E and D cannot unlock to either p1 or p2. A, B, C could not themselves come to any consensus since at most we have 4 node voting, while we need at least 5 Nodes.

As I can see current implementation of locks is not suffice to handle this case.

Alex Beregszaszi · Answer 36 · Thu Nov 07 2019 00:56:34 GMT+0800 (China Standard Time)

It would be nice turning this into an actual EIP. Especially as there appears to be an "IBFT2.0" as well (links: 1 2)

Bob Summerwill · Answer 37 · Fri Nov 08 2019 04:24:06 GMT+0800 (China Standard Time)

This still has not made it to accepted EIP status, @axic? Eeek.
Yes, so I very much agree.

With the EEA/EF Mainnet initiative, we really do need to be starting to consider EEA standards within the same EIP process, even if they do not apply to the ETH mainnet.

The EIP standards process needs to look at Ethereum-as-a-protocol, not purely the needs of $ETH.

When I raised that to @Souptacular in 2017, his response was that there was likely little appetite in the Core Devs group for taking on that extra load, considering that such proposals were not of direct benefit to ETH. Maybe the appetite is different now, especially with PegaSys people spanning both sides, @timbeiko and @shemnon being deeply involved with Core Devs, etc?

Alex Beregszaszi · Answer 38 · Fri Nov 08 2019 04:49:19 GMT+0800 (China Standard Time)

I am a bit confused, but I don't think anyone would have rejected this submitted as an EIP. As it stands today, this is only a discussion. When it gets submitted as a pull request, it can be merged as a draft and likely turned final, given it was implemented in multiple clients (and superseded already?).

Adrian Sutton · Answer 39 · Fri Nov 08 2019 05:09:14 GMT+0800 (China Standard Time)

Note that the Quorum implementation has recently changed the calculation for a quorum of validators to fix an issue. There are a bunch of details I'm not familiar with but this spec likely needs an update before it becomes final. From my memory of trying to implement IBFT1 I seem to recall some parts of this were misleading or wrong (or possibly the Quorum implementation was wrong but that's essentially become the standard for IBFT1 since it's what's in production). I should have raised them at the time (sorry) and would have to review the spec again now, though there are likely better people.

There is also ongoing work in the EEA to adopt a standard BFT consensus algorithm. I'm not sure what the status of that is. It does mean that we don't necessarily need this and other non-mainnet stuff as EIPs, the EEA spec may (or may not) be a better place for them.

Bob Summerwill · Answer 40 · Fri Nov 08 2019 08:02:11 GMT+0800 (China Standard Time)

@ajsutton My guts says that everything which can be EIPs should be EIPs, to avoid siloing between Public Ethereum and Enterprise Ethereum (which is exactly what happened with the EEA - intentionally at first, but with the intention of converging them back together in happier days - ie now).

There is nothing to say that all EIPs have to be implemented by ALL clients to be useful. There is nothing to say that all EIPs have to apply to the ETH mainnet to be accepted.

The fact that EIPs were NOT originally written for functionality like: JSON-RPCs, Swarm, Warp-Sync, Aura, Clique and more was a real problem. You were stuck with trying to be bug-for-bug compatible with Geth or with Parity.

Now we have more clients I would argue that pretty much EVERY useful feature from ETH1 clients, including EEA features, should have EIPs written for them - unless they are very experimental and new. The spec is what lets other clients adopt.

Danno Ferrin · Answer 41 · Sat Nov 09 2019 02:22:15 GMT+0800 (China Standard Time)

Clique is a EIP issue just like IBFT - #225
JSON-RPC has a doc that serves much like a spec, EEA references specific wiki edit versions - https://github.com/ethereum/wiki/wiki/JSON-RPC
Warp-Sync and Aura to my knowledge have never had specs written (i've looked) and only have parity-ethereum implementations, unlike clique and JSON-RPC that have multiple clients implementing them.
Swarm... I have no idea.

Samer Falah · Answer 42 · Sat Nov 09 2019 04:32:51 GMT+0800 (China Standard Time)

Note that the Quorum implementation has recently changed the calculation for a quorum of validators to fix an issue. There are a bunch of details I'm not familiar with but this spec likely needs an update before it becomes final. From my memory of trying to implement IBFT1 I seem to recall some parts of this were misleading or wrong (or possibly the Quorum implementation was wrong but that's essentially become the standard for IBFT1 since it's what's in production). I should have raised them at the time (sorry) and would have to review the spec again now, though there are likely better people.

There is also ongoing work in the EEA to adopt a standard BFT consensus algorithm. I'm not sure what the status of that is. It does mean that we don't necessarily need this and other non-mainnet stuff as EIPs, the EEA spec may (or may not) be a better place for them.

We modified the implementation to better handle dynamic validators based on a reported issue with scaling a network from 1 validator to 4. We'll continue to enhance the protocol as IBFT. We are currently working on a TLA+ spec, with so far a few updates to the described protocol, that we'll also make available once it's completed and more than happy to see it as an EIP. I thought this was originally an EIP.

Alex Beregszaszi · Answer 43 · Sat Nov 09 2019 04:34:52 GMT+0800 (China Standard Time)

Clique is a EIP issue just like IBFT - #225

It is an EIP actually: https://eips.ethereum.org/EIPS/eip-225

JSON-RPC has a doc that serves much like a spec, EEA references specific wiki edit versions - https://github.com/ethereum/wiki/wiki/JSON-RPC

It has an EIP too: https://eips.ethereum.org/EIPS/eip-1474

Bob Summerwill · Answer 44 · Sat Nov 09 2019 05:03:01 GMT+0800 (China Standard Time)

The Clique EIP was written by @karalabe in an unsuccessful attempt to "unfork" the different POA approaches after Parity "went first" with Aura and then a group of companies launched the Kovan testnet without even informing the Geth team:

https://medium.com/@Digix/announcing-kovan-a-stable-ethereum-public-testnet-10ac7cb6c85f

Parity did not "play ball" and implement Clique in Parity, and also did not author an EIP of their own for Aura, or propose any alternative standard which both teams could implement.

That was finally resolved by the Gorli project (co-funded by the EF and ETC Coop) which added Clique support to Parity. Thank you @soc1c, @aidanih and @YazzyYaz. ETC Coop paid $130K on our side for that to happen, and I believe that the EF matched that funding.

https://medium.com/ethereum-classic/building-a-better-unified-testnet-3f48490cd4e1
https://goerli.net/

The JSON-RPC EIP also happened a lot later than the original Wiki spec. Does Parity even comply with the EIP? I honestly do not know. The lack of alignment between Geth and Parity on that score has been an issue since 2016.

A Warp-Sync EIP would have been very useful. Aleth was leveraging that functionality at one stage, right, @axic? Is that still the case?

Swarm is "graduated" from EF funding now, and they have their own process, making an EIP moot at this stage:

https://github.com/ethersphere/SWIPs

ETC Labs have started funding Swarm now. And @tgerring has been funding personally. And they have partnered by @pipermerriam and Trinity team. Go @zelig :-)

https://medium.com/ethereum-classic-labs/ethereum-classic-labs-partnership-announcement-79328d5055f4

https://twitter.com/BobSummerwill/status/1174071570588815360

Takunori Min · Answer 45 · Mon Jan 06 2020 16:16:10 GMT+0800 (China Standard Time)

Hello, I have a small question.

Why "Istanbul" is used as the name? Is it from Ethereum Istanbul update?

Péter Szilágyi · Answer 46 · Mon Jan 06 2020 16:18:12 GMT+0800 (China Standard Time)

The Istanbul name here predates the fork.

Takunori Min · Answer 47 · Tue Jan 14 2020 19:03:45 GMT+0800 (China Standard Time)

The Istanbul name here predates the fork.

Predates the fork? So why is it called Istanbul?? It is not related to Ethereum Istanbul, right?

Bob Summerwill · Answer 48 · Wed Jan 15 2020 00:11:12 GMT+0800 (China Standard Time)

Correct,@NoriMin.

IBFT was created by AMIS, a Taiwanese banking consortium, in 2017 and it is completely unrelated to the Istanbul hard fork.

They called it Istanbul as a riff on Byzantium Fault Tolerance.

Where Byzantium, Constantinople and Istanbul were the names assigned to the phases of what was originally planned as a single hard fork called Metropolis, the phase of the original ETH roadmap prior to Serenity.

Those all being different names which the real world city of Istanbul has had in its history (and being a metropolis).

Takunori Min · Answer 49 · Fri Jan 17 2020 21:20:19 GMT+0800 (China Standard Time)

@bobsummerwill
I understood! Thank you:)

RauleHernandeza · Answer 50 · Wed Aug 25 2021 05:46:19 GMT+0800 (China Standard Time)

Hi, sorry, I would like to see if someone could give me details about the IBFT consensus mode.

For example, if in each round IBFT chooses a random node as proposer? and also if IBFT chooses which nodes participate in each round or if it works with all nodes?

Jason Frame · Answer 51 · Wed Aug 25 2021 07:35:32 GMT+0800 (China Standard Time)

Hi @RauleHernandeza,

The node to be chosen as a proposer cannot be random it is deterministic. Proposer selection policy can either use a robin approach or use a sticky proposer.

Which nodes participate in a round is determined by the voting process, so all nodes will use the same set of nodes for the validators for in a round. The "Validator list voting" section goes into more detail how this works.

RauleHernandeza · Answer 52 · Wed Aug 25 2021 08:06:21 GMT+0800 (China Standard Time)

Hola @RauleHernandeza ,

El nodo a elegir como proponente no puede ser aleatorio, es determinista. La política de selección de proponentes puede utilizar un enfoque robin o un proponente pegajoso.

Los nodos que participan en una ronda están determinados por el proceso de votación, por lo que todos los nodos utilizarán el mismo conjunto de nodos para los validadores en una ronda. La sección "Votación de la lista de validadores" explica con más detalle cómo funciona esto.

First of all thank you for answering my question. Now why can't you choose nodes randomly? you say that because it is not implemented?

And secondly where can I see the index of sections? I was given the link to the repository but I still don't know how to guide me through it. I want to take a look at the Validator list voting.

Jason Frame · Answer 53 · Wed Aug 25 2021 09:25:27 GMT+0800 (China Standard Time)

Hi @RauleHernandeza, Choosing random nodes is not implemented or part of the IBFT EIP.

The validator list voting section I'm referring to is the "Validator list voting" heading in this issue which has some more details.

robertox186 · Answer 54 · Wed Aug 25 2021 21:11:52 GMT+0800 (China Standard Time)

Hello, how do you choose the proponent ?, and in the round participate all the nodes of the network?

robertox186 · Answer 55 · Wed Aug 25 2021 23:48:19 GMT+0800 (China Standard Time)

How do the round robin and sticky proponent methods work?,
how they choose the next proponent (in case of change)

robertox186 · Answer 56 · Wed Aug 25 2021 23:50:00 GMT+0800 (China Standard Time)

at what point do the validator nodes vote to remove or add a validator node to the network?

Stefano De Angelis · Answer 57 · Thu Aug 26 2021 00:48:27 GMT+0800 (China Standard Time)

@robertox186 the list of validators changes whit the transition block (epoch change)

robertox186 · Answer 58 · Thu Aug 26 2021 00:52:47 GMT+0800 (China Standard Time)

that is, not all nodes participate in the round?

robertox186 · Answer 59 · Thu Aug 26 2021 00:59:19 GMT+0800 (China Standard Time)

at what time do you vote to add or remove a node?

Stefano De Angelis · Answer 60 · Thu Aug 26 2021 01:29:13 GMT+0800 (China Standard Time)

@robertox186 for each non-epoch transition block, signers may cast one vote proposing a change in the validators list. Actually I need to correct myself, indeed IF a certain vote proposal reaches consensus it comes to effect immediately (you do not have to wait for epoch termination).

Only the validator nodes participate in consensus rounds.

RauleHernandeza · Answer 61 · Fri Aug 27 2021 03:11:08 GMT+0800 (China Standard Time)

the out of time that can happen in the diagram always take the same time? the default is always 10000 milliseconds?

robertox186 · Answer 62 · Fri Aug 27 2021 05:04:57 GMT+0800 (China Standard Time)

How does ibft pay the stake to the nodes?

robertox186 · Answer 63 · Sat Aug 28 2021 00:44:12 GMT+0800 (China Standard Time)

Good, how does ibft pay the stake and if you can tell me what part of the code it is in, also with the update of the validator lists?

github-actions · Answer 64 · Thu Feb 24 2022 01:08:32 GMT+0800 (China Standard Time)

There has been no activity on this issue for two months. It will be closed in a week if no further activity occurs. If you would like to move this EIP forward, please respond to any outstanding feedback or add a comment indicating that you have addressed all required feedback and are ready for a review.

github-actions · Answer 65 · Thu Mar 10 2022 01:09:22 GMT+0800 (China Standard Time)

This issue was closed due to inactivity. If you are still pursuing it, feel free to reopen it and respond to any feedback or request a review in a comment.

Istanbul Byzantine Fault Tolerance

Change log:

Pull request

Istanbul byzantine fault tolerant consensus protocol

Terminology

Consensus

Consensus states

State transitions:

Round change flow

Proposer selection

Validator list voting

Future message and backlog

Optimization

Constants

Block header

Block hash, proposer seal, and committed seals

Proposer seal calculation

Block hash calculation

Consensus proof

Block locking mechanism

Lock

Lock and unlock

Protocol (+2/3 validators are locked with Lock(B,H))

Locking cases

Gossip network

How to run

Command line options

Nodekey and validator

genesis.json

extraData tools

Ottoman testnet

Run testnet node

Faulty node

Background

Remaining Tasks

Notes and discussions

Does it still make sense to use gas?

Does it make sense to charge gas in a consortium chain?

Put consensus proof in the next block?

Proof of lock

Contribution

Forked repository (and original implementation branch)

Clarifications and feedback

Protocol (`+2/3` validators are locked with `Lock(B,H)`)

`extraData` tools