tendermint / tmkms

Key Management service for Tendermint Validator nodes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

KMS double sign false positive on small testnet

yulidai opened this issue · comments

Vestion

  • Gaia V2.0.0
  • KMS V0.6.3

Description

We start a four-validator test network, one of which is use KMS(Called KmsVal below). Whenever it's KmsVal turn to proposal, will get error info. The odds are high, but not 100 percent.

  • the error info of gaia
Vote{2:93BB1F0645CA 160/00/2(Precommit) 000000000000 000000000000 @ 2019-09-23T05:28:17.2211451Z}" err="signerServiceEndpoint returned error #4: double signing requested at height: 160
  • the error info of kms
attempted double sign at h/r/s: 160/0/6 (6B3E0F531E != <nil>)

Question

When this error occurs, will it lead to omission or even double signature?

It seems when KmsVal proposal, there is a high chance that PreCommit will not be signed and get error info: (attempted double sign at h/r/s: 64/0/6 (F9794CEE05 != <nil>)).

Edit: It should be that this situation will cause problems:

signed Proposal:40A4050BB4 at h/r/s 308/0/3 (0 ms)
signed PreCommit:<nil> at h/r/s 308/0/6 (0 ms)
signed PreVote:CEB28EBB90 at h/r/s 308/1/6 (0 ms)
signed PreCommit:CEB28EBB90 at h/r/s 308/1/6 (0 ms)

Any time you run multiple concurrent validators against the same KMS, you will see errors like this, as the validators are uncoordinated and sending inconsistent signing requests to the KMS. The KMS will do its best to prevent double signing in these cases. @mdyring and @liangping helped test it, but I wouldn't say it's rock solid yet.

For more information, check out this blog post:

https://iqlusion.blog/postmortem-2019-08-08-tendermint-kms-related-cosmos-hub-validator-incident#tendermint-consensus-and-double-signing_1

Note this in particular:

For this reason we recommend validators don’t run this sort of configuration in perpetuity, but use the functionality to failover between validators. Based on this incident, and others we’ll describe below, we in fact recommend you only run in this configuration on testnets for now because we think this operational mode needs a lot more testing to be safe.

If I modify the code to ensure that I sign only once at the same height and same round and same type(proposal/prevote/procommit). Then change code in state.rs:

/**** old code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
    new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
     !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
{...}

/**** new code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
    new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
     !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
     && new_state.block_id.is_some() && self.consensus_state.block_id.is_some()
{...}

It seems to solve this problem.
The test seems to avoid the missed signage caused by the error message like attempted double sign at h/r/s: 64/0/6 (F9794CEE05 != <nil>)

What do you think of this way?

I believe that would reintroduce the double signing vulnerability from #333 which was fixed in #334.

Note that we've already made a considerable effort to allow signing (and re-signing) in all cases which are not considered double signing.

This is covered in this blog post: https://iqlusion.blog/postmortem-2019-08-08-tendermint-kms-related-cosmos-hub-validator-incident#tendermint-consensus-and-double-signing_1

So sign only once at the same height and same round and same type(proposal/prevote/procommit) can't bring security guarantee, right?

Correct: there needs to be special logic for handling proof-of-lock-change (PoLC) which is expressed as the <nil> votes. Without that logic there is a double signing vulnerability as seen in #333.

Any time you run multiple concurrent validators against the same KMS, you will see errors like this, as the validators are uncoordinated and sending inconsistent signing requests to the KMS.

emmm...It seems that when there is only one Validator corresponding to a KMS, this error will also be reported.

If that were true, it would indicate bugs in both the KMS and Tendermint.

Can you paste some loglines where this is happening and confirm the requests originate from the same client?

using method

softsign for test.

tmkms.toml

[[chain]]
id = "ld_test"
key_format = { type = "bech32", account_key_prefix = "cosmospub", consensus_key_prefix = "cosmosvalconspub" }

## Validator configuration
[[validator]]
addr = "tcp://127.0.0.1:16659"
chain_id = "ld_test"
reconnect = true
secret_key = "secret_connection.key"

[[providers.softsign]]
chain_ids = ["ld_test"]
key_format = "json"
path = "priv_validator_key.json"
Logs of KMS
06:15:37 [info] [ld_test@tcp://127.0.0.1:16659] signed Proposal:9FBB94F5F5 at h/r/s 268/0/3 (0 ms)
06:15:37 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:9FBB94F5F5 at h/r/s 268/0/6 (0 ms)
06:15:38 [error] [ld_test:tcp://127.0.0.1:16659] attempted double sign at h/r/s: 268/0/6 (9FBB94F5F5 != <nil>)
06:15:43 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:54C9F3D04E at h/r/s 269/0/6 (0 ms)
06:15:43 [info] [ld_test@tcp://127.0.0.1:16659] signed PreCommit:54C9F3D04E at h/r/s 269/0/6 (0 ms)
06:15:48 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:DAF65E42C9 at h/r/s 270/0/6 (0 ms)
06:15:48 [info] [ld_test@tcp://127.0.0.1:16659] signed PreCommit:DAF65E42C9 at h/r/s 270/0/6 (0 ms)
06:15:54 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:9CCA7DB717 at h/r/s 271/0/6 (0 ms)
06:15:54 [info] [ld_test@tcp://127.0.0.1:16659] signed PreCommit:9CCA7DB717 at h/r/s 271/0/6 (0 ms)
06:15:59 [info] [ld_test@tcp://127.0.0.1:16659] signed Proposal:97A771C714 at h/r/s 272/0/3 (0 ms)
06:15:59 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:97A771C714 at h/r/s 272/0/6 (0 ms)
06:15:59 [error] [ld_test:tcp://127.0.0.1:16659] attempted double sign at h/r/s: 272/0/6 (97A771C714 != <nil>)

Logs of Validators

gaia_3        | I[2019-09-24|06:15:54.546] Committed state                              module=state height=271 txs=0 appHash=5E92E038AB80CE113B1152ACD0DABCC6C5E3564694D8AA164B412667A6B3A4CC
gaia_2        | I[2019-09-24|06:15:54.548] Executed block                               module=state height=271 validTxs=0 invalidTxs=0
gaia_2        | I[2019-09-24|06:15:54.558] Committed state                              module=state height=271 txs=0 appHash=5E92E038AB80CE113B1152ACD0DABCC6C5E3564694D8AA164B412667A6B3A4CC
gaia_1        | I[2019-09-24|06:15:59.874] Executed block                               module=state height=272 validTxs=0 invalidTxs=0
gaia_1        | I[2019-09-24|06:15:59.886] Committed state                              module=state height=272 txs=0 appHash=D9F9C6F45B35EFA856DE018064E0F0F3E7BB358629C94614E36C741051F04542
gaia_kms_1    | E[2019-09-24|06:15:59.966] Error signing vote                           module=consensus height=272 round=0 vote="Vote{2:93BB1F0645CA 272/00/2(Precommit) 000000000000 000000000000 @ 2019-09-24T06:15:59.964628Z}" err="signerServiceEndpoint returned error #4: double signing requested at height: 272"
gaia_2        | I[2019-09-24|06:15:59.977] Executed block                               module=state height=272 validTxs=0 invalidTxs=0
gaia_3        | I[2019-09-24|06:15:59.983] Executed block                               module=state height=272 validTxs=0 invalidTxs=0
gaia_kms_1    | I[2019-09-24|06:15:59.985] Executed block                               module=state height=272 validTxs=0 invalidTxs=0

Are gaia_1, gaia_2, and gaia_3 not separate instances of gaia?

They are separate instances of gaia, genesis.json is:

{
  "genesis_time": "2019-09-22T18:04:18.0316648Z",
  "chain_id": "ld_test",
  "consensus_params": {
    "block": {
      "max_bytes": "22020096",
      "max_gas": "-1",
      "time_iota_ms": "1000"
    },
    "evidence": {
      "max_age": "100000"
    },
    "validator": {
      "pub_key_types": [
        "ed25519"
      ]
    }
  },
  "app_hash": "",
  "app_state": {
    "gov": {
      "starting_proposal_id": "1",
      "deposits": null,
      "votes": null,
      "proposals": null,
      "deposit_params": {
        "min_deposit": [
          {
            "denom": "stake",
            "amount": "10000000"
          }
        ],
        "max_deposit_period": "172800000000000"
      },
      "voting_params": {
        "voting_period": "172800000000000"
      },
      "tally_params": {
        "quorum": "0.334000000000000000",
        "threshold": "0.500000000000000000",
        "veto": "0.334000000000000000"
      }
    },
    "crisis": {
      "constant_fee": {
        "denom": "stake",
        "amount": "1000"
      }
    },
    "slashing": {
      "params": {
        "max_evidence_age": "120000000000",
        "signed_blocks_window": "100",
        "min_signed_per_window": "0.500000000000000000",
        "downtime_jail_duration": "600000000000",
        "slash_fraction_double_sign": "0.050000000000000000",
        "slash_fraction_downtime": "0.010000000000000000"
      },
      "signing_infos": {},
      "missed_blocks": {}
    },
    "accounts": [
      {
        "address": "cosmos1u0037f0vm5ym9kp5y9jt4xeatxzm8lvfrg5ql8",
        "coins": [
          {
            "denom": "stake",
            "amount": "100000000"
          },
          {
            "denom": "validatortoken",
            "amount": "100000000"
          }
        ],
        "sequence_number": "0",
        "account_number": "0",
        "original_vesting": [],
        "delegated_free": [],
        "delegated_vesting": [],
        "start_time": "0",
        "end_time": "0",
        "module_name": "",
        "module_permissions": [
          ""
        ]
      },
      {
        "address": "cosmos1gqzx83hluqxqj70gw33ufmcvmcugwh73k0ma5m",
        "coins": [
          {
            "denom": "stake",
            "amount": "100000000"
          },
          {
            "denom": "validatortoken",
            "amount": "100000000"
          }
        ],
        "sequence_number": "0",
        "account_number": "0",
        "original_vesting": [],
        "delegated_free": [],
        "delegated_vesting": [],
        "start_time": "0",
        "end_time": "0",
        "module_name": "",
        "module_permissions": [
          ""
        ]
      },
      {
        "address": "cosmos1nzrrq282fauqc2xdp5tmt7mnsf9xzhllcp9lfq",
        "coins": [
          {
            "denom": "stake",
            "amount": "100000000"
          },
          {
            "denom": "validatortoken",
            "amount": "100000000"
          }
        ],
        "sequence_number": "0",
        "account_number": "0",
        "original_vesting": [],
        "delegated_free": [],
        "delegated_vesting": [],
        "start_time": "0",
        "end_time": "0",
        "module_name": "",
        "module_permissions": [
          ""
        ]
      },
      {
        "address": "cosmos1q2tpku988upsevfmr0ckcvfnng47fxk5tp9kth",
        "coins": [
          {
            "denom": "stake",
            "amount": "100000000"
          },
          {
            "denom": "validatortoken",
            "amount": "100000000"
          }
        ],
        "sequence_number": "0",
        "account_number": "0",
        "original_vesting": [],
        "delegated_free": [],
        "delegated_vesting": [],
        "start_time": "0",
        "end_time": "0",
        "module_name": "",
        "module_permissions": [
          ""
        ]
      }
    ],
    "bank": {
      "send_enabled": true
    },
    "supply": {
      "supply": []
    },
    "genutil": {
      "gentxs": [
        {
          "type": "cosmos-sdk/StdTx",
          "value": {
            "msg": [
              {
                "type": "cosmos-sdk/MsgCreateValidator",
                "value": {
                  "description": {
                    "moniker": "testing",
                    "identity": "",
                    "website": "",
                    "details": ""
                  },
                  "commission": {
                    "rate": "0.100000000000000000",
                    "max_rate": "0.200000000000000000",
                    "max_change_rate": "0.010000000000000000"
                  },
                  "min_self_delegation": "1",
                  "delegator_address": "cosmos1gqzx83hluqxqj70gw33ufmcvmcugwh73k0ma5m",
                  "validator_address": "cosmosvaloper1gqzx83hluqxqj70gw33ufmcvmcugwh73nm0gcg",
                  "pubkey": "cosmosvalconspub1zcjduepq77h8lt3mzh7fdx6zupp6t5v0h7rhs8tsk0s24stauejjjg8s6hwqgymdrs",
                  "value": {
                    "denom": "stake",
                    "amount": "10000000"
                  }
                }
              }
            ],
            "fee": {
              "amount": [],
              "gas": "200000"
            },
            "signatures": [
              {
                "pub_key": {
                  "type": "tendermint/PubKeySecp256k1",
                  "value": "A+2x1fROB4QInzo43WqL+Hvz//0CBupgRX/cQAdfV0mP"
                },
                "signature": "cHjvlpXt31w0+iR7dDjUIB7fyppZkDZhUsdrj8LE/RoOAp6gaJuSUlKONGhrZOzVFiE5qhXGjv/YJYukUhzHBw=="
              }
            ],
            "memo": "6eb1c4468e1e2b5a994cb44285d7de5345690ec8@172.17.0.4:26656"
          }
        },
        {
          "type": "cosmos-sdk/StdTx",
          "value": {
            "msg": [
              {
                "type": "cosmos-sdk/MsgCreateValidator",
                "value": {
                  "description": {
                    "moniker": "testing",
                    "identity": "",
                    "website": "",
                    "details": ""
                  },
                  "commission": {
                    "rate": "0.100000000000000000",
                    "max_rate": "0.200000000000000000",
                    "max_change_rate": "0.010000000000000000"
                  },
                  "min_self_delegation": "1",
                  "delegator_address": "cosmos1q2tpku988upsevfmr0ckcvfnng47fxk5tp9kth",
                  "validator_address": "cosmosvaloper1q2tpku988upsevfmr0ckcvfnng47fxk5w43r8y",
                  "pubkey": "cosmosvalconspub1zcjduepqfcvduhnpavv253lgf9swu4ee5855lee3x6pxhsakleg7gr36x5gq0hnnf4",
                  "value": {
                    "denom": "stake",
                    "amount": "10000000"
                  }
                }
              }
            ],
            "fee": {
              "amount": [],
              "gas": "200000"
            },
            "signatures": [
              {
                "pub_key": {
                  "type": "tendermint/PubKeySecp256k1",
                  "value": "A+zEasV3xRemKpsxELYm0O9U2g8puF9V1BfOUKJDa2iM"
                },
                "signature": "4tZHr8ZxQqX7w8MceMNhVTtH2jsakZB6ewmvlM3PhZB1XxZ9pGbiUm0mYLLIwHi064pthGt6LJvfDi6+m6zo0A=="
              }
            ],
            "memo": "b8e38f589cba4466a483d13b66c98b1bcd2a9e10@172.17.0.4:26656"
          }
        },
        {
          "type": "cosmos-sdk/StdTx",
          "value": {
            "msg": [
              {
                "type": "cosmos-sdk/MsgCreateValidator",
                "value": {
                  "description": {
                    "moniker": "testing",
                    "identity": "",
                    "website": "",
                    "details": ""
                  },
                  "commission": {
                    "rate": "0.100000000000000000",
                    "max_rate": "0.200000000000000000",
                    "max_change_rate": "0.010000000000000000"
                  },
                  "min_self_delegation": "1",
                  "delegator_address": "cosmos1nzrrq282fauqc2xdp5tmt7mnsf9xzhllcp9lfq",
                  "validator_address": "cosmosvaloper1nzrrq282fauqc2xdp5tmt7mnsf9xzhlla4329n",
                  "pubkey": "cosmosvalconspub1zcjduepq9vxlrxqgwvzdk646ue6767n4f6yt60eucr2mhynx949yeh6ddurs3y46wy",
                  "value": {
                    "denom": "stake",
                    "amount": "10000000"
                  }
                }
              }
            ],
            "fee": {
              "amount": [],
              "gas": "200000"
            },
            "signatures": [
              {
                "pub_key": {
                  "type": "tendermint/PubKeySecp256k1",
                  "value": "A7dYDToebAorbKhF24aq3H+vZUXaN2sPcRKKW7wIS0dc"
                },
                "signature": "xlsJvaHAYj3Nm6sOfd6WPwe+ge4CdPmUIb3FUFzzRuV6CsQKEFsj7psxo8r7nWqAzkSERTElcIxS99jpEqJ7Fw=="
              }
            ],
            "memo": "d522889513c78e4706bd42e840fd8755d7d28a84@172.17.0.4:26656"
          }
        },
        {
          "type": "cosmos-sdk/StdTx",
          "value": {
            "msg": [
              {
                "type": "cosmos-sdk/MsgCreateValidator",
                "value": {
                  "description": {
                    "moniker": "testing",
                    "identity": "",
                    "website": "",
                    "details": ""
                  },
                  "commission": {
                    "rate": "0.100000000000000000",
                    "max_rate": "0.200000000000000000",
                    "max_change_rate": "0.010000000000000000"
                  },
                  "min_self_delegation": "1",
                  "delegator_address": "cosmos1u0037f0vm5ym9kp5y9jt4xeatxzm8lvfrg5ql8",
                  "validator_address": "cosmosvaloper1u0037f0vm5ym9kp5y9jt4xeatxzm8lvfxuq4n5",
                  "pubkey": "cosmosvalconspub1zcjduepqvjkq0x379vsjups9s338rv97f427atz3r4pu4uq0er87tpdnlf0smzftt9",
                  "value": {
                    "denom": "stake",
                    "amount": "10000000"
                  }
                }
              }
            ],
            "fee": {
              "amount": [],
              "gas": "200000"
            },
            "signatures": [
              {
                "pub_key": {
                  "type": "tendermint/PubKeySecp256k1",
                  "value": "AgtOk6bwPkPOy968rT0gsjAnlFqP9ZIDbHmzlAMOuRTX"
                },
                "signature": "ArQPT7sJxTBWLe6QnXeBw6i36tS+FnxL2fEvxIXPNscEeaGkhS0TNjyb2QaEXX2alGxP7VUluMMxMfDTlXdPXA=="
              }
            ],
            "memo": "ea2f7725d2fea3d0e50ecf79232839058dab3016@172.17.0.4:26656"
          }
        }
      ]
    },
    "auth": {
      "params": {
        "max_memo_characters": "256",
        "tx_sig_limit": "7",
        "tx_size_cost_per_byte": "10",
        "sig_verify_cost_ed25519": "590",
        "sig_verify_cost_secp256k1": "1000"
      }
    },
    "params": null,
    "distribution": {
      "fee_pool": {
        "community_pool": []
      },
      "community_tax": "0.020000000000000000",
      "base_proposer_reward": "0.010000000000000000",
      "bonus_proposer_reward": "0.040000000000000000",
      "withdraw_addr_enabled": true,
      "delegator_withdraw_infos": [],
      "previous_proposer": "",
      "outstanding_rewards": [],
      "validator_accumulated_commissions": [],
      "validator_historical_rewards": [],
      "validator_current_rewards": [],
      "delegator_starting_infos": [],
      "validator_slash_events": []
    },
    "staking": {
      "params": {
        "unbonding_time": "1814400000000000",
        "max_validators": 100,
        "max_entries": 7,
        "bond_denom": "stake"
      },
      "last_total_power": "0",
      "last_validator_powers": null,
      "validators": null,
      "delegations": null,
      "unbonding_delegations": null,
      "redelegations": null,
      "exported": false
    },
    "mint": {
      "minter": {
        "inflation": "0.130000000000000000",
        "annual_provisions": "0.000000000000000000"
      },
      "params": {
        "mint_denom": "stake",
        "inflation_rate_change": "0.130000000000000000",
        "inflation_max": "0.200000000000000000",
        "inflation_min": "0.070000000000000000",
        "goal_bonded": "0.670000000000000000",
        "blocks_per_year": "6311520"
      }
    }
  }
}

Separate instances of gaia (regardless of where they are located, be it the same host or on separate hosts) act in an uncoordinated manner and can send conflicting signing requests to the KMS, so this is to be expected.

I don't understand, because of ONLY ONE validator named gaia_kms_1 connect to the KMS, four validators using different ips in different containers, docker-compose like below:

services:
  gaia_1:
    container_name: gaia_1
    networks:
      kms_test:
        ipv4_address: 172.19.0.2
    command: gaiad start
  gaia_2:
    container_name: gaia_2
    networks:
      kms_test:
        ipv4_address: 172.19.0.3
    command: gaiad start
  gaia_3:
    container_name: gaia_3
    networks:
      kms_test:
        ipv4_address: 172.19.0.4
    command: gaiad start
  gaia_kms_1:
    container_name: gaia_kms_1
    networks:
      kms_test:
        ipv4_address: 172.19.0.5
    ports:
      - 16659:26659
      - 1317:1317
    command: gaiad start

Maybe I need to learn and understand what you mean?

Does each validator have a separate consensus key, or are they sharing one?

If this is a network you are trying to spin up yourself, it seems there may not be enough validators to make progress?

Yes, each validator have a separate consensus key

Can you try installing tmkms directly from GitHub:

cargo install tmkms --git https://github.com/tendermint/kms

It's possible #348 might contain a fix for this scenario.

It looks like it’s okay.

After reading the code of #348 code carefully. I found out that it looks like what i says before:

If I modify the code to ensure that I sign only once at the same height and same round and same type(proposal/prevote/procommit). Then change code in state.rs:

/**** old code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
    new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
     !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
{...}

/**** new code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
    new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
     !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
     && new_state.block_id.is_some() && self.consensus_state.block_id.is_some()
{...}

It seems to solve this problem.
The test seems to avoid the missed signage caused by the error message like attempted double sign at h/r/s: 64/0/6 (F9794CEE05 != <nil>)

:P

Thanks for confirming. Looks like we need to get another release out soon.