KMS connection failure: "session limit of 1048576 messages exceeded"
mdyring opened this issue · comments
We've just experience the below error ("signing operation failed") on the tmkms side, while connecting to irisd instance for irishub.
As the current iris release does not support automatic KMS connection recovery, manual intervention was required, leading to some missed blocks in the interim:
Quick search of the GitHub repo doesn't reveal anything related to this, so I am stuck trying to determine what would cause this error. Any help appreciated.
18:31:35 [INFO] [irishub@tcp://xxxx:27659] connected to validator successfully
06:15:14 [ERROR] [irishub@tcp://xxxx:27659] signing operation failed: protocol error: session limit of 1048576 messages exceeded: protocol error: session limit of 1048576 messages exceeded
06:15:15 [INFO] KMS node ID: DD7036834704E2CFF8C7B35C68F8933D18ECA2E8
06:32:23 [ERROR] [irishub@tcp://xxxx:27659] I/O error
These errors are supposed to be handled internally within yubihsm-rs
, however it doesn't look like that happened correctly.
I'll see if I can write some tests to reproduce and handle this better.
@zmanian also commented on Twitter: "This sounds like a secret connection needing to re-key and the lack of auto restart on the Tendermint side causing a failure"
Yep, that's correct
I opened up a PR to automatically initiate a new session and retry sending the command in the event this occurs:
The above PR was included in yubihsm-rs
v0.25.0, which is included in #259 (which I will hopefully land today)
#259 is landed, so I'll close this out.