Unable to run Axon in a local Kubernetes cluster
serejke opened this issue · comments
I'm trying to start k8s-deploy/k8s/axon/deploy.sh
in a local docker-desktop
cluster.
Having the following changes:
- create PersistentVolumes to satisfy PersistentVolumeClaims, stored in
/Users/serejke/.axon/node-1
,/Users/serejke/.axon/node-2
etc - disable Ingress — not needed
- remove
nodeSelector: disktype: node4
to make my local Kubernetes nodes compatible - make replicate = 1 to simplify debugging
My 4 axon nodes run for a couple of seconds and fail with the following log:
[2023-04-03T18:27:56.212491963+00:00 INFO core_executor::system_contract] execute addr 0xb00d…c15a
[2023-04-03T18:27:56.223528129+00:00 INFO core_executor::system_contract] execute addr 0x4af5…2352
[2023-04-03T18:27:56.229972796+00:00 INFO core_run] Execute the genesis distribute success, genesis state root 0xd01bf2694feaaea8a7d6ee62f4c27c143b41042bdca1eca18e84cb7d2e55f10c, response ExecResp { state_root: 0xd01bf2694feaaea8a7d6ee62f4c27c143b41042bdca1eca18e84cb7d2e55f10c, receipt_root: 0x0378c03f0ac30062de319246880360934dbb384835cf0a6726b9af00dee2b92a, gas_used: 9509750, tx_resp: [TxResp { exit_reason: Succeed(Returned), ret: [], gas_used: 2081462, remain_gas: 27918538, fee_cost: 0, logs: [], code_address: Some(0xc2fd48d60ae16b3fe6e333a9a13763691970d9373d4fab7cc323d7ba06fa9986), removed: false }, TxResp { exit_reason: Succeed(Returned), ret: [], gas_used: 2422845, remain_gas: 215571183603, fee_cost: 0, logs: [Log { address: 0x4af5ec5e3d29d9ddd7f4bf91a022131c41b72352, topics: [0x8be0079c531659141344cd1fd0a4f28419497f9722a3daafe3b4186f6b6457e0, 0x0000000000000000000000000000000000000000000000000000000000000000, 0x0000000000000000000000008ab0cf264df99d83525e9e11c7e4db01558ae1b1], data: [] }, Log { address: 0x4af5ec5e3d29d9ddd7f4bf91a022131c41b72352, topics: [0x2f8788117e7eff1d82e926ec794901d17c78024a50270940304540a733656f0d, 0x0000000000000000000000000000000000000000000000000000000000000000, 0x0000000000000000000000008ab0cf264df99d83525e9e11c7e4db01558ae1b1, 0x0000000000000000000000008ab0cf264df99d83525e9e11c7e4db01558ae1b1], data: [] }], code_address: Some(0x336c11f92895e657a26642914af5ec5e3d29d9ddd7f4bf91a022131c41b72352), removed: false }, TxResp { exit_reason: Succeed(Returned), ret: [], gas_used: 269886, remain_gas: 215573336562, fee_cost: 0, logs: [Log { address: 0xb00d616b820c39619ee29e5144d0226cf8b5c15a, topics: [0xbc7cd75a20ee27fd9adebab32041f755214dbc6bffa90cc0225b39da2e5c2d3b, 0x000000000000000000000000a13763691970d9373d4fab7cc323d7ba06fa9986], data: [] }], code_address: Some(0xb233fb175c5be87ff90fc88eb00d616b820c39619ee29e5144d0226cf8b5c15a), removed: false }, TxResp { exit_reason: Succeed(Returned), ret: [], gas_used: 3386908, remain_gas: 26613092, fee_cost: 0, logs: [], code_address: Some(0x2c3a9349df5b162519b17621f67bc4e50d1df92b0e4c61794a4517af6a995cb2), removed: false }, TxResp { exit_reason: Succeed(Returned), ret: [], gas_used: 799309, remain_gas: 29195891, fee_cost: 0, logs: [], code_address: None, removed: false }, TxResp { exit_reason: Succeed(Returned), ret: [], gas_used: 497334, remain_gas: 29502666, fee_cost: 0, logs: [Log { address: 0xb484fd480e598621638f380f404697cd9f58b0f8, topics: [0xbc7cd75a20ee27fd9adebab32041f755214dbc6bffa90cc0225b39da2e5c2d3b, 0x000000000000000000000000f67bc4e50d1df92b0e4c61794a4517af6a995cb2], data: [] }], code_address: Some(0xda6db70ce66da4c6433bb447b484fd480e598621638f380f404697cd9f58b0f8), removed: false }, TxResp { exit_reason: Succeed(Stopped), ret: [], gas_used: 52006, remain_gas: 29947994, fee_cost: 0, logs: [Log { address: 0x4af5ec5e3d29d9ddd7f4bf91a022131c41b72352, topics: [0x2f8788117e7eff1d82e926ec794901d17c78024a50270940304540a733656f0d, 0x241ecf16d79d0f8dbfb92cbc07fe17840425976cf0667f022fe9877caa831b08, 0x000000000000000000000000b484fd480e598621638f380f404697cd9f58b0f8, 0x0000000000000000000000008ab0cf264df99d83525e9e11c7e4db01558ae1b1], data: [] }], code_address: None, removed: false }] }
[2023-04-03T18:27:56.246767713+00:00 INFO core_run] The genesis block is created Block { header: Header { prev_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, proposer: 0x0000000000000000000000000000000000000000, state_root: 0xd01bf2694feaaea8a7d6ee62f4c27c143b41042bdca1eca18e84cb7d2e55f10c, transactions_root: 0x0000000000000000000000000000000000000000000000000000000000000000, signed_txs_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, receipts_root: 0x0000000000000000000000000000000000000000000000000000000000000000, log_bloom: 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, difficulty: 0, timestamp: 1639459018, number: 0, gas_used: 0, gas_limit: 0, extra_data: b"", mixed_hash: None, nonce: 0x0000000000000000, base_fee_per_gas: 1337, proof: Proof { number: 0, round: 0, block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, signature: b"", bitmap: b"" }, last_checkpoint_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, call_system_script_count: 0, chain_id: 10012 }, tx_hashes: [0x3bbe1ebf56b864d91ff5d7505be6df8a13a232a3c5969b30ad5fd254226c6e6b, 0x01240fb109c0c9ca0c095542d04140cc00d13bb66dd262ec088ba1b27424c8ac, 0xdf58cdda98ae3139026750bda1e3100442b59f91e26b6adac5749e3b026219ef, 0xfcbd67037cb8789fcb215cabed5e60a66afec59698a36d2afaa0cda626d66f07, 0x41bdc59db755cd3da3f41c2fdaf936e16d16130b202a8fd3c608c06b14d243ce, 0xca338c9e4eb563817bc363a243602b30c5a9608d94e3a04f948f60d40fc127f8, 0x378803d6f9517956f38e67c773956cf646625775207080b237e82334cbebcdb2] }
[2023-04-03T18:27:56.690430088+00:00 INFO core_run] prometheus start
[2023-04-03T18:27:56.690726463+00:00 INFO core_run] node starts
[2023-04-03T18:27:56.690817963+00:00 INFO core_run] Data path for block: "./devtools/chain/data1/rocksdb/block_data"
[2023-04-03T18:27:58.731982005+00:00 INFO core_run] Recover 0 tx of number 1 from wal
[2023-04-03T18:28:02.492455799+00:00 INFO core_run] The Genesis block has been initialized.
[2023-04-03T18:28:02.869677841+00:00 INFO core_run] prometheus start
[2023-04-03T18:28:02.869816716+00:00 INFO core_run] node starts
[2023-04-03T18:28:02.869895674+00:00 INFO core_run] Data path for block: "./devtools/chain/data1/rocksdb/block_data"
[2023-04-03T18:28:10.945044678+00:00 INFO core_run] Recover 0 tx of number 1 from wal
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ProtocolError { kind: Executor, error: FutureEpoch }', core/cli/src/lib.rs:56:42
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
where the ProtocolError { kind: Executor, error: FutureEpoch } seems to be the root cause.
This might be a misconfiguration issue on my side, even though I have verified that the created PersistentVolumes are bound to the claims, and that axon nodes create some internal files for RocksDB.
Will greatly appreciate your help in debugging this issue, as we need to be able to start the Axon nodes locally, to have full development environment similar to the production (AWS) cluster.
Working on it.
Hey @liya2017, yes, I can allocate a temp directory on local to store all Axon nodes' data. For now this is my /Users/serejke/.axon/
but it may be configurable
@liya2017 so your change was to use local
PV instead of hostPath
? I did like this:
I replaced:
hostPath:
path: "/Users/serejke/.axon/node-1"
with
local:
path: "/Users/serejke/.axon/node-1"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- docker-desktop
I use docker-desktop
for Mac.
The same issue.
@liya2017 I built an image for this release v0.1.0-alpha.5 published at February 7 (commit hash 2684f2d3
), and it just works with the same configuration
I will try with the latest release v0.1.0-alpha.8
and post results here
something has changed since February 7. It may be just a configuration issue: axon-devops
might be a bit out-of-date
and it just works with the same configuration
Great. Yes, it’s my problem,sorry.
BTW,why don’t you use the docker-deploy if you in docker mode? We have docker-deploy directory in the repo. I thought you were using k8s, so I update it in k8s mode😀
@liya2017 I use Docker Desktop (not docker-deploy
) - this is Docker for MacOS
and it creates a docker-desktop
Kubernetes cluster. So I tried to deploy to k8s ;-) Sorry for confusing you
Ah, sorry to misunderstand and thanks for your sharing, will learn and try it in my side.
Could we close this issue?
@liya2017 I just built an M1 image (and created a task) for the most recent release v0.1.0-alpha.8 and it also works as expected.
The initial problem was with the image built for axon
's main
branch - that is, the most recent code.
I'd like to ask you if there is any important difference between axon
's main
and release branches (marked with tags), which might lead to the above problem? Otherwise it might be a bug introduced in the past 2 weeks, and we'd need to escalate this issue to the axon
developers. Unfortunately, I'm not experienced with Axon node monitoring/debugging yet and can't provide more details for investigation. Thanks
@serejke The main branch included this pr axonweb3/axon#1115. If you using the main branch to build the image , you should update the config.toml and genesis.json, also need to clean the data since the genesis.json changed.
@liya2017 I see! Thanks so much. This issue may be closed then