redis / redis

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.

Home Page:http://redis.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[CRASH] cluster relance failed: ASSERTION FAILED

xiaozhuang-a opened this issue · comments

Notice!

  • If a Redis module was involved, please open an issue in the module's repo instead!
  • If you're using docker on Apple M1, please make sure the image you're using was compiled for ARM!

Crash report

Paste the complete crash log between the quotes below. Please include a few lines from the log preceding the crash report to provide some context.

redis-cluster: v7.2.4
During the rebalance phase of Redis expansion and sharding, the source node that was migrated crashed. The way to perform the rebalance was through 'Redis cli cluster rebalance', and the rebalance failure resulted in abnormal status of the slot, which cannot be fixed using the cluster fix. This is a headache for us

99535:M 18 Apr 2024 17:37:28.526 * Starting automatic rewriting of AOF on 100% growth
799535:M 18 Apr 2024 17:37:28.526 * Creating AOF incr file appendonly.aof.208368.incr.aof on background rewrite
799535:M 18 Apr 2024 17:37:28.536 * Background append only file rewriting started by pid 3380172
3380172:C 18 Apr 2024 17:37:29.835 * Successfully created the temporary AOF base file temp-rewriteaof-bg-3380172.aof
3380172:C 18 Apr 2024 17:37:29.840 * Fork CoW for AOF rewrite: current 20 MB, peak 20 MB, average 14 MB
799535:M 18 Apr 2024 17:37:29.844 * Background AOF rewrite terminated with success
799535:M 18 Apr 2024 17:37:29.844 * Successfully renamed the temporary AOF base file temp-rewriteaof-bg-3380172.aof in
to appendonly.aof.208368.base.rdb
799535:M 18 Apr 2024 17:37:29.845 * Removing the history file appendonly.aof.208367.incr.aof in the background
799535:M 18 Apr 2024 17:37:29.845 * Removing the history file appendonly.aof.208367.base.rdb in the background
799535:M 18 Apr 2024 17:37:29.845 * Background AOF rewrite finished successfully
799535:M 18 Apr 2024 17:37:35.298 # === ASSERTION FAILED ===
799535:M 18 Apr 2024 17:37:35.298 # ==> networking.c:2066 'c->duration == 0' is not true
799535:M 18 Apr 2024 17:37:35.305 # Bio worker thread #0 terminated
799535:M 18 Apr 2024 17:37:35.306 # Bio worker thread #1 terminated
799535:M 18 Apr 2024 17:37:35.306 # Bio worker thread #2 terminated
799535:M 18 Apr 2024 17:37:35.306 # IO thread(tid:139898848896768) terminated
799535:M 18 Apr 2024 17:37:35.306 # IO thread(tid:139898840504064) terminated

Additional information

  1. OS distribution and version
  2. Steps to reproduce (if any)

@xiaozhuang-a did you use any module? and can you give the fully crash log?

@xiaozhuang-a did you use any module? and can you give the fully crash log?
Not using any module
This is the complete crash log

@xiaozhuang-a did you use any module? and can you give the fully crash log?

Are we affected by the large and high concurrency use of 'stream' and 'block' in our requests

yes, it's probably block related.

yes, it's probably block related.

yes, it's probably block related.

There is a way to avoid this crash, let's complete the rebalance

@xiaozhuang-a do you mean you wanna avoid this crash?
you can temporarily close serverAssert(c->duration == 0);, in theory it should have no side effects.

@xiaozhuang-a do you mean you wanna avoid this crash? you can temporarily close serverAssert(c->duration == 0);, in theory it should have no side effects.

yes
ths