[RIOT-Redis] Error during live replication: "io.lettuce.core.output.StatusOutput does not support set(long)"
StephanHener opened this issue Β· comments
Hey,
We are facing an error when attempting to do a live replication. During the migration the logs show this error:
Encountered an error executing step snapshot-replication in job live-replication: io.lettuce.core.output.StatusOutput does not support set(long)
This error seems to happen a couple of times ultimately leading to the connection closing and riot shutting down
Here is the command we use for replication
riot-redis --info -h <source_host> -p <source_port> -a <source_password> -n <source_database> --timeout 60 --metrics replicate -h <target_host> -p <target_port> -a <target_password> --tls --tls-verify NONE -n 0 --metrics --mode live --batch 100 --scan-count 2000 --reader-threads 1 --reader-batch 100 --reader-queue 2000 --scan-match "*" --threads 1 --no-verify
The source is a read only replica running redis 6.2.8, the target is running 6.2.6.
We use riot redis 2.18.5 in a docker container via fieldengineering/riot-redis:v2.18.5
, but we also have the same error running it directly locally.
Here is an example info log:
Job: [FlowJob: [name=live-replication]] launched with the following parameters: [{}]
Executing step: [snapshot-replication]
Executing step: [live-replication]
Listening ? % ββ β 0/? (0:00:00 / ?) ?/sJob: [SimpleJob: [name=live-reader]] launched with the following parameters: [{}]
Executing step: [live-reader]
Scanning 0% β β 0/41558 (0:00:00 / ?) ?/sJob: [SimpleJob: [name=scan-reader]] launched with the following parameters: [{}]
Executing step: [scan-reader]
Listening ? % β β β 9/? (0:00:00 / ?) ?/s
Encountered an error executing step live-replication in job live-replication: io.lettuce.core.output.StatusOutput does not support set(long)
Step: [live-replication] executed in 845ms
Listening ? % β β β 27/? (0:00:00 / ?) ?/s
Scanning 1% ββ β 700/41558 (0:00:00 / 0:00:35) ?/s
Scanning 3% ββ β 1500/41558 (0:00:00 / 0:00:24) ?/s
Scanning 5% ββ β 2200/41558 (0:00:01 / 0:00:21) 2200.0/s
Scanning 5% ββ β 2400/41558 (0:00:01 / 0:00:24) 2400.0/s
Scanning 7% βββ β 3200/41558 (0:00:01 / 0:00:21) 3200.0/s
Scanning 9% βββ β 4000/41558 (0:00:02 / 0:00:19) 2000.0/s
Scanning 11% βββ β 4900/41558 (0:00:02 / 0:00:17) 2450.0/s
Scanning 13% βββ β 5700/41558 (0:00:02 / 0:00:16) 2850.0/s
Scanning 15% ββββ β 6400/41558 (0:00:03 / 0:00:16) 2133.3/s
Scanning 17% ββββ β 7300/41558 (0:00:03 / 0:00:15) 2433.3/s
Scanning 19% ββββ β 8000/41558 (0:00:03 / 0:00:15) 2666.7/s
Scanning 21% βββββ β 8900/41558 (0:00:03 / 0:00:14) 2966.7/s
Scanning 23% βββββ β 9800/41558 (0:00:04 / 0:00:13) 2450.0/s
Scanning 25% βββββ β 10500/41558 (0:00:04 / 0:00:13) 2625.0/s
Scanning 27% βββββ β 11400/41558 (0:00:04 / 0:00:12) 2850.0/s
Scanning 29% ββββββ β 12100/41558 (0:00:05 / 0:00:12) 2420.0/s
Scanning 30% ββββββ β 12500/41558 (0:00:05 / 0:00:12) 2500.0/s
Exception while closing step execution resources in step live-replication in job live-replication
Scanning 32% ββββββ β 13700/41558 (0:00:07 / 0:00:15) 1957.1/s
Scanning 34% βββββββ β 14400/41558 (0:00:07 / 0:00:14) 2057.1/s
Scanning 35% βββββββ β 14800/41558 (0:00:08 / 0:00:14) 1850.0/s
Encountered an error executing step snapshot-replication in job live-replication: io.lettuce.core.output.StatusOutput does not support set(long)
Step: [snapshot-replication] executed in 8s695ms
Scanning 37% βββββββ β 15500/41558 (0:00:08 / 0:00:14) 1937.5/sException while closing step execution resources in step snapshot-replication in job live-replication
Job: [FlowJob: [name=live-replication]] completed with the following parameters: [{}] and the following status: [FAILED] in 13s724ms
Encountered an error executing step live-reader in job live-reader: Connection closed
Step: [live-reader] executed in 13s755ms
Closing with items still in queue
Exception while closing step execution resources in step live-reader in job live-reader
Running the tool with the debug flag produces the following stacktrace for the error:
io.lettuce.core.RedisException: java.lang.UnsupportedOperationException: io.lettuce.core.output.StatusOutput does not support set(long)
at io.lettuce.core.internal.Exceptions.fromSynchronization(Exceptions.java:106)
at io.lettuce.core.internal.Futures.awaitAll(Futures.java:226)
at io.lettuce.core.LettuceFutures.awaitAll(LettuceFutures.java:59)
at com.redis.spring.batch.RedisItemWriter.write(RedisItemWriter.java:44)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.writeItems(SimpleChunkProcessor.java:193)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.doWrite(SimpleChunkProcessor.java:159)
at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor$3.doWithRetry(FaultTolerantChunkProcessor.java:348)
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:329)
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:255)
at org.springframework.batch.core.step.item.BatchRetryTemplate.execute(BatchRetryTemplate.java:217)
at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor.write(FaultTolerantChunkProcessor.java:444)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.process(SimpleChunkProcessor.java:217)
at org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:77)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:407)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:331)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:273)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:82)
at org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$ExecutingRunnable.run(TaskExecutorRepeatTemplate.java:262)
at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50)
at org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate.getNextResult(TaskExecutorRepeatTemplate.java:125)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:145)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:258)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:208)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:152)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:68)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:68)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:169)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144)
at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:94)
at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:91)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.UnsupportedOperationException: io.lettuce.core.output.StatusOutput does not support set(long)
at io.lettuce.core.output.CommandOutput.set(CommandOutput.java:107)
at io.lettuce.core.protocol.RedisStateMachine.safeSet(RedisStateMachine.java:778)
at io.lettuce.core.protocol.RedisStateMachine.handleInteger(RedisStateMachine.java:404)
at io.lettuce.core.protocol.RedisStateMachine$State$Type.handle(RedisStateMachine.java:206)
at io.lettuce.core.protocol.RedisStateMachine.doDecode(RedisStateMachine.java:334)
at io.lettuce.core.protocol.RedisStateMachine.decode(RedisStateMachine.java:295)
at io.lettuce.core.protocol.CommandHandler.decode(CommandHandler.java:842)
at io.lettuce.core.protocol.CommandHandler.decode0(CommandHandler.java:793)
at io.lettuce.core.protocol.CommandHandler.decode(CommandHandler.java:767)
at io.lettuce.core.protocol.CommandHandler.decode(CommandHandler.java:659)
at io.lettuce.core.protocol.CommandHandler.channelRead(CommandHandler.java:599)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1373)
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1236)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1285)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:519)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:458)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:280)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
Some other notes:
- running snapshot mode instead of live mode prodcues the same error
- after aborting the target contains significant less keys that the source, so it looks like it dies before even starting the ongoing migration
- using --dry-run option works without error, looks like this only happens when trying to write to the target cluster
- in terms of key datatype structure we have roughly 60k keys in redis:
- roughly 60% are strings that are usually just created, updated once or twice and deleted
- other roughly 40% are zsets with very low entries (<10) that are rarely update
- the rest are
- static keys that don't change
- zsets that are updated frequently with a higher amount entries (most of them under 500) and one with ~20k entries with represent job queues, which is a pattern according to documentation riot redis might struggle with, but not sure if this is relevant yet as we seem to fail at the initial replication already
Unfortunately we didn't find much about that error online. Any idea what could cause this?
Does the error still happen in RIOT 3.x?
Hey,
We have already migrated our Redis instances, and we decided to use a different approach without riot-redis.
I am closing the issue.