Broker may forget to upload segment to remote storage - data loss

Question

Broker may forget to upload segment to remote storage - data loss

ivanyu opened this issue a year ago · comments

Start a broker with the following configuration:

remote.log.storage.system.enable=true
remote.log.manager.task.interval.ms=5000
remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager
rlmm.config.remote.log.metadata.topic.replication.factor=1
remote.log.metadata.manager.listener.name=PLAINTEXT
remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.commons.UniversalRemoteStorageManager
rsm.config.chunk.size=102400
rsm.config.object.storage.factory=io.aiven.kafka.tieredstorage.commons.storage.filesystem.FileSystemStorageFactory
rsm.config.object.storage.root=<some dir here>

(it uses our UniversalRemoteStorageManager, but it doesn't matter).

Create the topic with the following command:

~/kafka/kafka_2.13-3.4.0/bin/kafka-topics.sh --bootstrap-server localhost:9092 \
	--create --topic topic1 \
	--config remote.storage.enable=true \
	--config segment.bytes=512000 \
	--config retention.ms=3600000 \
	--config local.retention.ms=120000

Fill it with the following command:

~/kafka/kafka_2.13-3.4.0/bin/kafka-producer-perf-test.sh \
	--topic topic1 --num-records=10000 --throughput -1 --record-size 1000 \
	--producer-props acks=1 batch.size=16384 bootstrap.servers=localhost:9092

The last/active segment is 00000000000000009920.

Now happens this:

All the current segments on the disk are uploaded (save for the last, i.e. active).

[2023-04-21 07:48:16,094] INFO [RemoteLogManager=0 partition=o46XK3njSDKNWM2AuTeVuQ:topic1-0] Copied 00000000000000000000.log to remote storage with segment-id: RemoteLogSegmentId{topicIdPartition=o46XK3njSDKNWM2AuTeVuQ:topic1-0, id=xw7n8ttqSZKdaZFMDveVvg} (kafka.log.remote.RemoteLogManager$RLMTask)
...
[2023-04-21 07:48:18,218] INFO [RemoteLogManager=0 partition=o46XK3njSDKNWM2AuTeVuQ:topic1-0] Copied 00000000000000009424.log to remote storage with segment-id: RemoteLogSegmentId{topicIdPartition=o46XK3njSDKNWM2AuTeVuQ:topic1-0, id=kAjnjNskSAOn2x4xnGQqlA} (kafka.log.remote.RemoteLogManager$RLMTask)

A new segment is rolled (don't know why, the data aren't being written, but it doesn't matter)

[2023-04-21 07:50:32,317] INFO [LocalLog partition=topic1-0, dir=/home/ivanyu/kafka/kafka-ts-development/_run-kafka-logs] Rolled new log segment at offset 10000 in 5 ms. (kafka.log.LocalLog)
[2023-04-21 07:50:32,335] INFO [ProducerStateManager partition=topic1-0] Wrote producer snapshot at offset 10000 with 0 producer ids in 18 ms. (kafka.log.ProducerStateManager)

The 00000000000000009920 becomes eligible for uploading and cleaning, but it is not uploaded.
Old segments are cleaned

[2023-04-21 07:51:32,342] INFO Deleted log /home/ivanyu/kafka/kafka-ts-development/_run-kafka-logs/topic1-0/00000000000000000000.log.deleted. (kafka.log.LogSegment)
...
[2023-04-21 07:51:32,352] INFO Deleted log /home/ivanyu/kafka/kafka-ts-development/_run-kafka-logs/topic1-0/00000000000000009920.log.deleted. (kafka.log.LogSegment)

including the 00000000000000009920, which is not uploaded.

Matthew de Detrich · Answer 1 · Fri Apr 21 2023 13:02:11 GMT+0800 (China Standard Time)

On which branch is this happening?

Ivan Yurchenko · Answer 2 · Fri Apr 21 2023 13:02:43 GMT+0800 (China Standard Time)

3.3-2022-10-06-tiered-storage

Matthew de Detrich · Answer 3 · Fri Apr 21 2023 13:03:33 GMT+0800 (China Standard Time)

Did you try running on 3.0 to see if its also there?

Ivan Yurchenko · Answer 4 · Fri Apr 21 2023 18:20:20 GMT+0800 (China Standard Time)

Reproduces on 3.3-2022-10-06-tiered-storage, but works as expected on 3.0-li. The 10000 segment is not rolled on 3.0, so the 9920 remains active and not deleted.

Ivan Yurchenko · Answer 5 · Fri Apr 21 2023 18:32:10 GMT+0800 (China Standard Time)

It seems, on 3.3 it reproduces inconsistently, so I can't guarantee it doesn't happen on 3.0 as well.

Ivan Yurchenko · Answer 6 · Fri Apr 21 2023 18:34:39 GMT+0800 (China Standard Time)

UPD: no, it's consistent. The segment roll happens as the same moment as first deletion:

[2023-04-21 13:33:12,765] INFO [LocalLog partition=topic1-0, dir=/home/ivanyu/kafka/kafka-ts-development/_run-kafka-logs] Rolled new log segment at offset 10000 in 4 ms. (kafka.log.LocalLog)
[2023-04-21 13:33:12,784] INFO [ProducerStateManager partition=topic1-0] Wrote producer snapshot at offset 10000 with 0 producer ids in 19 ms. (kafka.log.ProducerStateManager)
[2023-04-21 13:33:12,786] INFO [UnifiedLog partition=topic1-0, dir=/home/ivanyu/kafka/kafka-ts-development/_run-kafka-logs] Deleting segment LogSegment(baseOffset=0, size=502355, lastModifiedTime=1682073046965, largestRecordTimestamp=Some(1682073046848)) due to retention time 120000ms breach based on the largest record timestamp in the segment (kafka.log.UnifiedLog)

Matthew de Detrich · Answer 7 · Tue May 23 2023 05:14:14 GMT+0800 (China Standard Time)

@ivanyu I believe I have found the root cause for the fix, can you apply this patch onto the 3.3 branch and confirm whether it solves the problem or not?

Fix_for_accidental_deletion_of_remote_log_segments.patch

Ivan Yurchenko · Answer 8 · Tue May 23 2023 13:21:40 GMT+0800 (China Standard Time)

Seems to be working with the patch: the 10000 segment is not created and the 9920 one remains the active.

Matthew de Detrich · Answer 9 · Fri May 26 2023 15:48:40 GMT+0800 (China Standard Time)

Perfect thanks! I need to update my github subscriptions, I missed this.

Force updating the branch now with the patch.

Matthew de Detrich · Answer 10 · Fri May 26 2023 16:07:21 GMT+0800 (China Standard Time)

Closing ticket, the 3.3-2022-10-06-tiered-storage has force pushed with the fix.