aiven / kafka

Mirror of Apache Kafka

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Consumption from Tiered Storage is broken

jeqo opened this issue · comments

Testing 3.3-2022-10-06-tiered-storage branch, consumption seem to be broken compared to 3.0-2022-03-31-tiered-storage.

Test harness cases are failing when trying to consume, e.g. DeleteTopicWithSecondaryStorageTest:

org.opentest4j.AssertionFailedError: Could not consume 3 records of topicA-1 from offset 0 in 60000 ms. 0 message(s) consumed:

	at app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39)
	at app//org.junit.jupiter.api.Assertions.fail(Assertions.java:134)
	at app//kafka.utils.TestUtils$.pollRecordsUntilTrue(TestUtils.scala:1061)
	at app//kafka.tiered.storage.TieredStorageTestContext.consume(TieredStorageTestContext.scala:151)
	at app//kafka.tiered.storage.ProduceAction.doExecute(TieredStorageTestSpec.scala:282)
	at app//kafka.tiered.storage.TieredStorageTestAction.execute(TieredStorageTestSpec.scala:110)
	at app//kafka.tiered.storage.TieredStorageTestAction.execute$(TieredStorageTestSpec.scala:108)
	at app//kafka.tiered.storage.ProduceAction.execute(TieredStorageTestSpec.scala:216)

Haven't dived into the details on what may be causing this issue, but adding it here to keep track.

Many thanks to @ivanyu , he managed to find the core issue. Due to having to integrate changes a lot of places moved from TopicPartition to TopicIdPartition. I happened to miss a line where I didn't adjust an equality check, i.e.

if (tp.equals(remoteFetchInfo.topicPartition) && remoteFetchResult.isDone
. I applied the fix and force pushed to the branch, i.e. https://github.com/aiven/kafka/blob/3.3-2022-10-06-tiered-storage/core/src/main/scala/kafka/server/DelayedRemoteFetch.scala#L94 and I can now confirm that the various Fetcher tests are now passing.

@jeqo Can you confirm that it also passed on your end? You still might need to do the Thread.sleep workaround in RemoteLogManager.onEndpointCreated. Once you confirm this I will close the ticket.

Closing this as I believe its fixed, re-open if its not the case.

@mdedetrich sorry for the late reply. Yes, I can confirm that fetching on 3.3 is working for me. Thank you!