jetstream pull_subscribe never returns and makes stream unusable
jghelto opened this issue · comments
Hello!
I'm using nats-py to move to different parts of a stream to replay messages. My streams are very large (14 million messages, each of which is ~ 400KB). When I try to set the opt_start_seq in the ConsumerConfig to a large number within the stream (usually anything above 3 million messages from the beginning) for a jetstream.pull_subscribe, the pull subscriber is never returned (call times out). While this is ongoing, I can do a nats stream info on the stream from the CLI to see a consumer has been created. I can also do a nats consumer ls to list the consumer with the correct durable ID. However, once the jetstream.pull_subscribe times out (timeout=60s), the pull subscriber is None and the CLI can no longer operate on the stream for some time (sometimes up to 30 minutes). Of note, after the pull_subscribe times out, I also do a consumer_info call (timeout=60) and a further pull_subscribe call since the subscriber already exists. Both calls time out.
Also note that pull_subscribes early in the stream (i.e. less than 3 million messages into the stream) work fine, but take increasingly longer to complete the further into the stream you search. Also, setting the opt_start_seq to the last sequence in the stream comes back very quickly.
Code snippet and log are attached.
Of note, the same subscription on nats cli comes back in 2 seconds: nats consumer test test --pull --deliver=17676000
However, from the Jetstream Server logs, it seems the subscription from the nats-py library took 55 minutes.
2023/06/27 12:38:17.217470 [WRN] Internal subscription on "$JS.API.CONSUMER.DURABLE.CREATE.test.test" took too long: 55m17.0411416s
nats-py version 2.3.1
Nats server: 2.9.19 (nightly build) running in Docker