Always enable Querier BatchIterators
yeya24 opened this issue · comments
Proposal
-querier.batch-iterators
was marked as the default option in #2344, which was 4 years ago.
https://github.com/cortexproject/cortex/blob/master/pkg/querier/querier.go#L154
There are other two options Iterators
and None
. Both goes to different code paths.
Since Batch Iterator has been used for a long time in various production environment and is pretty stable. I propose that we make Batch Iterator always on and remove other two options.
This could simplify the codebase quite a bit and make it easier to support querying native histograms.
+1
The plan is to mark related flags as deprecated in v1.17.0 and totally remove them in v1.19.0.
👍
Actually I want to verify whether we can remove the chunks iterators Cortex currently maintains
or not https://github.com/cortexproject/cortex/blob/master/pkg/querier/querier.go#L154 and reuse Prometheus upstream storage interface as much as possible.
If batch iterators proved to have a much better performance, then we should definitely keep it. Otherwise, we can change to use the upstream interface instead. By reusing the upstream interface, there is almost no code change required to support native histograms on the query path.
Actually I want to verify whether we can remove the chunks iterators Cortex currently maintains
or not https://github.com/cortexproject/cortex/blob/master/pkg/querier/querier.go#L154 and reuse Prometheus upstream storage interface as much as possible.
Another benefit of totally removing chunks and Cortex's own iterators is to support Thanos downsampled chunks out of box. Otherwise, we have to implement those iterators for it as well.
Update: actually I tested it with downsampled blocks and it seems that chunks queried from Store Gateway is not impacted by those chunks iterators at all. Only data from Ingesters is relevant
I tested the performance of Cortex batch Iterator
, Iterator
and upstream Prometheus iterator (mainly storage.ChainSampleIteratorFromIterables) performance and put the benchmark results here.
The test covers 2 operations. Next
and Seek
.
Benchmark result is below.
Seek
- When number of chunks is low, Cortex Iterator has the best performance. Next is batch iterator and Prometheus iterator is the worst.
- When number of chunks are large, Cortex Batch Iterator shows better performance than Cortex Iterator. Prometheus iterator is way worse. Like 20X slower than Cortex Batch Iterator.
Next
- If we iterate over all samples in chunks, Prometheus Iterator has the best performance. Cortex Batch Iterator performance is slightly better than Cortex Iterator.
- When number of chunks increase, Cortex Batch Iterator performance increases. The more chunks we have, the worse performance we get on Prometheus iterator
- When the number of chunks is large and we stop early when iterating samples, Cortex Batch Iterator has the best performance and Cortex Iterator has the worst performance.
TLDR:
- I don't think we should replace Cortex Batch Iterator and Cortex Iterator with Prometheus Iterator due to Prometheus Iterator's poor
Seek
performance - Since Cortex Batch Iterator has been the default for 4 years, I think we should remove Cortex Iterator even though it seems to have better
Seek
performance at some scenario.