parallelize `referenced_blocks`
sourcefrog opened this issue · comments
referenced_blocks
builds a set of all blocks referenced by all indexes. In 0.6.8 this is done on a single thread but it could fairly easily be parallelized:
- Read multiple bands in parallel.
- Read hunks of indexes in parallel.
This will significantly help performance of conserve gc
and conserve delete
.
Not strictly the same but related:
- There's no need to stitch indexes when finding referenced blocks, because this amounts to reading the same hunk twice. (However, truncated indexes are probably fairly rare so this is not too high of a priority.)
Lines 238 to 261 in 9f405dc