Further space improvement
dkoslicki opened this issue · comments
David Koslicki commented
Can significantly improve space required this by only using the k-mers that are present in the union of the training/reference genomes. This will significantly cut down on the size of the bloom filter of the sample. Would need a more creative way to estimate the cardinality of the whole sample though (e.g. Hyperloglog).