brentp / hts-nim

nim wrapper for htslib for parsing genomics data files

Home Page:https://brentp.github.io/hts-nim/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Iteration without for loop

jaudoux opened this issue · comments

Hi @brentp,

I was looking into merging two sorted VCF files using hts-nim. However I found myself having trouble openning one iterator for each files and moving forward alternatively the iterators depending on the variant positions in each VCF file.

I finally used the concept of "Anonymous Functions" as I used to in Perl, but it's not very sexy. Do you happen to have a better pattern (cf. code bellow) ?

Thanks in advance,
Jérôme.

import hts

type var_it = (proc(): Variant)

proc get_vcf_it(vcf: VCF): var_it =
  return proc(): Variant =
    for i in vcf:
      return i

var
  vcf: VCF
  v: Variant
  vcf_it: var_it

doAssert(open(vcf, "DP28102008_S16.vcf"))
vcf_it = get_vcf_it(vcf)

v = vcf_it()
echo(v)

v = vcf_it()
echo(v)

hi, yes, you can use a closure iterator:

import hts
import os

proc get_vcf_it(vcf: VCF): iterator(): Variant =
  return iterator(): Variant =
    for i in vcf:
      yield i

var
  vcf: VCF
  v: Variant

doAssert(open(vcf, paramStr(1)))
var vcf_it = get_vcf_it(vcf)

echo vcf_it()
echo vcf_it()

also, for 2 files, this would be overkill, but you could use binary heap to make a priority Q of sorts to allow any number of files.

Thanks a lot @brentp , I knew there was a cleaner way to do it !

Nice to know that there is binary heap lib ready to use ;)