clwgg / nQuire

A statistical framework for ploidy estimation using NGS short-read data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

create multithreading

rotifergirl opened this issue · comments

I was wondering if it is possible to run create on multiple threads, I am trying to run in with a bed file of annotated genes (about 15,000) and it is taking a very long time to run, but I can't seem to find a threading option in the help for create.
If this is not possible currently, is it possible to implement in a future version of nQuire?

Unfortunately, the 'create' subcommand currently has no multithreading implementation. I actually didn't think about this particular application, but it's a great suggestion for cases where the regions are to be concatenated into a single '.bin' file (i.e. using the -y option).
In cases where a separate '.bin' file is created for each BED region, one could split the BED file and run multiple instances of 'nquire create' - but since your BED file contains genes I suppose this is not the use case you are interested in (this is usually used to for example split the analysis by chromosomes etc.).

Also, my feeling is that a large proportion of the 'create' run time is actually due to disk IO, which probably would have limited benefit from spreading the calculations over multiple threads. I would hope to implement it to at least benchmark this, but realistically it might take a while unfortunately. For the time being I am afraid the only option is to let it run for a while.

Thanks! It did eventually finish, I just wanted to make sure I wasn't missing any options.