cancerit / cgpPindel

Cancer Genome Project Insertion/Deletion detection pipeline based around Pindel

Home Page:http://cancerit.github.io/cgpPindel/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

speed up flagging with ItervalTree

keiranmraine opened this issue · comments

Most of the flagging code is looking for simple "hit" lookups in tabix files. This can be handled in exactly the same way as the input generation speed up.

Will have additional advantages as current code wraps each query with an eval which is expensive:

my $ret = eval{
my $iter = $vcf_flagging_unmatched_normals_tabix->query_full($CHROM,$from,$to);
return $PASS if(!defined $iter); # no valid entries (chromosome not in index) so must pass
while($iter->next){
return $FAIL;
}
return $PASS;
};
if($@) {
die $@;
}

Should be able to hide this in the reuse_unmatched_normals_tabix and reuse_repeats_tabix functions. Needs to be applied in both FilterRules.pm and FragmentFilterRules.pm.