genome / pindel

Pindel can detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data. It uses a pattern growth approach to identify the breakpoints of these variants from paired-end short reads.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Suboptimal gap opening

selkovjr opened this issue · comments

I've been running into situations with pindel where it seemed to call short indels at wrong loci but in the general vicinity of the true variant. At a glance, they seemed to be ambiguous representations and I ignored them. Now I can see that they are actually incorrect. Why does it pick a deletion accompanied by four mismatches, while a perfectly-matching deletion of the same size is sitting next door? Does this algorithm pick the first solution that satisfies all criteria? How can I persuade it to do the right thing without the loss of sensitivity?

Example (the first alignment is by pindel, the second is a perfect match):

AATTTTTCCTataATAAAACAAATAATGTTAAAATGTTA
AATTTTTCCT---ATAATAAAACAAATGTTAAAATGTTA
AATTTTTCCTATAATAAAACA---AATGTTAAAATGTTA

Pindel version 0.2.5b9, 20160729, with default parameters