"xxx.bam" indexes as "xxx.bai" or "xxx.bam.bai"
russHyde opened this issue · comments
I was wondering whether you'd accept a pull request to modify how bam-indexes are found by csam
.
Currently, for a given "xxx.bam", the csaw
functions die if xxx.bam.bai
can't be found.
As a suggestion, I think csaw
should allow either "xxx.bam.bai" (samtools default) or "xxx.bai" (picard default) names for indexes (and die if neither are present; choose the newer of the two if both are present).
Ah, at last. 5 years ago I wondered whether I should provide this option, but I decided against it for simplicity. I was expecting more frequent complaints about this, but yours is the first.
I was wondering whether you'd accept a pull request to modify how bam-indexes are found by
csam
.
In principle, yes. In practice, I'm still waiting on events that will allow #4 to go through. You could make a PR off htsfree
, but that's a moving target in itself... you're more than welcome to give it a go, though. I have some suggestions for how to do this if you're interested.
As a suggestion, I think
csaw
should allow either "xxx.bam.bai" (samtools default) or "xxx.bai" (picard default) names for indexes (and die if neither are present; choose the newer of the two if both are present).
I would prefer the behaviour to be a bit simpler, and avoid checking time stamps:
- Check if indices are supplied as part of the input. If so, use them.
- Add
.bam.bai
to the end of the file names, and if they're there, use them. - Add
.bai
to the end of the file names, and if they're there, use them. - Die.
Thanks for the reply. In your new branch, it looks like bam-file handling has been pushed down to Rsamtools; this should mean that picard-style indexes are handled (and if not I'll move this issue to the rsamtools repository) and that this is no longer an issue. All the best