LTLA / csaw

Clone of the Bioconductor repository for the csaw package.

Home Page:https://bioconductor.org/packages/devel/bioc/html/csaw.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"xxx.bam" indexes as "xxx.bai" or "xxx.bam.bai"

russHyde opened this issue · comments

I was wondering whether you'd accept a pull request to modify how bam-indexes are found by csam.

Currently, for a given "xxx.bam", the csaw functions die if xxx.bam.bai can't be found.

As a suggestion, I think csaw should allow either "xxx.bam.bai" (samtools default) or "xxx.bai" (picard default) names for indexes (and die if neither are present; choose the newer of the two if both are present).

Ah, at last. 5 years ago I wondered whether I should provide this option, but I decided against it for simplicity. I was expecting more frequent complaints about this, but yours is the first.

I was wondering whether you'd accept a pull request to modify how bam-indexes are found by csam.

In principle, yes. In practice, I'm still waiting on events that will allow #4 to go through. You could make a PR off htsfree, but that's a moving target in itself... you're more than welcome to give it a go, though. I have some suggestions for how to do this if you're interested.

As a suggestion, I think csaw should allow either "xxx.bam.bai" (samtools default) or "xxx.bai" (picard default) names for indexes (and die if neither are present; choose the newer of the two if both are present).

I would prefer the behaviour to be a bit simpler, and avoid checking time stamps:

  1. Check if indices are supplied as part of the input. If so, use them.
  2. Add .bam.bai to the end of the file names, and if they're there, use them.
  3. Add .bai to the end of the file names, and if they're there, use them.
  4. Die.

Thanks for the reply. In your new branch, it looks like bam-file handling has been pushed down to Rsamtools; this should mean that picard-style indexes are handled (and if not I'll move this issue to the rsamtools repository) and that this is no longer an issue. All the best