shah-rohan / SmartMap

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

intermediate files and filan BAM or BEDPE

fgualdr opened this issue · comments

Hi,
I'm looking into SmartMap for several ongoing projects.
I am currently attempting to assemble a similar pipeline and I would like instead to use Smartmap.

One thing that I would like to add to this pipeline is whether it is possible to retrieve intermediate files and in general whether is feasible to get a final BAM in which only one multi-map is kept according to weights and scores as described in the paper.

I am asking this as several downstream pipeline won't use Bedgraphs such as differential peaks and genes caller (i.e. DEseq which needs raw reads per feature).
I was starting to mess around with Smartmap code but I am more of a python person than a C++ so I was finding it difficult to add such features.

Any insight is appreciated.
Thanks a lot
Francesco

Hi, what intermediate files in particular are you trying to keep?

With regards to the final BAM in which only one multi-map is kept, that is indeed possible. In the SmartMap software, you can use the flag -r in the options to output all the final weights for each read. So the command would then look like:

SmartMap [other options] -r -g [genome length file] -o [output prefix] [BED or BED.gz file input(s)]

This will output a file with the coordinates for each mapping, the read ID, and the score for that particular mapping. You can then use Python, awk, or the tool of your choice to select the mapping for each read ID with the highest score and keep the coordinates to put into a BED file to be converted directly into a BAM.

Hope that helps!

-Rohan