Comparing Results with Multiple Samples with Same Name
brcopeland opened this issue · comments
I have a WGS pipeline, and when I have a sample with multiple read groups (1 BAM/read group), I want to confirm they all correspond to the same individual prior to merging them. I tried just following the instructions here and found somalier kept overwriting the same file which I realized would be because each BAM is labeled with the same SM
tag in the @RG
line in the header. I was able to handle this by placing somalier extract
output into separate directories and renaming the resulting files. Upon running somalier relate
, however, I find all comparisons in, for example, somalier.pairs.tsv
reference the same sample name again. If there is a relatedness problem this would make it difficult to infer which BAM(s) was the problem.
I could of course reheader the BAMs to give them distinct SMs but I would prefer to not have to do that just for this step. Do you have any suggestion as to how to accomplish this?
Hi, you can use the --sample-prefix
argument to somalier extract
for this. Just give each file a unique --sample-prefix
and then you'll be able to distinguish them in the output.