brentp / mosdepth

fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unspecific naming of region in summary file

Krannich479 opened this issue · comments

Dear Brent,
first of all, mosdepth is super easy to install with bioconda, extremely fast and works like charm, overall great tool!

  • WHATS THE ISSUE:
    I suppose I observe a minor bug with the region naming in the summary file. I have a BED file in the format
    NC_045512.2 21533 25790 spike
    to focus on the "spike" region. The corresponding output line in the summary report is
    NC_045512.2_region 4257 42824433 10059.77 672 24214
    which seems about right, judging by the depth I observe via IGV. However, the naming convention of the region NC_045512.2_region is not as I expected.

  • WHAT I EXPECTED:
    I expected an output line in the format
    spike 4257 42824433 10059.77 672 24214
    or to comply with your suffix
    spike_region 4257 42824433 10059.77 672 24214

  • ADDITIONAL INFO:
    I am not very confident with nim but from what I can tell I narrowed down the issue to somewhere around this line:

    write_summary(target.name & "_region", chrom_region_stat, fh_summary)

    Maybe this should be
    write_summary(region & "_region", chrom_region_stat, fh_summary) ?

Best,
Thomas

Addendum, I just realized with another test BED file that the <prefix>.mosdepth.summary.txt contains the sum of all regions in the <chromName>_region line.

For reference, what I was looking for is within <prefix>.regions.bed.gz and the README tells me the 5th column's value is the mean by default. Sorry for all the confusion 🙃

Case closed.