NationalGenomicsInfrastructure / ngi_reports

Code to generate reports for use by the NGI in SciLifeLab

Home Page:http://nationalgenomicsinfrastructure.github.io/ngi_reports/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem parsing PERCENT_DUPLICATION from metrics file

Smeds opened this issue · comments

The resulting ign_aggregate_report will have no value for duplication_rate when the picard metrics file duplication value contain , instead of ..

Example

## METRICS CLASS        picard.sam.DuplicationMetrics
LIBRARY UNPAIRED_READS_EXAMINED READ_PAIRS_EXAMINED     UNMAPPED_READS UNPAIRED_READ_DUPLICATES        READ_PAIR_DUPLICATES READ_PAIR_OPTICAL_DUPLICATES    PERCENT_DUPLICATION     ESTIMATED_LIBRARY_SIZE
SX444_235.v1    4079278 292064184       8029922 1371353 26034821        3672918 0,090854        1762198382

## HISTOGRAM    java.lang.Double
BIN     VALUE
1.0     1,01171
2.0     1,868899

Simple question - why use Swedish number formatting? It might be easier to change that than go through all of our code base making it compatible..

We actually don't know why we had Swedish number formatting, it happend for a small subset of samples in a project.

Just pitching in - it's probably related to the locale. And this can be exported from the ssh-session if I'm not misstaken. To if somebody runs something from a shell that has a swedish locale that might carry over into the Java program.

Closing this issue as ign_aggregate_report has been retired.