Problem parsing PERCENT_DUPLICATION from metrics file

Question

Problem parsing PERCENT_DUPLICATION from metrics file

Smeds opened this issue 9 years ago · comments

The resulting ign_aggregate_report will have no value for duplication_rate when the picard metrics file duplication value contain , instead of ..

Example

## METRICS CLASS        picard.sam.DuplicationMetrics
LIBRARY UNPAIRED_READS_EXAMINED READ_PAIRS_EXAMINED     UNMAPPED_READS UNPAIRED_READ_DUPLICATES        READ_PAIR_DUPLICATES READ_PAIR_OPTICAL_DUPLICATES    PERCENT_DUPLICATION     ESTIMATED_LIBRARY_SIZE
SX444_235.v1    4079278 292064184       8029922 1371353 26034821        3672918 0,090854        1762198382

## HISTOGRAM    java.lang.Double
BIN     VALUE
1.0     1,01171
2.0     1,868899

Phil Ewels · Answer 1 · Wed Dec 09 2015 19:45:03 GMT+0800 (China Standard Time)

Simple question - why use Swedish number formatting? It might be easier to change that than go through all of our code base making it compatible..

Patrik Smeds · Answer 2 · Wed Dec 09 2015 22:31:15 GMT+0800 (China Standard Time)

We actually don't know why we had Swedish number formatting, it happend for a small subset of samples in a project.

Johan Dahlberg · Answer 3 · Wed Dec 09 2015 23:18:49 GMT+0800 (China Standard Time)

Just pitching in - it's probably related to the locale. And this can be exported from the ssh-session if I'm not misstaken. To if somebody runs something from a shell that has a swedish locale that might carry over into the Java program.

Anandashankar Anil · Answer 4 · Mon Nov 09 2020 21:06:34 GMT+0800 (China Standard Time)

Closing this issue as ign_aggregate_report has been retired.