mlin / PhyloCSF

Phylogenetic analysis of multi-species genome sequence alignments to identify conserved protein-coding regions

Home Page:http://compbio.mit.edu/PhyloCSF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

max_score is incorrect when --allScores option is used

iljungr opened this issue · comments

Run with these arguments:
12flies -f3 --allScores --orf=ATGStop --minCodons=1 Bug.fa --species=dmel,dsim

Result:
Bug.fa orf_score(decibans) -90.7230 93 215
Bug.fa orf_score(decibans) -9.3641 100 111
Bug.fa orf_score(decibans) -15.4172 160 237
Bug.fa orf_score(decibans) -29.8306 65 154
Bug.fa orf_score(decibans) -40.1296 20 154
Bug.fa max_score(decibans) -90.7230 93 215

max_score is obviously not the maximum score, though without the --allScores option it correctly reports the max_score to be -9.364. In other cases, it is not the maximum absolute value, not the first, and sometimes not even one of the scores listed. (Those scores should be listed, reported separately.) Similar problems with minCodons=25 and with 12 flies.

Here's Bug.fa:

dmel
GATAGACATCAATTTGAAAAATGGGCCAAGAGAGCAGGAGCAACGAAAAACAAACACGGCGAACAATGGG
CTACCCAAAAGCAGGCGTAGAACATGAGGAATGGATTTGTTTTAGGATTTCGATTTGGAAACACCCAGTA
TTTGCAACTTGTATATAGATATGACTTTCAGTCGGTCCCCGTTAAATGTGTTGTTATACGGAACAGTCCT
TTCACGTAAACAGCTATCCCAGGACTCTTGAAGCCAGACGGCGACCTATGTGTACTCAACGTTACT
dsim
GATAGACATCAATTTGAAAAATGGGCCAAAAGAGCAGGAGCAACGAAAAACAAACACGGCGAACAATGGG
CTACCGAAAAGCAGGCGTGGAACATGAGGAATGGGTTTGTTTTAGGATTTCGATTTGGAAACATCCTATT
TTTGCACCTAGTATATAGTTATGACTTTCAGTCGGTCCCCATAAAATGTGTTGTTATATAGAACTGTCCT
CTCACGTAAACAGCCATCCCAGGACTCTTGAAGCCAGACGGCGACCTATGTATACTCAACGTTAGT