mlin / PhyloCSF

Phylogenetic analysis of multi-species genome sequence alignments to identify conserved protein-coding regions

Home Page:http://compbio.mit.edu/PhyloCSF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some ORFs are missing when --allScores option is used

iljungr opened this issue · comments

Run with these arguments:
12flies -f3 --allScores --orf=ATGStop --minCodons=1 Bug.fa --species=dmel,dsim
Result:
Bug.fa orf_score(decibans) 14.5636 3 14
Bug.fa max_score(decibans) 20.6305 24 35

There are no orf_score's for the orfs at 24-35, 42-53, 63-74 (though one of them shows up in max_score). When PhyloCSF is run on these ORFs individually it is able to compute a score.

Here is Bug.fa:

dmel
aaaatgccctttgggtagcccaaaatgccctttggttagaaaatgccctttgggtagcccaaaatgccctttgggtag
dsim
aaaatgcctttcggatagcccaaaatgcctttcggatagaaaatgcctttcggatagcccaaaatgcctttcggatag

Thanks, this and #8 were due to a lazy data structure that should have been eager. It arose only when using --allScores.