Question about min-orf change for blastx in v2.1.0
Stikus opened this issue · comments
Hello, thanks for great tool.
We are using https://github.com/humanlongevity/HLA in our pipelines, and it provides great results but is unmaintained, unfortunately.
One of it scripts use diamond blastx
:
open(IN, "diamond blastx -t . -C 20000 --index-mode 1 --seg no --min-score 10 --top 20 -c 1 -d $root/data/hla -q $fastq_file -f tab --quiet -o /dev/stdout |") or die $!;
We changed this command according to this issue #140 (comment):
open(IN, "diamond blastx -t . --sensitive --masking 0 --min-score 10 --top 20 -c 1 -d $root/data/hla -q $fastq_file -f tab --quiet -o /dev/stdout |") or die $!;
But after DIAMOND 2.1.0 change:
The
blastx
mode will now mask any open reading frame below the minimum required length as specified by--min-orf
.
Theblastx
mode will only count unmasked letters towards the block size.
Results changed.
According to wiki
--min-orf/-l #
Ignore translated sequences that do not contain an open reading frame of at least this length. By default this feature is disabled for sequences of length below 30, set to 20 for sequences of length below 100, and set to 40 otherwise. Setting this option to
1
will disable this feature.
We can add --min-orf 1
to command. We tested it and it brings back old results. But should we? Can anyone give us advice?
Using this will produce more accurate results at the expense of longer runtime. So you should probably use it if run time is not an issue.
@bbuchfink results with --min-orf 1
are more accurate, correct? Thanks for fast answer!
@bbuchfink results with --min-orf 1 are more accurate, correct?
yes
Thanks, closing issue