jeffdaily / parasail

Pairwise Sequence Alignment Library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Indexing error in alphabet_aliases

shenker opened this issue · comments

Unless I'm mistaken, I believe i+=1 should be i+=2 in two places:

parasail/src/traceback.c

Lines 43 to 50 in 600fb26

for (i=0; i<aliases_size; i+=1) {
if (alphabet_aliases[i] == a) {
matches |= alphabet_aliases[i+1] == b;
}
else if (alphabet_aliases[i+1] == a) {
matches |= alphabet_aliases[i] == b;
}
}

for (i=0; i<aliases_size; i+=1) {
if (alphabet_aliases[i] == a) {
matches |= alphabet_aliases[i+1] == b;
}
else if (alphabet_aliases[i+1] == a) {
matches |= alphabet_aliases[i] == b;
}
}

As is, alphabet_aliases = "AaBb" would treat a and B as equivalent. My understanding of the documentation is that alphabet_aliases = "AaBb" should imply the two equivalence pairs A/a and B/b but not a/B.

Additionally, it might be nice to explicitly specify in the documentation that you need specify only one ordering for each equivalence pair (i.e., alphabet_aliases = "AB" is equivalent to alphabet_aliases = "ABBA").

Happy to submit a PR if that would help.

I'd be happy to accept a PR if you can also add a unit test that exercises the problem.