smarco / WFA2-lib

WFA-lib: Wavefront alignment algorithm library v2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Other Landau-Vishkin implementations

RagnarGrootKoerkamp opened this issue · comments

Are you aware of any other competitive / comparable in speed implementation of the basic Landau-Vishkin algorithm that WFA extends?

I tried looking for one, but couldn't find any, and since you also don't compare to them, this may not exist?

Not sure what are you looking for. I understand that you are not looking for classical pairwise alignment methods, but diagonal-transition algorithms (like landau-vishkin, OND, and WFA). Right?

In that case, I might suggest looking into lv89 from @lh3, wfalm from @jeizenga, and DALIGNER from G.Myers.

I hope this helps.
Cheers,

The reason I asked is because I was expecting a comparison to one such tool in your paper, since this is the algorithm you extend. The fact that you did not include one seems to indicate that no (competitive) implementation of the diagonal-transition method was available at the time.

  • lv89 seems to be a toy project (and is very recent)
  • wfalm I may include, but from their experiments they seem slower than normal WFA (although using less memory may be useful, as WFA does run out of memory on our largers tests. I'll also run WFA2 with the less memory options.)
  • DALIGNER seems to be mostly for local alignments.

Anyway, I just noticed the sentence saying:

We discarded other methods from the evaluation as their running time was exceedingly long or because their recall was substantially below par.

So I suppose that answers my question :)

Agreed.

I'll also run WFA2 with the less memory options.

Yes, please!

Let me know if you have any other questions or remarks. These are highly appreciated.

Cheers,

I forwarded @RagnarGrootKoerkamp to WFA2. As I was mentioned here, I need to clarify that lv89 is spinoff of my gwfa. I said it is a toy due to its simplicity. It is fairly efficient, though probably not as efficient as WFA2. I failed to get the correct output from WFA2. Will create a separate issue for that.

PS: the wrong result was due to enabling heuristics. It has been fixed on my end. See #7.

I forwarded @RagnarGrootKoerkamp to WFA2.

Thanks for the forwarding. Much appreciated.

[...] I need to clarify that lv89 is spinoff of my gwfa. I said it is a toy due to its simplicity. It is fairly efficient, though probably not as efficient as WFA2.

Well, I think it is a good idea. Simplicity is many times preferred over complex and hard-to-integrate tools/libraries. I would really love to offer the WFA2 in just a header, but obviously, I can't.

I failed to get the correct output from WFA2. Will create a separate issue for that.

Please, do.

Perhaps of interest: I implemented an adaptive version of the low memory WFA algorithm in wfalm that decides between three WFA variants on-line, which should have largely eliminated the run time difference between wfalm's implementations of standard and low-memory WFA. That said, I have neither rigorously benchmarked it nor compared the speed to implementations in other repositories.

@jeizenga I may also include wfalm in my benchmark. I didn't look much into your repo yet though (it would be simpler if you provide something similar to the tools/align-benchmark in this repo ;).

Also, now that we're all here: would you be interested in a slack/discord where we could chat more? After I get the preprint of my own aligner out, I'd love to collaborate.

I'd be happy to chat further :)

In the meantime, I do have a benchmarking utility in the wfalm repository (in the test directory), although it is very rudimentary compared to @smarco's. Maybe still useful though.

Sure, happy to keep on talking.

Well, I think it is a good idea. Simplicity is many times preferred over complex and hard-to-integrate tools/libraries. I would really love to offer the WFA2 in just a header, but obviously, I can't.

Is a header-only solution possible but requires significant work, or are there other factors preventing this?

Trying to answer the question properly, you can always put all the WFA2 sources in a single .h/.hpp file. This can be done easily. However, I am reluctant to merge >10K lines of code together in a single file. First, I need to maintain & debug this code on my own; I need it very structured and modularized to reduce the complexity as much as possible. Second, I think that compiling against a library, using API/bindings (C/C++/Python), is attainable for most programmers.

So, I think it is not a matter of offering a single file but producing a simple code (i.e., a few lines) that can do the job on a single header. In fact, I have a reduced version of the WFA-edit (a single C file) that I usually use for educational purposes. That is why I think that lv89 from @lh3 is a good idea; because it is simple and easy to understand.

I hope this clarifies the question.

Yes, it very much does. Thank you for the detailed answer!