-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Base level alignment #26
Comments
I think this is worth exploring WFA as we are currently relying on SSW which has its issues being a local aligner as mentioned in issue #54. The method seems to have great performance, see table2 in WFA paper for timings to e.g. SeqAn and ksw2 in Table 2. Furthermore, the maturity of the implementation is here with WFA2 in terms of providing different penalty models (including dual gap cost penalties!), traceback cigar etc, Also in strobealign the extension step is over 50% of the runtime for many biological datasets, see 'aln' field for BIO150 and BIO250 in attached figure for extension with SSW. |
This will also make it easy to scale to longer sequences, provided your seed chaining can do it. BiWFA uses order divergence space and is consequently very cache coherent and actually fast even for ~500bp sequences. |
That's a good point. As mentioned in #24, extending to alignments of long reads is one objective. Strobealign's seeds should be very suitable for long reads, so BiWFA may be the way to go already at start. We are currently exploring WFA (#229), but we have yet to find a way to use the library efficiently compared to SSW. |
Note to developer:
The extension step (nucleotide level alignment) is the bottleneck in strobealign. There are different three ways to reduce this:
The text was updated successfully, but these errors were encountered: