Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strobealign as reusable library? #39

Open
ksahlin opened this issue Aug 17, 2022 · 2 comments
Open

Strobealign as reusable library? #39

ksahlin opened this issue Aug 17, 2022 · 2 comments
Labels
feature New feature or request

Comments

@ksahlin
Copy link
Owner

ksahlin commented Aug 17, 2022

Adding the request "Please make it a reusable library" from here. Not sure what it entails or mean.

@ksahlin ksahlin added the feature New feature or request label Aug 17, 2022
@marcelm
Copy link
Collaborator

marcelm commented Sep 20, 2022

The idea is likely that other people would not only be able to use StrobeAlign as a command-line tool, but that they can also write their own (C++) programs that use the StrobeAlign algorithms. This would be similar to how we rely on external libraries such as args, doctest, kseq++, StripedSmithWaterman etc.

This requires a couple of changes and may sound like extra work, but the great thing is that nearly all of these changes are needed anyway to make the code more readable, more testable and generally more maintainable.

For example, I would like be able to write a unit test like this for aligning a paired-end read against a reference:

    auto refs = References::from_fasta("tests/phix.fasta");
    auto index = StrobemerIndex(refs);
    auto r1 = KSeq("read", "ACGT", "AAAA");
    auto r2 = KSeq("read", "CCAA", "BBBB");

    StrobemerMapper mapper(index);
    auto mapped = mapper.map_paired(r1, r2);
    auto aligned = mapped.align();
    CHECK(aligned[0].ref_position == 270);
    CHECK(aligned[1].is_unmapped);

Many of the changes I’ve made recently are intended to make the above work (and we’re getting there).

So while my motivation for the moment is to enable testing, the required refactoring forces us to come up with a good API that can then also be used by others. Not sure if we’ll go that far, but StrobeAlign could essentially become the frontend to a libstrobealign. The frontend would only parse the command-line and perhaps set up threads or so, and then it delegates index creation and mapping to the backend (API). This would mirror the relationship between samtools and htslib.

Here are a couple of things that would need to be done. (We can turn them into issues when the time comes.)

  • Never call exit() from API functions, use exceptions instead (as a consumer of the API, you don’t want your own program to exit just because there was a problem in an API call. Perhaps you want to print a nice error message before exiting).
  • Do not log anything to stdout or stderr from within API functions. (Communicate the information some other way and let the main program decide whether it wants to display that info.)
  • Hide implementation details from the public headers. (For example, if a typedef is only needed in the cpp file, remove it from the hpp.)
  • Add documentation.
  • Ensure const correctness (means essentially: add const to reference parameters wherever possible). This makes it clearer which parameters are modified and which aren’t when calling a function.
  • Use consistent, meaningful names
  • Put all of StrobeAlign into a separate namespace (but not the code in main.cpp).
  • Ensure the CMakeLists.txt files are so that the project can be included via ExternalProject_Add from a different project.

These are just some points that come to mind. All of them (except possibly for the very last one), are things I’d recommend doing anyway, so even if we just aim for StrobeAlign to become a reusable library (but never get there), it would be beneficial because we would get better code in the end.

@ksahlin
Copy link
Owner Author

ksahlin commented Sep 20, 2022

Thanks for the detailed answer! Yes, I think this sounds great and worth spending some time on since it is beneficial for the code base regardless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants