Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very short soft-clipped ends #54

Closed
marcelm opened this issue Sep 7, 2022 · 1 comment · Fixed by #250
Closed

Very short soft-clipped ends #54

marcelm opened this issue Sep 7, 2022 · 1 comment · Fixed by #250

Comments

@marcelm
Copy link
Collaborator

marcelm commented Sep 7, 2022

All the reads in the phiX test dataset happen to start with a single N base (an artifact of picking the first 100 reads from the run and not random ones). For a read that otherwise matches without errors, StrobeAlign reports the alignment as 1S300=.

I found this to be unexpected because BWA-MEM reports this as 301M (with the N considered to be a mismatch as one can see from the MD tag). BWA-MEM penalizes soft-clipping (option -L) with a default penalty of 5, so it’ll prefer (at most) one mismatch (penalty 4) over soft clipping.

On the other hand, minimap2 also soft clips and reports 1S300M.

I think that penalizing soft clipping is beneficial when aligning short reads. It is not important for shotgun sequencing, but for targeted sequencing (amplicons), soft clipping single bases introduces a bias: Any variation at that position in the reference cannot be observed. For minimap2, it’s not so important because it is primarily (AFAIK) for longer reads.

This is probably not a high-priority issue, but I wanted to at least write it down because I was suprised when inspecting the test BAM output.

@ksahlin
Copy link
Owner

ksahlin commented Sep 9, 2022

I agree with this. Probably an artefact of using SSW local alignment mode. I used ksw2 before, but found its extension mode to occasionally yield some strange alignments over indels, hence switched (anecdata, unfortunately didn't log the event anyware). Having the right third party extension alignment tool is room for future work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants