Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to avoid ambiguous nucleotides in generated probes #58

Open
med-ss20 opened this issue May 31, 2024 · 1 comment
Open

Option to avoid ambiguous nucleotides in generated probes #58

med-ss20 opened this issue May 31, 2024 · 1 comment

Comments

@med-ss20
Copy link

Dear Dr. Metsky,

Is there a way to configure CATCH to avoid generating probes with ambiguous nucleotides and only use standard nucleotides (A, T, C, G)? I understand that it will lead to a greater number of probes for the given dataset.

Thank you very much for any advice and this great tool 👍🏻

Regards,
Sviat

@haydenm
Copy link
Collaborator

haydenm commented Jun 3, 2024

Hi Sviat,

I'm glad that you find CATCH useful. Yes, there is an option to do what you're looking for! It's --expand-n. The help message for that argument is:

Expand each probe so that 'N' bases are replaced by real
bases; for example, the probe 'ANA' would be replaced
with the probes 'AAA', 'ATA', 'ACA', and 'AGA'; this is
done combinatorially across all 'N' bases in a probe, and
thus the number of new probes grows exponentially with the
number of 'N' bases in a probe. If followed by a command-
line argument (INT), this only expands at most INT randomly
selected N bases, and the rest are replaced with random
unambiguous bases (default INT is 3).

For example, setting --expand-n 10 combinatorially expands up to 10 N nucleotides with real nucleotides, and replaces the rest randomly with real nucleotides. You could set the value to be the probe length if you want to combinatorially expand all Ns. Note that this does not work with non-N ambiguity characters (e.g., Y); if you have those, my suggestion would be to replace them with N in the input.

Hayden

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants