Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't Count X for AA Diversity Chart #723

Merged
merged 1 commit into from
May 18, 2019
Merged

Don't Count X for AA Diversity Chart #723

merged 1 commit into from
May 18, 2019

Conversation

emmahodcroft
Copy link
Member

With the update to allow recovery of ambiguous sites for Fasta-input in augur, there's now the possibility for many AA sites be 'X' at the tips. At the moment, these are counted as 'mutations' for the diversity chart:

Before the change:
counts-before

But these should probably be disregarded from 'entropy' and 'count', as 'N' is in nucleotide data. I introduced a similar check as is already present for nucleotides in 'counts'. However, I decided that we may be interested in gaps ('-') in AA alignments (as this is a little more stringent than gaps in nucleotides, since they must be in-frame), so I allowed this to be counted both in 'entropy' and 'counts'. We can change this in future if it doesn't seem to actually be informative.

After the change:
counts-after

@jameshadfield
Copy link
Member

Thanks @emmahodcroft this seems very sensible. I'll try and take a look at this over the next ~4 days, but if you don't hear from me feel free to merge 👍

@jameshadfield jameshadfield merged commit 73b69b4 into master May 18, 2019
@jameshadfield jameshadfield deleted the no_X_count_AA branch May 18, 2019 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants