-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple improvements to augur clades #1199
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Our current implementation of read_node_data requires that every node in the tree is specified in the (merged) node_data files. For mutations this is overkill -- many nodes don't have mutations and it's overkill to require node_data JSONs to specify things like `"node_name": {"muts": []}`. This may well be the general behaviour we want, but i didn't want to modify the read_node_data function which sees extensive use. A welcome side effect of these changes is that we no longer have to supply both nuc and aa_muts.
See comments in tests/functional/clades.t Also adds / updates comments and docstrings which were noticed as I worked through the code relating to these tests.
Workflows may be using this so I elected to hide it rather than remove it (and warn people it's a no-op if they do happen to be using it)
This function had a few subtle bugs in it which are fixed here, as well as improving the warning message to explain how this may affect clade inference. Note that the presence of sequences on nodes other than the root is not considered by augur clades.
We could check all of these up-front instead of exiting upon the first error, and such a check should be part of validation within augur clades, but this commit is a simple solution to fix a reported bug. Closes #965
A fatal error is raised if no clades are defined, but if a clade is not found on the tree it's only a warning. Suggested in #735
Multiple mutations at the same position on a single branch are now a fatal error. Previous behaviour was to overwrite such mutations when parsing. Suggested by #735.
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## branch-labels #1199 +/- ##
=================================================
+ Coverage 68.66% 68.81% +0.15%
=================================================
Files 64 64
Lines 6852 6898 +46
Branches 1677 1693 +16
=================================================
+ Hits 4705 4747 +42
- Misses 1840 1844 +4
Partials 307 307
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
victorlin
added a commit
that referenced
this pull request
May 9, 2023
For commits 007cb47...e5cfc3a.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Contains a bunch of improvements to
augur clades
which should improve usability quite a bit. See commit messagesfor details.
Commit 22e2444 was motivated by discussion with @rneher and @huddlej in #728 regarding labelling clades at the root of the tree. Clades at the root are correctly annotated if either (a) the root reference sequence is supplied in the node-data JSON or (b) the root node has the clade-defining mutations annotated. We do not use mutations observed elsewhere in the tree to infer the root sequence, but we can do this if it would be useful? #1028 is also relevant here.
Checklist
Addresses 4/5 of the remaining tasks in #735
Closes #965
Closes #1153
Closes #1154