generated from nextstrain/pathogen-repo-guide
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ingest] fix metadata conflicts across segments
Mismatched field values across segments (e.g. segments disagree on the 'date') are now resolved by choosing the most common occurrence with the intention they are resolved upstream, as implemented here. This approach was the third implementation. Initially I resolved disagreements within `group_segments.py` via a provided resolutions YAML. After discussion with @joverlee521 we decided this could be better implemented via `augur curate` and the original implementation here did this _after_ the segment grouping, however this made it impossible to distinguish disagreements which will be fixed vs those which won't¹ NOTE: Here we use accession as the ID, however using strain name would be better going forward as it would reduce the duplication needed in the current format. We can't (currently) do this in oropouche because strain names are added _after_ the curate chain runs. ¹ <#18 (comment)>
- Loading branch information
1 parent
148d76d
commit cbf0822
Showing
2 changed files
with
42 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters