Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include 2m timespan in Nextstrain GISAID and open profiles #957

Merged
merged 5 commits into from
Jun 1, 2022

Conversation

trvrb
Copy link
Member

@trvrb trvrb commented May 28, 2022

Description of proposed changes

This PR extends the previous logic of splitting out 6m and all-time timespans from PR #910 to include a new 2m timespan.

Screen Shot 2022-05-28 at 10 37 39 AM

2m was chosen over 1m to allow "logistic growth" calculation using the previous 6 weeks of frequencies pivots to work without modification.

With this narrow of timespans there is some unavoidable funny interaction with how augur filter subsamples based on --vpm, ie viruses per month. We have common situations where if current date is say May 15 we end up with

  • min date of March 15
  • desire by augur filter to equally sample viruses from March, April and May categories

so that March and May have 2 weeks for sampling of X viruses and April has 4 weeks for sampling of X viruses. This results in more densely sampled, in terms of viruses per day, months of March and May compared to April.

This effect will be more pronounced in scenarios where current date is, say, May 28, and so X viruses are sampled in 3 days in March and 30 days in April.

To fully address this we'd need to extend augur filter to have the option of per-week sampling categories in addition to per-month sampling categories. Or perhaps some continuous specification. However, I don't think this is too big of an issue in terms of the current PR and it's something we can refine once Augur is updated.

cc @victorlin @huddlej for Augur issue: nextstrain/augur#960

Testing

Trial builds are available at:

Release checklist

  • Update docs/src/reference/change_log.md in this pull request to document these changes by the date they were added.
  • Fix bug in Markdown table

After merging of this PR, we should:

  • update manifest_guest.json to allowing viewing of 2m datasets
  • revise nextstrain.org/sars-cov-2 to link to different timespans

trvrb added 2 commits May 28, 2022 09:54
- Replicate logic for "6m" profile for "2m" profile
- Use narrower bandwidth for "2m" frequencies
- Update Markdown description to table out builds
- Replicate logic for "6m" profile for "2m" profile
- Use narrower bandwidth for "2m" frequencies
- Update Markdown description to table out builds
@trvrb trvrb self-assigned this May 28, 2022
@trvrb trvrb merged commit b7e5c81 into master Jun 1, 2022
@trvrb trvrb deleted the 2m-timespan branch June 1, 2022 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

1 participant