Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use metadata-only filtering for subsampling
Replaces FASTA outputs with strain list outputs for the subsample rule such that sequence data are not inspected during most subsampling steps. The exception to the rule are subsampling jobs that require a priority score calculation that depends on the FASTA sequence of another subsampled group. To handle this exception, we add a new rule to extract just those subsampled sequences. Finally, we collect subsampled sequences into a single deduplicated FASTA output using augur filter's new interface with the `--exclude-all` flag and multiple input support for `--include`. Note that this commit also updates the conda environment to use a GitHub branch instead of an official augur release.
- Loading branch information