-
Notifications
You must be signed in to change notification settings - Fork 129
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refine: Re-implement --year-bounds with error handling
Previously, the value specified was unused in the code. This restores the functionality. I moved the min_max_year argument from the constructor to range() since it is only used there. Fixes #1136.
- Loading branch information
Showing
7 changed files
with
222 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
Setup | ||
|
||
$ source "$TESTDIR"/_setup.sh | ||
|
||
Create metadata with a strain that has partial ambiguity on the century-level | ||
(20XX) so that --year-bounds is applied. | ||
|
||
$ cat >metadata.tsv <<~~ | ||
> strain date | ||
> PAN/CDC_259359_V1_V3/2015 20XX-XX-XX | ||
> ~~ | ||
|
||
|
||
Check that invalid --year-bounds provides useful error messages. | ||
|
||
$ ${AUGUR} refine \ | ||
> --tree "$TESTDIR/../data/tree_raw.nwk" \ | ||
> --alignment "$TESTDIR/../data/aligned.fasta" \ | ||
> --metadata metadata.tsv \ | ||
> --output-tree tree.nwk \ | ||
> --output-node-data branch_lengths.json \ | ||
> --timetree \ | ||
> --year-bounds 1950 1960 1970 \ | ||
> --divergence-units mutations > /dev/null | ||
ERROR: Invalid value for --year-bounds: The year bounds [1950, 1960, 1970] must have only one (lower) or two (lower, upper) bounds. | ||
[2] | ||
|
||
$ ${AUGUR} refine \ | ||
> --tree "$TESTDIR/../data/tree_raw.nwk" \ | ||
> --alignment "$TESTDIR/../data/aligned.fasta" \ | ||
> --metadata metadata.tsv \ | ||
> --output-tree tree.nwk \ | ||
> --output-node-data branch_lengths.json \ | ||
> --timetree \ | ||
> --year-bounds 0 1000 \ | ||
> --divergence-units mutations > /dev/null | ||
ERROR: Invalid value for --year-bounds: 0 is not a valid year. | ||
[2] | ||
|
||
$ ${AUGUR} refine \ | ||
> --tree "$TESTDIR/../data/tree_raw.nwk" \ | ||
> --alignment "$TESTDIR/../data/aligned.fasta" \ | ||
> --metadata metadata.tsv \ | ||
> --output-tree tree.nwk \ | ||
> --output-node-data branch_lengths.json \ | ||
> --timetree \ | ||
> --year-bounds -2000 -3000 \ | ||
> --divergence-units mutations > /dev/null | ||
ERROR: Invalid value for --year-bounds: -3000 is not a valid year. | ||
[2] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
Setup | ||
|
||
$ source "$TESTDIR"/_setup.sh | ||
|
||
Create a copy of tests/functional/refine/data/metadata.tsv, adding partial ambiguity on the century-level (20XX) for the first strain PAN/CDC_259359_V1_V3/2015. | ||
|
||
$ cat >metadata.tsv <<~~ | ||
> strain date | ||
> PAN/CDC_259359_V1_V3/2015 20XX-XX-XX | ||
> COL/FLR_00024/2015 2015-12-XX | ||
> PRVABC59 2015-12-XX | ||
> COL/FLR_00008/2015 2015-12-XX | ||
> Colombia/2016/ZC204Se 2016-01-06 | ||
> ZKC2/2016 2016-02-16 | ||
> VEN/UF_1/2016 2016-03-25 | ||
> DOM/2016/BB_0059 2016-04-04 | ||
> BRA/2016/FC_6706 2016-04-08 | ||
> DOM/2016/BB_0183 2016-04-18 | ||
> EcEs062_16 2016-04-XX | ||
> HND/2016/HU_ME59 2016-05-13 | ||
> ~~ | ||
|
||
Limit ambiguous dates to be within (2000, 2020). | ||
|
||
$ ${AUGUR} refine \ | ||
> --tree "$TESTDIR/../data/tree_raw.nwk" \ | ||
> --alignment "$TESTDIR/../data/aligned.fasta" \ | ||
> --metadata metadata.tsv \ | ||
> --output-tree tree.nwk \ | ||
> --output-node-data branch_lengths.json \ | ||
> --timetree \ | ||
> --year-bounds 2000 2020 \ | ||
> --coalescent opt \ | ||
> --date-confidence \ | ||
> --date-inference marginal \ | ||
> --clock-filter-iqd 4 \ | ||
> --seed 314159 \ | ||
> --divergence-units mutations &> /dev/null | ||
|
||
Check that the inferred date is 2020-12-31. | ||
TODO: Using jq woud be cleaner, but requires an extra dev dependency. | ||
|
||
$ python3 -c 'import json, sys; print(json.load(sys.stdin)["nodes"]["PAN/CDC_259359_V1_V3/2015"]["date"])' < branch_lengths.json | ||
2020-12-31 | ||
|
||
Reverse the order to check that order does not matter. | ||
|
||
$ ${AUGUR} refine \ | ||
> --tree "$TESTDIR/../data/tree_raw.nwk" \ | ||
> --alignment "$TESTDIR/../data/aligned.fasta" \ | ||
> --metadata metadata.tsv \ | ||
> --output-tree tree.nwk \ | ||
> --output-node-data branch_lengths.json \ | ||
> --timetree \ | ||
> --year-bounds 2020 2000 \ | ||
> --coalescent opt \ | ||
> --date-confidence \ | ||
> --date-inference marginal \ | ||
> --clock-filter-iqd 4 \ | ||
> --seed 314159 \ | ||
> --divergence-units mutations &> /dev/null | ||
|
||
Run the same check as above. | ||
|
||
$ python3 -c 'import json, sys; print(json.load(sys.stdin)["nodes"]["PAN/CDC_259359_V1_V3/2015"]["date"])' < branch_lengths.json | ||
2020-12-31 |