Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tree): check for all synonyms of conflicting default args #1547

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

corneliusroemer
Copy link
Member

@corneliusroemer corneliusroemer commented Jul 17, 2024

  • fix(tree): check for all synonyms of conflicting args
  • chore(tests): more robust test by adding -redo to iqtree args

Description of proposed changes

See commit messages for details.

Related issue(s)

Inspired by iqtree/iqtree2#273

Checklist

  • Automated checks pass
  • Check if you need to add a changelog message
  • Check if you need to add tests
  • Check if you need to update docs

IQtree2 has multiple synonyms for default args

We were only checking for one of the synonyms. This PR fixes that.

It also uses the currently preferred name for all IQtree options, where "preferred" is defined as the name shown in `iqtree2 -h`

List of changes:
`-nt` -> `-T`
`--ntmax` -> `--threads-max`
`-czb` -> `--polytomy`

List of synonyms:
`-ntmax`==`--threads-max`
`-s`==`--aln`==`--msa`
`-m`==`--model`
If `-redo` is not added, one can get errors like this:
```
   >  --tree-builder-args="-czb -bb 1000 -bnni" \
   >  --override-default-args \
   >  --output "$TMP/tree_raw.nwk" > /dev/null
+
+  ERROR: Shell exited 2 when running: iqtree2 --threads-max 1 -s tree/full_aligned-delim.fasta -m GTR -czb -bb 1000 -bnni > tree/full_aligned-delim.iqtree.log
+  Command output was:
+    Checkpoint (tree/full_aligned-delim.fasta.ckp.gz) indicates that a previous run successfully finished
+    Use `-redo` option if you really want to redo the analysis and overwrite all output files.
+    Use `--redo-tree` option if you want to restore ModelFinder and only redo tree search.
+    Use `--undo` option if you want to continue previous run when changing/adding options.
+
+  [1]
```
Copy link

codecov bot commented Jul 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.13%. Comparing base (c96d448) to head (96b0c04).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1547      +/-   ##
==========================================
+ Coverage   70.09%   70.13%   +0.03%     
==========================================
  Files          75       75              
  Lines        7923     7923              
  Branches     1938     1938              
==========================================
+ Hits         5554     5557       +3     
+ Misses       2085     2082       -3     
  Partials      284      284              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@corneliusroemer
Copy link
Member Author

Would be nice if someone could look at this, e.g. @huddlej @victorlin @tsibley @genehack @joverlee521 @jameshadfield

@@ -261,13 +261,27 @@ def random_string(n):
ofile.write(tmp_line)

# Check tree builder arguments for conflicts with hardcoded defaults.
check_conflicting_args(tree_builder_args, ("-ntmax", "-s", "-m"))
check_conflicting_args(tree_builder_args, (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not at all familiar with this code, ignore if I'm just utterly confused

why are these args hard-coded in place here, instead of being part of the DEFAULT_ARGS data structure around L25? digging a little bit, I see this is a pattern throughout, but I find it sort of confusing that there's a set of hardcoded default arguments for each type of tree-builder, and then we're checking for "conflicts with hardcoded defaults" when running each tree-builder, using a different set of arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hard-coded args are arguments that are set controlled through augur tree options and for which we want to disallow users from overwriting them. Things like: the input alignment, which augur tree processes and sets directly, there's no reason why a user should change that, we want to make this part of the underlying tree binary "private".

Default args are recommendations for tree builder programs that user may override if they want to without causing harm.

Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that make sense?

It does; albeit the use of "default" to mean two orthogonal things that look like they're different ways of saying the same thing (DEFAULT_ARGS versus "hardcoded defaults") seems inherently confusing.

Maybe "hardcoded defaults" could be changed to "hardcoded options", but this is a nitpick. 🤷

> --override-default-args \
> --output "$TMP/tree_raw.nwk" > /dev/null

Build a tree with conflicting default arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the sake of symmetry, might be worth adding one additional test that is identical to this plus --override-default-args -- or instead having this test duplicate the arguments in the test above without --override-default-args.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cram tests are the slow part of the test suit so I'd probably prefer not to add tests just for symmetry sake - if the tests were fast, then sure!

Copy link
Contributor

@huddlej huddlej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @corneliusroemer! Nice to have the check for modern synonyms and also clearer to use those throughout the docs and tests (-czb has always confused me, for example, and --polytomy is much clearer).

@@ -49,10 +49,21 @@ Since the following custom arguments are incompatible with the default IQ-TREE a
$ ${AUGUR} tree \
> --method iqtree \
> --alignment tree/full_aligned.fasta \
> --tree-builder-args="-czb -bb 1000 -bnni" \
> --tree-builder-args="--polytomy -bb 1000 -bnni --redo" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the addition of --redo necessary here because we're writing the output of all these tests to $TMP? If so, we could just remove references to $TMP throughout this file (following our own cram style guide) and skip the --redo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure anymore why I needed to add the --redo - there was definitely some issue I hit without it - possibly iqtree stubmling upon an existing file from an earlier test without having cleaned up.

Redo does no harm (it should be the default, and one should actively opt into checkpoints). It might not bring many benefits but given there's no harm I thought I'd sprinkle it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require reorganizing the cram files: #1558

I plan to make a PR for that later, but this shouldn't block on it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not sure we can avoid --redo. I realized while working on #1559 that IQ-TREE puts intermediate files in the directory of the --alignment input file, not the directory of the --output file. Only the 2nd is cleaned up after every test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants