Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tree: RecursionError in writing tree to Newick #328

Closed
trvrb opened this issue Jul 24, 2019 · 18 comments
Closed

tree: RecursionError in writing tree to Newick #328

trvrb opened this issue Jul 24, 2019 · 18 comments
Labels
bug Something isn't working

Comments

@trvrb
Copy link
Member

trvrb commented Jul 24, 2019

I've encountered the following error when running flu trees for H3N2 MP segment. You should be able to reproduce this by running:

augur tree --alignment test.fasta --output test.nwk

with this supplied test.fasta. IQTREE runs just fine and produces an apparently perfectly valid Newick tree (at least when viewed in FigTree), available as test.nwk.

However, augur tree errors out on line 422 (https://github.com/nextstrain/augur/blob/master/augur/tree.py#L422) with:

Traceback (most recent call last):
  File "/Users/trvrb/.pyenv/versions/3.6.1/bin/augur", line 11, in <module>
    load_entry_point('nextstrain-augur', 'console_scripts', 'augur')()
  File "/Users/trvrb/Documents/src/augur/augur/__main__.py", line 10, in main
    return augur.run( argv[1:] )
  File "/Users/trvrb/Documents/src/augur/augur/__init__.py", line 72, in run
    return args.__command__.run(args)
  File "/Users/trvrb/Documents/src/augur/augur/tree.py", line 425, in run
    tree_success = Phylo.write(T, tree_fname, 'newick', format_branch_length='%1.8f')
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/Bio/Phylo/_io.py", line 82, in write
    n = getattr(supported_formats[format], 'write')(trees, fp, **kwargs)
...
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/Bio/Phylo/NewickIO.py", line 280, in <genexpr>
    subtrees = (newickize(sub) for sub in clade)
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/Bio/Phylo/NewickIO.py", line 280, in newickize
    subtrees = (newickize(sub) for sub in clade)
RecursionError: maximum recursion depth exceeded

Something odd with writing out this tree via Bio.Phylo. I believe this is stochastic error and will resolve itself when I rerun subsampling, but still thought I should document it here.

I tried upgrading BioPython to 1.73 as a simple fix to no avail.

@trvrb trvrb added the bug Something isn't working label Jul 24, 2019
@jameshadfield
Copy link
Member

@trvrb have you seen #307? I don't think the docs have been update but you should be able to set the env variable AUGUR_RECURSION_LIMIT to fix this. Better warnings needed tho!

@rneher
Copy link
Member

rneher commented Jul 24, 2019

Yes. This is a known problem and fixed via setting AUGUR_RECURSION_LIMIT.

@rneher rneher closed this as completed Jul 24, 2019
@jameshadfield
Copy link
Member

jameshadfield commented Jul 24, 2019

Thought's on a augur-wide catch for this & a nice error message?

Something like

def run(argv):
    args = make_parser().parse_args(argv)
    try:
        return args.__command__.run(args)
    except RecursionError:
        print("FATAL: Maximum recursion depth reached. You can set the env variable AUGUR_RECURSION_LIMIT to adjust this (currently set at {})".format(sys.getrecursionlimit()))
        sys.exit(2)

@rneher
Copy link
Member

rneher commented Jul 24, 2019

yes. that is a good idea.

@rneher rneher reopened this Jul 24, 2019
@trvrb
Copy link
Member Author

trvrb commented Jul 24, 2019

Thanks both. Is there a reason that augur does not just use a default recursion limit that can be overridden by the environment value? Furthermore, I'm not sure I follow why this needs to exist in environment at all. We should just be able to have a sensible default, right? I must be missing something.

Is there discussion of this that I can reference? I didn't see much in #307.

@rneher
Copy link
Member

rneher commented Jul 24, 2019

I am not sure I understand your question. But here are my 2c.

There is a default and it is used by augur, but I am not sure whether this is an env var.

In [1]: import sys
In [2]: sys.getrecursionlimit()
Out[2]: 3000

The recursion limit exists to guard against infinite recursions. So even if there was a way to set is globally, it is probably not a good idea. Hence we opted for an augur specific environment variable that can be set for example in the Snakefile.
I think these deep recursion happens for very imbalanced trees with lots of polytomies. For almost all applications, the default seems to work fine. But large low diversity data sets cause problems.

I guess setting a default to 10000 wouldn't hurt and would reduce the number of times this comes up -- until people start building bigger trees.

@trvrb
Copy link
Member Author

trvrb commented Jul 24, 2019

Thanks so much for the detailed explanation @rneher. I think I finally understand. Two thoughts:

  1. If it becomes common to breach the Python default recursion limit of 3000, we may want to consider upping the augur default to 10k or whatever. This would just be changing your code in augur/__init__.py: set recursion limit from env variable #307 to something like:
recursion_limit = os.environ.get("AUGUR_RECURSION_LIMIT")
if recursion_limit:
    sys.setrecursionlimit(int(recursion_limit))
else:
    sys.setrecursionlimit(10000)
  1. If this is only being thrown by Phylo.write, maybe there's another function that's more robust to this issue that we should be using to convert to Newick. (Maybe not so important however)

@jameshadfield
Copy link
Member

If this is only being thrown by Phylo.write, maybe there's another function that's more robust to this issue that we should be using to convert to Newick. (Maybe not so important however)

for what it's worth, @evogytis & I had this a lot in parsing BEAST MCC trees as well

@trvrb
Copy link
Member Author

trvrb commented Jul 24, 2019

I think @jameshadfield's suggestion above to improve error message (#328 (comment)) is great. Otherwise I'd consider this issue closed. Thanks for the explanations.

@jameshadfield
Copy link
Member

Will do this now 👍

@batson
Copy link

batson commented Jun 2, 2020

But what if there were a global pandemic with an unprecedented amount of sequences and extremely low diversity? Would that be a time when the default might be too low?

ttung pushed a commit to ttung/biopython that referenced this issue Jul 2, 2020
`newickize` can [overflow the python stack for very deep trees](nextstrain/augur#328).  This is an alternate implementation that uses a deque (as a stack) to store the tree data.  Although this is the same O(N) in terms of storage, the amount of data stored per function call depth in python is significantly higher.
ttung pushed a commit to ttung/biopython that referenced this issue Jul 2, 2020
`newickize` can [overflow the python stack for very deep trees](nextstrain/augur#328).  This is an alternate implementation that uses a deque (as a stack) to store the tree data.  Although this is the same O(N) in terms of storage, the amount of data stored per function call depth in python is significantly higher.
ttung pushed a commit to ttung/biopython that referenced this issue Jul 2, 2020
`newickize` can [overflow the python stack for very deep trees](nextstrain/augur#328).  This is an alternate implementation that uses a deque (as a stack) to store the tree data.  Although this is the same O(N) in terms of storage, the amount of data stored per function call depth in python is significantly higher.
ttung pushed a commit to ttung/biopython that referenced this issue Jul 2, 2020
`newickize` can [overflow the python stack for very deep trees](nextstrain/augur#328).  This is an alternate implementation that uses a deque (as a stack) to store the tree data.  Although this is the same O(N) in terms of storage, the amount of data stored per function call depth in python is significantly higher.
ttung pushed a commit to ttung/biopython that referenced this issue Jul 13, 2020
`newickize` can [overflow the python stack for very deep trees](nextstrain/augur#328).  This is an alternate implementation that uses a deque (as a stack) to store the tree data.  Although this is the same O(N) in terms of storage, the amount of data stored per function call depth in python is significantly higher.
@DanLesman
Copy link

I encountered the same error and adjusted AUGUR_RECURSION_LIMIT. However, iqtree's .ckp.gz checkpoint file seems to have been removed so the tree building has to restart. Is there any way to make it so that the checkpoint file is retained until after all the code executes?

@tsibley
Copy link
Member

tsibley commented Jul 22, 2020

@DanLesman Relevant code is:

augur/augur/tree.py

Lines 184 to 194 in 4c474a9

if clean_up:
#allow user to see chosen model if modeltest was run
if substitution_model.lower() == 'none':
shutil.copyfile(log_file, out_file.replace(out_file.split('/')[-1],"iqtree.log"))
if os.path.isfile(tmp_aln_file):
os.remove(tmp_aln_file)
for ext in [".bionj",".ckp.gz",".iqtree",".mldist",".model.gz",".treefile",".uniqueseq.phy",".model"]:
if os.path.isfile(tmp_aln_file + ext):
os.remove(tmp_aln_file + ext)

I think we'd be receptive to a patch exposing the clean_up function parameter of the build_* functions as a pair of --cleanup / --no-cleanup command-line flags. (Or a different name might be better; haven't thought about it.) Flagging @huddlej and @emmahodcroft here as they've also thought a bit about the CLI of Augur.

@jsan4christ
Copy link

I'm getting this error when I run:

treetime clock --tree new_tree_2.nwk --dates dates.csv --sequence-len 29903 --clock-filter 4


File "/analyses/software/miniconda3/lib/python3.7/site-packages/Bio/Phylo/NewickIO.py", line 300, in
subtrees = (newickize(sub) for sub in clade)
File "/analyses/software/miniconda3/lib/python3.7/site-packages/Bio/Phylo/NewickIO.py", line 293, in newickize
unquoted_label = re.match(token_dict["unquoted node label"], label)
File "/analyses/software/miniconda3/lib/python3.7/re.py", line 175, in match
return _compile(pattern, flags).match(string)
File "/analyses/software/miniconda3/lib/python3.7/re.py", line 275, in _compile
if isinstance(flags, RegexFlag):
RecursionError: maximum recursion depth exceeded while calling a Python object

@tsibley
Copy link
Member

tsibley commented Jul 18, 2022

@jsan4christ As noted above, you can set the AUGUR_RECURSION_LIMIT environment variable to increase the allowed recursion depth. For example:

AUGUR_RECURSION_LIMIT=10000 treetime clock --tree new_tree_2.nwk --dates dates.csv --sequence-len 29903 --clock-filter 4

@jsan4christ
Copy link

Thanks @tsibley, tried that to no avail...even set it to 1 billion and the error is exactly the same. Like treetime does not seem to use the option AUGUR_RECURSION_LIMIT. I also tried removing my nextstrain environment and recreating it but that didn't help too. In recreating it plus re-installing treetime which is now up from version 0.8.6 to 0.9.1 but that too did not help.

@tsibley
Copy link
Member

tsibley commented Jul 22, 2022

@jsan4christ Oh, sorry, I completely missed you were running the treetime command directly. This is something that will need to be handled in TreeTime itself, and you can file an issue over there. Unfortunately Python provides no way to set the recursion limit from the outside, so from the inside the code has to call sys.setrecursionlimit() itself. This is what Augur does based on the AUGUR_RECURSION_LIMIT value.

@jsan4christ
Copy link

I get you. thanks for clarifying. I will go ahead and file the issue.

joverlee521 added a commit that referenced this issue Apr 14, 2023
We've had to use the `AUGUR_RECURSION_LIMIT` environment variable to
bump the recursion limit for SARS-CoV-2 builds¹, mpox builds², and
our `nextstrain-jobs` definition in AWS Batch. From a general search of
the envvar across GitHub³, 10,000 looks like the default that others use
as well (although this may be influenced by our discussions⁴).

¹ nextstrain/ncov#690
² nextstrain/mpox#150
³ https://cs.github.com/?scopeName=All+repos&scope=&q=AUGUR_RECURSION_LIMIT%3D#328
joverlee521 added a commit that referenced this issue Apr 17, 2023
We've had to use the `AUGUR_RECURSION_LIMIT` environment variable to
bump the recursion limit for SARS-CoV-2 builds¹, mpox builds², and
our `nextstrain-jobs` definition in AWS Batch. From a general search of
the envvar across GitHub³, 10,000 looks like the default that others use
as well (although this may be influenced by our discussions⁴).

¹ nextstrain/ncov#690
² nextstrain/mpox#150
³ https://cs.github.com/?scopeName=All+repos&scope=&q=AUGUR_RECURSION_LIMIT%3D#328
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants