Allow Recovery of Ambiguous Sites for Fasta-input #280

emmahodcroft · 2019-05-06T12:54:28Z

Using ancestral with VCF-input includes a call to TreeTime's recover_var_ambigs function, which recalculates mutations on tip branches to restore ambiguous bases ('N's) at positions where they existed in the original sequence.

TreeTime reconstructs bases at ambiguous sites (where it can), but recovering these back onto tip sequences allows users to accurately see what proportion of sequences actually have information at a given site, as reconstructions could be misleading.

This PR updates the ancestral function with a new flag --keep-ambiguous which can be used to extend the same functionality to Fasta-input sequences.

For example, at a site of interest in Enterovirus...
Without recovering ambiguous sites:

This looks fairly straightforward, and one may be tempted to look and see if there are other associations with clusters where a mutation occurred.

With recovering ambiguous sites (note the colour change):

It becomes clear that for most sequences we have no data at this site, and there are entire clusters without a sequence at this site which have coloured branches just because of ancestral reconstruction. One should be cautious interpreting mutations at this site.

This update includes a TreeTime version check
This is because of a bug in older TreeTime versions that means using the --keep_ambiguous flag with Fasta-input returns nonsense. Thus, users are not allowed to use the flag if they are not running TreeTime 0.5.6 or newer.

To make this update most useful, it would be great to implement standardised nucleotide/AA colouring in auspice! (Particularly so that -/N/X are always grey.)

jameshadfield · 2019-05-07T05:25:25Z

This looks great @emmahodcroft

To make this update most useful, it would be great to implement standardised nucleotide/AA colouring in auspice! (Particularly so that -/N/X are always grey.)

Good idea!

emmahodcroft · 2019-05-07T08:53:35Z

@jameshadfield I had a little look into this (standard colouring), but it was a little more complicated than I'd anticipated, so didn't attempt anything yet! Are there any particular reasons for why this couldn't be done, in theory, or anything we'd want to preserve about how this works?

jameshadfield · 2019-05-07T15:19:22Z

Are there any particular reasons for why this couldn't be done, in theory, or anything we'd want to preserve about how this works?

Not that I can think of -- but we want to make it for genotype colouring only as some datasets may have demes / trait values of X, N etc

…o facilitate adding another command-line argument to control handling of gaps at both ends of the alignment.

trvrb · 2019-05-26T18:36:11Z

Chiming in to say that

To make this update most useful, it would be great to implement standardised nucleotide/AA colouring in auspice! (Particularly so that -/N/X are always grey.)

is a great idea. I've made an auspice issue to preserve it: nextstrain/auspice#727.

Allow recovery ambig sites for fasta-input ancestral

959403c

emmahodcroft requested a review from trvrb May 6, 2019 12:54

ancestral.py: added comments, feed argument through augur functions t…

2a1eac9

…o facilitate adding another command-line argument to control handling of gaps at both ends of the alignment.

rneher merged commit 1439563 into master May 11, 2019

rneher deleted the no_reconst_N branch May 11, 2019 16:49

trvrb mentioned this pull request May 26, 2019

Color gaps, Ns and Xs as gray nextstrain/auspice#727

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow Recovery of Ambiguous Sites for Fasta-input #280

Allow Recovery of Ambiguous Sites for Fasta-input #280

emmahodcroft commented May 6, 2019

jameshadfield commented May 7, 2019

emmahodcroft commented May 7, 2019

jameshadfield commented May 7, 2019

trvrb commented May 26, 2019

Allow Recovery of Ambiguous Sites for Fasta-input #280

Allow Recovery of Ambiguous Sites for Fasta-input #280

Conversation

emmahodcroft commented May 6, 2019

jameshadfield commented May 7, 2019

emmahodcroft commented May 7, 2019

jameshadfield commented May 7, 2019

trvrb commented May 26, 2019