Find smallest subtree with all visible data when using genotype filters #1276
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously we were not recomputing the MRCA of the the filtered nodes if genotype filters were applied, which resulted in the "zoom to selected" button behaving as if the genotype filters did not exist.
For non-genotype ("normal") filters, given a set
X
of visible nodes, we simultaneously find the MRCA ofX
and add the paths from the MRCA toX
.For genotype filtering, we wish to find the MRCA[1] of
X
without modifyingX
. Note that this allows situations whereMRCA \not\in X
. This is introduced in this commit via the functionfindFilteredMRCA
which uses a 3 step process:Filtering to clade 20C and genotype S 452 R. Left: "Zoom to selected" from PR #1265 wrongly identifies the MRCA of clade 20C and does not take into accound the genotype filtering. Right: This PR
Filtering to the homoplasic mutation 501Y. Left: In PR #1265, filtering to genotypes only did not allow one to "zoom to selected". Right: We can zoom into the subtree containing all samples with 501Y (which happens to be quite close to the root of the tree)
[1] Is there a better word than MRCA? I'm not suggesting that this is the node where the genotypes originated, rather the node which contains the smallest subtree with all of the filtered nodes within it.