-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phylogeny on merged samples #130
Comments
Hi @freddie090 , You can genotype the samples (from the same individual) using the multi-sample mode of The advantage is that you get consistent CNV and clone calls across samples and get an integrated phylogeny. Genotyping using multiple samples can also improve phasing accuracy. Best, |
Hi Teng, Ah great, okay - my hunch was the inference would be more robust if it had access to information from all samples at once. I'll give it a go! Thanks - |
Hi @teng-gao - sorry, just to clarify: After running pileup and phase where I provide a list of BAM files and corresponding sample names (as comma separated values as a single argument, e.g.: Has the multi sample mode worked? I wasn't sure whether I should expect a single combined allele counts table for all samples. If not, then do you suggest manually merging the expression matrix and allele counts for each sample before running Numbat? Best |
Yes you should get a separate allele count df for each sample. You can then concatenate them (ditto for expression count matrix) before feeding to |
Okay - and sorry final Q @teng-gao - are the sample identities preserved somewhere for distinguishing in the phylogeny plots later? |
You can plot the sample identities associated with cell barcodes on a sidebar using the |
Hi,
I have multiple samples from an experiment that have ~6000 cells of 10X scRNA data each. If I were to try and run NUMBAT on the entire merged experiment the BAMs would be too big.
Is it possible to run NUMBAT on the independent samples, but then merge the samples for the phylogeny part of the analysis? (for example, as was done in figure 5a of the NUMBAT paper).
Additionally, if I were to subset the BAMs given some high quality cells, merge the subsetted samples and then run NUMBAT on these merged BAMs/expression matrices, would this improve the robustness of the analysis? ie is there any advantage to the samples being processed simultaneously vs independently?
Thanks
The text was updated successfully, but these errors were encountered: