Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error during 1. Iteration #50

Closed
Moxinwu opened this issue Oct 31, 2022 · 4 comments
Closed

Error during 1. Iteration #50

Moxinwu opened this issue Oct 31, 2022 · 4 comments

Comments

@Moxinwu
Copy link

Moxinwu commented Oct 31, 2022

Dear Professor:
I have some questions about the use of numbat, look forward to your reply.
The sample of TNBC1 BAM file downloaded from the https://sra-pub-src-2.s3.amazonaws.com/SRR11546787/BAM_TNBC1.bam.1 and the expression count matrix download form https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM4476486 . Use Seurat to process the expression matrix,only process to clustering ( FindClusters() ) and use aggregate_counts() to prepare the expression reference,Use pileup_and_phase.R to generate TNBC1_allele_counts.tsv.gz . At last to run numbat with run_numbat ,but an error occurred. I do not know how to deal with.

Approximating initial clusters using smoothed expression ..
Mem used: 1.36Gb
number of genes left: 10480
running hclust...
Iteration 1
Mem used: 9.47Gb
Error in arrange():
! Problem with the implicit transmute() step.
✖ Problem while computing ..1 = CHROM.
Caused by error in mask$eval_all_mutate():
! object 'CHROM' not found
Backtrace:

  1. ├─global run_one_sample(...)
  2. │ └─numbat::run_numbat(...)
  3. │ └─numbat:::make_group_bulks(...)
  4. │ └─... %>% arrange(sample)
  5. ├─dplyr::arrange(., sample)
  6. ├─dplyr::mutate(., snp_index = as.integer(snp_id))
  7. ├─dplyr::mutate(., snp_id = factor(snp_id, unique(snp_id)))
  8. ├─dplyr::arrange(., CHROM, POS)
  9. ├─dplyr:::arrange.data.frame(., CHROM, POS)
  10. │ └─dplyr:::arrange_rows(.data, dots)
  11. │ ├─base::withCallingHandlers(...)
  12. │ ├─dplyr::transmute(new_data_frame(.data), !!!quosures)
  13. │ └─dplyr:::transmute.data.frame(new_data_frame(.data), !!!quosures)
  14. │ └─dplyr:::mutate_cols(.data, dots, caller_env = caller_env())
  15. │ ├─base::withCallingHandlers(...)
  16. │ └─mask$eval_all_mutate(quo)
  17. ├─base::.handleSimpleError(...)
  18. │ └─dplyr (local) h(simpleError(msg, call))
  19. │ └─rlang::abort(...)
  20. │ └─rlang:::signal_abort(cnd, .file)
  21. │ └─base::signalCondition(cnd)
  22. └─dplyr (local) <fn>(<dply:::_>)
  23. └─rlang::abort(bullets, call = error_call, parent = parent)
    Warning message:
    In mclapply(groups, mc.cores = ncores, function(g) { :
    all scheduled cores encountered errors in user code
    Execution halted
@teng-gao
Copy link
Collaborator

Hi @Moxinwu,

Could you share the full command that you used?

Thanks,
Teng

@Moxinwu
Copy link
Author

Moxinwu commented Oct 31, 2022

Hi,
The code I use is as follows. Thank you for your answer !

Rscript pileup_and_phase.R
--label Sample
--samples TNBC1
--bams BAM_TNBC1.bam
--barcodes TNBC1_barcode.tsv.gz
--outdir /public/home/lxw/05.Test/result
--gmap /public/home/lxw/02.Software/Eagle_v2.4.1/tables/genetic_map_hg38_withX.txt.gz
--eagle /public/home/lxw/02.Software/Eagle_v2.4.1/eagle
--snpvcf /public/home/lxw/07.Data/db_numbat/hg38/genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf
--paneldir /public/home/lxw/07.Data/db_numbat/hg38/1000G_hg38
--ncores 20

cell snp_id CHROM POS cM REF ALT AD DP GT gene
CGCCAGAGTATCGTGT-1 1_818802_A_G 1 818802 0.488775939022764 A G 0 1 0|1 FAM87B
TCTTAGTCAGCAGGAT-1 1_818802_A_G 1 818802 0.488775939022764 A G 0 1 0|1 FAM87B
TTTCGATGTTGGGCCT-1 1_818802_A_G 1 818802 0.488775939022764 A G 1 1 0|1 FAM87B

library(Seurat)
library(numbat)
load("project.rdata")
allele_data <- read.table(gzfile("TNBC1_allele_counts.tsv.gz"),header = T,fill = T,check.names = F)
count_mat <- project@assays$RNA@counts
cell_annot <- as.data.frame(cbind(colnames(project),as.vector(project@meta.data[,"seurat_clusters"])))
colnames(cell_annot) <- c("cell","group")
head(cell_annot)
cell group
1 AAACCTGCACCTTGTC-1 3
2 AAACGGGAGTCCTCCT-1 2
3 AAACGGGTCCAGAGGA-1 0
4 AAAGATGCAGTTTACG-1 1
5 AAAGCAACAGGAATGC-1 0
6 AAAGCAATCGGAATCT-1 3
ref_expr <- aggregate_counts(count_mat = count_mat,annot = cell_annot)
cell_dict
0 1 2 3 4 5
439 260 226 98 54 20
head(ref_expr)
0 1 2 3 4 5
RP11-34P13.3 0.000000e+00 0.000000e+00 0.000000e+00 0 0 0
FAM138A 0.000000e+00 0.000000e+00 0.000000e+00 0 0 0
OR4F5 0.000000e+00 0.000000e+00 0.000000e+00 0 0 0
RP11-34P13.7 1.040396e-07 3.664114e-07 3.306331e-07 0 0 0
RP11-34P13.8 0.000000e+00 1.221371e-07 0.000000e+00 0 0 0
RP11-34P13.14 0.000000e+00 0.000000e+00 0.000000e+00 0 0 0

out <- run_numbat(count_mat, ref_expr, allele_data,
genome = "hg38",t = 1e-05,ncores = 4,plot = TRUE,out_dir = "./test")

have some result as follows:
-rw-r--r-- 1 lxw lxw 1.7M Oct 31 14:56 exp_roll_clust.png
-rw-r--r-- 1 lxw lxw 84M Oct 31 14:54 gexp_roll_wide.tsv.gz
-rw-r--r-- 1 lxw lxw 24K Oct 31 14:54 hc.rds
-rw-r--r-- 1 lxw lxw 1.2K Oct 31 14:56 log.txt
-rw-r--r-- 1 lxw lxw 6.8K Oct 31 14:54 sc_refs.rds

log file contene is:
INFO [2022-10-31 14:54:21] Mem used: 0.636Gb
INFO [2022-10-31 14:54:24] Approximating initial clusters using smoothed expression ..
INFO [2022-10-31 14:54:25] Mem used: 0.637Gb
INFO [2022-10-31 14:54:50] running hclust...
INFO [2022-10-31 14:56:14] Iteration 1
INFO [2022-10-31 14:56:14] Mem used: 2.18Gb
ERROR [2022-10-31 14:56:20] job 1,2,3,4,5 failed
ERROR [2022-10-31 14:56:20] Error in summarise(., AD = sum(AD), DP = sum(DP), AR = AD/DP, .groups = "drop") :
Problem while computing DP = sum(DP).
ℹ The error occurred in group 1: snp_id = "1_100003004_A_G", CHROM = 1, POS = 100003004, cM = "128.866327498673", REF = "A", ALT = "G", GT = "1|0", gene = "SLC35A3".
Caused by error in sum():
! invalid 'type' (character) of argument

@teng-gao
Copy link
Collaborator

The problem seems to be with formatting of the allele count dataframe .. could you try:

allele_data = fread("TNBC1_allele_counts.tsv.gz")

@Moxinwu
Copy link
Author

Moxinwu commented Oct 31, 2022

Great , it seems ok , thanks .

@teng-gao teng-gao closed this as completed Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants