Error during 1. Iteration #50

Moxinwu · 2022-10-31T05:07:42Z

Dear Professor：
I have some questions about the use of numbat, look forward to your reply.
The sample of TNBC1 BAM file downloaded from the https://sra-pub-src-2.s3.amazonaws.com/SRR11546787/BAM_TNBC1.bam.1 and the expression count matrix download form https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM4476486 . Use Seurat to process the expression matrix，only process to clustering ( FindClusters() ) and use aggregate_counts() to prepare the expression reference，Use pileup_and_phase.R to generate TNBC1_allele_counts.tsv.gz . At last to run numbat with run_numbat ，but an error occurred. I do not know how to deal with.

Approximating initial clusters using smoothed expression ..
Mem used: 1.36Gb
number of genes left: 10480
running hclust...
Iteration 1
Mem used: 9.47Gb
Error in arrange():
! Problem with the implicit transmute() step.
✖ Problem while computing ..1 = CHROM.
Caused by error in mask$eval_all_mutate():
! object 'CHROM' not found
Backtrace:
▆

├─global run_one_sample(...)
│ └─numbat::run_numbat(...)
│ └─numbat:::make_group_bulks(...)
│ └─... %>% arrange(sample)
├─dplyr::arrange(., sample)
├─dplyr::mutate(., snp_index = as.integer(snp_id))
├─dplyr::mutate(., snp_id = factor(snp_id, unique(snp_id)))
├─dplyr::arrange(., CHROM, POS)
├─dplyr:::arrange.data.frame(., CHROM, POS)
│ └─dplyr:::arrange_rows(.data, dots)
│ ├─base::withCallingHandlers(...)
│ ├─dplyr::transmute(new_data_frame(.data), !!!quosures)
│ └─dplyr:::transmute.data.frame(new_data_frame(.data), !!!quosures)
│ └─dplyr:::mutate_cols(.data, dots, caller_env = caller_env())
│ ├─base::withCallingHandlers(...)
│ └─mask$eval_all_mutate(quo)
├─base::.handleSimpleError(...)
│ └─dplyr (local) h(simpleError(msg, call))
│ └─rlang::abort(...)
│ └─rlang:::signal_abort(cnd, .file)
│ └─base::signalCondition(cnd)
└─dplyr (local) <fn>(<dply:::_>)
└─rlang::abort(bullets, call = error_call, parent = parent)
Warning message:
In mclapply(groups, mc.cores = ncores, function(g) { :
all scheduled cores encountered errors in user code
Execution halted

The text was updated successfully, but these errors were encountered:

teng-gao · 2022-10-31T14:59:46Z

Hi @Moxinwu,

Could you share the full command that you used?

Thanks,
Teng

Moxinwu · 2022-10-31T16:25:51Z

Hi,
The code I use is as follows. Thank you for your answer !

Rscript pileup_and_phase.R
--label Sample
--samples TNBC1
--bams BAM_TNBC1.bam
--barcodes TNBC1_barcode.tsv.gz
--outdir /public/home/lxw/05.Test/result
--gmap /public/home/lxw/02.Software/Eagle_v2.4.1/tables/genetic_map_hg38_withX.txt.gz
--eagle /public/home/lxw/02.Software/Eagle_v2.4.1/eagle
--snpvcf /public/home/lxw/07.Data/db_numbat/hg38/genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf
--paneldir /public/home/lxw/07.Data/db_numbat/hg38/1000G_hg38
--ncores 20

cell snp_id CHROM POS cM REF ALT AD DP GT gene
CGCCAGAGTATCGTGT-1 1_818802_A_G 1 818802 0.488775939022764 A G 0 1 0|1 FAM87B
TCTTAGTCAGCAGGAT-1 1_818802_A_G 1 818802 0.488775939022764 A G 0 1 0|1 FAM87B
TTTCGATGTTGGGCCT-1 1_818802_A_G 1 818802 0.488775939022764 A G 1 1 0|1 FAM87B

library(Seurat)
library(numbat)
load("project.rdata")
allele_data <- read.table(gzfile("TNBC1_allele_counts.tsv.gz"),header = T,fill = T,check.names = F)
count_mat <- project@assays$RNA@counts
cell_annot <- as.data.frame(cbind(colnames(project),as.vector(project@meta.data[,"seurat_clusters"])))
colnames(cell_annot) <- c("cell","group")
head(cell_annot)
cell group
1 AAACCTGCACCTTGTC-1 3
2 AAACGGGAGTCCTCCT-1 2
3 AAACGGGTCCAGAGGA-1 0
4 AAAGATGCAGTTTACG-1 1
5 AAAGCAACAGGAATGC-1 0
6 AAAGCAATCGGAATCT-1 3
ref_expr <- aggregate_counts(count_mat = count_mat,annot = cell_annot)
cell_dict
0 1 2 3 4 5
439 260 226 98 54 20
head(ref_expr)
0 1 2 3 4 5
RP11-34P13.3 0.000000e+00 0.000000e+00 0.000000e+00 0 0 0
FAM138A 0.000000e+00 0.000000e+00 0.000000e+00 0 0 0
OR4F5 0.000000e+00 0.000000e+00 0.000000e+00 0 0 0
RP11-34P13.7 1.040396e-07 3.664114e-07 3.306331e-07 0 0 0
RP11-34P13.8 0.000000e+00 1.221371e-07 0.000000e+00 0 0 0
RP11-34P13.14 0.000000e+00 0.000000e+00 0.000000e+00 0 0 0

out <- run_numbat(count_mat, ref_expr, allele_data,
genome = "hg38",t = 1e-05,ncores = 4,plot = TRUE,out_dir = "./test")

have some result as follows:
-rw-r--r-- 1 lxw lxw 1.7M Oct 31 14:56 exp_roll_clust.png
-rw-r--r-- 1 lxw lxw 84M Oct 31 14:54 gexp_roll_wide.tsv.gz
-rw-r--r-- 1 lxw lxw 24K Oct 31 14:54 hc.rds
-rw-r--r-- 1 lxw lxw 1.2K Oct 31 14:56 log.txt
-rw-r--r-- 1 lxw lxw 6.8K Oct 31 14:54 sc_refs.rds

log file contene is:
INFO [2022-10-31 14:54:21] Mem used: 0.636Gb
INFO [2022-10-31 14:54:24] Approximating initial clusters using smoothed expression ..
INFO [2022-10-31 14:54:25] Mem used: 0.637Gb
INFO [2022-10-31 14:54:50] running hclust...
INFO [2022-10-31 14:56:14] Iteration 1
INFO [2022-10-31 14:56:14] Mem used: 2.18Gb
ERROR [2022-10-31 14:56:20] job 1,2,3,4,5 failed
ERROR [2022-10-31 14:56:20] Error in summarise(., AD = sum(AD), DP = sum(DP), AR = AD/DP, .groups = "drop") :
Problem while computing DP = sum(DP).
ℹ The error occurred in group 1: snp_id = "1_100003004_A_G", CHROM = 1, POS = 100003004, cM = "128.866327498673", REF = "A", ALT = "G", GT = "1|0", gene = "SLC35A3".
Caused by error in sum():
! invalid 'type' (character) of argument

teng-gao · 2022-10-31T17:32:40Z

The problem seems to be with formatting of the allele count dataframe .. could you try:

allele_data = fread("TNBC1_allele_counts.tsv.gz")

Moxinwu · 2022-10-31T23:30:43Z

Great , it seems ok , thanks .

teng-gao closed this as completed Nov 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error during 1. Iteration #50

Error during 1. Iteration #50

Moxinwu commented Oct 31, 2022

teng-gao commented Oct 31, 2022

Moxinwu commented Oct 31, 2022

teng-gao commented Oct 31, 2022

Moxinwu commented Oct 31, 2022

Error during 1. Iteration #50

Error during 1. Iteration #50

Comments

Moxinwu commented Oct 31, 2022

teng-gao commented Oct 31, 2022

Moxinwu commented Oct 31, 2022

teng-gao commented Oct 31, 2022

Moxinwu commented Oct 31, 2022