Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_numbat seems to freeze after starting iteration 1 #20

Closed
liviuspenter opened this issue Feb 18, 2022 · 8 comments
Closed

run_numbat seems to freeze after starting iteration 1 #20

liviuspenter opened this issue Feb 18, 2022 · 8 comments

Comments

@liviuspenter
Copy link

Hello,

I wanted to try out numbat on one of our datasets. I was able to get the pileup and phasing to work.
When I run out_numbat(), it gets to iteration 1 and then seems to freeze without reporting any errors
(R stops running, nothing happens). It does produce output files until this stage.

I don't know how to debug this problem and would be happy about some insight from you.

Thanks,

livius

> out = run_numbat(counts, ref_hca, df_allele, gtf_hg38, genetic_map_hg38, out_dir = './data/Pool96_30/numbat/')
Running under parameters:
t = 1e-05
alpha = 1e-04
gamma = 20
min_cells = 10
init_k = 3
sample_size = 1e+05
max_cost = 1295.1
max_iter = 2
min_depth = 0
use_loh = auto
multi_allelic = TRUE
min_LLR = 50
max_entropy = 0.6
skip_nj = FALSE
exclude_normal = FALSE
diploid_chroms =
ncores = 30
common_diploid = TRUE
Input metrics:
4317 cells
Approximating initial clusters using smoothed expression ..
number of genes left: 9756
Iteration 1

@teng-gao
Copy link
Collaborator

Thanks for reporting this - could you show your log.txt in the output directory? That should show additional error messages.

@liviuspenter
Copy link
Author

It's not very informative:

INFO [2022-02-18 14:24:53] Running under parameters:
t = 1e-05
alpha = 1e-04
gamma = 20
min_cells = 10
init_k = 3
sample_size = 1e+05
max_cost = 1295.1
max_iter = 2
min_depth = 0
use_loh = auto
multi_allelic = TRUE
min_LLR = 50
max_entropy = 0.6
skip_nj = FALSE
exclude_normal = FALSE
diploid_chroms =
ncores = 30
common_diploid = TRUE
Input metrics:
4317 cells
INFO [2022-02-18 14:24:53] Approximating initial clusters using smoothed expression ..
INFO [2022-02-18 14:39:35] running hclust...
INFO [2022-02-18 14:39:51] Iteration 1

@teng-gao
Copy link
Collaborator

Interesting .. first time seeing this. What files does it produce so far?

@liviuspenter
Copy link
Author

I can see gexp_roll_wide.tsv.gz, hc.rds, hc_nodes.rds and log.txt

@teng-gao
Copy link
Collaborator

I see. Looks like the HMM did not run. Could you try to read in these output files and step through these lines manually?

https://github.com/kharchenkolab/numbat/blob/main/R/main.R#L114-L148

Probably some kind of error happened in between.

@liviuspenter
Copy link
Author

It's difficult to run the lines manually. It's not finding various functions like make_group_bulks() or log_error().
One problem was that it couldn't find mclapply(), which I was able to correct by manually loading the parallel package.

I did import the library using library(numbat).

This is my session info:

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin20.6.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /usr/local/Cellar/r/4.1.2/lib/R/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] dplyr_1.0.7 SeuratObject_4.0.4 Seurat_4.0.6 numbat_0.1.0

loaded via a namespace (and not attached):
[1] utf8_1.2.2 reticulate_1.22 R.utils_2.11.0 tidyselect_1.1.1 htmlwidgets_1.5.4
[6] grid_4.1.2 BiocParallel_1.28.3 Rtsne_0.15 devtools_2.4.3 munsell_0.5.0
[11] codetools_0.2-18 ica_1.0-2 future_1.23.0 miniUI_0.1.1.1 withr_2.4.3
[16] colorspace_2.0-2 Biobase_2.54.0 logger_0.2.2 rstudioapi_0.13 stats4_4.1.2
[21] ROCR_1.0-11 tensor_1.5 listenv_0.8.0 MatrixGenerics_1.6.0 GenomeInfoDbData_1.2.7
[26] polyclip_1.10-0 farver_2.1.0 rprojroot_2.0.2 parallelly_1.30.0 vctrs_0.3.8
[31] treeio_1.18.1 generics_0.1.1 TH.data_1.1-0 R6_2.5.1 GenomeInfoDb_1.30.0
[36] graphlayouts_0.8.0 bitops_1.0-7 spatstat.utils_2.3-0 cachem_1.0.6 gridGraphics_0.5-1
[41] DelayedArray_0.20.0 assertthat_0.2.1 promises_1.2.0.1 BiocIO_1.4.0 scales_1.1.1
[46] multcomp_1.4-17 ggraph_2.0.5 gtable_0.3.0 extraDistr_1.9.1 globals_0.14.0
[51] processx_3.5.2 goftest_1.2-3 tidygraph_1.2.0 sandwich_3.0-1 rlang_0.4.12
[56] splines_4.1.2 rtracklayer_1.54.0 lazyeval_0.2.2 spatstat.geom_2.3-1 yaml_2.2.1
[61] reshape2_1.4.4 abind_1.4-5 httpuv_1.6.4 usethis_2.1.5 tools_4.1.2
[66] ggplotify_0.1.0 ggplot2_3.3.5 ellipsis_0.3.2 spatstat.core_2.3-2 RColorBrewer_1.1-2
[71] BiocGenerics_0.40.0 sessioninfo_1.2.2 ggridges_0.5.3 Rcpp_1.0.7 plyr_1.8.6
[76] zlibbioc_1.40.0 purrr_0.3.4 RCurl_1.98-1.5 ps_1.6.0 prettyunits_1.1.1
[81] rpart_4.1-15 deldir_1.0-6 pbapply_1.5-0 viridis_0.6.2 cowplot_1.1.1
[86] S4Vectors_0.32.3 zoo_1.8-9 SummarizedExperiment_1.24.0 ggrepel_0.9.1 cluster_2.1.2
[91] fs_1.5.2 magrittr_2.0.1 data.table_1.14.2 scattermore_0.7 lmtest_0.9-39
[96] RANN_2.6.1 mvtnorm_1.1-3 parallelDist_0.2.6 fitdistrplus_1.1-6 matrixStats_0.61.0
[101] pkgload_1.2.4 patchwork_1.1.1 mime_0.12 xtable_1.8-4 XML_3.99-0.8
[106] IRanges_2.28.0 gridExtra_2.3 testthat_3.1.1 compiler_4.1.2 tibble_3.1.6
[111] KernSmooth_2.23-20 crayon_1.4.2 R.oo_1.24.0 htmltools_0.5.2 ggfun_0.0.5
[116] mgcv_1.8-38 later_1.3.0 tidyr_1.1.4 aplot_0.1.2 libcoin_1.0-9
[121] RcppParallel_5.1.5 DBI_1.1.1 tweenr_1.0.2 MASS_7.3-54 Matrix_1.4-0
[126] cli_3.1.0 R.methodsS3_1.8.1 igraph_1.2.10 GenomicRanges_1.46.1 pkgconfig_2.0.3
[131] GenomicAlignments_1.30.0 coin_1.4-2 plotly_4.10.0 spatstat.sparse_2.1-0 ggtree_3.2.1
[136] XVector_0.34.0 yulab.utils_0.0.4 stringr_1.4.0 callr_3.7.0 digest_0.6.29
[141] sctransform_0.3.2 RcppAnnoy_0.0.19 spatstat.data_2.1-2 Biostrings_2.62.0 leiden_0.3.9
[146] tidytree_0.3.7 dendextend_1.15.2 uwot_0.1.11 curl_4.3.2 restfulr_0.0.13
[151] shiny_1.7.1 Rsamtools_2.10.0 modeltools_0.2-23 rjson_0.2.20 lifecycle_1.0.1
[156] nlme_3.1-153 jsonlite_1.7.2 desc_1.4.0 viridisLite_0.4.0 fansi_0.5.0
[161] pillar_1.6.4 lattice_0.20-45 fastmap_1.1.0 httr_1.4.2 pkgbuild_1.3.0
[166] survival_3.2-13 remotes_2.4.2 glue_1.6.0 png_0.1-7 ggforce_0.3.3
[171] stringi_1.7.6 caTools_1.18.2 memoise_2.0.1 irlba_2.3.5 future.apply_1.8.1
[176] ape_5.5

@teng-gao
Copy link
Collaborator

Right. Please keep in mind that library(numbat) does not attach all packages that numbat depends on to your global environment. If you do devtools::load_all('local repo path'), you should have access to all internal functions and dependencies.
Happy to connect via zoom if you're still experiencing trouble.

@liviuspenter
Copy link
Author

liviuspenter commented Feb 18, 2022

Ok, I got to the part where it runs make_group_bulks() and it throws the following error (see below).
I would be happy to connect over zoom - I just sent you an email.

extract cell groupings

subtrees = purrr::keep(nodes, function(x) x$size > 10)
clones = purrr::keep(subtrees, function(x) x$sample %in% 1:3)

normal_cells = c()
bulk_subtrees = make_group_bulks(
groups = subtrees,
count_mat = count_mat,
df_allele = df_allele,
lambdas_ref = lambdas_ref,
gtf_transcript = gtf_transcript,
genetic_map = genetic_map,
min_depth = min_depth,
ncores = 12)
Error: arrange() failed at implicit mutate() step.

  • Problem with mutate() column ..1.
    ..1 = CHROM.
    x object 'CHROM' not found
    Run rlang::last_error() to see where the error occurred.
    In addition: Warning message:
    In mclapply(groups, mc.cores = ncores, function(g) { :

Error: arrange() failed at implicit mutate() step.

  • Problem with mutate() column ..1.
    ..1 = CHROM.
    x object 'CHROM' not found
    Run rlang::last_error() to see where the error occurred. `

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants