Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to re-run from the breakpoint #131

Open
daisyyr opened this issue Jun 29, 2023 · 3 comments
Open

How to re-run from the breakpoint #131

daisyyr opened this issue Jun 29, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@daisyyr
Copy link

daisyyr commented Jun 29, 2023

Hi,

I encountered an issue while running the "run_numbat" function on a dataset of around 40,000 cells. The process was time-consuming, and unfortunately, the job was terminated at the step "Using 4 CNVs to construct phylogeny" due to the time limit of the slurm system (3 days). I attempted to rerun the script, but it restarted from the beginning instead of resuming from the breakpoint. Is there any way to continue the execution from where it left off?
`

if(!dir.exists(paste0(numbat_res_base, selected_patient,'/run_numbat_out'))){
  dir.create(paste0(numbat_res_base, selected_patient,'/run_numbat_out'))
}

out = run_numbat(
  as(mat_merge,"CsparseMatrix"), # gene x cell integer UMI count matrix has to be a sparse matrix
  ref_internal, # reference expression profile, a gene x cell type normalized expression level matrix, made above
  df_allele_merge, # allele dataframe generated by pileup_and_phase script from the API
  genome = "hg38", # to change if using a different genome
  t = 1e-5,
  ncores = 20,
  ncores_nni=20,
  plot = TRUE,
  min_LLR=3,
  out_dir = paste0(numbat_res_base, selected_patient,'/run_numbat_out') #where you want to save the output
)

`

@teng-gao teng-gao added the enhancement New feature or request label Jul 6, 2023
@teng-gao
Copy link
Collaborator

teng-gao commented Jul 6, 2023

Hi @daisyyr ,

Thanks for the issue. 40k cells is a record for us .. are these all from the same patient? Unfortunately, there's currently no built-in mechanism for resuming Numbat runs from where it left off. The only workaround might be either running with more threads or splitting the cells by sample. Installing the most recent scistreer (v1.2.0) may also help since it speeds up the phylogeny part.

Best,
Teng

@Terkild
Copy link

Terkild commented Mar 13, 2024

I had a very similar issue with a unintended shutdown after run_numbat running for >2 days (35k cells). I thus decided to modify the code to allow resuming at the latest saved stop point (output files saved). As an added benefit it also allows to add additional iterations if the previous amount was not sufficient to reach convergence (without having to start over).

It may not be the most elegant fix, but it works and my interrupted run has now finished. But as I have never really done pull requests and it may not be the "optimal" way to do it, I have just placed the forked repository here for now:
https://github.com/Terkild/numbat

@teng-gao
Copy link
Collaborator

@Terkild Thank you for sharing. Will review this soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants