Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computational requirements and time #4

Closed
Tom-Jenkins opened this issue May 19, 2020 · 3 comments
Closed

Computational requirements and time #4

Tom-Jenkins opened this issue May 19, 2020 · 3 comments

Comments

@Tom-Jenkins
Copy link

Hi, thank you for sharing this program! I was wondering if you can give any estimates of the computational resources and time required to run the single command program?

@mawad89
Copy link
Collaborator

mawad89 commented May 20, 2020

Hi Tom;
It is depending on the number of drafts, genome size and data amount. However, I can say that MDM and CCM modules don't need a lot of time or computational resources. but LGAM needs to map each draft to the reads using bwa in a single mode or another mapping tool if you use the manual steps (I always recommend it because you can run the mapping in a parallel mode). Then read separator computational requirements rely on the fq/fa size. Then assembly of each of each linkage group depending on the software used for assembly,
So in single command mode you need the time and computational resources to run bwa in all drafts one after one and to run the assembly for each linkage group one after one. but if you use the step by step mode you can run this steps parallelly.

@Tom-Jenkins
Copy link
Author

Hi Mohamed,

Thank you for your reply that's helpful. Yes I plan to run the pipeline step-by-step but I want to first run the single command to get a feel for whether the software is going to be useful for my genome assembly. The Single Command is easier for me to do this because I have to submit jobs to a high-performance cluster to run anything.

As an example, if you had three draft assemblies, a genome size of 1 Gbp, and 50 G data (PacBio), are you able to give me a rough estimate of time to completion based on a computer capacity of 16 threads and 128 GB memory?

Many thanks,
Tom

@mawad89
Copy link
Collaborator

mawad89 commented May 22, 2020

Sorry for late response;
It will need a round 3:4 days to start the assembly process
and the assembly process will depending on the number of linkage groups.

However, I am always see that the most important part to try it to the end of ccm with different criteria to detect the best scaffolding approach then you can run LGAM manually

next days I will work on resume option to make it easer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants