-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: Must group by variables found in .data
. * Column seg
is not found. * Column sample
is not found. When analyzing two samples bound together
#33
Comments
Let me give the complete log: Running under parameters:
|
Hmm. The last line seems to indicate that the jobs ran out of memory .. how much memory did you use for 12 cores? One caveat with the current implementation of Numbat is that it's quite memory intensive. |
124GB! I thought the last line was due to the Error above found in one of the cores. |
Yes, please check if giving it more memory would solve the issue. We will look into optimizing the memory usage soon. |
Thanks.
I'm trying with 20 cores and 32GB mem-per-cpu. I'll let you know
Jose
…--------------------------------------
Jose M. Garcia Manteiga PhD
Computational Biologist
Center for Translational Genomics and BioInformatics
Dibit2-Basilica, 4A3
San Raffaele Scientific Institute
Via Olgettina 58, 20132 Milano (MI), Italy
Tel: +39-02-2643-9211
e-mail: ***@***.***
Il giorno mer 18 mag 2022 alle ore 19:08 Teng Gao ***@***.***>
ha scritto:
Yes, please check if giving it more memory would solve the issue. We will
look into optimizing the memory usage soon.
—
Reply to this email directly, view it on GitHub
<#33 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2UOMNX2GEGHPEXHGGOTKLVKUPZFANCNFSM5WIT4XOA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hello @josegarciamanteiga, We made some improvements to the runtime and memory usage in Version 0.1.3, which should be overall twice as fast and less memory intensive. Do let me know if the memory problem persists since there’s still a step that uses mclapply for parallelization. Thanks, |
Thanks!
I solved it with more memory not by cpu but in total, to let it manage it.
But I will next try the new version for new results.
Best
Jose
…--------------------------------------
Jose M. Garcia Manteiga PhD
Computational Biologist
Center for Translational Genomics and BioInformatics
Dibit2-Basilica, 4A3
San Raffaele Scientific Institute
Via Olgettina 58, 20132 Milano (MI), Italy
Tel: +39-02-2643-9211
e-mail: ***@***.***
Il giorno mar 5 lug 2022 alle ore 16:03 Teng Gao ***@***.***>
ha scritto:
Hello @josegarciamanteiga <https://github.com/josegarciamanteiga>,
We made some improvements to the runtime and memory usage in Version 0.1.3
<https://kharchenkolab.github.io/numbat/news/index.html#numbat-0-1-3---07022022>,
which should be overall twice as fast and less memory intensive. Do let me
know if the memory problem persists since there’s still a step that uses
mclapply for parallelization.
Thanks,
Teng
—
Reply to this email directly, view it on GitHub
<#33 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2UOMIUB64CUMLBLX6R3HTVSQ6CFANCNFSM5WIT4XOA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi again,
As you suggested, I used pileup_and_phase.R with two samples using ",". Then I used cbind and rbind on count matrices and allele dataframes by substituting before the "-1" suffix on barcodes in the second files for a "-2".
The code was going smoothly up to the fifth 'Retesting CNVs.." where it threw this:
Error: Must group by variables found in
.data
.*** Column
seg
is not found.sample
is not found.**Backtrace:
█
%>%
(...)%>%
(...)Thanks again!
Jose
The text was updated successfully, but these errors were encountered: