Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contig failing on ctx63 when path colors above 0 used #2

Closed
er432 opened this issue Jun 16, 2014 · 1 comment
Closed

Contig failing on ctx63 when path colors above 0 used #2

er432 opened this issue Jun 16, 2014 · 1 comment

Comments

@er432
Copy link

er432 commented Jun 16, 2014

I try running the following:
$MCCORTEX contigs -m 490G -n 12G --colour 1 -p 0:Coelorachis.clean.ctp -p 1:Vossia.k63.clean.ctp refAndSamples.basalAndropogonae.inferredEdges.clean.ctx > Vossia.clean.k63.fa

And I get this:
[16 Jun 2014 13:01:26-cEm][cmd] /programs/mccortex_5_30_14/bin/ctx63 contigs -m 490G -n 12G --colour 1 -p 0:Coelorachis.clean.ctp -p 1:Vossia.k63.clean.ctp refAndSamples.basalAndropogonae.inferredEdges.clean.ctx
[16 Jun 2014 13:01:26-cEm][cwd] /local/workdir/er432/andropogonae/mccortex_out
[16 Jun 2014 13:01:26-cEm][version] ctx=v0.0.3 zlib=1.2.3 htslib=0.2.0-rc8-6-gd49dfa6-dirty ASSERTS=ON CHECKS=ON k=33..63
[16 Jun 2014 13:01:26-cEm][memory] graph: 305GB
[16 Jun 2014 13:01:26-cEm][memory] paths: 49.6GB
[16 Jun 2014 13:01:26-cEm][memory] total: 354.6GB of 504.8GB RAM
[16 Jun 2014 13:01:26-cEm][hashtable] Allocating table with 12,884,901,888 entries, using 192.5GB
[16 Jun 2014 13:01:26-cEm][hashtable]  number of buckets: 268,435,456, bucket size: 48
[16 Jun 2014 13:02:50-cEm][graph] kmer-size: 63; colours: 3; capacity: 12,884,901,888
[16 Jun 2014 13:04:27-cEm][paths] Setting up path store to use 49.6GB main
[16 Jun 2014 13:04:27-cEm] Loading file refAndSamples.basalAndropogonae.inferredEdges.clean.ctx [3 colours] into colours 0-2
[16 Jun 2014 13:04:27-cEm]  2,223,283,362 kmers, 64.2GB filesize
[16 Jun 2014 13:04:27-cEm][CtxLoad] First col 0, into cols 0..2, file has 3 cols: refAndSamples.basalAndropogonae.inferredEdges.clean.ctx
[16 Jun 2014 13:14:42-cEm] Loaded 2,223,283,362 / 2,223,283,362 (100.00%) of kmers parsed
[16 Jun 2014 13:14:42-cEm][hash] buckets: 268,435,456 [2^28]; bucket size: 48; memory: 192.5GB; occupancy: 2,223,283,362 / 12,884,901,888 (17.25%)
[16 Jun 2014 13:14:42-cEm]  collisions  0: 2223283362
[16 Jun 2014 13:14:42-cEm][PathFormat] With 2 files, require 11859397612 tmp memory [0 extra bytes]
[16 Jun 2014 13:14:42-cEm] Loading file Coelorachis.clean.ctp [1 colour] into colour 0
[16 Jun 2014 13:14:42-cEm]  2,039,725,230 paths, 38.6GB path-bytes, 27,492,743 kmers, 39.2GB filesize
[16 Jun 2014 13:16:45-cEm][paths] Setup tmp path memory to use 11GB [remaining 38.6GB]
[16 Jun 2014 13:16:45-cEm] Loading file Vossia.k63.clean.ctp [1 colour] with colour filter: 0 into colour 1
[16 Jun 2014 13:16:45-cEm]  633,841,256 paths, 11GB path-bytes, 25,553,986 kmers, 11.6GB filesize
[src/kmer/path_store.c:186] Error path_store_add_packed(): Out of memory for paths
[16 Jun 2014 13:18:45-cEm] Fatal Error

Running only with the path for Vossia as follows: $MCCORTEX contigs -m 490G -n 12G --ncontigs 1000000 --print --colour 1 -p 1:Vossia.k63.clean.ctp refAndSampl
es.basalAndropogonae.inferredEdges.clean.ctx > Vossia.clean.k63.fa

Gives this:
[16 Jun 2014 12:43:19-cEm][cmd] /programs/mccortex_5_30_14/bin/ctx63 contigs -m 490G -n 12G --ncontigs 1000000 --colour 1 -p 1:Vossia.k63.clean.ctp refAndSamples.basalAndropogonae.inferredEdges.clean.ctx
[16 Jun 2014 12:43:19-cEm][cwd] /local/workdir/er432/andropogonae/mccortex_out
[16 Jun 2014 12:43:19-cEm][version] ctx=v0.0.3 zlib=1.2.3 htslib=0.2.0-rc8-6-gd49dfa6-dirty ASSERTS=ON CHECKS=ON k=33..63
[16 Jun 2014 12:43:19-cEm][memory] graph: 305GB
[16 Jun 2014 12:43:19-cEm][memory] paths: 11GB
[16 Jun 2014 12:43:19-cEm][memory] total: 316GB of 504.8GB RAM
[16 Jun 2014 12:43:19-cEm][hashtable] Allocating table with 12,884,901,888 entries, using 192.5GB
[16 Jun 2014 12:43:19-cEm][hashtable]  number of buckets: 268,435,456, bucket size: 48
[16 Jun 2014 12:44:45-cEm][graph] kmer-size: 63; colours: 3; capacity: 12,884,901,888
[16 Jun 2014 12:46:28-cEm][paths] Setting up path store to use 11GB main
[16 Jun 2014 12:46:28-cEm] Loading file refAndSamples.basalAndropogonae.inferredEdges.clean.ctx [3 colours] into colours 0-2
[16 Jun 2014 12:46:28-cEm]  2,223,283,362 kmers, 64.2GB filesize
[16 Jun 2014 12:46:28-cEm][CtxLoad] First col 0, into cols 0..2, file has 3 cols: refAndSamples.basalAndropogonae.inferredEdges.clean.ctx
[16 Jun 2014 12:57:16-cEm] Loaded 2,223,283,362 / 2,223,283,362 (100.00%) of kmers parsed
[16 Jun 2014 12:57:16-cEm][hash] buckets: 268,435,456 [2^28]; bucket size: 48; memory: 192.5GB; occupancy: 2,223,283,362 / 12,884,901,888 (17.25%)
[16 Jun 2014 12:57:16-cEm]  collisions  0: 2223283362
[16 Jun 2014 12:57:16-cEm][PathFormat] With 1 files, require 0 tmp memory [0 extra bytes]
[16 Jun 2014 12:57:16-cEm] Loading file Vossia.k63.clean.ctp [1 colour] with colour filter: 0 into colour 1
[16 Jun 2014 12:57:16-cEm]  633,841,256 paths, 11GB path-bytes, 25,553,986 kmers, 11.6GB filesize
[src/kmer/path_format.c:476] Assert Failed paths_format_merge(): hdr->num_path_bytes == 0 || pstore->tmpstore != ((void *)0)
[16 Jun 2014 12:57:16-cEm] Assert Error

However, I can successfully run when I only try to get contigs for color 0, as follows:
$MCCORTEX contigs -m 490G -n 12G --ncontigs 1000000 --print --colour 0 -p 0:Coelorachis.clean.ctp refAndSamples.basalAndropogonae.inferredEdges.clean.ctx > Coelorachis.clean.k63.fa

@noporpoise
Copy link
Member

Sorry it took so long to get to this. You should be able to run each sample one at a time:

$MCCORTEX contigs -m 490G -n 12G --ncontigs 1000000 --print --colour 0 -p Coelorachis.clean.ctp refAndSamples.basalAndropogonae.inferredEdges.clean.ctx > Coelorachis.clean.k63.fa

$MCCORTEX contigs -m 490G -n 12G --ncontigs 1000000 --print --colour 1 -p 1:Vossia.k63.clean.ctp refAndSamples.basalAndropogonae.inferredEdges.clean.ctx > Vossia.clean.k63.fa

But it looks like you're hitting a bug. We have had several similar issues so have been rewriting the way we store and represent paths (.ctp files). This work should allow for more flexibility and better interoperability of graph annotations files between programs. The develop branch includes this work which we'll soon make a release for. However this means that the file format for .ctp files has changed and you will need to regenerate them. I'm sorry this has been a hassle for you - McCortex is still under development and your feedback is helpful.

Isaac

noporpoise added a commit that referenced this issue May 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants