Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chr prefix #129

Open
hidehitofukushima opened this issue Mar 6, 2024 · 0 comments
Open

chr prefix #129

hidehitofukushima opened this issue Mar 6, 2024 · 0 comments

Comments

@hidehitofukushima
Copy link

hidehitofukushima commented Mar 6, 2024

Hi.
I'm using Battenberg recently and facing a little situation I seriously can't cope with, so I need your help.

I used the docker image from quay.io/wtsicgp/cgpbattenberg:3.7.1 , pulled it in a singularity image(singuarity pull docker://quay.io/wtsicgp/cgpbattenberg:3.7.1) and tried this command, and got these error.

####################################################################################
#$ -S /bin/bash
#$ -cwd
#$ -pe def_slot 16
#$ -l s_vmem=32G

singularity exec -e --bind /share/xxx/:/share/xxx/ /home/xxxx/database/images/cgpbattenberg_3.7.1.sif
battenberg.pl
-o ${OUTPUT_DIR}
-r ${FASTA}.fai
-tb ${TUMOR_BAM}
-nb ${NORMAL_BAM}
-e /home/xxxx/BATTENBERG/tmp2/reference_hg38/impute_info.txt
-u /home/xxxx/BATTENBERG/tmp2/reference_hg38/1000G_loci_hg38/
-ig /home/xxxx/BATTENBERG/tmp2/reference_hg38/ignore_contigs.wtchr
-gc /home/xxxx/BATTENBERG/tmp2/reference_hg38/GC_correction_hg38/
-t 16
-c /home/xxxx/BATTENBERG/tmp2/reference_hg38/probloci/probloci.txt.gz
-pr WGS
-ge XY
-ra 38

xxxxxxxxxxxxxxxx_allelecount.1.err:

  • /opt/wtsi-cgp/bin/alleleCounter -l /home/xxxxx/BATTENBERG/result/sample2/xxxxxxxxxxx/tmpBattenberg/1000genomesloci2012_chr1_split1.txt -b /home/xxxxxx/database/links/xxxxxx/result/wgs/xxxxxxxxxxxxxx/bam/yyyyyyyyyyyyyyyyyyyy/yyyyyyyyyyyyyyyyyyyyyyy.markdup.bam -o /home/xxxx/BATTENBERG/result/sample2/xxxxxxxxxxxxxxxxxx/tmpBattenberg/yyyyyyyyyyyyyyyyyyyyyyyyy_alleleFrequencies_chr1_split1.txt -m 20 -r /home/xxxx/database/reference/Homo_sapiens_assembly38.fasta.fai -d
    Reading locis
    Done reading locis
    Multi pos start:
    1054.64user 14.98system 17:53.85elapsed 99%CPU (0avgtext+0avgdata 4563196maxresident)k
    521476inputs+119296outputs (9major+1137847minor)pagefaults 0swaps

xxxxxxxxxxxxxxxx_runbaflog.0.err:

  • cd /home/xxxx/some_directory/tmpBattenberg
  • /usr/bin/Rscript -e 'library(Battenberg); getBAFsAndLogRs(tumourAlleleCountsFile.prefix="87336744-255b-42a1-a1ff-0eded2655d8f_alleleFrequencies_chr", normalAlleleCountsFile.prefix="dace3db7-74a8-408c-9872-d424ee7429bb_alleleFrequencies_chr", figuresFile.prefix="87336744-255b-42a1-a1ff-0eded2655d8f_", BAFnormalFile="87336744-255b-42a1-a1ff-0eded2655d8f_normalBAF.tab", BAFmutantFile="87336744-255b-42a1-a1ff-0eded2655d8f_mutantBAF.tab", logRnormalFile="87336744-255b-42a1-a1ff-0eded2655d8f_normalLogR.tab", logRmutantFile="87336744-255b-42a1-a1ff-0eded2655d8f_mutantLogR.tab", combinedAlleleCountsFile="87336744-255b-42a1-a1ff-0eded2655d8f_alleleCounts.tab", chr_names=as.vector(c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","X")), g1000file.prefix="/home/xxxx/BATTENBERG/tmp2/reference_hg38/1000G_loci_hg38/1000genomesAlleles2012_chr", minCounts=10, samplename="87336744-255b-42a1-a1ff-0eded2655d8f", seed=1488823153) '
    511.57user 12.36system 10:01.43elapsed 87%CPU (0avgtext+0avgdata 19791232maxresident)k
    785890inputs+5320592outputs (122major+2178229minor)pagefaults 0swaps

xxxxxxxxxxxxxxxx_gc_correct.0.err:

  • cd /home/some_directory/tmpBattenberg
  • /usr/bin/Rscript -e 'library(Battenberg); gc.correct.wgs(Tumour_LogR_file="87336744-255b-42a1-a1ff-0eded2655d8f_mutantLogR.tab", outfile="87336744-255b-42a1-a1ff-0eded2655d8f_mutantLogR_gcCorrected.tab", correlations_outfile="87336744-255b-42a1-a1ff-0eded2655d8f_GCwindowCorrelations.txt", gc_content_file_prefix="/home/xxxx/BATTENBERG/tmp2/reference_hg38/GC_correction_hg38/1000_genomes_GC_corr_chr_",replic_timing_file_prefix=NULL, chrom_names=as.vector(c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","X"))) '
    Warning message:
    The path argument of write_tsv() is deprecated as of readr 1.4.0.
    Please use the file argument instead.
    This warning is displayed once every 8 hours.
    Call lifecycle::last_warnings() to see where this warning was generated.
    220.29user 28.44system 4:26.97elapsed 93%CPU (0avgtext+0avgdata 21400400maxresident)k
    2455614inputs+10494816outputs (3major+3085355minor)pagefaults 0swaps

xxxxxxxxxxxxxxxxxx_imputefromaf.7.err:

  • cd /home/some_directory/result/sample2/hogehoge/tmpBattenberg
  • /usr/bin/Rscript -e 'library(Battenberg); generate.impute.input.wgs(chrom="chr7", tumour.allele.counts.file="87336744-255b-42a1-a1ff-0eded2655d8f_alleleFrequencies_chrchr7.txt", normal.allele.counts.file="dace3db7-74a8-408c-9872-d424ee7429bb_alleleFrequencies_chrchr7.txt", output.file="87336744-255b-42a1-a1ff-0eded2655d8f_impute_input_chrchr7.txt", imputeinfofile="/home/ny1fh/BATTENBERG/tmp2/reference_hg38/impute_info.txt", is.male="TRUE", problemLociFile="/home/ny1fh/BATTENBERG/tmp2/reference_hg38/probloci/probloci.txt.gz", useLociFile=NA, heterozygousFilter=0.1) '
    Error in file(file, "rt") : invalid 'description' argument
    Calls: generate.impute.input.wgs -> read.table -> file
    0.26user 0.04system 0:00.30elapsed 99%CPU (0avgtext+0avgdata 73864maxresident)k
    0inputs+48outputs (0major+18718minor)pagefaults 0swaps

####################################################################################

Apparently allelecount process completed with no error, and with complete set of xxxxxx_alleleFrequencies_chrZZZZ.txt.
Runbaflog.err file and gc.correct.err file seemed okay, but in the imputefromaf.err file I found the error above.

The 'chrom_names' variable first had no 'chr' prefix inside as you can see in the runbaflog.err file and gc.correct.file.
(chrom_names=as.vector(c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","X"))
However, I also noticed 'chrom' variable in generate.impute.input.wgs function has 'chr'. and this function required 'xxxx_alleleFrequencies_chrchr7.txt' as inputfile. (in my result folder there's 'xxxx_alleleFrequencies_chr7.txt')

I believe the battenberg.R file has only one site defining chrom_names variable(line145-150) , and subsequent functions are called with each 'chrom' from 'chrom_names', so I wonder why chrom_names variable suddenly changed from without chr prefix to WITH chrom prefix.

Just so you know, the bam files are alinged with hg38, and has 'chr' prefix

I also tried recent versions of distributed source including dev branch with battenberg R package, as well as all previous images from quay:io, and they all got stuck in the getBAFsAndLogRs function, with empty xxxxx_mutantBAF.tab files (just as someone mentioned in the issue, even though I tried with more cores, more memory(eg, 128G+10core). So I gave these versions up.
ANY help would be appreciated.
thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant