Skip to content

Commit

Permalink
Wdlupdate (#6)
Browse files Browse the repository at this point in the history
* added cram-to-bam. updated pairedtoubam to use gatk4

* Update README.md

removed gatk software requiremnt because repo will contain more than one wdl which may use different versions.

* Update README.md

* Wdl now uses a readgroup tsv file as input. Added task to compose a file containing a list of the generated ubams

* minor

* minor

* minor edits

* corrected memory placement

* minor edits

* added bam-to-unmapped-bams wdl

* fixed comment number

* changed to use latest gatk docker

* fastq to bam now uses arrays as input

* updated descriptor for paired fastq to bam

* updated inpute in description

* added a firecloud version for fastq to Ubam

* minor format changes. chaged pairedfastq2Ubam docker to gcr

* Minor update to ReadMe, added default docker to cram2bam

* decreased mem size in cram2bam to reduce cost
  • Loading branch information
bshifaw committed Jul 9, 2019
1 parent d2be83f commit 03b6522
Show file tree
Hide file tree
Showing 4 changed files with 62 additions and 2 deletions.
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,15 @@ This WDL converts BAM to unmapped BAMs
#### Outputs
- Sorted Unmapped BAMs

### interleaved-fastq-to-paired-fastq :
This WDL takes in a single interleaved(R1+R2) FASTQ file and separates it into separate R1 and R2 FASTQ (i.e. paired FASTQ) files. Paired FASTQ files are the input format for the tool that generates unmapped BAMs (the format used in most GATK processing and analysis tools).

#### Requirements/expectations
- Interleaved Fastq file

#### Outputs
- Separate R1 and R2 FASTQ files (i.e. paired FASTQ)

### Software version requirements :
- GATK4 or later
- Samtools 1.3.1
Expand Down
4 changes: 2 additions & 2 deletions cram-to-bam.inputs.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@
"CramToBamFlow.cram_to_bam_disk_size": "200",

"##_COMMENT3": "MEMORY",
"CramToBamFlow.validate_sam_file_mem_size": "3500 MB",
"CramToBamFlow.cram_to_bam_mem_size": "15 GB",
"CramToBamFlow.validate_sam_file_mem_size": "3750 MB",
"CramToBamFlow.cram_to_bam_mem_size": "3.75 GB",

"##_COMMENT3": "PREEMPTIBLES",
"CramToBamFlow.ValidateSamFile.preemptible_tries": "3"
Expand Down
7 changes: 7 additions & 0 deletions interleaved-fastq-to-paired-fastq.inputs.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"#UninterleaveFastqs.uninterleave_fqs.cpu": "Int? (optional)",
"#UninterleaveFastqs.uninterleave_fqs.memory": "Int? (optional)",
"#UninterleaveFastqs.uninterleave_fqs.disk": "Int? (optional)",
"UninterleaveFastqs.uninterleave_fqs.inputFastq": "gs://gatk-test-data/wgs_fastq/NA12878_20k/H06JUADXX130110.1.ATCACGAT.20k_interleaved.fastq"
}

44 changes: 44 additions & 0 deletions interleaved-fastq-to-paired-fastq.wdl
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#This WDL takes in a single interleaved(R1+R2) FASTQ file and separates it into separate R1 and R2 FASTQ (i.e. paired FASTQ) files. Paired FASTQ files are the input format for the tool that generates unmapped BAMs (the format used in most GATK processing and analysis tools).
#
#Requirements/expectations
#- Interleaved Fastq file
#
#Outputs
#- Separate R1 and R2 FASTQ files (i.e. paired FASTQ)
#
##################

workflow UninterleaveFastqs {

call uninterleave_fqs
}
task uninterleave_fqs {

File inputFastq

Int? cpu
Int? memory
Int? disk

String r1_name = basename(inputFastq, ".fastq") + "_reads_1.fastq"
String r2_name = basename(inputFastq, ".fastq") + "_reads_2.fastq"

command {
cat ${inputFastq} | paste - - - - - - - - | \
tee >(cut -f 1-4 | tr "\t" "\n" > ${r1_name}) | \
cut -f 5-8 | tr "\t" "\n" > ${r2_name}
}

runtime {
docker: "ubuntu:latest"
memory: select_first([memory, 8]) + " GB"
cpu: select_first([cpu, 2])
zones: "us-central1-c us-central1-b"
disks: "local-disk " + select_first([disk, 3]) + " HDD"
}

output {
File r1_fastq = "${r1_name}"
File r2_fastq = "${r2_name}"
}
}

0 comments on commit 03b6522

Please sign in to comment.