-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wdlupdate #4
Conversation
removed gatk software requiremnt because repo will contain more than one wdl which may use different versions.
…ile containing a list of the generated ubams
|
||
"##_Comment3": "Disk Space", | ||
"##_ConvertPairedFastQsToUnmappedBamWf.PairedFastQsToUnmappedBAM.disk_space_gb": "(optional) Int?", | ||
"##_ConvertPairedFastQsToUnmappedBamWf.PairedFastQsToUnmappedBAM.machine_mem_gb": "(optional) Int?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listed under disk space but this is a memory parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, just some suggestions to simplify the workflow/task definition.
|
||
"##_Comment3": "Memory", | ||
"##_ConvertPairedFastQsToUnmappedBamWf.CreateUbamList.disk_space_gb": "(optional) Int?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listed under memory but this is a disk space parameter
paired-fastq-to-unmapped-bam.wdl
Outdated
|
||
# Convert multiple pairs of input fastqs in parallel | ||
scatter (readgroup in readgroup_list) { | ||
scatter (i in range(length(readgroup_array))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that the original scatter syntax is less verbose:
scatter (readgroup in readgroup_list) {
call PairedFastQsToUnmappedBAM {
input:
fastq_1 = readgroup_array[1],
fastq_2 = readgroup_array[2],
...
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We switched to using a readgroup list file instead of an array because there isn't an option for map variables in FC for the other input variables (fastq_pairs and metadata)
paired-fastq-to-unmapped-bam.wdl
Outdated
input: | ||
unmapped_bams = PairedFastQsToUnmappedBAM.output_bam, | ||
ubam_list_name = ubam_list_name, | ||
docker = docker, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjusting spacing
paired-fastq-to-unmapped-bam.wdl
Outdated
} | ||
output { | ||
File output_bam = "${readgroup_name}.unmapped.bam" | ||
} | ||
} | ||
|
||
task CreateUbamList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this task be simplified by using a write_lines()
command {
mv ${write_lines(unmapped_bams)} ${ubam_list_name}.unmapped_bams.list
}
I also don't think it needs to explicitly specify disk/mem as the default Cromwell values should be sufficient. You could probably get away with just docker & preemptible count, or even hard-code preemptible to 3 as such a short task will likely not need more than one attempt.
runtime {
docker: docker
preemptible: select_first([preemptible_attempts, 3])
}
README.md
Outdated
@@ -23,9 +23,14 @@ This WDL converts paired FASTQ to uBAM and adds read group information | |||
#### Requirements/expectations | |||
- Pair-end sequencing data in FASTQ format (one file per orientation) | |||
- One or more read groups, one per pair of FASTQ files | |||
- The wdl will accept a file containing a list of the readgroups along with their metadata as the input. The wdl expects a specifi format for the colomuns, please follow the format as provided : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not important but maybe reword this sentence a bit?
This WDL requires an input TSV file such that each row corresponds to a readgroup and the columns contain metadata for each readgroup. The WDL depends on the input TSV to follow a specific format for the columns. Please use the template provided below as the header for the input TSV file:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most suggestions are around syntax. For the FC version of these WDLs, it would be good to have the default docker be from GCR and not DockerHub. Secondarily, I believe there's a place where the extensions need adjustment. Other than that, it looks good to me!
bam-to-unmapped-bams.wdl
Outdated
## - Sorted Unmapped BAMs | ||
## | ||
## Cromwell version support | ||
## - Successfully tested on v31 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
call SortSam { | ||
input: | ||
input_bam = unmapped_bam, | ||
sorted_bam_name = output_basename + ".unmapped.bam", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't need the + ".unmapped.bam"
anymore (as output_basename
includes that extension)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think basename(unmapped_bam, ".coord.sorted.unmapped.bam")
removes the unmapped.bam
part so this line adds back without the .coord.sorted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fudge--YES
paired-fastq-to-unmapped-bam-fc.wdl
Outdated
# Convert pair of FASTQs to uBAM | ||
call PairedFastQsToUnmappedBAM { | ||
input: | ||
fastq_1 = readgroup_array[i][1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry this is a small thing, but it would be nice to indent this list of inputs :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paired-fastq-to-unmapped-bam-fc.wdl
Outdated
} | ||
} | ||
|
||
#Create a file with a list of the generated ubams |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjust spacing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paired-fastq-to-unmapped-bam-fc.wdl
Outdated
#Create a file with a list of the generated ubams | ||
call CreateFoFN { | ||
input: | ||
array_of_files = PairedFastQsToUnmappedBAM.output_bam, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
"ConvertPairedFastQsToUnmappedBamWf.PairedFastQsToUnmappedBAM.disk_size": 200 | ||
} | ||
"##_Comment4": "Docker", | ||
"#ConvertPairedFastQsToUnmappedBamWf.gatk_docker_override": "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the line below, possibly replace empty ""
with (optional) String?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paired-fastq-to-unmapped-bam-fc.wdl
Outdated
String ubam_list_name = basename(readgroup_list,".list") + "unmapped.bam.list" | ||
|
||
String? gatk_docker_override | ||
String gatk_docker = select_first([gatk_docker_override, "broadinstitute/gatk:latest"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the default docker be the ones in gcr instead of dockerhub? Especially as this is the FireCloud version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've modified the ones in FC to use gcr. The idea was to have the ones in git use docker to avoid any authentication issues for users running on different platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops no i didn't , I forgot this was the FC version. thanks for catching this.
paired-fastq-to-unmapped-bam.wdl
Outdated
} | ||
} | ||
|
||
#Create a file with a list of the generated ubams |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paired-fastq-to-unmapped-bam.wdl
Outdated
#Create a file with a list of the generated ubams | ||
call CreateFoFN { | ||
input: | ||
array_of_files = PairedFastQsToUnmappedBAM.output_bam, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paired-fastq-to-unmapped-bam.wdl
Outdated
} | ||
runtime { | ||
docker: docker | ||
memory: mem_size | ||
memory: select_first([machine_mem_gb,10]) + " GB" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacing after comma
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pairedtoubam Wdl now uses a readgroup tsv file as input, also added task to compose a file containing a list of the generated ubams