-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cnv_state Column Description is Vague #152
Comments
Thanks for the issue.
2N regions should be designated neutral (NEU) and used as the baseline, and the 3N/5N regions as heterozygous gain (AMP), even if the average ploidy is close to 3. In the paper benchmark we only used HMFtools to produce CNV calls from WGS, which we compared with Numbat calls from scRNA. We didn't use it as input to Numbat analysis. See more at |
Ah, I understand. It will be easy for me to code. Finally, for |
|
Ah, great, so the segment ID syntax is not used by the software. For the future benefit of other PURPLE users: makeCNVinput <- function(directory)
{
segmentFiles <- list.files(directory, "purple.segment.tsv") # All sampleID.purple.segment.tsv files.
invisible(lapply(segmentFiles, function(segmentFile)
{
segmentTable <- read.delim(segmentFile)
segmentTable$minorAlleleCopyNumber <- round(segmentTable$minorAlleleCopyNumber)
segmentTable$majorAlleleCopyNumber <- round(segmentTable$majorAlleleCopyNumber)
segmentTable$tumorCopyNumber <- round(segmentTable$tumorCopyNumber)
cnv_state <- "neu"
isBalancedDel <- segmentTable$tumorCopyNumber == 0
if(any(isBalancedDel)) cnv_state[isBalancedDel] <- "bdel"
isDel <- segmentTable$tumorCopyNumber == 1
if(any(isDel)) cnv_state[isDel] <- "del"
isLOH <- segmentTable$tumorCopyNumber %in% seq(2, 100, 2) & segmentTable$minorAlleleCopyNumber == 0
if(any(isLOH)) cnv_state[isLOH] <- "loh"
isAmp <- segmentTable$tumorCopyNumber %in% seq(3, 99, 2)
if(any(isAmp)) cnv_state[isAmp] <- "amp"
isBalancedAmp <- segmentTable$tumorCopyNumber %in% seq(4, 100, 2) & segmentTable$minorAlleleCopyNumber == segmentTable$majorAlleleCopyNumber
if(any(isBalancedAmp)) cnv_state[isBalancedAmp] <- "bamp"
requiredTable <- data.frame(CHROM = segmentTable[, "chromosome"],
segment = paste("seg", 1:nrow(segmentTable), sep = ''),
seg_start = segmentTable[, "start"],
seg_end = segmentTable[, "end"],
cnv_state = cnv_state)
outFile <- gsub("purple", "forNumbat", segmentFile)
write.table(requiredTable, file = outFile, sep = '\t', quote = FALSE, row.names = FALSE)
}))
} |
cnv_state
definition should be improved. I have whole genome sequencing for each sample and have used Purity Ploidy Estimator for inferring purity-adjusted copy number. If the overall genome ploidy is three and a chromosome arm has copy number of two, what should that be coded as? How about if the chromsome arm has copy number five? In the journal article, I also notice that you use HMFTools for your whole genome sequencing analysis. So, is there a conversion script available for PURPLE output that converts it into a suitable file for Numbat input?The text was updated successfully, but these errors were encountered: