Skip to content
Keiran Raine edited this page Aug 13, 2018 · 4 revisions

BEDPE file format:

Heading Type Description
chr1 String Chromosome of lower coordinate
start1 0-based int Start coordinate of lower coordinate
end1 1-based int End coordinate of lower coordinate
chr2 String Chromosome of high coordinate
start2 0-based int Start coordinate of high coordinate
end2 1-based int End coordinate of high coordinate
id/name String ID of event, correlates with VCF
brass_score int Number of aberrant pairs contributing to the rearrangement group.
strand1 [+-] Strand of end in 'genomic' context - see table
strand2 [+-] Strand of end in 'genomic' context - see table
sample String Name of sample as found in BAM RG header SM field, each sample contibuting will be listed.
svclass String Basic event type: deletion, inversion, tandem-duplication, translocation
bkdist int Distance between inner edges of breakpoints (-1 if difference chromosomes)
assembly_score int 0-100, A "niceness" score for the Velvet assembly graph. A score of 100 indicates a perfect graph with five vertices forming a quintet. Points are deducted for isolated vertices, cruft hanging off the quintet, and major points for extra cycles and large-scale graph cruftiness.
readpair names String CSV of read pair names found in aberrant pair grouping (reads with mapping quality MPQ >= 6 )
readpair count int Count of read pair found in aberrant pair grouping (analogous to brass_score)
bal_trans String ID of event that describes reciprocal balanced translocation event
inv String ID of event that describes reciprocal inversion event
occL int Count of events that share lower coordinate (chr/start/end1) within 500 bp window
occH int Count of events that share higher coordinate (chr/start/end2) within 500 bp window
copynumber_flag char Indicate presence of copynumber change point from ASCAT NGS result
range_blat int flag indicating the degree of homology between the two sides of the breakpoint. For small distances this will always be high, since a region is being compared against itself. Range_blat filtering should therefore not be used in isolation. If needed range_blat can be required to be less than 100, as long as distance is greater than 1000
Brass Notation String A string containing the brass description of the break point - see table at end
non-template String  Non templated sequence 
micro-homology  String 
assembled readnames String  CSV of all individual reads that formed part of the ALT assembly path.
gene  Gene affected by disruption of the coding direction
gene_id ID of gene affected by disruption of the coding direction
transcript_id transcript ID affected by disruption of the coding direction
strand Coding strand of affected feature
phase Phase of breakpoint in affected feature
region
region_number
total_region_count
first/last First or last element of exon structure.
fusion_flag See BEDPE - Fusion-flag.

Strand1/2

SV Type Strand (1/2)
Inversion +/-
Inversion -/+
Deletion +/+
Tandem Dup. -/-

BRASS notation

Paired-flag Lower chr Lower breakpoint Higher chr Higher breakpoint NTS Microhomology Phase II output
4 4 114974732 4 115115727 - G Chr.4- 114974733(32)--G--115115728(27) Chr.4- (score 94)
8 6 126370389 6 134006924 GGCAAATATACTCTT - Chr.6- 126370389] GGCAAATATACTCTT [134006924 Chr.6 (score 95)
32 1 241559717 12 132126269 - - Chr.1- 241559717][132126269 Chr.12 (score 90)

Incomplete alignment for query to target assembly

When the assembled reads do not completely match the ends of the reference based target additional notation is added to the Chr.X elements of the 'Phase II output':

Chr.17  290000(200)--NNN--1350000(44)  Chr.17[@17]  (score 75)
Chr.3  186000000] NNN [11600000  Chr.16-[@651]  (score 94)

This denotes where the divergence occurs, if considered the low end (internally) the value is the point the query starts to match. If the high end (internally) it is the point the query starts to diverge. Scores are adjusted accordingly.