Skip to content

Commit

Permalink
dot plots
Browse files Browse the repository at this point in the history
  • Loading branch information
tobiasrausch committed Jun 30, 2023
1 parent 5baf5a4 commit 4effe50
Showing 1 changed file with 12 additions and 5 deletions.
17 changes: 12 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ alfred qc -r chr2.fa -o qc.tsv.gz -j qc.json.gz tumor.bam
zcat qc.tsv.gz | grep ^ME | datamash transpose
```

Instead of parsing the tab-delimited file, you can also upload the JSON file `qc.json.gz` to the [Alfred web application](https://www.gear-genomics.com/alfred/).
Instead of parsing the tab-delimited file, you can also upload the JSON file `qc.json.gz` to the [Alfred web application](https://www.gear-genomics.com/alfred/) available on [gear-genomics.com](https://www.gear-genomics.com/).


As you can see from the QC results, the data has been downsampled to 7x coverage to speed up all analyses.
Expand Down Expand Up @@ -77,7 +77,7 @@ igv -g chr2.fa

Once IGV has started use 'File' and 'Load from File' to load the `tumor.bam` and `control.bam` alignment file. Then import the `svs.bed` file from your working directory using 'Regions' and 'Import Regions'.
You can then easily navigate to the structural variants with 'Regions' and 'Region Navigator'.
Select a structural vaariant in the region navigator and click 'View', which will center the IGV alignment view on the selected structural variant.
Select a structural variant in the region navigator and click 'View', which will center the IGV alignment view on the selected structural variant.
You can zoom in and out using the '+' and '-' signs in the toolbar at the top.
To highlight the abnormal paired-ends please right click in IGV on the BAM file and activate 'View as pairs'. In the same menu, please open 'Color alignments by' and then switch to "pair orientation' for inversions and duplications. For deletions, you want to color the alignments by "insert size".

Expand All @@ -100,11 +100,18 @@ cat svs.bed | grep "complex"
As part of the [1000 Genomes SV consortium](https://www.nature.com/articles/nature15394) we validated some of the above complex SVs using PacBio. The reads are in a separate FASTA file called `pacbio.sv1.fa` and `pacbio.sv2.fa`. We need the subsequence of the reference to create a pairwise dotplot of the PacBio read against the reference. [SAMtools](http://www.htslib.org) is a convenient tool to extract such subsequences of a FASTA file.

```bash
samtools faidx chr2.fa chr2:18905691-18907969 > sv1.fa
samtools faidx chr2.fa chr2:96210505-96212783 > sv2.fa
samtools faidx chr2.fa chr2:18905691-18907969 | sed 's/^>.*$/>reference/' > sv1.fa
samtools faidx chr2.fa chr2:96210505-96212783 | sed 's/^>.*$/>reference/' > sv2.fa
```

Please align the above genomic reference subsequences `sv1.fa` and `sv2.fa` against the respective PacBio read `pacbio.sv1.fa` and `pacbio.sv2.fa` using [Maze](https://www.gear-genomics.com/maze/) available on [gear-genomics.com](https://www.gear-genomics.com/).
Please create a dot plot of the above genomic reference subsequences `sv1.fa` and `sv2.fa` against the respective PacBio read `pacbio.sv1.fa` and `pacbio.sv2.fa` using [Maze](https://www.gear-genomics.com/maze/) or on the command-line using [wally](https://github.com/tobiasrausch/wally).

```bash
cat pacbio.sv1.fa >> sv1.fa
wally dotplot sv1.fa
cat pacbio.sv2.fa >> sv2.fa
wally dotplot sv1.fa
```

***Exercises***

Expand Down

0 comments on commit 4effe50

Please sign in to comment.