Skip to content
Isaac Turner edited this page Oct 18, 2015 · 1 revision

Each read gives us read_length - kmer_size + 1 kmers. Therefore kmer coverage is calculated by:

kmer_coverage = coverage * (read_length - kmer_size + 1) / read_length

kmer_coverage is the expected number of times each kmer is seen.

If we take per base sequencing error rate (err_rate) into account, kmer coverage is:

kmer_coverage = coverage * (read_length - kmer_size + 1) / read_length * (1-err_rate*kmer_size)
Clone this wiki locally