Skip to content

Intensity Normalization

Robert Millikin edited this page Jan 7, 2020 · 10 revisions

Non-fractionated samples

FlashLFQ's intensity normalization is fairly straightforward for non-fractionated samples. A median-normalization method is employed to set the median peptide intensity difference between any two samples to zero. This means that the median peptide is assumed to have the same abundance in sample A and sample B.

The way this works in FlashLFQ is:

  1. A sample is chosen as the reference sample (the first sample of the first condition, conditions in alphabetical order)
  2. The sample's peptide's are compared to each other sample, one at a time (pairwise)
  3. A list of peptides that have non-zero intensities in both samples is generated
  4. The median peptide fold-change is found
  5. Every peptide's intensity in sample B is multiplied by 1/(median change)

This means that if the median peptide's abundance in Sample B is 110% of that in Sample A, then every peptide in Sample B is multiplied by ~0.91.

The assumption of this type of normalization is that the median protein/peptide does not change in abundance. In other words, most proteins/peptides are assumed to not be changing, and only a minority of proteins are changing in abundance between the samples (possibly as a result of some treatment). This assumption is roughly true in most samples, but it does not hold in some types of experiments, and it may not be appropriate for your particular experiment.

Fractionated Samples

For fractionated samples, the math becomes more difficult, but the principle remains the same. FlashLFQ's fraction normalization is similar to MaxQuant's. The typical protein/peptide is assumed to not be changing in abundance between samples. Each fraction is assumed to have a "normalization coefficient" similar to that described above; some constant that we will multiply all the intensities in the fraction by. Finding the normalization coefficient is trivial for non-fractionated data but becomes more difficult for fractionated data; it becomes a multidimentional optimization problem (one dimension per fraction), where each fraction has a normalization coefficient and the sum of the intensity difference between the samples is the error to be minimized. FlashLFQ uses a bounded Nelder-Mead optimizer to find optimal fraction normalization coefficients. The normalization coefficient for each fraction must be >0.3 and <3.0.