Skip to content

Amino Acid Substitution Discovery

trishorts edited this page Sep 11, 2017 · 1 revision

Only a small number of proteomics samples will have a reference database available at UniProt with complete protein sequences that match those of the sample (e.g. E. coli K12, S. cerevisiae S288c and Mus musculus C57BL/6J). Nearly all other samples will contain amino acid variants, which result from differences in the genome of the sample compared to reference. Amino acid variants can be discovered in a sample by careful sequencing of the genome or transcriptome. These variants can be coded into the protein database and used in proteomics database searching. This is proteogenomics(see below for additional references).

MetaMorpheus can be used to discover amino acid substitutions without requiring the sequencing of the organism. This is done during the G-PTM-D search task. The user wishing to discovery amino acid variants should select "G-PTM-D Modifications/1 nucleotide substitution". The list of substitutions enabled by selecting this feature are those that can occur via a single nucleotide substitution and the genomic level. They are the most likely candidates, a fact born out by their near complete inclusion in the list of observed amino acid variants reported at uniprot. Another category of substitutions is available "2+ nucleotide substitution". But, users are cautioned against using this list of variants without further validation.

Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation. Sheynkman GM, Shortreed MR, Cesnik AJ, Smith LM. Annu Rev Anal Chem (Palo Alto Calif). 2016 Jun 12;9(1):521-45. doi: 10.1146/annurev-anchem-071015-041722. Epub 2016 Mar 30.

Human Proteomic Variation Revealed by Combining RNA-Seq Proteogenomics and Global Post-Translational Modification (G-PTM) Search Strategy. Cesnik AJ, Shortreed MR, Sheynkman GM, Frey BL, Smith LM. J Proteome Res. 2016 Mar 4;15(3):800-8. doi: 10.1021/acs.jproteome.5b00817. Epub 2016 Jan 12.

Using Galaxy-P to leverage RNA-Seq for the discovery of novel protein variations. Sheynkman GM, Johnson JE, Jagtap PD, Shortreed MR, Onsongo G, Frey BL, Griffin TJ, Smith LM. BMC Genomics. 2014 Aug 22;15:703. doi: 10.1186/1471-2164-15-703.

Large-scale mass spectrometric detection of variant peptides resulting from nonsynonymous nucleotide differences. Sheynkman GM, Shortreed MR, Frey BL, Scalf M, Smith LM. J Proteome Res. 2014 Jan 3;13(1):228-40. doi: 10.1021/pr4009207. Epub 2013 Nov 11.