Quantifying homologous proteins and proteoforms

08/05/2017
by   Dmitry Malioutov, et al.
MIT
0

Many proteoforms - arising from alternative splicing, post-translational modifications (PTMs), or paralogous genes - have distinct biological functions, such as histone PTM proteoforms. However, their quantification by existing bottom-up mass-spectrometry (MS) methods is undermined by peptide-specific biases. To avoid these biases, we developed and implemented a first-principles model (HIquant) for quantifying proteoform stoichiometries. We characterized when MS data allow inferring proteoform stoichiometries by HIquant, derived an algorithm for optimal inference, and demonstrated experimentally high accuracy in quantifying fractional PTM occupancy without using external standards, even in the challenging case of the histone modification code. HIquant server is implemented at: https://web.northeastern.edu/slavov/2014_HIquant/

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

10/09/2020

Using Graph Neural Networks for Mass Spectrometry Prediction

Detecting and quantifying products of cellular metabolism using Mass Spe...
03/09/2017

Impact of URI Canonicalization on Memento Count

Quantifying the captures of a URI over time is useful for researchers to...
03/01/2021

From Quantifying Vagueness To Pan-niftyism

In this short paper, we will introduce a simple model for quantifying ph...
01/02/2020

A Deep Learning Approach to Diagnosing Multiple Sclerosis from Smartphone Data

Multiple sclerosis (MS) affects the central nervous system with a wide r...
08/28/2010

Foundations of Inference

We present a simple and clear foundation for finite inference that unite...
12/09/2017

DeepIso: A Deep Learning Model for Peptide Feature Detection

Liquid chromatography with tandem mass spectrometry (LC-MS/MS) based pro...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Figure 1

a H3K4 Methyl-proteoforms                       Model

peptide levels across conditions peptide-specific bias (nuisance) Set of proteins containing the peptide protein levels across conditions
b
Figure 1: Model for inferring stoichiometries among proteoforms and paralogous proteins independently from peptide-specific biases. (a) One shared () and three unique (, and ) peptides of H3 proteoforms illustrate a very simple case of HIquant. HIquant models the peptide levels measured across conditions () as a supposition of the protein levels (), scaled by unknown peptide–specific biases/nuisances (). These coupled equations can be written in a matrix form whose solution infers the methylation stoichiometry independently from the nuisances (). (b) The general form of the model for K proteoforms (or homologous proteins) with M peptides quantified across N conditions can be formulated and solved. In many, albeit not all, cases an optimal and unique solution can be found, even in the absence of unique peptides; see Supplementary Fig. 1 and Supplemental Information.

Figure 2

[width = .99]HIquant_validation_2.pdf a
[width = .42]UPS_validation.pdf b[width = .42]UPS_error4.pdf c

Figure 2: HIquant accurately quantifies ratios across alkylated proteoforms of a spiked-in standard. (a) Schematic diagram of a validation experiment. We prepared a gold standard of proteoforms from the dynamic universal proteomics standard (UPS2) whose cysteines were covalently modified either with iodoacetamide or with vinylpyridine. Upon digestion, these modified UPS proteins generate many shared peptides (peptides not containing cysteine) and a few unique peptides (peptides containing cysteine). The modified UPS2 proteins were mixed with one another at known ratios (), mixed with yeast lysate, digested and quantified by MS. The proteoform ratios that HIquant inferred from the MS data () were compared to the mixing ratios. (b) The ratios across the alkylated isoforms of UPS2 inferred by HIquant (, y-axis) accurately reflect the mixing ratios (, x-axis). (c) Comparison of the error in proteoform ratios inferred by HIquant and ratios inferred from the precursor ion areas and the reporter ion (RI) ratios.

Figure 3

[width = .42]H3K4.pdf a[width = .42]H3K9-K14.pdf b

Figure 3: HIquant

accurately infers stoichiometries and confidence intervals across PTM site occupancies of histone 3. (

a) Histone 3 peptides were quantified by SRM across 7 perturbations, and the fractional site occupancies for K4 methylation estimated by two methods: Estimates inferred by HIquant without using external standards are plotted against the corresponding estimates based on MasterMix external standards with known concentrations creech2015building. Each marker shape corresponds to the PTM site(s) shown in the legend; methylation is denoted with “me” and acetylation with “ac” followed by the number of methyl/acetyl groups. (b) The validation method from (a) was extended to another set of more complex fractional site occupancies on K9 methylation and K14 acetylation.

Supplemental Figure 1

a      RP L6 Paralogs
b      Phospho-proteoforms [0.3em] [0.3em] [0.3em] c
Figure S1: Model for inferring stoichiometries among proteoforms and paralogous proteins independently from peptide-specific biases. (a) One shared () and two unique ( and ) peptides from the two paralogs of ribosomal proteins L6 illustrate the simplest case of HIquant. HIquant models the peptide levels measured across two conditions () as a supposition of the protein levels (), scaled by unknown peptide–specific nuisances (). These coupled equations can be written in a matrix form whose solution infers the stoichiometry independently from the nuisances (). (b) The shared and unique peptides of proteoforms (as illustrated by PDHA1 phospho-proteoforms) can be modeled as in panel (a); (c) The matrix system from (a) generalizes to K proteoforms (and homologous proteins) with M peptides quantified across N conditions. In many, albeit not all, cases an optimal and unique solution can be found, even in the absence of unique peptides. See Supplemental Information for details.