Oral Presentation 25th Annual Lorne Proteomics Symposium 2020

Analytical guidelines for co-fractionation mass spectrometry obtained through global profiling of gold standard Saccharomyces cerevisiae protein complexes (#37)

Chi Nam Ignatius Pang 1 , Daniel Weissberger 1 , Sara Ballouz 2 , Loic M Thibaut 3 , Joseph R Gillis 2 , Marc R Wilkins 1 , Gene Hart-Smith 4
  1. School of Biotechnology and Biomolecular Sciences, UNSW, Sydney, NSW, Australia
  2. Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY, USA
  3. School of Mathematics and Statistics, UNSW, Sydney, NSW, Australia
  4. Department of Molecular Sciences, Macquarie University, Sydney, NSW, Australia

Co-fractionation mass spectrometry (CF-MS) is a method by which endogenous and unmanipulated protein complexes can be analysed on a broad scale in single experiments. CF-MS involves extensive biochemical fractionation of protein complexes using one or more non-denaturing chromatographic techniques (e.g. size exclusion chromatography (SEC)), followed by quantitative proteomics of each fraction. Subunits from the same intact complex will have highly correlated fractionation profiles.

Despite its demonstrated utility (1-2), best practice approaches for CF-MS remain undefined. Here we gain insight into how to best collect and interpret CF-MS data by benchmarking CF-MS datasets against gold standard complexes in Saccharomyces cerevisiae, one of the few organisms for which high proteome-coverage reference libraries of gold standard complexes exist.

By benchmarking experimental and modelled CF-MS datasets, we find that co-analysis of data from complementary biochemical fractionation methods (e.g. using Fisher’s combined probability test) identifies complexes with greater efficiency than stand-alone biochemical fractionation. Systematic identification of gold standard complexes using 17 correlation metrics indicates that some metrics (e.g. Spearman correlation) are more effective than others (e.g. Mutual Information).

Many fractionation profiles that were unable to be benchmarked were nonetheless highly correlated, and thus possibly derived from novel complexes. Principal component analysis of gold standard and putative novel complexes indicated that novel complexes frequently elute in later SEC fractions, and are therefore often small. To test the effects of using orthogonal data (e.g. Gene Ontology) to assist in the prediction of these novel complexes, the Extending ‘Guilt-by-Association’ by Degree R package (3) was used. These analyses found that identifications of gold standard complexes are likely to benefit from the integration of GO data, whereas predictions of novel complexes are not. This suggests that orthogonal experimental validation (e.g. cross-linking mass spectrometry) may be required to validate novel complexes in CF-MS datasets.

  1. Wan et al. 2015 Nature 525: 339-44
  2. Heusel et al. 2019 Molecular Systems Biology 15(1)
  3. Ballouz et al. 2017 Bioinformatics 33: 612-24