||The study of proteins provides knowledge that cannot be deduced from the study of genes, or transcripts, as it is the proteins, and not the genes, that determine the phenotypes of cells due to processes such as post-translational modifications (PTMs). The core technology of proteomics capable of both identification and quantification of proteins in biological samples is mass spectrometry (MS).
In the bottom-up approach to MS, peptides - smaller segments of proteins - enter the mass spectrometer and thus measurements are made on a peptide level. Some peptides may be assigned to several proteins. Such peptides are referred to as shared or degenerate peptides. Peptide sharing cannot be solved by technological advancement, as it is caused by sequence homology, often related to protein families (groups of evolutionarily-related proteins) or protein variants (similar proteins originating from one gene or gene family).
Currently, the most prevalent approach to handle shared peptides in protein quantification is to ignore them. This leads to several issues. For instance, fewer proteins are identified, as proteins identified only by shared peptides are discarded from the analysis. Also, protein abundance estimates and their precision are altered by shared peptides removal.
The main goal of this project is the development of a novel statistical modelling framework capable of including information from shared peptides in protein quantification analysis.