Important progress in the study of N-linked glycproteomics of human serum
February 29, 2020, "Large-scale Identification of N-linked Intact Glycopeptides in Human Serum using HILIC Enrichment and Spectral Library Search" was published online in Molecular & Cellular Proteomics. The work was finished by Prof. YANG Fuquan's group at Institute of Biophysics, Chinese Academy of Sciences and Prof. FU Yan's group at Academy of Mathematics and Systems Science, Chinese Academy of Sciences.
Glycosylation is one of the most important and prevalent post-translational modifications of proteins. Protein glycosylation plays vital roles in cells, including determination of protein folding, trafficking and stability, and regulation of various biological processes, such as cell growth, cell-cell communication, cell-matrix interactions, viral replication and immune defense. Aberrant protein glycosylation is usually associated with the pathological progression of many diseases, including cancer, neuro degenerative disorders, pulmonary diseases, blood disorders, and genetic diseases. Most of glycoproteins are potential drug targets and disease related biomarkers.
Protein glycosylation mainly includes N-linked glycosylation and O-linked glycosylation, among which N-linked glycosylation accounts for about 70%. N-linked glycoproteins are widely distributed, ranging from surface of various types of cells to different human body fluids such as serum, cerebrospinal fluid and urine. Glycoproteins secreted in body fluids are thought to provide a detailed window into the state of health of an individual. These features make glycoproteins a highly interesting class of proteins for clinical and biological research. Protein glycosylation is exceptionally complex, exists macroheterogeneity and microheterogeneity, and makes glycoproteomics studies extremely challenging. N-glycoproteomics of human serum is more challenging due to the wide dynamic range of serum protein abundances, the low abundance of N-glycoproteins, the lack of a complete serum N-glycan database and the existence of proteoforms.
In this study, serum proteins were first separated into low-abundant and high-abundant proteins by acetonitrile precipitation. After digestion, the N-linked intact glycopeptides were enriched by hydrophilic interaction liquid chromatography (HILIC) and a portion of the enriched N-linked intact glycopeptides were processed by N-Glycosidase F (PNGase F) to generate N-linked de-glycopeptides. Both N-linked intact glycopeptides and de-glycopeptides were analyzed by LC-MS/MS.
N-linked de-glycopeptides were first identified by searching their MS/MS spectra against human protein sequences, considering four types of N-linked glycosylation sequence motifs (NXS/T/C/V, X≠P) to recognize the N-linked de-glycopeptides. Then, the spectra of the N-linked de-glycopeptides identified were utilized to the construct the spectral library of N-linked de-glycopeptides with the addition of series of Y ions (Y1,Y2…5) in each spectrum. A database of 739 N-glycan masses was also constructed.
The identification of N-linked intact glycopeptides was performed with spectral library search strategy using pMatchGlyco software, library of N-linked de-glycopeptides and N-glycan mass database. Compared with the sequence search method, spectral library search method is of higher sensitivity and search speed. Moreover, by precursor mass optimization and taking into account semi-specific digestion and abundant chemical modifications, the identification sensitivity was further improved.
Figure 1. The strategy and experimental design for the identification of N-linked intact glycopeptides from N-linked glycoproteins in human serum.
A: Protein & N-linked intact glycopeptide samples processing with or without fractionation.
B: Data analysis workflow for N-linked intact glycopeptide identification.
(Image by Dr. YANG Fuquan's group)
In total, 526 N-linked glycoproteins, 1,036 N-linkedglycosites, 22,677 N-linked intact glycopeptides and 738 N-glycan masses were identified under 1% FDR, representing the most in-depth N-glycoproteome of human serum identified by LC-MS/MS at N-linked intact glycopeptide level. Transferrin is a very famous glycoprotein in serum. Four N-linked glycosites(N432, N523, N630 and N637) with 371,2, 364 and 34 N-glycans at each site respectively have been identified in serum transferrin. Five N-linked glycosites(N432, N491, N523, N630 and N637)with 559,5, 6, 547 and 117 kinds N-glycans at each site respectively have been identified from the commercial serum transferrin standard. The results showed that the microheterogeneity of glycosylation modification in serum transferrin.
This is the second collaboration between YANG's group and FU's group after the first collaboration in the development of pMatchGlyco, which is a software for the analysis of N-linked intact glycopeptides in 2018 (https://www.ncbi.nlm.nih.gov/pubmed/30186849).
Article link: https://www.mcponline.org/content/19/4/672
Contact: YANG Fuquan
Institute of Biophysics, Chinese Academy of Sciences
Beijing 100101, China
(Reported by Dr. YANG Fuquan's group)