Where does the amino acid sequence start and where does it end?
The terminal amino acid sequences are important for the biological functions of a protein. They influence the protein distribution to different cellular locations, and the protein degradation and turn-over rate.
For recombinant proteins, it is thus important to confirm that the N- and C-termini are as predicted from the gene. You also need to confirm that the protein expression is correct. In pharmaceutical proteins, it becomes critically important to confirm that the protein termini are correct and identical from batch to batch.
Often observed variations in the termini are incomplete processing of the N-terminal signal sequences. Other variations are lack of removal of the N-terminal methionine residue, and partial or complete formation of pyroglutamic acid from N-terminal Glu and Gln residues.
Common variations of the C-terminus are sequence truncations, for example removal of the C-terminal Lys residue of the heavy chain in monoclonal antibodies. These variations of the protein termini may occur with differences in fermentation conditions and protein purification buffers. Therefore, N- and C-terminal sequencing is an important quality control analysis. It is also a mandatory in the ICH Q6B Guidance on test procedures for biotechnological products.
You can obtain confirmation and identification of exact N-and C-terminals of a pure protein by a combination of the following techniques:
N-terminal Edman degradation
Protein sequencing by cyclic Edman degradation typically can identify 5-30 residues from the protein N-terminus. The Edman chemistry does not work on modified residues blocked for the PITC coupling. E.g. because they contain N-terminal pyroglutamic acid or acetylation. Edman sequencing requires 2-10 micrograms of the pure protein in-solution or on a PVDF membrane blot.
Top Down sequencing by MALDI ISD
Mass spectrometric sequencing can be obtained by fragmenting the intact protein. This typically confirms 20-80 residues from each end by accurate masses and mass differences. The technique can fragment and sequence both the N- and C-terminal in the same mass spectrum.
It is possible to detect modifications like pyroglutamic acid, acetylation, truncations, signal sequences, leading Met residues by their specific masses. Top-down sequencing by MALDI ISD requires 20-100 micrograms of the purified protein in solution.
Mass spectrometric peptide mapping by protease cleavage and LC MS/MS analysis
Enzymatic cleavage of the protein followed by LC MS/MS analysis and correlation of the peptide data with the expected amino acid sequence. This typically gives a sequence coverage from 30-90%.
The ionization efficiency in the mass spectrometer is much higher for most peptides than for the intact protein. Therefore, you can detect different peptides aat much higher sensitivity.The peptide sequence coverage is rarely 100%. This is because some peptides may be lost in sample preparation or not detected by the LC MS/MS, . Consequently, you cannot be completely sure that the terminal peptides are found.
The peptide mapping software should also consider possible terminal modifications. In addition you must allow unspecific cleavage rules for the protease, in order to detect the terminal peptides. You may use a a combination of LC MS/MS peptide mapping with 2 or 3 different proteases to increase the sequence coverage. This is also a good way to confirm the observed terminal peptides.
Due to the inherent features and limitations of these 3 techniques it is advisable to combine the N- and C-terminal analyses with intact protein Mw determination.
Alphalyse has expertise in N- and C-terminal sequencing from a variety of projects. Please see details of the analyses at our website and contact us if you are interested in discussing N- and C-terminal sequencing of your protein.