The development of mass spectrometry heralded a new frontier of amino acid sequencing that brought about the decline in use of Edman degradation. MS typically involves the enzymatic digestion of proteins, followed by ionization of the resulting peptides, separation according to their mass-to-charge ratios, and detection using an ion detector.
A major pain point of amino acid sequencing associated with the bottom-up MS approach mentioned above, is the requirement of thoroughly annotated sequence libraries, where searches are only able to match fragments of entire proteins. Because proteins need to be digested into peptides around 5-20 amino acids in length prior to identification, and because some genomes have yet to be sequenced, some proteins sequenced via mass spectrometry can be misidentified. To avoid the challenges posed by reference libraries, algorithms can be designed to automatically identify the most likely candidates matching the protein of interest2.
Upon successful identification of the peptide, difficulty remains in discovering isoforms and assigning post-translational modifications. Often enrichment strategies, such as chromatography and ion exchange, are used to locate and decipher PTMs, however, this strategy is challenging to execute.
In the past, MS techniques suffered from low sensitivity. However, advances through the years have ensured sensitivity issues are improved for amino acid sequencing. MS can be coupled with assays like antibody-based enrichment to offer up to 100,000 times improved sensitivity. In addition, concerns surrounding sensitivity can be addressed using a nanopore; an analytical tool with extreme sensitivity.
A better alternative is the use of de novo protein sequencing by liquid chromatography tandem mass spectrometry. The latter technology does not rely on databases. Using data from the mass spectrometer instruments and machine learning algorithms, the protein sequence can be deciphered, bypassing the shortcomings mentioned in the above paragraphs. However, this technology is not widely available as many laboratories worldwide do not fully integrate machine learning capabilities with wet bench lab processes2.