Review: Antibody Protein Sequence Analysis Using Mass Spectrometry

One of the important information researchers need to know at the early stage of the antibody drug research and development is the sequence information of the antibody protein. With the advancement of mass spectrometry instrumentation and technologies, it is helpful, and sometimes critical, to conduct sequence analysis using mass spec experiments. This article highlights three different technologies often used in this context.

  • Intact Mass Analysis
  • Peptide Mapping
  • Protein de novo Sequencing

Both of the first two methods, intact mass analysis and peptide mapping, require the prior knowledge of the antibody’s amino acid sequence. Whereas the third de novo sequencing does not require the prior knowledge of the primary sequence. Instead, the amino acid sequence will be derived from the LC-MS/MS data directly.

Intact Mass Analysis

Intact mass analysis is the measurement and determination of the intact antibody protein molecular weight. It may measure the antibody protein in its native form, or it may measure the heavy and light chain separately after reduction. In the case of antibody fragments, it may also measure the fragments, such as Fab or VH+CH1.

When the antibody’s primary sequence is known, intact mass can provide extra information to confirm the primary sequence of the antibody protein. However, certain sequence variations, such as Leucine/Isoleucine mutation and the swap of two amino acids, cannot be validated this way. More importantly, intact mass analysis can help derive the relative ratios of expressed glycoforms conferring biological activity or immunogenicity.

The general steps of a typical intact mass analysis is as follow.

  1. Perform LC-MS on purified antibody protein or its fragments to generate full MS spectrum
  2. Deconvolution of the high charged full MS spectrum
  3. Data analysis and interpretation

The image below illustrate a highly charged envelope of the target protein (B) as well as the corresponding deconvoluted spectrum (C).

Intact Mass

(Image credit:

Peptide Mapping

Peptide mapping is the analysis of peptides generated from the digestion of a protein by mass spectrometry. It is a comparative procedure and can be used as an identity test for proteins. It requires a reference, either a reference sequence, reference standard or a reference material, to compare and contrast with the target of interest.

Peptide mapping is particularly useful to find differences between two or more samples or conditions when visualized in the liquid chromatogram. It can also be used to confirm the sequence of the target protein against a known sequence and discover point mutations. The technology utilizes more information than intact mass measurement, and therefore is more reliable. However, it still has difficulty in detecting some subtle mutations such as the swap of two amino acids.

The general steps of peptide mapping is as follow.

  1. Digest the monoclonal antibody protein to peptides using one or multiple enzymes with diverse cleavage sites
  2. Separate the peptides using chromatographic technologies such as reverse-phase HPLC
  3. Identification of peptides. When MS/MS is used for the peptide identification, a database search engine such as Mascot and SEQUEST may be used
  4. Report and visualize the analysis by mapping the peptides to the LC Chromatogram or the known protein sequence

The image below illustrate a tryptic peptide mapping comparing 3 monoclonal antibodies.

Peptide Mapping on LC

(Image credit:

The image below shows how mapping peptides to a known protein sequence can help confirm the sequence information.

Peptide Sequence Map

(Image credit:

Protein de novo Sequencing

Protein de novo sequencing is the process to derive the antibody protein sequence directly from the mass spec data without the prior knowledge of the sequence. It measures the peptides using high mass accuracy instrument and derive the protein sequence based on the consensus of the overlapping peptides as well as fragment ion information.

At the core of this technology is a peptide de novo sequencing algorithm that can produce both accurate de novo peptides and accurate scoring for each amino acid. This is particularly important for the CDR (Complementarity-Determining Region) regions of an antibody. The six CDR regions from the heavy and light chain are highly variable. This introduces the diversity the immune system requires, but at the same time makes all the database sequences for those regions untrustworthy.

The general steps of protein de novo sequencing is as follow.

  1. Digest the monoclonal antibody protein to peptides using one or multiple enzymes with diverse cleavage sites
  2. Separate the peptides using chromatographic technologies such as reverse-phase HPLC
  3. Produce tandem mass spectrum using high mass accuracy mass spectrometer
  4. Peptide de novo sequencing to generate de novo peptides
  5. Assemble de novo peptides back to the full antibody protein sequence

The image below visualizes the peptides identified in a de novo sequenced protein by Rapid Novor’s antibody sequencing software. The bold fonts highlights the amino acids confidently supported by the MS/MS fragment ion peaks. The overlapping peptides collectively prove the correctness of each amino acid of the antibody protein.

Rapid Novor de novo antibody sequencing

(Image credit: Screenshot of Rapid Novor antibody sequencing software)


The three technologies have their unique advantages. The complexity of the experiments and data analysis increase significantly from intact mass analysis to de novo sequencing. As such, the required expertise and the associated cost, increase as well. If one is confident about the primary sequence and the purpose is to guard from the sequence variations that significantly change the intact mass of the whole protein, or certain peptide’s mass or retention time, then intact mass analysis and peptide mapping are the method of choice due to the lower cost and faster turnaround. If one requires the confidence on each single amino acid of the antibody, then protein de novo sequencing is the way to go.

Want to talk to us about protein de novo sequencing and your projects?

Contact us and we’ll be happy to talk