HAL Id: hal-02329885
https://hal.archives-ouvertes.fr/hal-02329885
Submitted on 23 Oct 2019
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Separation Methods hyphenated to Mass Spectrometry for the Characterization of the Protein Glycosylation at
the Intact Level
Julien Camperi, Valérie Pichon, Nathalie Delaunay
To cite this version:
Julien Camperi, Valérie Pichon, Nathalie Delaunay. Separation Methods hyphenated to Mass Spec- trometry for the Characterization of the Protein Glycosylation at the Intact Level. Journal of Phar- maceutical and Biomedical Analysis, Elsevier, In press, �10.1016/j.jpba.2019.112921�. �hal-02329885�
Separation Methods hyphenated to Mass Spectrometry for the Characterization of the Protein Glycosylation at the Intact Level
Julien Camperi
a
, Valerie Pichon
a,b
, Nathalie Delaunay
a
a
Laboratory of Analytical, Bioanalytical Sciences and Miniaturization, UMR CBI 8231 CNRS - ESPCI Paris, PSL University, Paris, France
b
Sorbonne Université, Paris, France
Corresponding author: Nathalie Delaunay, [email protected]
Corresponding author institution: LSABM, UMR CBI 8231 CNRS – ESPCI Paris, 10 rue Vauquelin, 75005 Paris, France.
Abstract
Glycosylation is one of the most common post-translational modifications of proteins that affects their biological activity, solubility, and half-life. Therefore, its characterization is of great interest in proteomic, particularly from a diagnostic and therapeutic point of view.
However, the number and type of glycosylation sites, the degree of site occupancy and the different possible structures of glycans can lead to a very large number of isoforms for a given protein, called glycoforms. The identification of these glycoforms constitutes an important analytical challenge. Indeed, to attempt to characterize all of them, it is necessary to develop efficient separation methods associated with a sensitive and informative detection mode, such as mass spectrometry (MS). Most analytical methods are based on bottom-up proteomics, which consists in the analysis of the protein at the glycopeptides level after its digestion. Even if this approach provides essential information, including the localization and composition of glycans on the protein, it is also characterized by a loss of information on macro- heterogeneity, i.e. the nature of the glycans present on a given glycoform. The analysis of glycoforms at the intact level can overcome this disadvantage. The aim of this review is to detail the state-of-the art of separation methods that can be easily hyphenated with MS for the characterization of protein glycosylation at the intact level. The different electrophoretic and chromatographic approaches are discussed in detail. The miniaturization of these separation
the development and optimization of the separation step to achieve high resolution between isoforms, the recent ones are much more application-oriented, such as clinical diagnosis, quality control, and glycoprotein monitoring in formulations or biological samples.
Keywords: Intact protein; Glycosylation; Capillary electrophoresis; Liquid chromatography;
Mass spectrometry
Abbreviation
ACN, acetonitrile; AGP, α1-acid glycoprotein; Asn, asparagine; BGE, background electrolyte;
BMP-2, bone morphogenetic protein-2; CGE, capillary gel electrophoresis; CIEF, capillary isoelectric focusing; DS, dextran sulfate; EIE, extracted ions electropherogram; EOF, electroosmotic flow; EPO, erythropoietin; FA, formic acid; FT, Fourier Transform; Fuc, fucose; Gal, galactose; GalNAc, acetylgalactosamine; GlcNAc, N-acetylglucosamine; hCG, human chorionic gonadotropin; hCGα, α-subunit of human chorionic gonadotropin; HIC, hydrophobic interaction chromatography; HRMS, high-resolution mass spectrometry; ICR, ion cyclotron resonance; IFN-β, interferon-β-1a; mAb, monoclonal antibody; Man, mannose;
PB, polybrene; PCA, principal component analysis; PEI, polyethylenimine; PTMs, post- translational modifications; Q, quadrupole; RNase B, ribonuclease B; Ser, serine; Tf, transferrin; TFA, trifluoroacetic acid; Thr, threonine; TQ, triple quadrupole
1. Introduction
The study of glycoproteins is a rapidly growing field, which is not surprising since approximately 70% of human proteins are glycosylated [1,2] and protein properties mainly depend on their glycosylation such as stability, cell-cell communication, localization, half- life, and activity [2-5]. The impact of protein glycosylation is evident in diseases related to its disorders [6], such as autoimmune diseases [7,8], Alzheimer’s disease [9], cancers [10-12], or genetic defects [13,14]. This is why the glycosylation state of some proteins of interest in biological fluids may be disease biomarkers. Moreover, many glycoproteins have been approved as biopharmaceutical drugs, such as erythropoietin (EPO) and monoclonal antibodies (mAbs). Since their glycosylation state affects their stability, solubility, and clearance, its characterization is essential during their development (discovery, preclinical, and clinical steps), production, storage, and delivery [4,5]. Many steps are therefore
concerned, such as clone selection, pharmacokinetic and pharmacodynamic studies, in- process control, quality control of the final product, consistency from lot to lot in glycan composition, sialic acid content, glycan profiles, or glycan sequences. However, the characterization of protein glycosylation leads to huge analytical challenges, as glycosylation is one of the most complex post-translational modifications (PTMs).
Two types of glycans can be linked to proteins: N- and O-linked glycans (Figure 1). N- glycosylation corresponds to the attachment of a glycan to the protein by an asparagine (Asn).
Due to their common biosynthesis pathway [15], all N-glycans have a common structure composed of 2 GlcNAc and 3 mannoses (Man) and can be classified in 3 groups: (i) high- mannose, containing only two GlcNAc and a variable number of Man, (ii) complex, composed of different monosaccharides in addition to the common structure, and (iii) hybrid, combining both high-mannose and complex-type species. Complex N-glycans are the main forms in humans. With regard to O-glycosylation, the most common is initiated by the addition of a single monosaccharide GlcNAc (O-GlcAcylation) or GalNAc (mucin type glycosylation) to Ser or Thr of proteins. These O-linked glycans have no common structure and their size can vary, ranging from a single monosaccharide residue to extended chains, and can include a large number of different monosaccharides [16].
For each glycoprotein, potential N- or O-glycosylation sites may be occupied in varying proportions and different glycans can be present at a given glycosylation site, resulting in micro-heterogeneity for each site. In addition, glycoproteins may also have other types of PTMs including acetylation, phosphorylation, methylation, etc. Consequently, glycosylation and other PTMs lead to a large number of isoforms for a given protein, with different molecular weights and concentrations.
The most commonly used approach to characterize protein glycosylation is the bottom-up one, based on the analysis of glycopeptides resulting from protein digestion, which allows the identification and localization of the glycans on the protein [5,17-22]. This site-specific information is useful for assessing the micro-heterogeneity of each glycosylation site and for confirming the possible presence of other PTMs. Nevertheless, this requires different pre- treatment steps, such as often preliminary reduction and alkylation steps, an efficient enzymatic digestion of the protein, and purification and/or enrichment methods of glycopeptides prior to liquid chromatography (LC) or capillary electrophoresis (CE)-mass spectrometry (MS) analysis.
Figure 1: Representation of (A) N- and (B) O-glycans.
Another way to study protein glycosylation is to analyze glycans released by glycoproteins or glycopeptides [5,19,21-25]. This approach allows the determination of the intrinsic structural heterogeneity of glycans resulting from branching, isomerism, and linkage position between different monosaccharides. As for the glycopeptide analysis, a reduction/alkylation step followed by enzymatic digestion or chemical reaction is usually necessary to release the glycans [26]. Several purification/extraction steps can be carried out to facilitate the identification and quantitation of glycans by removing peptides, proteins, salts, and some other molecules added during the sample pre-treatment [27]. A derivatization step is often performed to improve both their separation in LC or CE and detection by fluorescence or MS.
The analysis of protein glycosylation can also be performed at the intact level [5,22,28-30], leading to different and complementary information compared to the two approaches described above. For example, the analysis of the protein isoforms at the intact level is the only way to correlate an isoform with its different PTMs and determine the total number of isoforms and their relative abundances. This is very relevant when some isoforms are biomarkers. Moreover, the two previous approaches require one or more enzymatic digestion steps, sometimes combined with enrichment and/or derivatization steps, thus leading to a long and costly analytical protocol. In return, the intact approach does not require complex sample pretreatment, therefore is faster and cheaper, and prevents from modifications induced by sample handling. For all these reasons, the characterization of glycoproteins at the intact level has increased rapidly over the past decade.
Complex Hybrid
High mannose
Mannose (Man) Galactose (Gal)
Acid N-acetylneuraminic (NeuAc) N-acetylglucosamine (GlcNAc) Fucose (Fuc)
A B
Asn Thr/Ser
However, the main challenge in this case is to achieve an efficient separation of isoforms, which may be difficult due to their structural similarities, especially for proteins with different glycosylation sites leading to a high number of glycoforms. LC and CE are both high- performance separation methods that can help to achieve this objective. They are often hyphenated to MS owing to an electrospray ionization (ESI) source, thus providing sensitivity, selectivity, and information on glycoform identity [5,22]. The hyphenation of the separation step with MS dramatically improves the efficiency of the ionization step and leads to simplified mass spectra, even when using high-resolution MS (HRMS). With regard to literature, some reviews have already been published on CE and LC coupled with MS for the analysis of intact proteins [28,31-35], but they have not specifically focused on characterizing their glycosylation. Therefore, we propose, in this review, to discuss in detail the potential of CE and LC hyphenated to mass spectrometry for the characterization of the glycosylation of proteins, analyzing them in their intact form. Approaches used to reduce the size of large proteins, such as the middle-up for example, to obtain fragments before analysis will not be discussed here, as they have already been presented in other reviews [17,21,22]. Two types of applications are concerned, the analysis of (i) pharmaceutical preparations containing the protein at a high concentration in a quite simple solution and (ii) biological samples of much greater complexity with potentially low concentrations of the target glycoprotein. This second application requires a careful sample handling before separation and detection steps, as already discussed in some reviews [5,19,20,36].
2. Capillary Electrophoresis
Different CE modes are regularly used for the analysis of intact proteins, such as capillary zone electrophoresis (CZE), capillary isoelectric focusing (CIEF) or capillary gel electrophoresis (CGE). CGE can separate glycoforms, because glycans modify the mobility of the glycoform in gel electrophoresis with their mass but also with their charge since the sodium dodecyl sulfate molecules do not interact with them. Nevertheless, CGE is not compatible for a direct coupling with MS, preventing further characterization of isoforms.
Therefore, only CZE and CIEF are addressed in this review for their contribution to the glycosylation characterization.
2.1 Capillary zone electrophoresis
Historically, CZE is the most used method for the separation of intact proteins and its potential has already been widely described in the literature [28,31,33]. It is characterized by high-efficiencies, especially for proteins as they have a low diffusion coefficient, can use native aqueous conditions, preventing denaturation or change in conformation due to a stationary phase or solvent, and can separate protein isoforms according to their charge-to- size ratio. CZE can also be easily hyphenated with MS via an ESI source when a volatile background electrolyte (BGE) is used. This hyphenation allows the characterization and identification of isoforms, particularly with HRMS analyzers [32,37].
Focusing on glycoproteins, CZE-MS was used for their analysis at the intact level to characterize their glycosylation (Table 1), either for glycoprofiling and fingerprinting purposes, or for identification in clinical diagnosis, quality control or monitoring in formulations. More than 30 papers were published, one-third dealing with EPO (3 N- and 1 O-glycosylation sites) and two-thirds with other glycoproteins such as ribonuclease B (RNase B, 1 N-glycosylation site), Transferrin (Tf, 2 N-glycosylation sites), mAbs (2 N-glycosylation sites), α1-acid glycoprotein (AGP, 5 N-glycosylation sites), human chorionic gonadotropin (hCG , 4 N- and 4 O-glycosylation sites) or interferon-β-1a (IFN-β, 1 N-glycosylation site).
2.1.1 Sample considerations
Most analyses used protein concentrations of some mg/ml. Some authors increased the injection volume to 2.7 or 4% of the capillary volume, but this has negatively affected the separation resolution [61,62]. Indeed, as large injection volumes in CZE induce a dramatic band broadening, high concentrations are required to inject sufficient sample amounts.
Nevertheless, they may induce the formation of aggregates [44]. It is worthwhile to notice that a large number of isoforms induces a decrease in sensitivity for a given isoform. As an example, more than 100 isoforms were detected for EPO, with signals varying by 2 to 3 orders of magnitude, with a sensitivity in the femtomol range [47].
When pharmaceutical formulations are analyzed, the glycoprotein concentration can easily reach the mg/ml range, often after a simple centrifugal filtration step which also eliminates salts and excipients. However, if the targeted protein is in biological samples, its concentration may be very low and it may be among other highly concentrated proteins. As an example, the concentration of endogenous EPO is in the ng/l range in human urines and about 10 ng/l in serum. In that case, a pretreatment step must be optimized before analysis. CZE- MS can be very useful for characterizing purified samples with different methods to select
one that reduces sample preparation time, has a high extraction yield, and minimizes chemical modifications and, therefore, changes in isoform pattern. For example, two methods for purifying AGP from human serum were evaluated, in particular with regard to the potential AGP desialylation due to the use of acidic conditions [41]. One was based on the use of an acidic reagent for precipitation that was neutralized immediately after precipitation and before immunoextraction. In the second one, the acidic precipitation was skipped before the immunoextraction step and led to higher yields.
2.1.2 Capillary coating
After sample injection, the best possible separation must be obtained in order to favor as much as possible the ionization of all isoforms, especially the minority ones, which will then improve the signal intensity of ions and the quality of their MS spectrum. It is often necessary to coat the capillary to both limit protein adsorption and control electroosmotic flow (EOF) as shown in Table 1. For the coupling of CZE with MS, a dynamic coating requiring the constant presence of a non-volatile coating agent in the BGE is excluded, despite its simplicity and stability. A statically adsorbed coating may be used, resulting from successive percolation(s) in the capillary of one or more polymer solutions prior to the analysis, when the capillary outlet is out of the ESI source, leading to a semi-permanent coating. A permanent, i. e.
covalently immobilized, coating can also be used, avoiding frequent recoating and potential contamination of the MS source. Several reviews describing these different types of coating are available [64,65].
As it can be seen in Table 1, three approaches are used:
- a cationic coating, giving an anodic EOF (written A in the EOF column), with an acidic BGE. Isoforms are positively-charged, favoring their electrostatic repulsion with the positively-charged capillary surface. EOF and analyte mobility are in opposite directions, which may increase resolution but also analysis time if EOF is not too high. This is the most common approach.
- a neutral coating suppressing EOF (written 0 in the EOF column) with an acidic BGE.
Isoforms are positively-charged and resolution depends only on their electrophoretic mobility difference.
- an anionic coating, leading to a cathodic EOF (written C in the EOF column), with a basic BGE. Isoforms are negatively-charged, favoring their electrostatic repulsion with the negatively-charged capillary surface. EOF and analyte mobility are in opposite
directions. This is the less used configuration, because acidic BGE is often preferred to improve the MS ionization in positive mode.
For the first approach, the first impressive results showing the analysis of many isoforms in CZE-ESI-MS were obtained with EPO [45,47] using a cationic Polybrene (PB)-based coating combined with an acidic BGE. Even if they are some exceptions such as a strong adsorption observed for the Tf glycoforms, regardless of cationic polyethylenimine (PEI) concentration, coating procedure, BGE composition, and pH [62], these conditions generally provide high efficiencies, such as those obtained for IFN-β between 350,000 and 450,000, allowing the resolution of isobaric positional isomers of a single sialic acid on biantennary glycan, and the detection and quantification of 138 isoforms [58]. Another cationic polymer containing an amine (LA 113) was also reported for capillary coating and the analysis of VGEF165 isoforms [63]. A multilayer coating based on PB-dextran sulfate (DS)-PB was used to improve EOF reproducibility and coating stability [57,66].
In different studies, cationic and neutral coatings were compared (see Table1). For example, a cationic PB coating was compared to a polyacrylamide (PA)-based coating at acidic pH [38].
For both fetuin and AGP proteins, the separation of the main sialoforms was considerably improved with the suppression of EOF. Indeed, in this case, the CZE separation results from the isoform electrophoretic mobilities, providing a better separation resolution within a relatively acceptable analysis time. In addition, partially resolved peaks appeared, corresponding to glycoforms with the same sialic acid number but with an additional HexHexNAc. As expected, the increase in resolution improved the quality of the mass spectra and enhanced sensitivity. A similar study carried out with EPO came to the same conclusions [48].
A more detailed study was performed by Neusüβ and coll. to compare coatings based on PA, UltraTrol High Reverse (HR), underivatized PAA, N,N-dimethylacrylamide-ethylpyrrolidine methacrylate, or PEI, for the separation of EPO isoforms and a mixture of small and medium size proteins (<33 kDa) [52]. The best resolution was obtained with the neutral coatings.
Gimenez et al. also compared neutral PA and anionic UltraTrol HR coatings for the analysis of EPO and novel erythropoiesis-stimulating protein (NESP) [49]. If, once again, the neutral coating gave a higher resolution than the cationic one, but also a longer analysis time, the reproducibility studies demonstrated a lack of stability and a bleeding of the PA coating, resulting in a dramatic loss of sensitivity in MS and the presence of interfering peaks in the spectra. The cationic coating had enhanced stability. It is worthwhile to notice that to improve the neutral coating stability, it has to be done every 5 to 10 runs and requires removing the
capillary from the CE-ESI-MS interface to avoid contamination of the ESI source [51,52]. A rinsing step with 3 M HCl before coating also helps to maintain reproducible migration times.
Therefore, it appears in the studies presented in Table 1 that most often the second approach involving an acidic BGE plus a neutral coating that suppresses EOF led to higher resolution than the first approach with a cationic coating. Although the EOF and analyte mobility are in opposite directions, which should increase resolution, this was not observed because the EOF was too high. Indeed, in another study with a cationic coating with an acrylamide-pyrrolidine methacrylate copolymer generating a moderate EOF, a better separation of AGP isoforms was obtained and in less time than with a neutral coating [41].
Recently, a new neutral coating (Neutral OptiMS) was evaluated for mAb characterization [59]. It consists of a first hydrophobic layer to protect siloxanes from hydrolysis and a second layer of polyacrylamide providing a hydrophilic surface to improve its stability. Therefore, this criterion must also be taken into account when choosing the most appropriate coating.
This also depends on the objective, i.e. to characterize the greatest number of glycoforms, even if the analysis time will be high, or a fast analysis at the expense of the separation resolution [48,53,54].
In the third approach, an anionic coating and a BGE with a pH superior to the pI values of the protein glycoforms are preferred. As an example, a PB coating and a wide range of pH and ionic strength values did not give successful separation of Tf glycoforms, unlike an anionic coating and a BGE composed of 25 mM ammonium acetate at pH 8.5 [62]. The anionic coating was obtained with a first layer of PB followed by a second DS layer leading to a pH- independent cathodic EOF.
Another criterion for selecting the nature of the coating may be the objective of analyzing the protein in denaturing or native conditions. Indeed, the strongly acidic BGEs that may also contain solvent can induce protein denaturation. To analyze the protein in native conditions, a BGE with a physiological pH is required. Therefore, Karger and coll. selected a cationic coating for the characterization of mAbs in denaturing conditions and a neutral one when native conditions were used [60]. Nevertheless, the separation of the 2x-glycosylated mAb (with glycans present on both Fc/2 domains), 1x-glycosylated mAb (with glycans present on one of the two Fc/2 domains), and a glycosylated mAb was only observed with the BGE composed of 10% isopropanol in 2% FA, i. e. in denaturing conditions.
2.1.3 Other CZE parameters
In addition to the capillary coating, the BGE composition must also be optimized, i. e. the pH, ionic strength, nature and content of an additive and/or solvent, which all affect the ionization state of isoforms, their mobility, solubility, and potential adsorption on the capillary wall.
Moreover, the BGE must be volatile. It impacts the separation (resolution and analysis time), but also the ESI ionization, so the sensitivity of the final method. An increase in the ionic strength of the BGE [56] or the addition of a solvent such as MeOH [45] can improve resolution. The generated intensity and therefore the Joule effect is also a criterion and some limit values are recommended with some CE-MS interfaces. Other conventional CZE parameters such as capillary length, voltage, and/or temperature must also be optimized in parallel with the BGE composition, as was done for example for the analysis of neo- glycoproteins [66].
After separation in CZE, the glycoforms must be ionized for their introduction in the MS analyzer. This requires a volatile BGE, although in some cases, the use of non-volatile agents such as urea cannot always be avoided to achieve separation. For example, the separation of the AT isoforms required the introduction of urea at 4 M [43]. A neutral coating was thus necessary to completely suppress the EOF, preventing the introduction of high amounts of urea into the ESI source. Another option may be to carry out two-dimensional CZE-CZE-MS, with a four-port valve interface allowing the use of a non-volatile BGE in the first dimension leading to high resolution and a volatile BGE in the second dimension to favor the MS signal [68].
2.1.4 CE-MS interfaces
With regard to the CE-MS interface, CE can be coupled to ESI-MS by sheathless or sheath- liquid interfaces, as it was reviewed [31,69]. Nevertheless, Table 1 shows that by far the most commonly used interface for the glycoprotein characterization is the coaxial junction with a sheath liquid, the sheath liquid being the most important parameter affecting signal intensity and spray stability. It is composed of water with an organic solvent, e.g. methanol or isopropanol, and a low concentration of acetic acid or FA, usually between 0.1 and 1%, to promote ionization in positive mode. Its composition must be carefully optimized, as demonstrated for the ionization of Tf glycoforms, by varying the type and content of organic solvent (methanol and 2-propanol) and modifier (acetic and formic acids, between 0.05 and 1%) [61]. In this case, the best results were obtained with 90% MeOH and 0.5% FA, after comparing the optimized water/organic solvent ratios for each solvent. Another study dealing
with the separation of Tf isoforms compared methanol, 2-propanol, and acetonitrile (ACN) and both methanol and ACN led to lower signal intensities than 2-propanol with a proportion of 50% [62]. A higher content of organic solvents induced unstable ESI currents, while a low proportion led to poor ionization. The addition of 5% of FA was necessary to ensure substantial ionization of the Tf isoforms. Acetic acid gave lower ionization and higher noise and instability, while trifluoroacetic acid (TFA) led to strong ion suppression. In another study, it was also observed that FA induced higher ionization, but noise in the MS spectra was higher than when acetic acid was used [63].
In some cases, a compromise must be found between separation and detection to select the most appropriate solvent. For the analysis of EPO isoforms, 2-propanol, methanol, and ACN were compared in combination with different ratios of water in the BGE [52]. Methanol had a similar performance to 2-propanol in terms of separation efficiency and electrospray stability, but led to intensities about 3 times lower comparing equal water/organic solvent ratios. A strong instability of the electrospray was observed with ACN.
The sheath liquid flow-rate also affects sensitivity, which is improved as the flow-rate decreases. This is why, the minimum flow-rate value allowing a stable spray is often selected [52,61]. The nebulizer gas flow-rate can induce an aspirating effect, inducing a decrease in migration times [52,61]. This effect decreases with the nebulizer gas pressure, but the ion intensity too.
In 2011, a sheathless CE-MS interface, called CESI, was commercialized. It was first evaluated for glycoform profiling by Somsen and coll. [53]. A neutral coating and a BGE with an acidic pH were used to favor the resolution of glycoforms of IFN-β and EPO. During the analysis, a pressure of 0.5 psi was applied at the capillary inlet, inducing an overall flow-rate of 5 nl/min to ensure a stable electrospray since a near-zero EOF was obtained. Crucial interfacing parameters, like dry gas flow-rate, ESI voltage, and capillary tip position plate distance were optimized. After optimization, more than 80 isoforms including 18 glycoforms were detected for IFN-β in a single CZE-MS run, and 250 isoforms including 74 glycoforms for EPO. A high sensitivity and a wide linearity range were reached. Indeed, assuming an equal ESI-MS response for all glycoforms and taking into account the injected protein concentration and calculated relative peak areas from extracted ions electropherograms (EIEs) into account, IFN-β glycoform concentrations varied between 0.53 and 946 nM (about 0.01- 20 µg/ml). Similarly, the relative intensities extend over more than 3 orders of magnitude from the most abundant to the least abundant glycoforms detected. This interface and a
neutral coating were then evaluated for mAb characterization [59,60]. Again, a small pressure of 0.5 or 3 psi was applied at the capillary inlet to ensure a stable electrospray.
2.1.5 Performances
With optimized BGE and coating, glycoforms varying by their sialic acids can be easily separated, as demonstrated for Tf [67], EPO [45,47,53], IFN-β [53] or hCG α-subunit (hCGα) [54]. This is illustrated in Figure 2, which shows the analysis of a drug based on recombinant human EPO. The EPO glycoforms are resolved over a 20-min time window (Figure 2A).
Figure 2B shows the deconvoluted mass spectrum obtained at the apex of the peak migrating at 38.0 min, revealing a glycoform with a mass of 29,597 Da. Minor spectral bands with masses of +16 Da and +42 Da relative to the main mass indicate the presence of oxidation and acetylation products. A total of 74 distinct glycoforms were detected and Figure 2C1 shows the glycoform resolution obtained by combining the high-efficiency of CZE with HRMS.
Focusing on the state of charge 14+ (Figure 2C2), glycoforms containing the same number of sialic acid residues are along diagonal rows, a difference from one sialic acid residue causing a migration time shift of about 2 min. The separation of isoforms in CZE is mainly oriented by their charge differences because they often have a similar size. However, a HexHexNAc unit also induced a shift in migration time for EPO (Figure 2C3), whereas this moiety should not be charged in the BGE used for the separation at pH 2.1, but to a lesser extent (about 0.5 min per HexHexNAc) since a HexHexNAc contributes only to about 1% of the total mass of the protein. Similar results were obtained for hCGα [54] and IFN-β [53,57]. Similarly, isoforms of RNase B varying from only one neutral sugar, Man, were successfully separated [56]. This was also the case for bone morphogenetic protein-2 (BMP-2) isoforms varying from mannose units, but a preliminary reduction-alkylation step was mandatory [44].
In most of the works reported in Table 1, the figures of merit such as the repeatability of migration times and peak areas, LODs and LOQs, are very rarely given. The relative standard deviations (RSDs) of migration times and peak areas given by some authors are in agreement with those obtained usually in CE-MS. As an example, the intra- (n=8) and inter-day (n=11, 2 days) precision of migration times of 8 EPO isoforms was between 1.4 and 1.9% [52]. For peak areas, RSDs between 6.7 and 18.0% were calculated. Nevertheless, the peak area repeatability may be affected by the number of charge states considered to get the extracted ion electropherograms, as it is also the case for calculating the LOD and LOQ values. It is worthwhile to notice that the definition of LOD or LOQ must be precisely defined when analyzing glycoforms. Indeed, it may be related to the concentration of the whole protein or a
single glycoform. When many glycoforms can be present, nearly 100 as an example for EPO, it impacts dramatically the given value. Although it is difficult to compare different MS and CE instruments, which can also use different capillary dimensions and injected volumes, values in the order of some hundreds of µg/ml were published for the total concentration of EPO, with a sheath-liquid interface and a quadripole-time-of-flight (qTOF) analyzer [52].
Figure 2: Sheathless CE-MS of EPO (0.2 mg/ml) employing a neutral coated capillary. (A) BPE; (B) deconvoluted mass spectrum obtained in the apex of the peak migrating at 38.0 min (*); (C1) presentation of the electropherogram as a contour plot to show the high resolution obtained by combining CE with HRMS with zooms of (C2) the 14+ charge state of the glycoforms and (C3) the 14+ charge state of the SiA13 sialoforms. Reprinted with permission from [53].
With a CESI interface hyphenated with a TOF and a neutral coating capillary (requiring the application of a pressure of 0.5 psi at the capillary inlet to ensure a stable spray), the RSDs (n=10) of migration times and peak areas for the main glycoforms of IFN-β were 1.9% and 11%, respectively [53]. In repeated EPO analysis, they were 2.4% and 13% (n=6), respectively. The detection limit was in the picomolar-range (some tens of µg/ml). With a CESI interface hyphenated with an Orbitrap and a cationic PEI coating capillary, RSDs (n=4) of the migration times below 2% from run-to-run were obtained [58]. To compensate for variations in EOF, the free solution mobility of each isoform was determined and RSD values (n=6) less than 0.6% were observed for different capillaries over several days.
Linearity is also a key parameter, especially when glycoforms may be present in a wide dynamic range for a given glycoprotein, some of them being very abundant and other being in the minority. A linearity of one order of magnitude was observed for EPO glycoforms, which is quite low but no higher concentration of EPO was available in this case [52].
2.1.6 Applications
For biopharmaceutical applications, CZE-(TOF)MS analyses followed by multivariate statistics were used to compare, for example, EPO preparations [51]. The relative peak areas of the selected intact EPO isoforms were used as variables in principal component analysis (PCA) and hierarchical agglomerative clustering. Both strategies allowed EPO formulations to be differentiated according to manufacturer, production cell line, and batch number. The results of the PCA on the different EPO preparations are presented Figure 3. Small changes in antennarity, sialylation, and acetylation of isoforms were observed. This kind of strategy can be useful for quality control, optimization of the production or comparison of biosimilar with innovator.
CZE-MS can also be used to accurately determine a precise mass after deconvolution of the charge distribution. However, elucidation of the composition of the intact glycoforms is impossible because different glycan combinations have the same mass, even with a very high resolution MS analyzer, such as Fourier transform-based MS [54]. This is why complementary approaches must also be implemented to characterize the glycoforms at the glycopeptide or glycan levels [38,39,48,54,56,63].
Figure 3: Score plot for principal component 1 (PC 1) and principal component 2 (PC 2) of the principal component analysis on the different EPO preparations analyzed by CZE-MS.
Reprinted with permission from [51].
Nevertheless, in some cases, the combination of MS spectra and CZE migration times allows the identification of major isoforms differing by only 1 Da (isoforms with one less sialic acid and two more fucose), as observed when analyzing AGP purified from human serum [39].
Figure 4 shows the electropherogram obtained with many peaks over 3 min, and the spectra and deconvoluted spectra of peaks 5 and 6. Differences in isoform composition can easily be observed. Indeed, AGP isoforms differing in one sialic acid have a molecular mass difference of 291 Da and are observed in two consecutive time windows: isoforms with a higher sialic acid content appear at lower migration times (Figure 4B). Within the same time window, observed masses correspond to the addition of HexHexNAc units (365 Da) or fucosylation (146 Da) (Figure 4C). A proof-of-concept study also demonstrated that the analysis of AGP in CZE-MS combined with statistical techniques allowed the identification of potential biomarkers of bladder cancer [40]. CZE-UV did not provide statistically different variables to differentiate healthy patients from cancer patients, but CZE-ESI-TOF showed to be a very appropriate technique to achieve this objective when the CZE migration times are taken into account with MS data.
Figure 4: Analysis by sheath liquid CE-MS of intact AGP (8 mg/ml) purified from human serum. (A) Base peak electropherogram. (B) spectra of peaks 5 and 6. (C) Deconvoluted spectra of peaks 5 and 6. ORM1 and ORM2 are the two main AGP isoforms, differing in 21 amino acids. The ORM1 isoform presents 3 different variants, ORM1F1, ORM1F2, and ORM1S, while the ORM2 isoform presents the ORM2A variant. Reprinted with permission from [39].
Recently, CZE-MS with a triple-quadripole (TQ) analyzer was used for the analysis of intact hCG and demonstrated its high potential for a fingerprinting approach, as it differentiated the isoform composition of 2 hCG-based drugs [55]. For identification, a high mass resolution is mandatory and, for isotopic resolution, the resolving power value must be higher than the molecular weight of the protein [50]. For example, proteins with a MW of 30 kDa require a MS with a resolving power of about 40,000 or more to be isotopically resolved. In this paper, the analysis illustrated that TOF MS is adequate for the isotopic resolution of intact proteins up to about 20 kDa (benchtop instrument) or 30 kDa (high resolution instrument).
Other high resolution MS analyzers can be used, such as Orbitrap or Fourier transform ion cyclotron resonance (FT-ICR). However, the resolution of these mass spectrometers is inversely proportional to the acquisition time, which can be a dramatic issue when coupling to fast and high performance separation techniques like CE.
MS allows also to perform in-source fragmentation as it was first carried out for the analysis of an intact glycoprotein by CZE-MS to obtain carbohydrate-specific reporter ions and to compare the internal ratio of Hex+/HexNAc+ of rhBMP-2 isoforms [44].
With regard to quantification, unlike glycopeptides, for which it is not possible to assume that they have similar ESI ionization yields, this is possible for different intact glycoforms of a protein. Indeed, they often have the same maximum of the charge distribution after ESI ionization. In this case, the determination of the peak areas from the EIEs allows the comparison of their relative abundances, as it was already demonstrated by several authors [43,54,57]. The repeatability was studied using peak areas normalized by that of the most intense glycoform obtained during the analysis by CZE-ESI-(FT)MS of hCGα for which 60 glycoforms were detected [54]. The average RSD (n=3) of the peak areas was less than 10%.
Karger and coll. used a different method to do the relative quantitation of IFN-β isoforms [58]. They summed the EIE integrated signals of the 6 charge states with the highest ion intensities and divided it by the sum of the total signal for all proteoforms. The signal extraction window was centered (+65%/-35%) around the theoretical average masses for the given charge state within mass tolerance windows of 1.4, 1.3, 1.25, 1.15, 1.0 Da (z = +11 to +16, respectively). Considering a total relative abundance higher than 1.2% of the total peak intensity, a total of 55 isoforms were quantitated with less than 20% RSD (n=6). They used these quantitative data to try to correlate some protein modifications. Nevertheless, to take into account the potential differences in ionization efficiency of each glycoform, it is preferable to compare the results on a relative basis, such as from batch to batch or during a
As a perspective, some studies have recently described the implementation of separation in a microchip channel instead of a fused-silica capillary [42,70]. Ramsey and coll. carried out the separation of intact mAb variants (Infliximab and two other mAbs) in a glass etched channel of a chip (23-cm long, 10-µm depth, and 70-µm full width) having an integrated ESI emitter [70]. To avoid adsorption of the analyte and control EOF, a surface coating using chemical vapor deposition of an aminopropylsilane-based layer and a covalent modification of the resulting surface with polyethylene glycol were performed. This surface coating led to highly efficient separations, as illustrated in Figure 5, for Infliximab with a BGE composed of 10%
2-propanol with 0.2% acetic acid (pH 3.2), which is fully MS compatible, unlike the BGE traditionally used for mAb charge variants based on triethylenetetramine or ε-aminocaproic acid. As shown in Figure 5, at least 4 mAb glycoforms can be identified per lysine variant, resulting in at least 12 mAb isoforms identified. Therefore, this miniaturized approach allows the determination of the mass of intact mAb variants and the assessment of charge heterogeneity and other glycoform information with a total analysis time of less than 4 min and a stable channel surface that does not require regular regeneration of the coating. This technique was then used to characterize other mAb and intact antibody drug conjugate variants [42].
Figure 5: (A) Schematic for CE-ESI-MS chip device with a 23 cm separation channel with an enlarged image of the asymmetric turn tapering. Red channels indicate an aminopropylsilane (APS) coating while black channels indicate an APS-polyethylene glycol coating. S, sample reservoir; B, BGE reservoir; SW, sample waste reservoir; EO, electroosmotic pump. (B) Analysis of intact Infliximab using the CE-ESI-MS chip device. (a) Identified lysine variants bands are labeled as 2-K, 1-K, and 0-K, which correspond to 2-Lysine, 1-Lysine, and 0- Lysine variants, respectively. (b) Deconvoluted mass spectra for each lysine variant with glycosylation structures above the mass of each peak. Reprinted with permission from [70].
A B
2.2 Capillary isoelectric focusing
The on-line hyphenation of CIEF with MS is not straightforward due to the non-volatile ampholytes, acids, and bases needed to get separations with high resolution. Several strategies can overcome this problem, such as the use of low ampholyte concentration, an interim separation by chromatography, the use of a dialysis interface to remove the interfering substances or the partial ampholyte filling technique, as reviewed by Hühner et al. [71]. As an example, CIEF was applied to the analysis of bovine serum apotransferrin glycoforms to separate the di-, tri-, and tetrasiallotransferrins [72].
Very recently, CIEF has been directly coupled to MS with a stainless-steel flow-through microvial ESI interface [73]. A very small amount of an intact mAb (30 ng of Infliximab) was analyzed and 15 intact molecular weight values originating from glycosylation heterogeneity and charge variation were observed, while 13 were identified thanks to the use of an Orbitrap analyzer. A recent paper dealing with the coupling of imaged CIEF with MS using a nanoliter valve for the analysis of intact mAb variants demonstrated once again the high potential of CIEF-MS for the analysis of protein glycoforms [74].
It is also worthwhile to notice that Shimura reviewed the advances of CIEF implemented in microchips for the separation of proteins at the intact level [75]. However, separation resolution and automation still need to be improved before using this technique for the analysis of protein glycosylation.
3. Liquid chromatography
LC is also a powerful method for the separation of intact proteins due to its high-resolving power, good reproducibility, and compatibility with MS [76]. In recent years, significant developments have occurred to improve its potential for the protein analysis, such as the development of bio-inert systems to prevent the risk of adsorption. New stationary phases adapted for the analysis of intact proteins, such as sub-2-microns particles and superficially porous particles [77] and with large pore sizes (> 200 Å), also appeared, leading to higher efficiencies in shorter analysis times. The performance of these new stationary phases packed in columns with conventional dimensions (2.1 x 150 mm), especially for the separation of therapeutic mAbs, was recently reviewed [78,79]. Columns with nano (25 µm < i.d. < 100 µm) and micro (100 µm < i.d. < 1 mm) formats are also interesting because they lead to high sensitivity, while requiring less sample and mobile phase, and can be directly hyphenated with nano-ESI MS [80].
A wide variety of separation modes can be used for protein analysis, including hydrophilic interaction liquid chromatography (HILIC), reversed-phase liquid chromatography (RPLC), hydrophobic interaction chromatography (HIC), ion-exchange chromatography, and size exclusion chromatography [76,81]. The consideration of the MS compatibility criterion explains why RPLC and HILIC have been the most widely used in this field.
3.1 Reversed-phase liquid chromatography
RPLC is widely used to separate glycopeptides obtained after enzymatic digestion of the protein [82,83], but its application to the analysis of intact proteins is more complex due to their adsorption, pore exclusion, and low diffusion coefficient [84]. However, some strategies were proposed to overcome these problems by using of (i) silica-based stationary phases with restricted access to residual silanol (e.g. end-capped, hybrid silica, high density bonding, or embedded polar group stationary phase), (ii) elevated temperature, (iii) particles with large pore sizes (> 200Å), and (vi) addition of TFA or FA in the mobile phase to reduce adsorption, as already mentioned in reviews devoted to biopharmaceutical characterization [33,85].
Focusing on studies on the characterization of protein glycosylation by RPLC, a review of the literature is presented in Table 2. It deals mainly with studies on mAbs, but also on EPO, Tf or hCG. The most commonly used stationary phase is a C8-bonded silica. It is interesting to note that a Tf-exendin fusion protein was used to study the role of sugars in interactions between the protein backbone and the hydrophobic ligand on the LC stationary phase [102].
Indeed, this protein has the N-linked sited removed, leading to a less complex glycoprotein containing only linear Man residues at the Ser and Thr residues. A C18-bonded silica was selected and it was observed that retention time increases as the number of Man decreases.
A gradient composed of a mixture of ACN and water is generally applied sometimes with some n-propanol which can help to mask residual silanol groups. A small content (0.02-0.1%) of an acid is always added in the mobile phase either to favor glycoform separation (TFA) or to enhance the MS signal (FA). Since FA does not improve selectivity and TFA reduces the MS signal, they are sometimes both introduced in the mobile phase, but keeping TFA content as low as possible (0.02%). In Table 2, it also appears that a high column temperature is always chosen, between 50 and 80°C. Indeed, an increase in temperature has a significant positive impact on glycoform separation, as observed for example with a recombinant hCG- based drug analysis giving more than 10 chromatographic peaks at 65°C (Figure 6) instead of 6 at 35°C [88]. It is worthwhile to notice that the large peak between 33 and 35 min
corresponds to the co-elution of hCG isoforms, that are very numerous because hCG has 8 glycosylation sites plus other potential PTMs.
Figure 6: TIC chromatogram obtained by RPLC-MS analysis of a recombinant hCG-based drug. Reprinted with permission from [88].
One of the first papers dealing with the RPLC-MS analysis of an intact glycoprotein also concerns hCG [87]. It combined the results of glycan identification after tryptic digestion with molecular mass measurements of intact hCG after its separation on a monolithic capillary column and detection in a quadrupole IT analyzer tuned to detect low charge states of glycoforms. The annotation of glycoforms observed in deconvoluted mass spectra was accomplished by an algorithm comparing the measured molecular masses and a database derived from the known glycan composition and possible modifications of the protein backbone. This method allowed the annotation of 7 glycoforms of hCGα in Pregnyl, a hCG- based drug, with a minimal sample preparation (only ultra-filtration through a cutoff 5000 membrane) by LC analysis in less than 15 min.
Of course, this approach was also quickly evaluated for the analysis of mAbs, which are proteins with significantly higher molecular masses [90-92]. Although their wide isotopic distribution makes it difficult to resolve their glycoforms completely, it provides an overall view of their heterogeneity. This allows a fast comparison of mAb batches, as illustrated Figure 7 (top). Today, the use of UHPLC hyphenated with a qTOF analyzer allows a better accuracy and precision for identification with a duration of less than 10 min per analysis [94].
5 10 15 20 25 30 35 40 Time [min]
0.0 0.5 1.0 1.5 2.0 2.5 x105 Intens.
Figure 7: (Top) Deconvoluted spectra of two different lots (a and b) of the same recombinant mAb analyzed by RPLC-MS at the intact level. The peak labels refer to the various glycoforms identified base on the deconvoluted mass. Reprinted with permission from [92].
(Bottom) Analysis by nanoRPLC-chip-MS of two different mAbs, Trastuzumab and Bevacizumab. (A) TIC with elution of Trastuzumab highlighted in shaded grey. (B) Multiple charged ion distribution of Trastuzumab. Reconstructed mass spectrum with annotation of the glycoforms of (C) Trastuzumab and (D) Bevacizumab. Reprinted with permission from [95].
The hyphenation of LC with an analyzer with an even higher mass resolution such as an Orbitrap is also of interest. This was used for the characterization of mAb variants in fermentation broth, as the dilute-and-shoot method developed by Huber and coll. [96]. With a short RPLC-(qOrbitrap)MS analysis achieved in 15 min, N-glycosylation and truncation
A
B D
C
variants of the expressed mAb were identified at the intact protein level. The robustness and repeatability of the method were assessed with regard to retention time, peak area, and mass resolution. The precision of the molecular mass determination, expressed as a 95% confidence interval, ranged from ± 0.16 to ± 2.89 Da for 6 replicates. After 965 injections of the mAb reference material or fermentation samples, retention times and peak widths at half-height differed by only 4.8 and 0.06 s, respectively. A relative quantification of mAb glycoforms was then performed with extracted ion current chromatograms based on the theoretical mass of the respective species. The average fractional abundance of each glycoform was calculated from the measured peak areas. However, the absence of a validated mAb standard hinders the direct evaluation of the trueness of glycoform quantification by MS. In addition, the ability of LC-(Orbitrap)MS analysis of an intact protein to characterize its glycosylation may be limited when some glycoforms are in low abundance and/or when many glycosylation sites and other PTMs are present. A recent study investigated the effect of various parameters of an Orbitrap hyphenated with LC, including source temperature, microscan type and quantity, mass resolution, and automatic gain control on spectral quality [97]. Several human proteins of diagnostic or of therapeutic interest were selected as model proteins (C-reactive protein, vitamin D-binding protein, Tf, and IgG). For example, the optimized source temperature was found to be specific to each protein. A methodological framework was established to ensure a robust glycoform identification, before their quantification.
One of the current LC trends is miniaturization. This is also the case for glycosylation characterization. Harazono et al. compared the glycoform profiles of the innovator and a biosimilar of EPO by nanoRPLC-MS [86]. Due to the high micro-heterogeneity of glycans, the chromatograms had a massif with a ten of summits and the mass spectrum was obtained either over the entire elution period (18.7-20 min) or over short elution periods (18.7-18.9 min, 18.9-19 min, …) before deconvolution. The first approach is fast and led to the detection of major glycoforms, while the second takes more time but shows minor glycoforms. The first approach was used to compare EPO innovators and biosimilars, as the RSDs of the relative peak intensities, assumed to correspond to those measured on MS spectra, were close to or less than 10% for major-to-moderate glycoforms (n=4). RPLC-MS allowed rapid glycoform profiling and assessment of the similarity of glycoprotein-based pharmaceuticals. The second approach demonstrated that glycoforms containing less sialic acids eluted much earlier, and those containing larger glycans eluted slightly earlier.
Some authors also proposed to carry out separations on chips [95,100,101]. However, a
Therefore, the use of a trapping column may be a good alternative to increase the injected sample volumes and to concentrate isoforms before their separation, although care must be taken to avoid the potential loss of the most hydrophilic glycoforms during trapping on the reduced size precolumn [95]. This method was used to compare the glycosylation profiles of different monoclonal IgG-based biopharmaceuticals at the intact level, as illustrated for Trastuzumab and Bevacizumab (Figure 7 bottom). Its robustness was assessed with RSD values of 2.2% and 3.7% in terms of relative abundance for intra- and inter-assay variability, respectively. It also allowed the evaluation of batch-to-batch variations in the glycoform heterogeneity. The same group also applied this method for the glycoprofiling of Tf in plasma samples of controls, patients with known defects, and patients with secondary or unsolved cause of abnormal glycosylation [100]. After an immunoextraction step, the analysis led to distinct glycosylation profiles, which facilitated the identification of the specific congenital disorders of glycosylation subtype.
With regard to diagnosis, another example is the RPLC-MS analysis of 6 multifucosylated glycoforms of the basic proline-rich protein IB-8a CON1+ (so called according to its interaction with concanavalin A) in human saliva [89]. The sample pretreatment consisted of a simple dilution (1/1) with an acidic solution (0.2% TFA) and a centrifugation step for 10 min.
A similar study was performed by the same group to characterize the glycoforms of a basic salivary proline-rich protein 3M [99].
3.2 Hydrophilic interaction liquid chromatography
There is an ongoing effort nowadays to evaluate the potential of HILIC for the characterization of intact glycoproteins [103-107]. Indeed, HILIC is already widely used routinely for the analysis of released glycans and glycopeptides [108]. With this mode, retention is mainly based on partitioning of the analytes between a water-enriched layer formed at the surface of a polar stationary phase and the hydro-organic mobile phase, and interactions with the stationary phase (hydrogen bonds, ionic and dipole-dipole interactions) [103,109]. The high organic solvent content of the mobile phase improves its volatility, which favors the hyphenation with MS, leading to an improvement in sensitivity for a large variety of compounds [110]. Periat et al. were the first to demonstrate the potential of HILIC as an orthogonal approach to reversed-phase LC for the characterization of mAb glycoforms at the protein level [111]. For the characterization of protein glycosylation by HILIC analysis hyphenated to MS, an overview of the literature is presented in Table 3. As far as we know,
only a few papers have been published, dealing with RNase B [114], neo-glycoprotein such as Ag85B and TB10.4 [112], or IFN-β and EPO [113].
Several HILIC columns were tested to separate the intact isoforms of these glycoproteins, but amide-based silica seems to be the gold standard (see Table 3). Most amide-based stationary phases have pore sizes between 80 and 150 Å, which can exclude the largest glycoforms from the pores. To the best of our knowledge, only the AdvanceBio Glycan Mapping® (Agilent Technologies) and Glycoprotein BEH Amide® (Waters) columns have a pore size of 300 Å.
Zhang et al. reported the use of an innovative, laboratory-made stationary phase based on non-porous silica particles bonded with linear polyacrylamide chains for the separation of 5 intact high-mannose N-glycoforms of RNase B [114]. The brush layer was designed to be closely spaced to sterically exclude proteins, thus avoiding protein-silanol interactions, while glycans can penetrate it. With regard to the retention mechanism, the protein backbone would be adsorbed just on the surface of the polyacrylamide and the glycan moieties would contribute as a function of their partitioning in the hydrophilic layer. RNaseB was selected as a model protein, with N-glycans varying from 5 to 9 Man groups. Its 5 glycoforms were baseline resolved and even more peaks were resolved using 700 nm particles compared to those of 1.4 µm, corresponding to glycoform isomers, but only when HILIC was coupled with UV, which allows the use of a high percentage of TFA (0.1%) in the mobile phase. The performances were better than those obtained with two commercial HILIC columns (BEH amide column, 2.1 x 30 mm, 1.7 µm, and a TSKgel amide-80 clumb, 2 x 50 mm, 3 µm), with a separation temperature of only 30°C compared to 60°C for the other two. In HILIC-MS, the amount of TFA was reduced to 0.02% until acceptable MS spectra were obtained and 0.5% of FA was included. In this case, the resolution was divided by a factor of 2.
Typically, the mobile phase consists of a high percentage of ACN plus water acidified with TFA, acting as an ion-pairing agent and generating some solvating effect that both impact selectivity, retention, resolution, and peak shape. Indeed, Tengattini et al. demonstrated its large influence on the separation resolution of intact glycoforms of RNase B [112]. Different TFA contents were tested and compared to the use of other acidic additives such as FA or perchloric acid. From these experiments, it appears that at least 0.05% of TFA must be used to achieve a satisfactory separation of glycoforms.
TFA is a corrosive agent that can cause irreversible damage to the MS detector and also leads to a high ion suppression effect. Zhang et al. reduced its concentration by using 0.05% instead of 0.1% (used in UV conditions) and adding 0.5% of FA. Nevertheless, a dramatic loss of
study, Dominguez-Vega et al. reported about the interest of in-source collision-induced dissociation, which minimizes the protein-signal suppression induced by TFA [113].
Therefore, TFA in mobile phase is essential in HILIC to obtain an efficient separation of intact isoforms, but it results in a dramatic decrease in signal, highlighting the current limitations of HILIC-MS for the analysis of protein glycosylation.
Another main challenge in HILIC for the analysis of intact glycoproteins is to determine injection conditions that will not lead to significant deformation of chromatographic peaks.
Indeed, the dissolution of glycoproteins in high ACN content (e.g. 70-90%) close to the mobile phase composition cannot be achieved because they precipitate [35]. Nevertheless, some studies reported the use of relatively high-percentages (i.e. 90-50%) of organic solvent [112,114], but there is a risk of having only a partial dissolution of the protein isoforms.
Therefore, another approach consists in dissolving the protein in water and to inject a low volume to prevent peak shape deformation induced by the high percentage of water [113]. For an injection volume of 1 μl, the sample-to-column volume ratio is 0.2% for a standard LC column (150 x 2.1 mm, approx. 415 µl with a porosity value of 0.8), which is acceptable for the amount of water introduced on the HILIC column.
With regard to column temperature, Periat et al. demonstrated that temperature does not have as much effect in HILIC as in RPLC [111]. In Table 3, it appears that values between 30 and 60°C are commonly used for the separation of glycoforms.
For applications, HILIC coupled to MS showed its potential for the characterization of intact neo-glycoproteins obtained after a glycosylation procedure from ribonuclease A, and 2 antigenic proteins, namely TB10.4 and Ag85B [112]. Additional MS/MS experiments confirmed the localization of the N-glycosylation site. In this case, selectivity depended mainly on the size and number of glycans attached to the protein, but it should be noted that these semi-synthetic glycoproteins contain glycans composed only of Man, which highly reduces the number of isoforms and thus the complexity of HILIC separation.
A HILIC-MS method was also applied to the profiling of IFN-β and EPO [113]. A chromatogram of IFN-β is presented in Figure 8A. Even if an incomplete separation was obtained, various glycoforms were detected, differing by the number of Fuc, hexose, and sialic acid. Figure 8B shows that the neutral glycan units contribute significantly to glycoform separation, whereas terminal sialic acids only slightly affect the HILIC retention. Other PTMs such as oxidation, succinimide intermediates, and loss of N-terminal methionine were also observed (Figure 8C). The same analytical method was applied to the analysis of EPO and
allowed to elucidate the composition of 12 and 51 glycoforms of IFN-β and EPO, respectively.
Figure 8: HILIC-MS of IFN-β. (A) Base-peak chromatogram; (B) Extracted-ion chromatograms of the detected glycoforms with tentative glycan assignment; (C) Extracted- ion chromatograms of the most abundant glycoforms. Reprinted with permission from [113].
3.3 Other LC modes
Ion-exchange chromatography provides a separation mode orthogonal to RPLC and is very useful for the characterization of charge variants, and thus of the glycoforms. Nevertheless, it requires mobile phases with high concentrations of non-volatile salts non-easily compatible with MS. However, a strategy based on on-line two-dimensional LC-MS can be used, coupling IEX with RPLC in order to separate glycoforms according to their charge in the first dimension and desalt the medium with a small RPLC trapping column in the second. This approach was used, for example, to characterize mAb glycoforms [115]. Up to 6 fractions of interest from ion exchange separation were analyzed by trapping and desalting the fractions onto a series of reversed phase trap cartridges with subsequent on-line analysis by MS. The
sialylation. It is important to note that if, in this study, the sole objective of the RPLC sorbent was a rapid desalting of fractions prior to their MS analysis, a column could be incorporated for additional separation prior to MS detection. The optimization of MS-compatible mobile phases for IEX separation of mAbs was recently investigated and seems to be very promising in the future for glycoform separation [116].
HIC is a popular non-denaturing separation mode to analyze proteins, but most often it uses mobile phases that are not compatible with MS. Recently, Ge and coll. reported the first on- line coupling of HIC to MS using more hydrophobic columns and ammonium acetate with an organic solvent as mobile phase for the analysis of proteins [117]. This new methodology was then investigated with mAbs and allowed the determination of relative hydrophobicity with HIC, and of intact masses and glycosylation profiles with MS [118]. However, the addition of solvent in the organic phase can denature the protein and therefore one of the key advantages of the HILIC mode could be lost. The growing interest in this approach should expand the toolbox for characterizing glycoproteins.
Secondly, it is generally considered that size exclusion chromatography (SEC) cannot provide any information on glycosylation, as it has the lowest efficiency and resolution of all chromatographic modes. Nevertheless, SEC using a mobile phase based on ACN with 0.1%
FA and 0.1% TFA and hyphenated with a qTOF MS analyzer has recently been successfully applied to the analysis of intact Trastuzumab and a detailed glycosylation pattern was obtained [119]. This can allow the analysis of proteins in their native state, which is particularly important to study molecules maintained by non-covalent interactions.
Affinity chromatography can also be an effective technique to separate glycoforms, based on their reversible adsorption on a stationary phase with specific ligands immobilized on its surface. Affinity chromatography involving protein A-based sorbent is useful to recover antibodies from complex solutions such as culture supernatants. This technique was thus on- line hyphenated with RPLC-MS via a desalting pre-column to evaluate the glycan heterogeneity of different mAbs diluted in water, serum-free medium, and Chinese hamster ovary cell culture supernatant [120]. All three types of samples gave similar glycoform patterns, suggesting that this system avoids interferences from media components or host cell proteins. The analysis of Rituximab samples diluted with serum-free medium were repeated 6 times per day for 3 days. The height of the 4 major peaks in the deconvoluted MS spectra were measured and the RSDs (3 days) of each relative peak height against total peak height ranged from 1.0 to 5.3%. Two other mAbs (Trastuzumab and Palivizumab) were also