• Aucun résultat trouvé

Proteomic Tools

1.6 What is proteomics?

1.6.5 Ultimate proteomic tool: mass spectrometry

One of the reasons of the rapid proteomic expansion is the improvement of technique efficiency and development of new technologies including nanoscale liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) or matrix-assisted laser desorption/ionisation (MALDI) (18). These techniques are based on mass spectrometry (MS) (immune-MS, MS profiling), on protein arrays (antibody arrays, reversed phase arrays), on RNA assays (oligonucleotide microarrays, gene chips) or on DNA assays (chromatin immunoprecipitation, high-density DNA microarrays) (27, 28). Proteomics is now the most widely used approach to evaluate differential expression on tissues, biofluids, and enzymatic pathways as well as disease and toxicological screening (19).

Figure 9 explains a typical workflow in a proteomic study. A protein population is usually extracted from a biological sample. This protein population is then separated by single or multiple dimension gel electrophoresis and then digested (trypsin, endoproteinase) prior to the mass spectrometry identification analysis. An alternative approach consists of digesting protein first and separating peptides by high performance liquid chromatography (HPLC) before the identification analysis. Peptides are then ionized using an electrospray ionization (ESI) or matrix-assisted laser desorption/ionization (MALDI) and analysed by various different mass spectrometers including quadrupole or Time of Flight (TOF). Finally, the obtained peptide-sequencing are searched against protein databases using database-searching softwares (Sequest, Mascot, etc.). Examples of the reagents or techniques used at each step of this workflow are given beneath each arrow.

2D, two-dimensional; FTICR, Fourier-transform ion cyclotron resonance; HPLC, high-performance liquid chromatography

Figure 9: Typical workflow in proteomic studies from Steen (et al.) (20).

The use of tandem mass spectrometry (MS/MS) for protein analysis has been elaborated during those last twenty years. Before the existence of this tool, applied to protein analysis, the method of choice for the analysis of amino acids sequences was the Edman degradation. The principle is based on the sequential removal of an amino acid from the N-terminus of protein and on the identification of this cleaved amino acid. Shimonishi (et al.) proposed in 1980 to combine mass spectrometry and Edman degradation by measuring the mass of the peptides after amino acid removal (21). However, the extension of this approach to biological samples was confronted to a fundamental problem. This problem consisted to transfer highly polar, completely non-volatile molecules with a mass of several kDa into a gas phase without damages (20). After several improvements the identification of peptides with mass spectrometry took a radical change in the 1990s with the development of two ionization methods for large molecules: electrospray ionisation (ESI) by Fenn (et al.) and matrix-assisted laser desorption/ionisation (MALDI) by Karas (et al.) (32, 33). Moreover as well as the rapid increase of the number of available protein sequences databases contributed to this expansion. The discovering of these two methods became the starting point of the proteomic approach called peptide mass fingerprinting (PMF), based on the measure of the peptide mass composition and on its correlation with theoretical masses computed from protein sequences databases (34, 35). Different software has been proposed to identify proteins from PMF data, such as Mascot, ProFound or Aldente (22). Finally, another identifying approach used with ESI was based on the correlation of MS/MS spectra with theoretical peptides fragment mass found in databases (23).

The mass spectrometer is the key element in the identification of peptide mass. This powerful tool is composed of three elements: an ionization source, a mass analyzer and a detector. The ESI and MALDI are the most commonly used ionization sources (20). When using MALDI the sample must be co-crystallized within an ultraviolet absorbing matrix which is a low-molecular weight aromatic acid. During irradiation, a focused laser of suitable wavelength target the matrix and makes the molecules sublime and transfer into the gas phase

(20). The formed ions are then accelerated by electric potentials into a mass analyser.

Contrary to the MALDI, ESI source can spray peptides solution under high-voltage (several kV) to create highly positive charged micro droplets and generate ionized peptides (Figure 10). The liquid, eluting from the chromatography column contains the peptides which are electrostatically dispersed. Once the droplets are nebulized, the solvent evaporates which decreases the size and increases the charge density of the particle.

Figure 10: Two ionization techniques: matrix-assisted laser desorption/ionization (MALDI)

(a) and electrospray ionization (ESI) (b) from Steen (et al.) (20). MALDI consists of targeting peptides with a laser which will ionize these peptides. ESI consists of applying a current between the end needle of a HPLC and the mass spectrometer. This current will ionize peptides.

Finally, the ionized molecules are injected into the mass spectrometer and peptide fragmentation is induced by collision with environmental gas. This collision causes an amide bound cleavage, creating b-ions when the charge is retained by the amino-terminal fragment or y-ions when it is retained by the carboxy-terminal fragment (Figure 11) (20). Some other types of fragments (e.g. a-ions) can also be produced depending on the type of ionisation (e.g.

MALDI) and mass spectrometer.

Figure 11: Fragmentation process during mass spectrometer collision from Steen (et al.) (20). b-ions represent fragments retaining charge at the amino-terminal and y-ions represent fragment retaining charge at the carboxy-terminal fragment.

The mass analyzer separates the peptide ions according to their mass over charge ratio (m/z). This ratio is then recorded by the detector. Redundancy can be reduced by the use of an exclusion list containing the previously fragmented precursor ions (24). Each selected peptide can then be fragmented after collision induced dissociation (CID) into different fragment ions and be recorded as the MS/MS spectrum. This spectrum is composed of the m/z ratio of the precursor ion corresponding to its mass and charge state, as well as some b- and y-ions representing part of the amino acid sequence of the peptide (Figure 12). Each peak of the HPLC run represents a group of the most abundant peptides (a). The amount of peptides is therefore given by the height of each peak. The spectrometer can then select the most abundant peptides present in each peak and measure their total mass (MS) (b). Finally, these selected peptides are fragmented and the MS/MS sequence of each peptide gives their exact sequence (c).

Figure 12: Chromatogram (a), MS (b) and MS/MS (c) spectra of a mixture of peptides

separated on a HPLC column from Steen (et al.) (20). Mass spectrometer select the most intense peptides in each chromatogram peaks. Then mass of these peptides is determined during MS. Finally each peptide is fragmented and their sequence is given during MS/MS.

Three different approaches are commonly used to determine the correct identity of analyzed peptides (Figure 13). The first one, called Peptide Sequence Tags, (PeptideSearch) is based on interpretative models where it is assumed that each MS/MS spectrum contains at least a continuous serie of fragment ions giving a short amino acids sequence called amino acid tag (panel a). Together with the amino terminal mass (m1) and the carboxy terminal mass (m3), the peptidic sequence tag is searched in the database to match the complete peptide sequence (20). The second approach (panel b) is based on descriptive models like the Sequest algorithm, which mathematically correlates the experimental MS/MS spectrum with the

theoretical predicted MS/MS spectrum. The quality of the match is then quantified to give a cross correlation score reflecting the similarity between the spectra (20). The third approach (panel c) implies statistical and probability models. The Mascot search engine, which is based on the MOWSE scoring algorithm (25), evaluates all the matches between the input MS/MS spectrum and peptide sequences from the database. A probability (P) that the match is a random event is calculated for each peptide and a score is returned as -10 x log10 (P) for better convenience. Further information on database searching software can be found in recent reviews (26, 27).

Figure 13: Different approached in the identification of peptides from Steen (et al.) (20).

These approaches consists of (a) matching sequence against a database with most intense fragments, (b) determining similarities with theoretical spectrum and (c) calculation of all predicted fragments contained in the database.