• Aucun résultat trouvé

Identification and characterization of two new archaeal methyltransferases forming 1-methyladenosine or 1-methyladenosine and 1-methylguanosine in transfer RNA

N/A
N/A
Protected

Academic year: 2021

Partager "Identification and characterization of two new archaeal methyltransferases forming 1-methyladenosine or 1-methyladenosine and 1-methylguanosine in transfer RNA"

Copied!
168
0
0

Texte intégral

(1)

0 Faculté des Sciences

Département de Biologie Moléculaire Laboratoire de Microbiologie

Identification and characterization of two new archaeal methyltransferases

forming 1-methyladenosine or 1- methyladenosine and 1-

methylguanosine in transfer RNA

Promotor:

Prof. Dr. L. Droogmans Co-promotor:

Dr. M. Roovers

Submitted in fulfillment of the Requirement for the degree Doctor (Ph.D) in Sciences Morgane Kempenaers Brussels 2011

(2)

1

Acknowledgments – Remerciements

Après quatre années passées au sein du laboratoire de microbiologie de l’ULB et de l’institut de recherches Jean-Marie Wiame, me voilà arrivée au terme de ma thèse de doctorat. Ces années ont été particulièrement enrichissantes, tant du point de vue scientifique que du point de vue humain.

J’aimerais dès lors remercier toutes les personnes ayant contribué à faire de ces dernières années une expérience agréable.

En premier lieu, j’aimerais chaleureusement remercier mon promoteur, Louis Droogmans, pour m’avoir offert la possibilité de faire mon doctorat sur un sujet passionnant et m’avoir gentiment accueillie dans son groupe. Merci Louis pour ton encadrement, pour avoir partagé tes connaissances, pour nos discussions parfois un peu tendues mais toujours enrichissantes ! Merci pour tout !

A Martine Roovers, un énorme et très chaleureux merci, je n’aurais pas pu rêver meilleure co- promotrice, collègue et amie ! Merci d’avoir supporté toutes mes questions, merci pour nos

discussions, scientifiques ou totalement non scientifiques, nos papotes et nos fou-rires ! Merci pour ta bonne humeur, quel plaisir d’avoir passé ces 4 ans en ta compagnie ! Ca va me manquer de ne plus avoir de « petite maman » à mes côtés tous les jours !

En parlant de « mamans », je tiens aussi à remercier ma deuxième « maman » de l’institut, Françoise Van Vliet. Merci pour ta générosité, discrète mais toujours présente ! J’espère que tu trouveras quelqu’un pour t’accompagner dans ton jogging autour du canal ! C’était un grand plaisir de travailler à tes côtés !

Je tiens aussi à remercier Yamina Oudjama, pour avoir accompagné mes premiers pas à l’institut, pour m’avoir initiée aux maniements de l’Akta, et bien sur, pour ton amitié. Merci aussi à Daniel Gigot pour tous ses conseils techniques et l’aide immense apportée lors des tentatives de purification de Tk0422p.

(3)

2 Un tout grand merci à Evelyne Dubois, pour m’avoir accueillie à l’institut, c’était un grand plaisir de travailler dans un tel environnement.

Je tiens à remercier tous mes collègues de l’institut de recherches, Isabelle, Dany, Fabienne, papi, Raph, Nathalie B., Martine L., Chantal, Marc, Cédric, Virginie, Christianne, Mélanie, Dédé, Nathalie la rousse, Nathalie la blonde, Bertrand, Corinne, Françoise Bex, Luc, Catherine, Sigrid, Sandra, Carla, ainsi que ceux que j’ai croisés à un moment donné à l’institut, Alfred, Sébastien, Julie, Geoffray, Angela, Bart, Eric, … Un tout grand merci aussi à Nicole pour ta bonne humeur et ta joie de vivre ! Un tout grand merci à tous les étudiants qui ont partagé avec moi quelques mois au labo, merci à Joanne, Charles-Henri, Céline, Thierry, Aurélie, Rabi, Ariane, Jess, Marianne, Michèle, Ben, David, Laïla, et tous les autres, et un merci tout particulier à Ikram, pour ton amitié qui m’est très chère ! Un grand merci aussi à Aria, Momo et Jonathan, bonne chance pour vos thèses respectives ! Et un bon conseil, accrochez-vous parce que ça en vaut la peine !

Un grand merci aussi à Laurence Van Nedervelde pour le coup de main HPLC, et à toutes personnes amies de Meurice et de l’UBT.

I would like to wholeheartedly thank Wim Versees and Marcus Fislage for their help and advices about protein crystallization, and for their friendship and always good mood! It was a real pleasure to work with you! Thanks to every people that helped me in your lab.

I also would like to wholeheartedly thank Eveline Peeters, Danny Charlier, Amelia, NingNing and Phu, for introducing me to the genetics of Sulfolobus, and for suffering with me for the difficulty to obtain a knock out! Thank you for the very pleasant atmosphere in your lab, for all the good advices, and for everything!

J’aimerais remercier mes amis d’études de l’ULB, en premier lieu Isa, Basilou et Xav, notre bande des 4 de la première candi, vous m’avez permis d’apprécier cette première année qui me terrorisait un peu au départ ! Merci d’être toujours là 7-8 ans plus tard, merci pour tout.

Un énorme merci aux quatre lustucru qui ont rejoint la bande, Caro, Bert, Jo et Triton, et qui m’ont permis de supporter la deuxième candi ! Merci Caro pour les petits beurres, et bien sur pour tout le reste ! Merci à tous pour les fou-rires en géol, les parties de Whist, et le Mogwayadactyle !

Merci aussi à Damien, Vincent, Quentin, Germain, Steph, Seb, Yan, Caza et Cazette et tous les autres, biologistes, ULBistes, ou amis connus par l’intermédiaire d’un ou une ULBistes.

(4)

3 Je voudrais aussi remercier tous mes amis de primaire ou apparentés, d’être toujours là depuis une vingtaine d’années, même si on se voit moins, vous comptez et compterez toujours pour moi ! Un énorme merci à ma famille, mes parents, mes frères et sœurs, mes oncles, tantes, cousins, cousines, grand parents, pour m’avoir aidée à devenir qui je suis. Un merci particulier à ma jumelle Gwenn, pour être toujours là quand j’ai besoin de toi, et pour nos cafés chez Exki !

Pour terminer ces remerciements, j’aimerais dire du fond du cœur un énorme merci à Anne Verbist et Marina Riemslagh, pour m’avoir aidée à rendre ma vie plus lumineuse et à l’apprécier

pleinement ! Merci pour votre aide, je vous en serai éternellement reconnaissante !!

Je voudrais également remercier l’UCPA pour m’avoir permis de rencontrer de nouveaux amis extra, et toutes les personnes amies que je n’aurais pas cité nommément dans ces remerciements ! Finalement, je remercie le Fonds pour la formation à la recherche dans l’industrie et dans l’agriculture (FRIA) pour le financement de ce travail.

Merci à tous, pour tout

(5)

4

Table of content

I. Introduction 1. History

1.1 Discovery of nucleic acids

1.2 Animal nucleic acid versus plant nucleic acid 2. The ribonucleic acid (RNA)

2.1 Different categories of RNA 2.2 The transfer RNA (tRNA)

2.2.1 tRNA biosynthesis i) 5’-end processing ii) 3’-end processing iii) 3’-end CCA addition iv) Intron splicing

v) Nucleoside modifications vi) Nuclear export in eukaryotes 2.2.2 Ribonucleosides modifications

i) Examples of modified nucleosides with “functional” roles

a) Function of a modified nucleoside at position 34 of tRNA:

the case of mnm5s2U

b) Function of a modified nucleoside at position 37 of tRNA:

the case of t6A

ii) Examples of modified nucleosides with “structural” roles a) Modified nucleoside with local “structural” effect b) Modified nucleoside with global “structural” effect 3. RNA methyltransferases (MTases)

3.1 The various superfamilies of RNA MTases 3.1.1 The Rossmann-fold superfamily

9 9 9 10 11 11 13 15 16 17 17 17 19 20 23 26 27

29

30 30 31 34 34 35

(6)

5 3.1.2 The SPOUT superfamily

3.1.3 The radical SAM superfamily

3.1.4 The FAD/NAD(P)-binding protein superfamily 3.2 tRNA methylation in Archaea

3.2.1 Base mono-methylating tRNA MTases i) The tRNA (m1G37) MTase

ii) The tRNA (m5U) MTase from Pyrococcus abyssi iii) The tRNA (m1A) MTase from P. abyssi

iv) The tRNA (m5C) MTase from P. abyssi

v) The tRNA (m2G) MTase from Methanocaldococcus jannaschii 3.2.2 Base mono- and di-methylating tRNA MTases

3.2.3 Ribose methylating tRNA MTases II. Aim of the thesis

III. Results

1. Functional characterization of archaeal MTases related to Trm10p, a tRNA (m1G9) MTase from the yeast Saccharomyces cerevisiae

1.1 Cloning and expression of the archaeal Trm10p-related proteins Sso1087p from Sulfolobus solfataricus and Pab0880p from Pyrococcus abyssi

1.2 Analysis of the tRNA MTase activities of Sso1087p and Pab0880p

1.3 Sso1087p from S. solfataricus catalyzes formation of 1-methyladenosine on in vitro transcripts of archaeal tRNAs

1.4 Identification and cloning of homologs to Sso1087p and Pab0880p in other archaeal species

1.5 Analysis of tRNA MTase activities of Saci_1677p from S. acidocaldarius and Tk0422p from T. kodakaraensis and yeast Trm10p

1.6 Confirmation of the nature of the modification catalyzed by the archaeal Trm10p- related proteins using in vitro transcripts of archaeal tRNAs

1.7 Saci_1677p from S. acidocaldarius and Tk0422p from T. kodakaraensis act at position 9 of tRNAs

36 38 39 40 42 42 43 43 44 45 45 46 47 49 49

49

51 52

53

55

57

61

(7)

6 1.8 Further characterization of the unique tRNA (m1A-m1G) MTase activity of

Tk0422p from T. kodakaraensis

1.9 Tk0422p catalyzes both m1A and m1G formation in vivo on E. coli tRNAs 1.10 Insight into the in vivo function of Sulfolobus sp. Trm10p-related proteins:

inactivation attempts of SSO1087 gene in S. solfataricus and of Saci_1677 gene in S.

acidocaldarius

1.10.1 Inactivation of SSO1087 gene from S. solfataricus 1.10.2 Inactivation of Saci_1677 gene from S. acidocaldarius

2. Structural characterization of the yeast Trm10p MTase, and of the archaeal Trm10p- related MTases Saci_1677p from S. acidocaldarius and Tk0422p from T. kodakaraensis

2.1 Purification of S. cerevisiae Trm10p

2.2 Structural analysis of Tk0422p from T. kodakaraensis 2.2.1 Purification of Tk0422p

2.2.2 Determination of tRNA elements recognized by Tk0422p 2.2.3 Mutagenesis studies of Tk0422p

2.3 Structural analysis of Saci_1677p from S. acidocaldarius 2.3.1 Purification of Saci_1677p

2.3.2 Determination of tRNA elements recognized by Saci_1677p 2.3.3 Crystallization of Saci_1677p

3. Cloning and attempts to characterize human and murin Trm10p-related proteins 3.1 In silico analysis of the Trm10p paralogs in human and mouse

3.2 Cloning and expression of the human and murin Trm10p-related proteins Rg9mtd1

3.3 Analysis of tRNA MTase activities of human H1 and H1-export and murin M1 and M1-export recombinant proteins

IV. Discussion

V. Materials and Methods 5.1 Materials

5.1.1 Strains

62

64 65

65 70 75

76 78 78 81 83 91 91 94 96 100 101 101

102

106 122 122 122

(8)

7 5.1.2 Plasmids

5.1.3 Genomic DNAs

5.1.4 Oligonucleotide sequences (from 5’ to 3’) for PCR DNA amplification 5.1.5 Culture media

5.1.6 Buffers and solutions 5.1.7 Antibiotics

5.1.8 Enzymes

5.1.9 Radioactive isotopes 5.1.10 Molecular mass markers 5.2 Methods

5.2.1 PCR amplification of a DNA sequence from genomic or plasmidic DNA 5.2.2 Site directed mutagenesis

5.2.3 DNA electrophoresis in agarose gel 5.2.4 DNA extraction from agarose gel

5.2.5 Protein electrophoresis in polyacrylamide-SDS gel 5.2.6 Molecular cloning

5.2.7 Transformation of E. coli cells

5.2.8 Minipreparation and midipreparation of plasmidic DNA 5.2.9 DNA sequencing

5.2.10 Recombinant proteins expression 5.2.11 Preparation of cell extracts

5.2.12 General procedure for protein purification 5.2.13 Measurement of DNA and protein concentration 5.2.14 Cloning of tRNA genes

5.2.15 in vitro transcription of tRNA genes 5.2.16 Labelling of the truncated tRNAminiAla

5.2.17 Preparation of polyacrylamide gel to purify the in vitro transcripts

123 124 124 130 131 132 132 133 133 133 133 134 135 135 135 135 136 137 137 137 137 138 139 139 140 141 141

(9)

8 5.2.18 Two-dimensional thin layer chromatography

5.2.19 tRNA MTase assay

5.2.20 Quantification of the intensity of the various spots on the TLC plates or on the autoradiographies

5.2.21 Localization of modified nucleosides in tRNAs

5.2.22 HPLC analysis to determine the in vivo presence of m1A or m1G on E.

coli tRNAs

5.2.23 Gene inactivation in S. solfataricus 5.2.24 Gene inactivation in S. acidocaldarius 5.2.25 Crystallography

VI. Article VII. Bibliography VIII. Abstract

142 142 144

144 145

145 148 150 151 152 167

(10)

9

I. Introduction

1. History

1.1 Discovery of nucleic acids

The research into nucleic acids started in Germany during the years 1869-1870, when the young physicist Johann Friedrich Miescher decided to determine the chemical composition of the lymphocytes from pus on fresh surgical bandages that he got from the nearby hospital in Tübingen. Their “histological purity” allowed him to achieve the most complete purification of the chemical building blocks that constitute the cells (1, 2). Miescher noticed that a substance precipitated from the solution when acid was added and dissolved again when alkali was added.

Since this compound was present in the nucleus, Miescher named it “nuclein”. He showed that nuclein was not sensitive to the protease pepsin, that its composition was different of that of proteins because it lacked sulfur but contained a large amount of phosphorus together with carbon, hydrogen, oxygen and nitrogen. Miescher continued his experiments on nuclein using salmon sperm cells because their heads contain almost exclusively a nucleus. On these cells, he confirmed the composition of nuclein, and stated that all the phosphorus present in nuclein was present as phosphoric acid. Miescher concluded that in salmon spermatocytes, the acidic nuclein is bound in a salt-like state to a basic molecule which he called “protamin” (3). What Miescher had identified as nuclein was in fact a deoxyribonucleoprotein.

Following Miescher pioneer work on nuclein, Albrecht Kossel showed that there was a protein portion, and a non-protein portion in the nuclein (4), but it is Richard Altmann who first obtained a material devoid of protein, that he named NUCLEIC ACID (5).

(11)

10

1.2 Animal nucleic acid versus plant nucleic acid

In the last two decades of the 19th century, Albrecht Kossel succeeded in isolating and characterizing the bases adenine (A), guanine (G), thymine (T) and cytosine (C) in nucleic acid coming from the thymus gland of calf and from the spleen of beef (6, 7, 8, 9, 10) and Ascoli identified at the very beginning of 20th century a new compound, uracil (U), isolated from what was at that time called “yeast nucleic acid” (11).

In that period, scientists used two major sources of nucleic acid, nucleic acid from calf thymus, and from yeast. In 1909, Phoebus Levene and Walter Jacobs hydrolysed the yeast nucleic acid in mild conditions with pyridine to generate what they called the four pentose nucleosides, adenosine, guanosine, cytidine and uridine. They were identified as derivatives of the four bases adenine, guanine, cytosine and uracil. They also showed that yeast nucleic acid contained a phosphoric acid, and a sugar molecule, identified as pentose (12). At this time, the thymus nucleic acid from animal cells appeared resistant to alkaline hydrolysis, so it took twenty more years to solve its composition.

In 1929, Levene discovered, using the action of enzymes on thymus nucleic acid, that it contains thymidine in addition to adenosine, guanosine and cytidine (13). This nucleic acid contained also a phosphoric acid and a sugar molecule different from the one found in the yeast nucleic acid. At first, this sugar was thought to be an hexose, but was finally characterized as 2-deoxypentose.

As most of nucleic acids coming from animal cells resembled thymus nucleic acid, and as the only other nucleic acid prepared in large quantity from a plant (named triticonucleic acid) resembled the yeast nucleic acid, it generated the idea that there were only two sorts of nucleic acids in nature,

“animal nucleic acid” and “plant nucleic acid”. Jones emphasized this idea in his monograph: “There are but two nucleic acids in nature, one obtainable from the nuclei of animal cells, and the other from the nuclei of plant cells” (14).

Nonetheless, it had been known for some time that pentose derivatives were present in animal tissue. Hammarsten had shown that a β-nucleoprotein, found in mammal’s pancreas, contained a pentose sugar (15), and Calvery had shown that chicken embryos contained also some pentose nucleic acid (16). In the same manner, “animal nucleic acid” had also been isolated from plant tissues, and from yeast (17, 18). Therefore, Calvery stated that: “Since the terms thymus or animal nucleic acid and yeast or plant nucleic acid no longer possess their original significance, […] I suggest that the terms hexose nucleic acid and pentose nucleic acid [should] be substituted for them” (16).

Since the “hexose” and pentose sugars were characterized as 2-desoxyribose and ribose, the terms

(12)

11 Desoxyribose Nucleic Acid (DNA) and Ribose Nucleic Acid (RNA) were finally adopted to replace animal and plant nucleic acid.

Finally, Feulgen, Behrens and Mahdihassan successfully separated the nucleus of the cell from any cytoplasmic constituent, showing that DNA was in fact a constituent of all cells nuclei, independently of their origin, and that RNA was mainly a cytoplasmic constituent of all cells (19, 20). The final proof that RNA was a normal constituent of all cells came with the ultraviolet spectrophotometric studies of Caspersson (21), the histochemical observations of Brachet (22), and from the chemical analysis of Davidson and Waymouth (23).

2. The Ribonucleic acid (RNA) 2.1 Different categories of RNA

The discovery of the various types of RNA in cells is tightly related to the discovery of the mechanism of translation. During the 1950ies, turn-over experiments showed that the labelling of RNA in cells was inhomogeneous. Part of the RNA was found in the nucleus, part of it in particles in the cytoplasm, and part of it was found “soluble” in the cytoplasm. It was also shown that the turn- over of these three fractions was different, meaning that in cells, RNA is very inhomogeneous metabolically. These observations led to the suggestion that RNA could have more than one role in cells. Among the first to suggest a link between RNA and protein synthesis were Caspersson and Brachet, who showed that there were more RNAs in organs that synthesize and secrete a lot of proteins such as the pancreas or the gastric mucosa than in the energy consuming organs, such as the heart or the muscles (21, 22).

Further evidence supporting the role of RNA in protein synthesis came from the work of Zamecnik and co-workers (24, 25), who showed, using radioactive amino acids, that microsomes, now known as ribosomes, were important for protein synthesis. They were the first to develop a cell-free system to study protein synthesis. This system used amino acids, a giving-ATP component, a soluble fraction of the cytoplasm, and ribosomes. During this in vitro experiment, evidence was obtained that the radioactive amino acids could be covalently linked on a “soluble” RNA present in the cytoplasmic fraction.

(13)

12 In the presence of this labelled RNA, ribosomal fraction, and GTP, the labelled amino acid was transferred from the “soluble” RNA, now known as transfer RNA (tRNA), to the ribosomal particle (26).

At that time, it was known that the genetic information was carried on DNA in the nucleus, and that protein synthesis took place in the cytoplasm of cells, leading people to hypothesize that there should be a “messenger” molecule, carrying information from genes to proteins. It was at first thought that the long RNA molecules present in the ribosomal particles (identified by Colter and co- workers (27)) could serve as “template”, leading to the hypothesis of “one gene-one ribosome-one protein” (Crick (28)). In Crick’s second so-called adaptator hypothesis, the amino acid would be carried to the “template” RNA (ribosomal RNA, rRNA) by an adaptator molecule, which he thought would be a tri-nucleotide potentially derived from the transfer RNA identified previously by Zamecnick and collegues. It was commonly thought at that time that the 70-75 nucleotide long tRNA would be too big to fit on rRNA, but it was later shown by kinetic studies that the quantity of tRNA molecules and the rate of transfer were high enough to accomplish this activity (29, 30).

Finally, the existence of a “messenger” RNA (mRNA), a metabolically labile, uncharacterised RNA involved in protein synthesis and distinct from the more stable rRNA was first suggested by Volkin and Astrachan (31), and confirmed by Brenner, Jacob and Meselson (32) and others. Indeed, they showed in phage or bacterial systems that the RNA that directs translation could change more rapidly than the life span of the ribosomal particle should have permitted, and must therefore consist of a separate, newly formed strand of RNA that turns over rapidly and associates and dissociates from existing ribosomal particles. Nirenberg and Matthei tested this postulation by using a synthetic polyribonucleotide as mRNA. In a cell-free system, they incubated polyuridilic acid with phenylalanine and observed the formation of polyphenylalanine (33). This constituted the first break in deciphering the genetic code. Ochoa and collegues then found that polyadenylic acid codes for polylysine, and by 1966, the entire triplet nucleotide specific genetic code was deciphered (34).

In summary, three major RNA categories were discovered in cells during the attempt to understand the mechanism of protein synthesis. First, the messenger RNA (mRNA), which is a copy of the DNA genes present in the nucleus, and which serves as template for protein synthesis. Secondly, the transfer RNA (tRNA), which carries the amino acids to the ribosomes, by triplet pairing with the template mRNA. And finally, the ribosomal RNA (rRNA), that constitutes the core of the ribosomal particule, and which pairs with mRNA to position it correctly for protein synthesis.

(14)

13

2.2 The transfer RNA (tRNA)

The tRNAs are small RNA molecules that carry amino acids to the growing polypeptide chains on the ribosomes. Each tRNA is specific for a given amino acid, and tRNAs that recognize the same amino acid are called “isoacceptors”. tRNA species have to be different enough to allow correct recognition by their specific aminoacyl tRNA synthetases (aaRS), and to be correctly charged with their specific amino acid. At the same time, tRNAs must be similar enough to be recognized by the translation elongation factors, and to accommodate the tRNA binding sites in the ribosome. Thus, all canonical tRNAs have a comparable length, from 74 to 95 nucleotides (nt), and similar folding for their secondary (cloverleaf) and tertiary (L-shape) structures (figure 1).

Figure 1: Secondary “cloverleaf” and tertiary “L-shape” structures of tRNAs (adapted from (35)).

By convention, the nucleotides of tRNAs are numbered from 1 to 76 according to the yeast tRNAPhe (figure 2) (36).

(15)

14 Figure 2: Nucleotide numbering in tRNA based on the sequence of the yeast tRNAPhe (36).

The cloverleaf is made of 5 regions, namely the acceptor stem, bearing the amino acid on the CCA triplet at the 3’ terminus, the D stem and loop, the anticodon stem and loop, where nt 34 to 36 form the anticodon, the variable loop, and the T stem and loop. The length of the different stem regions of tRNA is generally well conserved, consisting of seven base pairs (bp) in the acceptor stem, five bp in the T and anticodon stems, and three or four bp in the D stem. At contrary, as suggested by its name, the length of the variable loop is not constant. Depending on this length, tRNAs can be divided into two classes. Class I tRNAs have a variable loop of four or five nt, while the variable loop of class II tRNAs can contain from ten to twenty-four nt. The tertiary L-shape structure of tRNA is obtained by the stacking of the acceptor stem onto the T stem to yield the acceptor arm, and of the D stem on the anticodon stem to form the D arm. This similar folding of all canonical tRNAs is allowed by the fix length of the stems and loops, and by the semi-conserved and conserved nt participating in secondary and tertiary interactions (figure 3) (37).

(16)

15 Figure 3: Interactions between semi-conserved or conserved nucleotides involved in tertiary structure folding in tRNAs (37). The picture represents the yeast tRNAPhe.

2.2.1 tRNA biosynthesis

The biosynthesis of the tRNA molecules is similar in the three domains of life (Eukarya, Bacteria and Archaea), and starts with the transcription by the RNA polymerase (pol III in eukaryotes) of the tRNA genes as pre-tRNA transcripts. These transcripts are then processed in five or six steps to generate the mature tRNA (figure 4).

i) 5’-end processing

ii)

3’-end processing

iii)

3’-end CCA addition

iv)

Intron splicing

v)

Nucleoside modifications

vi)

Nuclear export in eukaryotes

(17)

16 Figure 4: Schematic view of a tRNA precursor (left) and of a mature (right) unmodified tRNA. The 5’-leader sequence, 3’ tail and intron are in blue, the nucleotides of mature tRNA are in green except for the anticodon that is colored in red. Picture adapted from (38).

i) 5’-end processing

In the three domains of life, pre-tRNA transcripts usually contain a 5’-leader sequence which has to be removed by an endonuclease, the Ribonuclease P (RNase P). This RNase P is a ribonucleoprotein in which the RNA possesses the catalytic activity (39). The organization of RNase P differs depending on its origin. In Bacteria, the RNase P is formed by the association of one RNA molecule and one protein, while the archaeal RNase P contains one RNA component and five protein subunits (40). A slightly different situation exists in eukaryotes, were the RNase P is much bigger, consisting of one RNA molecule and nine or ten protein subunits. In this case, the RNA is poorly catalytic by itself, and needs the presence of the other proteins to become truly catalytic (41, 42, 43).

There are two major exceptions of this 5’-end processing, one found in human mitochondria, and a second one found in the archaeon Nanoarchaeum equitans.

The human mitochondrial RNase P is devoid of any RNA component, being uniquely composed of three proteins that exert together the catalytic activity (44). The situation is even more surprising in N. equitans, where no RNase P gene could be identified. It has been shown that the tRNA genes of this archaeon are transcribed without 5’ leader and are functional without further processing (45).

(18)

17 ii) 3’-end processing

The pre-tRNAs from all domains always contain a 3’-end tail that has to be removed to form the mature tRNA. This maturation starts with the cleavage of the tail by an endonuclease, followed by the degradation of the cleaved fragment by various exonucleases (46, 47, 48, 49). The RNase Z plays a major role in removing the trailer in various organisms, cleaving tRNAs just after the discriminating base N73 prior to the addition of CCA (50). The CCA constitutes an anti-determinant for RNase Z, preventing mature tRNAs to be subjected to cleavage and thus preventing a futile cycle (51).

iii) 3’-end CCA addition

All tRNAs must possess the CCA triplet at their 3’-extremity (position 74-75-76) to be able to bind their cognate amino acid and to form the peptide link on the ribosome (52). This CCA is not always encoded in the tRNA gene, but there exists a CCA-adding enzyme using CTP and ATP as substrates to construct or repair the 3’-CCA (53). This 3’-CCA adding enzyme is found in the three domains of life and is essential in organisms where the tRNA genes do not encode the terminal CCA.

The archaeal CCA-adding enzyme differs from the bacterial/eukaryal one, but all these enzymes always contain four domains: head-neck-body-tail. They catalyze the 3’adding of CCA without any nucleic acid template.

iv) Intron splicing

Only a minority of tRNA genes contains an intron in their sequence, but these intron- containing tRNA genes can be found in all sequenced eukaryal and archaeal species. In eukaryotes, the intron is always found between position 37 and 38 of tRNA, and no more than one intron per tRNA gene can be found (54). The splicing endonuclease recognizes the body of the mature tRNA and uses a “ruler-mechanism” to measure the distance toward the cleavage site (see figure 5) (55).

(19)

18 Figure 5: Schematic view of tRNA intron splicing and ligation in eukaryotes. The anticodon is in red, the intron in blue. The endonuclease cleaves the tRNA molecule at each intron-exon border, generating tRNA halves with 2’-3’ cyclic phosphate (5’ half tRNA) and 5’-OH (3’ half tRNA) extremities. The 5’-OH is then phosphorylated, the 2’-3’ cyclic phosphate is opened in 2’ phosphate, and the halves are ligated to form the spliced tRNA (figure from (59)).

The situation is different in Archaea, which display the most intron insertion in tRNA genes.

Approximately 15% of all archaeal tRNA genes contain introns. These vary in size, from 16 to 44 nt, and in position. They are mostly found between position 37 and 38 as in eukaryotes, but can also be inserted at other positions, and some tRNAs contain even two or three introns. The splicing determinant in Archaea is the bulge-helix-bulge structure of the intron, where a central four nt helix is flanked on both sides by three nt bulges (figure 6) (56, 57).

(20)

19 Figure 6: Schematic view of archaeal tRNA introns and bulge-helix-bulge motif required for their splicing (boxed). Picture adapted from (60).

Finally, some bacterial tRNA genes contain a different type of intron within the anticodon region. The tRNAs containing these introns are self-splicing tRNAs, able to cleave the phosphodiester link to eliminate the intron and then to ligate the mature tRNA sequences together (58).

v) Nucleoside modifications

One of the striking characteristics of tRNAs from all organisms is their large number of post- transcriptional nucleotide modifications. 92 different tRNA modifications are listed in the RNA modification database (http:// Biochem.ncsu.edu/RNAmods), and a median of eight modifications can be found per tRNA species (61). Some examples of nucleotide modifications are given in figure 7.

In the last decade, many genes and corresponding modification enzymes have been identified and characterized in several organisms, and the function of the modified nucleosides begins to be best understood. They follow mainly three rules: first, many modifications in or around the anticodon play a role in the efficiency or fidelity of translation. Second, many modifications in the main body of tRNA affect tRNA folding and stability. Third, some modifications affect the tRNA identity. Examples of these roles will be given in the next chapter of the introduction.

(21)

20 Figure 7: Examples of tRNA modifications, pictures from the RNA modification database (http://

Biochem.ncsu.edu/RNAmods). m = methyl, t = threonylcarbamoyl, Ψ = pseudouridine, rT = ribothymidine, D = dihydrouridine, mnm = methylaminomethyl, s = thiol, tm = taurinomethyl.

vi) Nuclear export in eukaryotes

In Bacteria and Archaea, the transcription and maturation of tRNA take place in the cytoplasm, while in eukaryotes, tRNAs are transcribed and partly matured in the nucleus, and have to be exported to the cytoplasm to exert their function. This nuclear export follows the Ran-GTPase pathway, where the exportin Xpo-t binds the tRNA in presence of GTP, and this complex moves through nuclear pores to the cytoplasm. There, GTP is hydrolysed in GDP, promoting complex dissociation and releasing the tRNA in the cytosol (62).

(22)

21 These various transcription and processing steps constitute the usual tRNA synthesis pathway.

However, some tRNAs are transcribed in a totally different manner, especially in Archaea. In N.

equitans, some tRNAs are expressed as half molecules, each half being encoded by tRNA half genes possessing their own promoter, and not especially located in the vicinity of their matching half (63, 64). These genes are called “split tRNA genes”. The matching halves tRNA molecules contain additional complementary nucleotide sequences at their 5’ or 3’ ends. These sequences pair to form a 14-nt long helix that mediates the joining of the two tRNA halves. At the junction between the tRNA and the helix, a BHB motif is formed, allowing splicing of the additional sequence and formation of the mature tRNA (figure 8).

Figure 8: Schematic view of “split tRNA genes” in N. equitans, and of the formation of mature tRNA from these

“split genes” (59).

(23)

22 An even more complex tRNA gene organization can be found in the archaeon Caldivirga maquilingensis. In this organism, the tRNAGly (anticodon UCC) is encoded in three independent fragments, a 3’ half, a shortened 5’ half and a 12 nt long fragment containing the anticodon (65).

A last unusual tRNA gene organization was found in the red alga Cyanidioschyzon merolae (66). In this organism, the tRNA genes are also fragmented in 5’ and 3’ halves, but inverted into a single gene. This gene organization is called “permutated tRNAs”. The promoter of this gene is followed by the 3’ half preceding the 5’ matching half. The processing is similar to the one found for the split tRNA genes in N. equitans (figure 9).

Figure 9: Schematic view of C. merolae permutated tRNAs. The green sequence connects the 3’ part of the tRNA with its 5’ part. This green sequence is possibly excised by RNase P and tRNase Z (60).

Finally, another unusual tRNA maturation event is found in the archaeon Methanopyrus kandleri. The majority of the genes of this organism contain a C at position 8 of tRNA, a position that usually bears the invariant nucleotide U8. U8 participates in tertiary interactions with A14, allowing correct folding of tRNA. In M. kandleri, the pre-tRNAs are edited from C8 to U8 by a CDAT8 enzyme (cytidine deaminase acting on tRNA base 8), correcting a nucleotide change that would impair the correct folding of tRNAs (figure 10) (67).

(24)

23 Figure 10: C-to-U editing in M. kandleri tRNAs (60).

2.2.2 Ribonucleosides modifications

In the early years of RNA studies, scientists thought that RNA molecules were uniquely composed of the four canonical nucleosides adenosine (A), guanosine (G), cytidine (C) and uridine (U). Yet, in 1951, Cohn and Volkin identified a new compound in a hydrolyzate of yeast tRNA by paper chromatography (68). This compound was later characterized as 5-ribosyluridine, now called pseudouridine (Ψ) (69). One year later, another type of non-canonical nucleoside was identified in RNA, consisting in methylation of the 2’ hydroxyl group of ribonucleosides (70, 71). In the same period, base methylated nucleosides were also detected and characterized in RNA, such as 5- methylribouridine (m5U or ribo-T), 5-methylcytosine (m5C), 1-methyladenosine (m1A), 1- methylguanosine (m1G), … (72, 73, 74, 75). By 1970, twenty years after Ψ discovery, 35 modified nucleosides had been identified in RNA, and to date, around 110 different modifications are known to occur in RNA molecules (some examples are shown in figure 7) (76, 77).

The synthesis of these modified nucleosides was quite puzzling at the beginning, scientists believing them to be incorporated directly by RNA polymerase in the growing polynucleotide. However, Svensson and co-workers as well as Fleissner and Borek showed that the synthesis of these modified nucleosides was a post-transcriptional mechanism (78, 79).

(25)

24 The RNA modifications can be quite diverse, going from simple chemical alterations such as methylations, deaminations, acetylations, reductions, thiolations, etc. to complex chemical modifications consisting in the addition of a long lateral chain or of several substituents on the same nucleosides. Among the different nucleoside modifications, base and ribose methylations, as well as pseudouridylations are the most frequently encountered (76, 77).

More than half of the RNA modified nucleosides is specific of one of the three domains of life, suggesting that they arose late in evolution, after the separation of the different domains. About 1/4 of the remaining modifications is found simultaneously in two domains, and the last 1/4 are modifications that can be found in organisms from all the three domains. These latter modifications are generally quite simple, and several are found in all types of RNA molecules. This observation led to the hypothesis that they would correspond to relics of modifications present in the last universal cellular ancestor (LUCA) (figure 11).

Figure 11: Phylogenetic distribution of modified nucleosides in the three domains of life. Picture adapted from 80

(26)

25 It is now generally accepted that all cellular RNAs contain modified nucleosides, but the largest number and the greatest variety of these modifications are found in tRNAs (figure 12). In these molecules, the anticodon loop is the region which is the most heavily modified, particularly at position 34 and 37 (3’ adjacent to the anticodon), and modifications at these positions are known to be important for mRNA decoding (81, 82). In contrast, the function of modifications present outside of the anticodon domain, in the structural core of tRNA, is poorly understood. These modifications are typically less sophisticated than those found in the anticodon, consisting mainly in methylated nucleosides and Ψ. The understanding of the role of an individual modification in the core of tRNA suffers from the apparent lack of consequence resulting from the absence of one modification (83).

Indeed, most of the time, the effect of one modification becomes apparent only when additional modifications or elements of the tRNA structure are also missing (84, 85). Nevertheless, the common perception in the field of tRNA modification is that modified nucleosides in the core of the tRNA would play a role in the structural stability and flexibility of the RNA molecule (86, 87).

Figure 12: Distribution of modified nucleosides by RNA type. Picture adapted from 80

The tRNA modified nucleosides can thus be seen in a simplified view as having a “functional” role if found in the anticodon region, or a “structural” role if found in the core of the tRNA molecule.

(27)

26 The following section will give some examples of “functional” or “structural” tRNA modifications, although it is sometimes difficult to separate modifications as having only a “structural” or only a

“functional” role. A later chapter of this introduction will focus mainly on methylated nucleosides and on the enzymes catalyzing their formation.

i) Examples of modified nucleosides with “functional” roles

To date, more than six hundred tRNAs from one hundred different organisms of the three domains of life have been sequenced. The type and localization of modified nucleosides have been identified (76). A map of the modifications found at different positions of tRNA is presented in figure 13. Some positions in tRNAs are never modified, while others can be modified in a large variety of manners. This is especially the case for positions 34 and 37 of the anticodon loop. Indeed, 28 different modified nucleosides are found at the wobble position of anticodon (namely position 34), and 17 different modifications are found at position 37, the position 3’ adjacent to the anticodon.

This variety in nucleoside modifications suggests that their presence is important for the decoding capacity of tRNAs. The decoding information of the tRNA may in fact extend beyond the anticodon, especially toward the 3’ side of the tRNA (88, 89).

(28)

27 Figure 13: Schematic representation of tRNA cloverleaf structure and positions where a given modified nucleoside has been encountered (90).

a) Function of a modified nucleoside at position 34 of tRNA: the case of mnm5s2U

A compilation of tRNA sequences highlighted the presence of a large number of modifications at position 34 of tRNAs. In his wobble hypothesis, Crick suggested that while the first two positions of the codon (positions 35 and 36 of the anticodon) were highly likely to follow the strict standard base pairing rule, the base-pairing of the third position of the codon (first position of the anticodon) could tolerate a certain amount of play, or wobble, such that more than one possibility of pairing would be tolerated (91). This could explain the degeneracy of the genetic code, where most amino acids have synonymous codons, differing mainly in the third letter (figure 14).

(29)

28 Figure 14: The genetic code.

The large number of modified nucleosides found at position 34 could therefore represent adaptations developed to influence the decoding of mRNAs. Among the 28 different modified nucleosides found at the wobble position, 17 are uridine-derivatives, suggesting that the decoding capacity of tRNAs with a modified uridine at this position is improved compared to tRNAs with unmodified U34. It has been shown that modifications of U34 can either increase the recognition of the four canonical nucleosides at the last position of the codon of mRNA, allowing recognition of the four codons in the 4-fold degenerated codons boxes of the genetic code, or restrict recognition to one or two cognate nucleosides in the mixed codon boxes containing 2-fold degenerated codons.

For example, the modification mnm5s2U (see figure 7) is a complex modification found in all tRNAs reading NAA codons (where N represents C, A or G) as in the mixed codon boxes in which two codons correspond to one amino acid, and the two others correspond to a second amino acid, as for Histidine/Glutamine (His/Gln), Asparagine/Lysine (Asn/Lys), and Aspartate/Glutamate (Asp/Glu). The corresponding unmodified anticodon sequences are 5’-U34-U35-U/C/G36-3’. Because the stacking between uridines is poor, an anticodon with such a U-rich sequence should be unstable (92). The

(30)

29 thiolation at position 2 of U improves the stacking interactions between uridines, therefore stabilizing the anticodon-codon pairing (93, 94), which can explain why tRNAs from all organisms reading NAA codons have thiolated uridine derivatives as wobble nucleoside. It was also shown that the mnm5 side chain of U34 could increase G recognition at the third position of the codon by the tRNALys (UUU), this modification improving therefore the reading of 2-fold degenerated codon boxes (95).

The “functional” effects of modified nucleosides are not uniquely encountered at the level of codon- anticodon pairing, but can also play a role at the level of the reading-frame maintenance on the ribosome. Indeed, reading-frame maintenance is a key part of translation because the amino acid sequence of a protein synthesized after frame-shifting would be totally different to the one expected. Moreover, the ribosomes frequently encounter a stop codon after a frame-shift event, leading to the synthesis of truncated, non-functional proteins. Hypomodified tRNAs may affect reading frame maintenance by altering the binding rate of the ternary complex (aminoacylated tRNA- GTP-elongation factor EF-Tu) to the ribosome, or by altering the dissociation rate of the tRNA bound to the A or P sites of the ribosome.

The modified nucleoside mnm5s2U34 not only affect codon-anticodon recognition, but is also implicated in reading frame maintenance. Indeed, in E. coli, a tRNAGln deficient in mnm5s2U34 is unable to bind correctly the P-site of the ribosome, resulting in frame-shifting (96).

b) Function of a modified nucleoside at position 37 of tRNA: the case of t6A

Some nucleotides are always found at the same position of the tRNA molecule, or some positions can only bear a purine while others can only accommodate a pyrimidine. These nucleotides are considered as invariant or semi-invariant respectively, and play crucial roles in secondary and tertiary interactions yielding to the tertiary folding of tRNAs (97). Among the tRNA sites always occupied by semi-invariant nucleotides, is position 37 (3’ adjacent to the anticodon). This position always contains a purine nucleotide, which is often modified with diverse chemical groups. Seven different modified guanosines and ten different modified adenosines can be found at this position, some of these consisting in complex chemical modifications. Among them, the t6A modification (N6- threonylcarbamoyl adenosine) (see figure 7) is present in E. coli tRNALys (anticodon mnm5s2UUU) or in the human tRNALys3. It improves codon-anticodon pairing indirectly by creating an open structured loop, by preventing interaction between A37 and U33 (invariant nt), which destabilizes the closed

(31)

30 anticodon loop and orders the structure by the improvement of stacking of its 3’-side (98). This example emphasizes the difficulty of separating nucleoside modifications as having only “functional”

or “structural” roles. A further effect of the t6A modification is a direct stabilization of codon- anticodon pairing by an increase of the stacking interactions. This effect was identified in E. coli and yeast by the observation that tRNAIle (anticodon GAU) (E. coli) or tRNAArgIII (yeast) deficient in t6A modification bind less well to their cognate codon than wild-type tRNAs (99, 100).

ii) Examples of modified nucleosides with “structural” roles

A correctly arranged structure of tRNA is crucial for its biological function, and most of the time, an in vitro transcribed tRNA adopts spontaneously the canonical secondary and tertiary foldings. These in vitro transcripts do not contain any modified nucleosides. Therefore, it is generally accepted that modifications are not essential for the global folding of tRNAs, but rather fine-tune locally the tRNA structure. Although most unmodified tRNAs are functional in vitro in aminoacylation assays as well as in “in vitro” translation assays, recent evidences indicate that tRNAs are non- functional in vivo without post-transcriptional modifications that reinforce the structural tRNA core (84, 85, 101). Two examples of “structural” function of modified nucleosides will be given, one concerning a modification with a subtle role in tRNA structure, and the second one concerning a crucial modified nucleoside essential for the correct folding of tRNA.

a) Modified nucleoside with local “structural” effect

Most of the modified nucleosides influence parameters of the local tRNA structure, notably by reinforcing or forming the secondary and tertiary interactions inside the tRNA molecule. The methylated nucleoside 7-methylguanosine (m7G) (see figure 7) is found at position 46 of the variable loop of yeast tRNAPhe, and is involved in tertiary interactions with the base pair C13-G22 of the D- stem and loop (102, 103). m7G46 acts in two different ways to stabilize the tertiary structure of tRNAs: on one hand, it makes two hydrogen bonds with G22 of the D stem and loop, and on the other hand, its positive charge (m7G is one of the few methylated nucleosides bearing a positive charge, with 1-methyladenosine (m1A) and 3-methylcytosine (m3C)) stabilizes the tertiary structure by electrostatic interactions with the phosphodiester backbone of tRNAs. These interactions help to stabilize the variable loop in the center of the tRNA molecule.

(32)

31 Despite its structural role, the modification m7G46 is not essential in yeast tRNAs, since deletion of the corresponding enzyme only generates growth defect in a narrow window of growth conditions (104). Nevertheless, Phizicky and co-workers showed that combined deletion of the m7G46 methyltransferase (MTase) gene (Trm8) with one of the other modifying enzymes acting outside of the anticodon in yeast tRNAs generates a strong growth defect, the strongest growth-defect phenotypes being observed for the combined absence of m7G46 and m5C49, m7G46 and Ψ13 and m7G46 and D47 (85). This growth defect is related to degradation of unstable hypomodified mature tRNAs by a rapid tRNA decay (RTD) pathway. The yeast tRNAVal(AAC) is particularly sensitive to the combined loss of modifications, since it is the only tRNA species that is reduced in amount in all double mutants characterized in this study. This tRNA contains the four modified nucleosides for which the combined absence had the most pronounced growth defect phenotype. All these modifications are potentially implicated in maintaining the three dimensional structure of tRNAs, making plausible that the structure of this tRNA is particularly sensitive, and requires stabilization by a constellation of modifications.

The above example emphasizes the fact that a single “structural” modification is not especially crucial for the global tRNA folding, but improves locally the tRNA structure, and that this network of local stabilization is crucial to maintain the global correct 3-dimensional structure of tRNAs.

b) Modified nucleoside with global “structural” effect

Large structural effects in RNA include dissolution and re-formation of helical structures. This implies changes in base pairing, in Watson-Crick interactions. Methylations are the most simple and widespread modifications known to block hydrogen bonds formation between nucleotides (see figure 15).

(33)

32 Figure 15: Canonical nucleosides as well as pseudouridine (Ψ) can make hydrogen bonding on their “C-H” edge, Watson-Crick edge or sugar edge. These hydrogen bonds participate in secondary and tertiary folding of tRNAs.

The modified nucleosides, essentially methylated nucleosides, blocking the hydrogen bond formation are indicated (105).

One of the best examples of modified nucleoside generating a global structural rearrangement of the tRNA molecule concerns the human mitochondrial lysine tRNA (mttRNALys) (106, 107). In vitro transcripts of this tRNA devoid of any modification do not fold into the expected canonical cloverleaf, but rather adopts an alternative folding, consisting in an extended hairpin structure with several bulges (figure 16). This extended structure cannot adopt the canonical three dimensional L-shape folding. The unusual folding of the human mttRNALys transcript indicates that one or several of the six modified nucleosides present in the native tRNA are required and responsible for the cloverleaf structure formation. These modified nucleosides are m1A9, m2G10, Ψ27, Ψ28, tm5s2U (a hypermodified uridine) and t6A. In the native mttRNALys, A9 is present in single-stranded region, while it is base paired in the in vitro transcript, and contributes to the formation of its extended accepting stem; therefore it could be the key element in the folding process of mttRNALys.

Indeed, Helm and co-workers, using a chimeric tRNA possessing m1A9 as unique modified nucleoside, and comparing its structure to both native or in vitro transcribed unmodified tRNA, confirmed that

(34)

33 m1A9 is sufficient to lead to the correct cloverleaf folding of tRNA. This effect is due to the blocking of Watson-Crick base pairing between A9 and U64, which also prevents pairing between residues 8 and 65 and 10 and 63.

Figure 16: Structure of the native human mitochondrial tRNA for Lysine (mttRNALys) and of the corresponding unmodified in vitro transcript (108).

Animal mitochondrial tRNAs present particular structural features which differentiate them from the canonical tRNAs. Indeed, they differ in the size of stem and loop regions, and they lack semi- conserved or conserved residues known as essential for the three-dimensional structure (97).

Therefore, the minimal structural characteristics of the cloverleaf could be reached only after certain modification steps in these tRNAs.

(35)

34

3. RNA methyltransferases (MTases) 3.1 The various superfamilies of RNA MTases

Among the naturally occurring RNA modifications, base and ribose methylations are by far the most frequently encountered. They constitute about 2/3 of all RNA modified nucleosides (77, 109). The methylation can be targeted to several atoms of the base, or to the 2’ hydroxyl group of ribose. Concerning base methylations, almost all nitrogen groups can be methylated, except nitrogen involved in glycosidic bond formation and positions N7 and N3 of adenosine. These methylations can be endo- or exocyclic. Some carbon atoms are also targets of methylation, such as the carbon atoms at position 5 in pyrimidines and at positions 2 and 8 in adenosine. Methylated nucleosides can be found in various RNA species, such as tRNA, rRNA, mRNA, snRNA, miRNA and in viral RNA. These methylated nucleosides are formed by modification enzymes called RNA methyltransferases (MTases), which belong to four unrelated superfamilies characterized by different three-dimensional folds (figure 17) (110). RNA MTases from the Rossmann-fold superfamily, the SPOUT superfamily and the radical SAM superfamily use S-adenosyl-L-methionine (AdoMet) as methyl donor, while MTases from the FAD/NAD(P)-binding protein superfamily use 5,10-methylenetetrahydrofolate (5,10-CH2- THF) as methyl donor.

Most of the AdoMet-dependent MTases use a bimolecular nucleophilic substitution mechanism to catalyze direct transfer of the methyl group of AdoMet on their substrates, except for MTases from the radical SAM superfamily. These enzymes use a totally different mechanism to methylate their substrate. This will be explained in the paragraph concerning this family of MTases.

(36)

35 Figure 17: Structure of four MTases from the different superfamilies of RNA MTases. (A) Crystal structure of the mRNA cap 1 MTase VP39 with the RFM catalytic domain. (B) Theoretical model of the structure of Cfr MTase, with the radical SAM catalytic domain (J. Bujnicki, unpublished data). (C) Crystal structure of RlmH, an rRNA MTase with SPOUT catalytic domain. The enzyme is a dimer, but here, only a subunit is shown. (D) Theoretical model of the structure of the FAD/NAD(P)-binding domain of TrmFO MTase (J. Bujnicki, unpublished data). The protein molecules are shown in the cartoon representation, ligands and RNA are shown as black stick, the conserved core of the catalytic domain is shown in red, additional elements are shown in grey (110).

3.1.1 The Rossmann-fold superfamily

Most of the known MTases belong to this family. Their structure comprises a 7-stranded β sheet (6↓7↑5↓4↓1↓2↓3↓) flanked by α helices to form an αβα sandwich that resembles to a Rossmann-fold (6↓5↓4↓1↓2↓3↓), in which a seventh β -strand would be inserted in an antiparallel manner (figure 18). Therefore, these enzymes are called Rossmann-fold MTases (RFM).

The first β-strand contains a GxGxG sequence, characteristic of a nucleotide binding site, and the second β-strand ends with an acidic residue, implicated in hydrogen bonding with the ribose of AdoMet.

RFM are frequently monomeric, although di-, tri- or tetrameric structures have been reported. For example, the PabTrmU54 MTase from the euryarchaeon P. abyssi is monomeric, while the TrmI MTase from the same organism has a unusual tetrameric structure (111, 112).

(37)

36 Figure 18: Example of the fold and topology of an RFM. Triangles correspond to β-strands, circles to α helices.

Picture adapted from (113).

3.1.2 The SPOUT superfamily

The second largest group of RNA MTases is the SPOUT superfamily, whose name comes from the first two members of this family, SpoU and TrmD MTases (114, 115). The structure of these enzymes comprises a six stranded parallel β sheet (6↑4↑5↑1↑2↑3↑) flanked by seven α helices, the first three strands of the sheet forming half a Rossmann-fold. One of the most remarkable features of this family is the presence of a so-called trefoil knot at the C-terminal part of the domain.

Indeed, about 30 residues of the C-terminus of the molecule go through a loop formed by the last three strands of the β sheet. This knot is crucial for the MTase function, being responsible for AdoMet binding (figure 19). All known SPOUT MTases are dimers, the catalytic site being formed at the interface of two monomers, as can be seen in the crystal structure of the tRNA MTase aTrm56 from P. horikoshii, a tRNA 2’-O-methylcytidine MTase (116).

(38)

37 Figure 19: Example of fold and topology of a SPOUT MTase. Triangles correspond to β-strands, circles to α helices. Picture adapted from (113).

MTases from the RFM and SPOUT superfamilies catalyze methylation of their substrates by a bimolecular nucleophilic substitution mechanism. This mechanism has been described for the SPOUT MTase TrmD, catalyzing m1G37 formation in tRNAs (figure 20) (117, 118). The methylation is probably initiated by an aspartate residue of TrmD (D169). Indeed, D169 is positioned near the N1 atom of G37, and would deprotonate it. The deprotonated G can therefore attack the methyl group of AdoMet, leading to m1G37 formation in tRNAs.

Figure 20: Model of m1G37 formation by TrmD (118).

(39)

38

3.1.3 The radical SAM superfamily

Among enzymes of this family, only two RNA MTases have been identified to date, Cfr and RlmN acting on rRNAs (119, 120). Enzymes belonging to this family possess a TIM-barrel fold, which comprises eight tandemly arranged α/β elements (1↓2↓3↓4↓5↓6↓7↓8↓), even if most of them lack two of these α/β elements and exhibit a “3/4 barrel” fold (figure 21). This is notably the case of E. coli RlmN, possessing an α66 organization (121). Despite structural similarity to the Rossmann- fold, radical-SAM enzymes bind the AdoMet in a completely different way, by coordination by a Fe-S cluster. No tRNA MTase belonging to this family has yet been discovered.

Figure 21: Example of fold and topology of a radical SAM MTase. Triangles correspond to β-strands, circles to α helices. Picture adapted from (113).

The radical SAM MTases do not catalyze direct transfer of a methyl group on the rRNA, but the methylation involves two steps of methyl group transfer (122, 123). The proposed mechanism for RlmN, an m2A2503 rRNA MTase, involves a first transfer of the methyl group from AdoMet to a cysteine of the MTase (Cys355). A second AdoMet molecule is then cleaved to form a 5’- deoxyadenosyl 5’ radical (5’-dA•) that will abstract a hydrogen atom from the protein-bound methyl group. This yields to a neutral, carbon-centered radical that will attack the carbon C2 of A2503 in rRNA. By electron reorganization, the methyl group bound on the MTase will finally be transfered to the carbon C2 of A, forming m2A in rRNA (figure 22).

(40)

39 Figure 22: Mechanism of m2A2503 formation in 23S rRNA by the RlmN protein, an MTase of the radical SAM superfamily (123).

3.1.4 The FAD/NAD(P)-binding protein superfamily

The enzymes belonging to this family present a typical Rossmann fold with six parallel β- strands (6↓5↓4↓1↓2↓3↓), and the co-factor binding site formed by the loops following strands 1, 2 and 3. There is only one protein from this superfamily known to catalyze RNA methylation, TrmFO, a tRNA: m5U54 MTase found in Bacillus subtilis (124, 125).

(41)

40

3.2 tRNA methylation in Archaea

In 1977, Carl Woese and George Fox used ribosomal RNA sequences to draw the phylogenetic tree of all living beings. Contrary to the previously accepted dichotomial classification into Prokaryotes and Eukaryotes, they showed that comparison of 16S or 18S rRNA defined not two but three different domains of life, that they named Archaebacteria, Eubacteria and Urkaryotes (126). At that time, the archaebacterial domain, now called archaeal domain, was solely represented by some methanogenic “bacteria”, a class of anaerobe organisms possessing a metabolism based on carbon dioxide reduction to methane. The following years witnessed the discovery of a plethora of new archaeal species, and further phylogenetic analysis permitted to classify these new species into four different phyla, namely the Crenarchaeota, Euryarchaeota, Nanoarchaeota and Korarchaeota.

The majority of the archaeal species known to date belongs to the Crenarchaeota and to the Euryarchaeota, and some of them present extreme modes of living, being adapted to environments usually considered as highly deleterious for life. Indeed, most of the crenarchaeal species were isolated from extremely hot and recently also from extremely cold environments, such as hot springs, geysers, solfataras, or from polar seas or abysses. These organisms are called thermophiles or hyperthermophiles if adapted to high temperatures, and psychrophiles if adapted to cold temperatures. The euryarchaeal phylum contains physiologically diversified organisms. As in the Crenarchaeota, many euryarchaeal species are thermophiles or hyperthermophiles, but this phylum also contains organisms adapted to other extreme parameters, such as pression (piezophilic organisms), salt concentration (halophilic organisms), and pH (acidophilic organisms if adapted to low pHs, alkaliphilic organisms if adapted to high pHs). This phylum also includes methanogenes, a group of archaea whose metabolism produces methane gaz. The Nanoarchaeota phylum contains only one genus, Nanoarchaeum, a small parasitic archaea living on the surface of the cells of the crenarchaeota Ignicoccus. The Korarchaeota have been discovered by 16S rRNA analysis of organisms living in hot springs, but their existence could not be confirmed until very recently, by the sequencing of the genome of ‘Ca. Korarchaeum cryptofilum’ (127). Finally, recently, the existence of two other archaeal phyla, the Thaumarchaeota and Aigarchaeota have been proposed, based on the results of genome sequencing (128). The relations between the archaeal phyla are shown in figure 23.

(42)

41 Figure 23: Phylogenetic three of Archaea, based on the comparison of 16S rRNA sequences. Picture from Brock, Biologie des micro-organismes.

Among the most studied archaeal organisms are Sulfolobus species (crenarchaeota) and Thermococcus species (euryarchaeota). Sulfolobus species have been isolated from sulfur-rich, hot acidic spings. These organisms have a spherical lobed morphology, and they grow optimally at temperatures around 80°C, and at pH between 1 and 5. They are aerobic chimiolithotrophs, oxydising elementar sulfur into sulfuric acid. The genomes of some Sulfolobus species, including S.

solfataricus and S. acidocaldarius, have been sequenced (129, 130), and they consist in one circular chromosome of 2.99 milion and 2.22 milion bp respectively, with a GC content of 37%. While S.

solfataricus genome contains 200 insertion sequence elements, and many putative non autonomous mobile elements, no such elements could be identified in the genome of S. acidocaldarius, which makes it a more genetically stable strain. However, S. acidocaldarius genome lacks some of the genes for sugar transporters, limiting thus its growth to a narrow range of carbon sources.

(43)

42 Thermococcus sp. were also isolated from hot springs, but contrary to Sulfolobus sp., they are strictly anaerobic organisms. Their morphology consists of spherical cells with many polar flagella.

Thermococcus sp. are chimioorganotrophs, growing on organic substrates, in presence of elemental sulfur which is reduced to hydrogen sulfide. Their optimal growth temperature is comprised between 75°C and 95°C, and their optimal pH of growth is 7. The genome of Thermococcus kodakaraensis has been sequenced (131), consisting of one single circular chromosome of 2.09 million bp, with a GC content of 38%. No extra chromosomal elements were detected in this organism.

Although informations about the physiology and genetic organization of archaeal organisms are available, the molecular mechanisms taking place in these organisms are still less understood than in Bacteria or in Eukarya. This lack of data is well illustrated for the tRNA modification machinery, since only eight different tRNA MTases have been characterized in Archaea. These MTases differ in term of structure, function or in their mechanism.

These enzymes can be classified in three different arbitral groups, namely mono-methylating (I) or di- methylating (II) enzymes acting on the bases of nucleosides, or ribose-methylating enzymes (III). The different modified nucleosides formed by the characterized archaeal tRNA MTases are described in figure 7.

3.2.1 Base mono-methylating tRNA MTases

Five of the known archaeal tRNA MTases belong to this arbitral group.

i) The tRNA (m1G37) MTase

The modified nucleoside m1G is found at position 37 of tRNAs specific for leucine (CUN codon, N is one of the four canonical nucleosides), proline (CCN) and arginine (CGG) from the three domains of life (132). A strong evolutionary pressure must have been maintained against an unmodified G37, since almost all tRNAs containing a guanosine in position 37 sequenced to date are methylated on position N1 to form m1G37 (133). In Archaea, this modification is catalyzed by Trm5p, a tRNA MTase from the RFM superfamily.

Homologs to Trm5p are found in Eukarya, but not in Bacteria, where the same modification is catalyzed by TrmD, a SPOUT MTase unrelated to Trm5p (134). This phylogeny illustrates that m1G37 formation in tRNAs from organisms from the three domains of life results of convergent evolution. It was shown that this modification is not indispensable in E. coli nor in

(44)

43 yeast, even if the mutants present severe growth defects, but is essential for growth in Streptococcus pneumoniae (135). Thus it seems that the presence of m1G37 in tRNAs exerts a strong beneficial effect on its function. It was later shown that m1G37 exerts its beneficial effect by preventing frameshifting on the ribosome.

ii) The tRNA (m5U) MTase from Pyrococcus abyssi

The 5-methyluridine (m5U), also called ribothymidine (rT) is found in tRNAs and rRNAs in most eukaryal and bacterial organisms. In tRNAs, m5U is formed at position 54 of the so-called T loop. It is catalyzed by Trm2 in eukaryotes and TrmA in bacteria, two homologous enzymes from the RFM superfamily (136, 137). In contrast to eukaryotes and bacteria, m5U54 is rarely found in archaeal tRNAs. Its presence in P. abyssi tRNAs is due to the tRNA specific MTase PabTrmU54 (138). Surprinsingly, while PabTrmU54 is specific for tRNA methylation, phylogenetic analysis showed that PabTrmU54 is more closely related to the bacterial rRNA (m5U) MTase RumA than to the tRNA MTases Trm2 and TrmA. This discovery led to the hypothesis that the PabTrmU54 gene would have been acquired by P. abyssi or one of its ancestors by horizontal gene transfer from a bacterial donor.

iii) The tRNA (m1A) MTase from P. abyssi

The modified nucleoside 1-methyladenosine (m1A) can be found at several positions of various tRNAs, namely positions 8, 9, 14, 22, 57 and 58, m1A58 being the only one found in tRNAs from organisms belonging to the three domains of life. The presence of m1A in position 57 is specific of archaeal tRNAs, m1A57 being an obligatory intermediate in the biosynthesis of 1-methylinosine (m1I) (139). In yeast, this modification is catalyzed by a heterotetrameric enzyme composed of two evolutionary related subunits Gcd10p and Gcd14p. The Gcd10p subunit is responsible for tRNA binding, while Gcd14p binds the AdoMet and catalyzes the methylation reaction (140). In thermophilic bacteria and hyperthermophilic archaea, m1A58 formation is catalyzed by an enzyme homologous to the yeast Gcd14p MTase, but this enzyme is a homotetramer of a Gcd14p-like protein. The enzyme from the bacterium Thermus thermophilus is site specific, acting only at position 58 of tRNAs, while the P. abyssi enzyme is region specific, acting on both positions 57 and 58 of tRNAs (112). m1A58 in tRNA makes an unusual reverse Hoogsteen pair with T54 (see figure

Références

Documents relatifs

La fusion de protoplastes permet le croisement entre deux espèces éloignées...On obtient des protoplastes à partir de cellules végétales dont la paroi a été dégradée par

In this study, we compared Abbott RealTime HIV-1 assay (m2000sp/m2000rt Abbott Molecular) which is rou- tinely used in our laboratory along with the Generic HIV-1 viral load

As seen in Figure 1, the fluorescent signal from the assay employing strepta-QD obtained in the course of signal accumulation is more than two orders of magni- tude stronger than

Stress signaling then leads to stress-adapted gene expression by directly or indirectly affecting chromatin structure via DNA methylation, histone tail modifications such

In this work we present CIN-102 as a promising broad-spectrum antibacterial agent acting also against multi- resistant bacteria, this novel agent might have as well a

long-range scanning, the selection of the start codon is carried out within a same structural core composed of the small ribosomal subunit, mRNA, methionylated initiator tRNA

patterns in species turnover (i.e., change in species composition, or species replacement) along environmental gradients: peat bog plant species are divided into two clusters

We developed a loop-mediated isothermal amplification (LAMP) assay for a simple, rapid and sensitive detection of the four most com- mon CTX-M groups, namely CTX-M groups 1, 2, 8