Thesis
Reference
Translational control of the iab-8 “lncRNA”: a CNS-Specific transcript from the Bithorax complex involved in fly sexual behaviors
FREI, Yohan
Abstract
The iab-8 transcript, expressed in the CNS, is essential in Drosophila for fertility in both sexes.
After two unsuccessful attempts at generating a reporter line that could recapitulate the pattern of expression of the iab-8 RNA, our project became intertwined with a second project centered on a micropeptide that could be expressed from a version of the iab-8 RNA called msa. Using transfection in tissue culture cells and in vivo transgenic constructs, we characterized the presence of short open reading frames present in the two firsts exons of the iab-8 transcript acting as translational repressors for the micropeptide. Using what we learned, we generated a Gal4 reporter line that allowed for the characterization of the pattern of expression of the iab-8 RNA in different tissues at both the larval stage and in the adult.
These findings led to numerous hypotheses about how the iab-8 RNA may function in reproduction in males and females.
FREI, Yohan. Translational control of the iab-8 “lncRNA”: a CNS-Specific transcript from the Bithorax complex involved in fly sexual behaviors. Thèse de doctorat : Univ.
Genève, 2019, no. Sc. 5396
DOI : 10.13097/archive-ouverte/unige:125556 URN : urn:nbn:ch:unige-1255563
Available at:
http://archive-ouverte.unige.ch/unige:125556
UNIVERSITÉ DE GENÈVE FACULTÉ DES SCIENCES Département de Génétique & Évolution Professeur François Karch
Translational Control of the iab-8 “lncRNA”: a CNS-Specific Transcript from the Bithorax Complex Involved in Fly Sexual Behaviors
THÈSE
présentée a la Faculté des Sciences de l’Université de Genève pour obtenir le grade de Docteur ès sciences, mention biologie
par Yohan Frei
de France
Thèse n°5396 Genève
Atelier d’impression numérique-Repromail 2019
UNIVERS|TÉ DE GENÈvr
raculrÉ DEs scrENcEs
DOCTORAT ES SCIENCES, MENTION BIOLOGIE Thèse de Monsieur Yohan FREI
intitulée
<<Translational Control of the iab-Ù "lncRNA":
a GNS-Specific Transcript from the Bithorax Gomplex lnvolved in Fly Sexual Behaviorsn
La Faculté des sciences, sur le préavis de Monsieur F. KARCH, professeur ordinaire et directeur de thèse (Département de génétique
et
évolution), Monsieurl.
RODRIGUEZ, professeur ordinaire (Départementde
génétiqueet
évolution), MonsieurR.
MAEDA,docteur
(Département de génétique et évolution), Monsieur S. GOODWIN, professeur (Department of Physiology, Anatomy and Genetics, Oxford University, Sherington, United Kingdom), autorise I'impressionde la
présente thèse, sans exprimer d'opinion sur les propositions qui y sont énoncées.Genève, le 14 octobre2019
Thèse - 5396 -
Le Doyen
TABLE OF CONTENTS
RÉSUMÉ ... 9
ABSTRACT ... 13
INTRODUCTION ... 17
1-INTRODUCTION TO THE HOMEOTIC GENES ... 17
The nature of the homeotic genes ... 17
Role and organization of the homeotic genes ... 18
Drosophila, a classical model for Hox gene study ... 21
2-THE BITHORAX COMPLEX IN DROSOPHILA ... 24
Dissecting the BX-C, an Ed. B. Lewis narrative ... 24
Introduction to the Bithorax complex ... 27
The open for business model ... 31
Transcription in the Bithorax complex ... 34
3-THE BX-C AND THE IAB-8 LONG NON-CODING RNA ... 36
Long non-coding RNAs in the BX-C ... 36
Specific regulation implicating the iab-8 transcript ... 38
The sterility phenotype involving the iab-8 ncRNA ... 39
The male sterility phenotype and the muscle of Lawrence ... 40
iab-8 ncRNA and CNS ... 41
4-THE IAB-8 NCRNA, A POTENTIAL SOURCE FOR MICROPEPTIDES ... 42
The exon-8 miPep, introduction to micropeptides ... 42
Example of Two micropeptides during development ... 43
5-THE IAB-8 NCRNA REPORTER LINE PROJECT ... 45
6-OPEN READING FRAMES IN REGULATING TRANSLATION ... 49
Upstream and overlapping ORF for CDS regulation ... 51
Examples of uORF regulating genes translation ... 52
Short ORF regulation in Drosophila ... 55
RESULTS ... 58
1- IAB-8 NCRNA, CHARACTERIZATION OF A CRYPTIC TRANSLATIONAL REGULATORY ELEMENT ... 58
Inserting a Gal4 coding sequence within exon-2 of the iab-8 ncRNA ... 58
2-THE IAB-8 NCRNA EXON-8 MICRO-PEPTIDE EXPRESSION ... 62
3-ATTEMPT TO UNLOCK MCHERRY INHIBITION BY DELETING EXON-2 ... 65
Conclusion of the in vivo study ... 69
4-USING S2-CELL TRANSFECTION TO IDENTIFY THE TRANSLATIONAL REGULATION MECHANISM ... 71
4.1 Translation repression or translation activation? ... 74
4.2 Repressive mechanism cannot be linked to secondary structure ... 80
4.3 Translation repression regulated by uORFs ... 89
4.4 Validation in Drosophila ... 101
4.5 Study on the iab-8 RNA potential for peptide ... 105
uORF expression in Drosophila ... 105
5-CONSERVATION OF ORF1&2 ... 108
ORF 1 conservation ... 108
ORF 2 conservation ... 109
6-GENERATION OF THE IAB-8-EXON-1-PROMOTER PLATFORM ... 112
7-STUDY OF ORF1 AND ORF2 IN THE IAB-8RNA ... 117
8- IAB-8RNA5’ ALTERNATIVE START SITE ... 123
9.2 Expression pattern in adult CNS, the Gal4-62 line ... 145
9.3 Projection on the reproduction tract with the Gal4-62 line ... 151
9.4 Expression pattern in larvae, the Gal4-74 line ... 158
9.5 Innervation of the terminalia in the Gal4-74 line ... 163
9.6 Using trans-Tango with Gal4-74 to determine downstream connection in males ... 165
GENERAL DISCUSSION ... 167
1-ANALYSIS OF THE IAB-8RNA FIRSTS EXONS ... 167
2- IAB-8RNA PEPTIDES ... 171
3- IAB-8RNA NEW PATTERN OF EXPRESSION ... 176
iab-8 RNA expression in larvae ... 176
iab-8 RNA expression in the adult ... 179
Innervation of the digestive tract ... 181
Connections on the adult terminalia ... 182
CONCLUSION ... 190
TABLES OF FIGURES ... 192
ABBREVIATIONS: ... 196
MATERIEL AND METHODS ... 199
APPENDIX ... 232
BIBLIOGRAPHY ... 242
Acknowledgements
GIBBS
-Werd best drop canvas, sirl JACK
-She can hold a bit longer.
The wind picks up, howling.
(Jack smiles) GIBBS (shouts)
-What's in your head to put you in such a fine mood?
JACK (shouts) -We're catching up!
Jack turns back to the sea, enjoying himself.
Gibbs stares at him like he's a crazy man.
Pirates of the Caribbean: The Curse of the Black Pearl (2003)
I would like to thank the external members of my thesis committee, Pr.
Stephen Goodwin from Oxford University, specialist in the Drosophila reproductive tract neurobiology, and Pr. Ivan Rogriguez, whose laboratory works on neurobiology in mice and Drosophila, who is also is our head of
department. They both accepted to take on their busy schedule to be present for me and to evaluate my work, come to my Defence, challenge me and judge me; I consider this as a huge favor. I was very happy when they both accepted to become members of the jury for my Defence. I would like to thank Pr. Goodwin personally; during his stay in Geneva, our inspirational conversation, his natural positive energy and all his suggestions about what to do with our results really dug out my scientific motivation tamed by 2 month of preparation for the D-day.
I really appreciated his presence at this moment. I would also like to thank Pr.
Rodriguez, he evaluated me during my master and he and François gave me my Master degree 6 years ago. I really wanted him to accept becoming one of the jury member of my thesis defense, simply because it felt important to me to show my progression to somebody that already evaluated my work in the past, I
I would like to thank François, my PhD director. First of all, he took the risk to accept me as a PhD student in his laboratory and trusted me, and I would like to first thank him for that. During all my PhD projects, I could of not have dreamed of a Boss as involved in my work as he was, and I know that this is rare and was energy consuming for him. Despite all our “electric” conversations, he always considered my scientific ideas as worthy as any, always encouraged me to give my opinion, and then challenged them. This contributed in shaping my scientific skills and mindset and motivated me to always go further. He had to deal with my stubbornness and originality, thinking back I feel very thankful that François had the patience to deal with me. I hope I was not as hard on his back as a crazy wild horse and that the work I produced under his supervision was worthwhile for him. François, I do not regret all these years, I hope you don’t either.
I would like to thank Rob Maeda, who supervised my work. Rob was very supporting and always available. I feel bad that I was so demanding to him, he helped so much and even if sometimes I was not sure about a hypothesis, Rob always took it seriously and helped me developing it. Rob helped a lot for the project and during the writing of the thesis, and it is partially thanks to him that I was able to do what I did. Thank you Rob for your energy, original ideas and motivation, I will never forget it.
I would like to thank Dragan, who supervised my master in the lab.
Dragan was a bit like Dr Jekyll and Mr Hide, a party monster during the night and a serious scientist during the day. I had quality time in both of these circadian periods with him. He always pushed me and even after he was not in charge of me, he often proposed to help me in my project.
I would like to thank Fabienne, she was very helpful during my “injection”
period. She also helped a lot for the plasmid design, genetic engineering in general, and also she provided lots of primers that saved me a lot of time. Thank you for always being available and for helping me in my project.
I would like to thank Annick, for her positivness, her help when I wanted to design new experiments and didn’t know how to do it, and also for just asking, from time to time, if everything was alright.
I would like to thank Daniel, Clement, Virginie, Elodie and Nicolas. They sometimes listen to me, gave valuable opinion about my experiments and also were available for casual conversation and jokes, which helped me a lot to cope with the ups and downs of scientific research. They were both good friends and colleagues during my stay. A special mention to Clement, who taught me to work on tissue culture cells, and was a good desk neighbor.
I would also like to thank Mikael and Bastien, the “squirrels” of the lab that did their master with us. They bring joy and freshness. Everybody is always happy to see a squirrel, time stops when you see one and happiness comes.
I would like to thank Pr. Roman Ulm and Pr. Thomas Schalch, the
members of my TAC committee. Their involvement and guidance in my project was very helpful, thank you for your implication in my work.
I would like to thank Javier and Ghyslaine, this dynamic duo sparkled the laboratory ambiance with original anecdotes and surprises.
I would like also to thank Jorge, Benjamin and Eva for their excellent technical support but also for their good contribution in the general good atmosphere in the lab.
I would like to thank our bio-imaging platform specialists, Christoph and Jérôme. Both of you were particularly available to help in the microscope
settings when I began using the confocal, and thanks to you, I was able to take good pictures, which is very important.
I would like to thank Dani Garaulet, during a brief passage in the lab, he showed me how to dissect adult Drosophila brains, and he saved me a lot of time and frustration. Despite the fact that we are working on the same subject and that we could be in competition, Dani didn’t care and provided great help, this is very rare and I would like to thank him for this.
I would like to thank our administrative team; they were always available for any question or demand. I would like to thank all the secretaries of the
“secretariat des étudiants”, the “secretariat de Biologie”, and of course Mme Corinne Mathey-Ebener, the administrative manager of our department.
I would like to thank Pr. Emi Nagoshi and the members of her laboratory, specifically Luca, Pedro, Anatoly and Rafael. They provided some antibodies, they set up a behavior assay for me and they provided some Drosophila lines.
Thank you for being so nice and available.
I would like to thank the “Internatonal PhD program in Life Sciences”
doctoral school and the FNS for funding my research. I was able to do my PhD in good conditions thanks to these two important institutions.
I would like to thank my closest friends from the University, principally the “manger” group, Dr. Alexis Assens, Stephane Hagmann, Dr. Pedro Machado, Ivan Dalvit and Quentin Dietschi. It is important to have true friends in a place of work. Our daily meeting during lunch breaks helped me a lot to change my perspective on things, they also gave very good advice and ideas about the
project. They were also very helpful when I needed something that we didn’t had in the lab such as some binoculars, antibodies, restrictions enzymes etc…
I would like to thank equally all my very good friends from Geneva, and specifically Anthony Jos, Amine Ouadahi (the NN secret group), Alban Guardien, Alban Damoneville, cpain de moi Brian Rocher, Prim Baert, Geofrey Debril, Sanja Vuckovic, Stephanie Taillé, Riccardo Bocchi, Yvan Gonzales, Lorris Cavagliotti, Alexis Kauffmann, L’anglais, La zone and L’embrouille. They are my friends, and they were very supportive in my PhD, thanks a lot guys!
Of course I would like to thank my mother, my step-father, my two brothers and my sister. My family helped a lot to make my life easier, always giving me a hand and being very supportive during my studies, I am very lucky to have such a family. My in-laws were also particularly helpful, Urs, Silke, Elisabeth and Dominic, thank you guys!
Above all, I would like to thank Eva, my wife for her patience, her positive energy and implication. Thank you for being my voice of reason, for helping me find the answers to my questions, and for giving me the courage to try. And thank you for giving me the most beautiful baby boy, our little Paul. You created a home and helped me see that there is a bright future waiting for me.
Résumé
Les gènes homéotiques sont à l’essence même de notre développement;
coordonnant l'expression des gènes au cours de l'embryogenèse, ces facteurs de transcription sont responsables du plan d’organisation des êtres vivants. Les gènes homéotiques ont été décrits pour la première fois chez la drosophile, principalement grâce aux travaux de Ed. B. Lewis sur le complexe Bithorax (BX- C) il y a près de 70 ans. Chez la drosophile, le BX-C est une région génomique de 300 kb située sur le troisième chromosome qui comprend trois gènes
homéotiques appelés Ubx, abd-A et Abd-B et qui spécifient l’identité des
segments qui forment la partie postérieure du thorax, de tout l’abdomen et des structures anales et génitales.
La régulation de ces gènes le long de l'axe A/P est médiée par 9 domaines cis-régulateur qui sont alignés le long du chromosome de façon colinéaire avec les segments qu’ils spécifient. Même si la présence de plus d’un gène homéotique est souvent observée dans un segment donné, la plupart du temps aucune cellule n’exprime deux gènes homéotiques adjacents en même temps. Car le gène
homéotique le plus postérieur réprime activement l'expression des gènes
homéotiques les plus antérieurs, cette caractéristique est définie comme la règle de dominance postérieure.
Cependant, comme toute règle a son exception, dans le système nerveux central, ABD-A n’est pas réprimé par ABD-B mais par l’activité d’une unité de
94kb et couvre la région inter-génique entre Abd-B et abd-A. Fait intéressant, cette ARN est épissé, en prenant un exon dans chacun des domaines cis-
régulateurs ; après épissage, la taille de cet ARN est de 2,8 Kb. Aux tout premiers stades du développement, ce transcrit est exprimé dans l'épiderme dans les PS13 et PS14, mais à partir du moment ou le système nerveux central (SNC) se
développe, son patron d’expression devient rapidement limité aux PS13 et PS14 du SNC.
Dans l’intron qui sépare les exon-5 et 6 l’ iab-8 lncRNA code pour un micro ARN appelé miR-iab-8. Il a été démontré que ce miARN cible plusieurs facteurs de transcription, tels que abd-A et Ubx, exd et htx. Ce micro ARN joue un rôle capital pour la fertilité des drosophiles. En effet alors que les males mutants miR iab-8 sont incapables de recourber suffisamment leur abdomen pour
copuler, chez les femelles, les ovocytes n’arrivent pas à transiter le long de l’oviducte.
Le patron d'expression du transcrit iab-8 a été principalement caractérisé par des expériences d’hybridations in situ. Cette technique est particulièrement difficile a réaliser au cours des stades larvaire, pupaux et chez les les mouches adultes et on ne connaît que très peu les structures du système nerveux qui expriment le iab-8 ncRNA ainsi que son implication dans le contrôle des comportements de reproduction. Afin d’approfondir nos connaissance sur ce sujet, nous avons tenté d’établir une lignée de drosophile qui récapitulerait le patron d’expression de ce transcrit. Dans un premier temps, nous avons utilisé
des séquences de gènes rapporteurs dans l'exon-2 ou dans l'exon-3.
Malheureusement ces constructions n’ont produit aucun signal, signifiant que les gènes rapporteurs n’étaient pas traduits depuis ces sites.
Réalisant une étude parallèle sur l'expression d'un micropeptide (miPep) présent dans le dernier exon du iab-8 lncRNA et d'un variant de ce transcrit appelé msa, des membres de notre laboratoire ont découvert que le miPep pouvais être exprimé à partir de l’ARNm du msa mais pas depuis celui de l’ iab-8 lncRNA. En découvrant que ce peptide était différentiellement exprimé depuis les deux transcrits, nous avons alors décidé d'étudier la nature du mécanisme qui en empêchait l'expression dans le SNC.
En utilisant des cellules en culture comme modèle expérimental, nous avons identifié la présence de plusieurs ORF dans les deux premiers exons du transcrit iab-8 lncRNA, qui agissent comme répresseur traductionnel en aval. A l’aide de constructions transgéniques, nous avons confirmé ces résultats in vivo et conclu que l’expression du miPep à partir de l’ARN messager iab-8 lncRNA, était imhibée par la traduction de plusieurs « upstream ORF ».
Dans l’espoir de déterminer la fonction biologique de ces uORF , nous avons généré une nouvelle plate-forme « d’échange de cassette » médiée par la ΦC31 nous permettant de modifier à volonté la région contenant le promoteur et l’exon-1 de l'ARN iab-8 lncRNA. Grâce à cette plate-forme, nous avons généré plusieurs fusions de ces ORFS avec la protéine fluorescente « GFP », mais ces constructions ne produisent pas de niveaux détectables de « GFP ».
Nous avons ensuite décidé d'utiliser la plate-forme d’injection permettant de modifier l’exon-1, pour générer une lignée reporter pour l'ARN iab-8 lncRNA,
comme nous le désirions au début de mon Doctorat. Nous avons donc inséré la séquence du gène rapporteur Gal4 en amont des séquences contenant les uORFs répressives de l'exon-1, et intégré cette construction dans notre plateforme. A l’aide de différentes constructions de gènes « UAS-reporter », nous avons identifié un groupe d'environ 90 neurones iab-8-lncRNA+ situés à l'extrémité postérieure de la chaine nerveuse ventrale des embryons larves et mouches adultes. En utilisant des techniques de marquage postsynaptiques, nous avons montré que les neurones iab-8-lncRNA+ établissent des connexions
postsynaptiques avec des neurones du SNC. Nous avons également identifié que les neurones iab-8-lncRNA+ envoient des projections à travers le nerf abdominal médian et innervent presque tous les organes qui composent l’appareil
reproducteur de la mouche, aussi bien chez la femelle que chez le mâle
établissant ainsi un lien physique entre les neurones qui expriment le miR-iab-8, critique pour la fertilité et le tractus génital.
Abstract
The homeotic genes are at the very essence of our development,
coordinating gene expression during embryogenesis to specify the stuctures that form along our antero-posterior axis (as well as the proximo-distal axis of the tetrapod limbs). The homeotic genes were first described in Drosophila through the work of Ed. B. Lewis on the Bithorax Complex, nearly 70 years ago. In
Drosophila, the Bithorax complex (BX-C) is a ~300 kb genomic region located on the third chromosome; this complex contains three homeotic genes called Ubx, abd-A and Abd-B that control the segments forming the posterior thorax and abdominal segments. Their regulation along the A/P axis is mediated by nine cis- regulatory domains aligned along the chromosome in a fashion collinear with the segment whose development they control. Although more than one homeotic gene can be expressed within the same segment, most cells do not co-express two adjacent homeotic genes at the same time. This is because the more posterior Hox genes repress the expression of the more anterior ones. This generality is known as the posterior dominance rule.
However, like with many rules, one exception has been characterized. In the central nervous system (CNS), downregulation of ABD-A is not mediated by ABD-B. Instead, ABD-A is repressed by a long, non-coding RNA, called by the iab-8 lncRNA. The iab-8 lncRNA is 94kb-long, spanning the entire intergenic region between abd-A and Abd-B. Interestingly, this transcript is spliced, capped and polyadenylated. After splicing, the transcript size is reduced to only 2.8kb. At the early stages of development, this transcript is expressed in PS13 and 14 of
the posterior epidermis. From the time the nervous system develops, however, its expression seems to become restricted to, and maintained in, PS 13 and 14 of CNS.
The iab-8-lncRNA is the template for the production of a micro RNA called miR-iab-8. The sequence of this miRNA is located in the intronic region between exons 5 and 6 of the iab-8 lncRNA. It has been demonstrated that miR-iab-8 targets multiple transcription factors such as abd-A and Ubx, htx and exd. Flies mutant for the iab-8 lncRNA are completely sterile in both sexes; female flies do not lay eggs and males cannot bend their abdomen to copulate.
The pattern of expression of the iab-8 transcript was mainly determined by in situ hybridization experiments. Due to the difficulties in performing in situ hybridization in larval and adult tissues, little is known about its expression at later stage of development and hence its role in reproduction. In order to tackle these questions, we decided to generate a reporter line that could recapitulate its pattern of expression. Using two ΦC31 mediated landing platforms, we inserted reporter gene sequences into the native sequence of both iab-8 lncRNA exons 2 and 3. Unfortunately, we were not able to document reporter gene expression in either of these two lines.
In parallel to these studies, we started investigating the expression of a potential micropeptide (miPep) present in the last exon of the iab-8 lncRNA and a splice variant of this transcript, called msa. Members of our laboratory
discovered that the miPep protein could be made from the msa template but not from the iab-8 lncRNA. Discovering that this peptide was differentially expressed
between the two transcripts, we decided to study the nature of the mechanism that prevented miPep expression from the iab-8 lncRNA in the CNS.
Using tissue culture cells as a model, we identified several upstream ORFs in the iab-8 lncRNA exons 1&2 that act as translational repressors. Using
transgenic constructs, we confirmed these results in the fly and we concluded that miPep expression, in the iab-8 transcript was prevented by the translation of several uORFs.
To investigate the biological function of these uORFs, we generated a new ΦC31-mediated cassette exchange platform to modify the exon-1/promoter region of the iab-8 lncRNA within the BX-C. Using this platform, we generated multiple ORF-GFP knock in fusions, but unfortunately, our preliminary results with these lines do not show detectable levels of GFP.
Given that the original goal of my project was to generate a reporter line for the iab-8 lncRNA, we then decided to use the exon-1 platform and what we learned about uORFs to create this reporter. For this, we inserted the Gal4 coding sequence in exon-1 of the iab-8 RNA, upstream of the repressive uORF and integrated this into our new exon-1/iab-8 RNA promoter platform. Using this line in combination with different UAS reporter gene constructs, we
identified a cluster of ~90 iab-8-lncRNA+neurons located at the posterior tip of the VNC both in larvae and adults. Using postsynaptic labeling techniques, we show that the iab-8-RNA+neurons make postsynaptic connections with neurons within the CNS and also identify iab-8-RNA+neuronal projections that travel through the Abdominal Nerve Trunk to innervate almost all the organs of the internal terminalia (including most of the reproductive tract) in both males and
females. These physical links between the neurons that express the iab-8 lncRNA and the reproductive organs they innervate can now be investigated for their role in fertility and their modulation of characteristic physical changes that occur after mating.
INTRODUCTION
1- Introduction to the homeotic genes
The nature of the homeotic genes
The homeotic genes are important during embryogenesis for all bilaterians. These genes are usually expressed from the gastrulation stage throughout the rest of the life of the organism (Iimura & Pourquié 2006). Each homeotic gene is responsible for the correct patterning of significant body portions (reviewed in (Deschamps & Duboule 2017) ). These genes have
different roles, but they all share a similarity in their sequence; they contain the motif coding for the homeobox domain. This domain allows the protein to bind to DNA, and trigger the expression or repression of other genes. This is why homeotic genes, or Hox genes belong to the category of genes called
Transcription Factors (TF) (reviewed in (Gehring 1987) ). Each TF can affect the transcription of one or multiple genes by binding to specific DNA sequences called enhancers or repressors. It is the binding of the transcription factors to their enhancer/repressor sequences that triggers the series of molecular mechanisms controlling to the transcription of the associated gene (see Fig. 1).
Fig. 1 Enhancer mediated gene activation by Transcription Factor activity. On this graphic representation, the horizontal lines represent DNA sequences. Each specific sequence is color- coded, in blue, the enhancer sequence, in red, the promoter sequence and in green, the gene sequence. The black line represents non-specific sequences. The blue box represents a Transcription Factor, and the black arrow indicates the recognition and pairing of the TF to its specific binding sequence, the enhancer (top). The action of this binding activates the promoter (blue arrow) leading to transcription (red arrow) of the gene (green wavy line).
Role and organization of the homeotic genes
During development, the homeotic genes act as molecular master
regulators; they are responsible for the correct expression of thousands of genes.
The misexpression of one Hox gene can have a significant impact on the future organism, often resulting in a homeotic transformation in which a body part is replaced by another body part that normally develops elsewhere on the body of the organism (Bateson 1894). It was by analyzing these defects that the scientific community was able to determine the function of these genes.
Homeotic gene expression is tightly regulated along the anterior-
regulated along the proximo-distal axis of the developing limbs. During
embryonic development, the Hox genes are expressed in a specific pattern, this triggers the localized expression of specific subsets of genes leading to the formation of the different body parts (Reviewed in (Luo et al. 2019) ).
Interestingly, the homeotic genes are not randomly positioned on the chromosome. First, they are grouped in clusters, where there can be more than one cluster, depending on the phylum. Secondly, along the DNA sequence, the order of the Hox genes seems to be important. Indeed, their expression follows a temporal regulation that results in a spatial pattern in vertebrates (Izpisúa- Belmonte et al. 1991; Lewis 1978). This expression follows a colinearity with their alignment along the chromosome. For instance, in mice, there are 39 Hox genes distributed into four different Hox clusters called HoxA, HoxB, HoxC, and HoxD (Reviewed in (Luo et al. 2019) ). Each gene is named with a letter and a number, the letter corresponds to the cluster name, and the number is attributed by its position in the complex (from 3’ to 5’ relative to their transcription; see Fig. 2A). During limb formation, for instance, on Fig. 2B, we can see that the more 5’ the Hox gene is, the more distal and late its expression pattern is. To illustrate that Hox genes are essential for the correct development of specific body
portions, as described in Fig. 2C, one can see that some deletions of Hox components exhibit missing body structures in the future animal. The more distal Hox gene deletions result in missing the whole digit anatomy, while medial Hox gene deletions result in the missing of the central parts of the limb
(Reviewed in (Denis Duboule 2007)).
Fig. 2 Mammalian Hox architecture, expression and influence. (A) General representation of the mammalian Hox gene clusters. The black line represents the DNA sequence, and the green boxes the Hox genes. 39 Hox genes distributed into four different Hox clusters called HoxA, HoxB, HoxC, and HoxD. Each gene name is coded with a letter and a number, the letter corresponds to the cluster name, and the number is attributed by the position along the 3’ to 5’ DNA sequence relative to the Hox genes sequences (adapted from (Mallo 2018)). (B) Graphic representation of mouse limb buds with the patterns of expression of the HoxA and HoxD genes during early and late phases of development. The first top half shows the HoxA cluster, and the bottom half the HoxD cluster (genes are in the center (black boxes)). From left to right different Hox gene expression patterns are represented in the mouse limb bud for early (top) and late (bottom) developmental stages. (adapted from (Denis Duboule 2007)). (C) Graphic representation of the developmental effect of different Hox gene deletions (bottom text) on mouse forelimb
development. From left to right, the first representation shows the correct development of the forelimb. In the second representation, we can see that the more distal anatomy is missing in a HoxA13; HoxD13 deletion. The third representation displays the missing of central parts of the forelimb in the context of the deletion of medial Hox genes. The last representation shows no forelimb development in the deletion of the entire two Hox clusters. S stands for stylopod (upper arm); Z for zeugopod (lower arm) and A for autopod (carpus and digits) (adapted from (Denis Duboule 2007)).
Drosophila, a classical model for Hox gene study
Several animals are widely used by the scientific community for Hox gene studies. The mouse, the frog, and the fruit fly have been the best characterized models (Krumlauf 2018). In order to try to decipher the mechanism that regulates human embryonic development, the advantages associated with the use of a vertebrate model are easy to understand. Studies on such animals allowed the scientific community to understand the mechanism of at least ten human genetic diseases (Reviewed in (Quinonez & Innis 2014) ).
However, the Hox genes were first found in Drosophila melanogaster, and for molecular genetic studies, the fly holds a number of advantages. First, it was the second animal to be sequenced (Adams et al. 2000) after the nematode C.elegans (C. elegans Sequencing Consortium 1998). Now the DNA sequence of several other Drosophila species has been determined and can be accessed on many online platforms (flybase.org, ncbi…). Second, any scientist that decides to choose Drosophila as a model organism can build on decades of results provided by the scientific community. This exceptional amount of knowledge is a plus for planning experiments and interpreting results. Third, Drosophila are easy to grow in the lab; their lifecycle is fast (generation time 10 days) and they produce a large number of offspring, growing on an inexpensive medium. These
advantages allow scientists to explore different hypotheses and create new protocols to generate fast and statistically significant results.
Some biological features now known to be common to most animals were first described in Drosophila. Five of them were eventually awarded by Nobel prizes:
• 1st Nobel Prize: delivered in 1933 to Thomas Hunt Morgan for his work linking the chromosomes and heredity.
• 2nd Nobel Prize: awarded to Hermann Joseph Muller in 1946 for his work on X-ray radiation and DNA mutations.
• The work of the third Nobel prize laureates had a high impact on the developmental biology field, and was attributed to Edward B. Lewis, Christiane Nüsslein-Volhard and Eric F. Wieschauss in 1995 for their discoveries on the genetic control of early embryonic development.
• Bruce A. Beutler and Jules A. Hoffmann were the laureates of the 2011 Medicine Nobel prize for their work on innate immunity.
• More recently, in 2017 Jeffrey C. Hall, Michael Rosbash and Michael W.
Young were awarded the Nobel prize for their discoveries on the molecular mechanisms that control the circadian clock.
(Source : nobelprize.org)
The life cycle of Drosophila melanogaster takes approximately ten days at 25°C. The female fly first deposits an egg of approximately 0.5mm (±0.003) (Markow et al. 2009). After 24 hours, a small larva (~0.5mm) called L1 (Larvae 1) emerges from the egg. This larva feeds and grows for the next four days, passing through L2 (~2.5mm) and finally an L3 forms (~4mm) (Ormerod et al.
2017). At day 5, the larva enters a phase of metamorphosis; it forms a cocoon for
five days called the pupa. During metamorphosis, most of the larval tissues are digested and new tissues are made that lead to the formation of an adult fly that is able to reproduce.
The body plan of a fly is composed of different structures called segments, distributed along the A/P axis. The head (mandibular, maxillary, and labial segments) is followed by the three thoracic segments (T1 T2 and T3). Each of the thoracic segments contains a pair of legs, while the second and the third thoracic segment contain a pair of wings or a specialized set of flying organs called the halters, respectively. Following the thorax, the abdomen of the fly is composed of seven segments (A1 to A7) in females and six in males (Reviewed in (A et al.
2004) ). During the ten days of development, homeotic genes are expressed in different tissues to coordinate the expression of genes essential for the correct development of the fly. The homeotic genes in Drosophila are located in two different clusters, both on the third chromosome, called the Antennapedia complex (ANT-C) and the Bithorax complex (BX-C). The ANT-C is responsible for the development of the anterior segments, while BX-C determines the identity of the segments from the third thoracic segment until the last abdominal segment (Lewis 1978).
2- The Bithorax complex in Drosophila
Dissecting the BX-C, an Ed. B. Lewis narrative
Edward B. Lewis was one of the three laureates of the 1995 Nobel Prize in Medicine for his contribution to developmental biology in elucidating the
components of the Bithorax complex. He also was the first to make the assertion of a spatial colinearity in the BX-C.
In 1946, Lewis started to work on several mutations exhibiting a
homeotic transformation of the third thoracic (T3) toward the second thoracic segment (T2), called bithorax (bx) (Lewis 1992). This mutant displays a second pair of rudimentary wings on the third thoracic segment that usually carries the halteres. First characterized by Bridges in 1915 (as reported (Dobzhansky 1968) ), the bx1 allele was weak and variable. Later, Stern found another mutant more penetrant called bx3 see Fig. 3 (as reported by (Dobzhansky 1968) ). This last mutant is more commonly used due to its relative penetrance (Lewis 1992). By combining different mutations that affect the development of the segments as bx3, pbx (postbithorax) and Ubx (Ultrabithorax), Lewis noticed different types of interactions. This led Lewis to first think that the BX-C regulation was organized in an “operon like” fashion. Some mutations affected one segment (bx3, pbx) while others affected several consecutive segments (Ubx) (Lewis 1963).
Fig. 3 A classical mutations of the BX-C. (A) and (B) display photographs of the dorsal view of Drosophila males. (A) Wild-type male (mutant for held out in order to spread the wings for better appreciation of the thorax and abdomen). (B) abx pbx bx3 mutant fly, one can appreciate the transformation of the third thoracic segment into a duplication of the second thoracic segment (Lewis 1992).
In his 1978 review, Lewis reported his analysis of an impressive number of X-ray induced mutations of the BX-C. By analyzing the phenotype and mapping all his mutants, he discovered that the BX-C is organized into nine segment- specific functions. Lewis noticed that each function was important for
determining the identity of one of the segments that compose the posterior half of the fly. Mutations affecting the identity of different parts of T3 were named abx (mostly active in the dorsal part of T3), bx (mostly active in the anterior part of T3) and pbx (mostly active in the posterior part of T3). Then mutations affecting the first abdominal segment were named bxd. Finally, there were also mutations affecting the identities of the 2nd through 8th abdominal segments that were named respectively iab-2 through iab-8.
By mapping these regions, Lewis noticed that they were organized along the chromosome in a striking order. Each of the mutations that affect the first thoracic segment was located toward the centromere of the chromosome, while mutations affecting more posterior segments mapped progressively more
telomeric along the chromosome. This revealed that there is a spatial colinearity linking the order of the homeotic elements along the DNA sequence and the development of the A/P axis (Lewis 1978).
Lewis also successfully isolated a complete BX-C deletion, lethal at a late embryonic stage. However, by analyzing the tracheal and cuticular patterning of the embryos, he was able to determine that the absence of the BX-C led to all abdominal segments displaying a T2-like identity. Based on this, Lewis
postulated that the thoracic-like segment must be the ground state pattern, and the more posterior the segment is, the more “advanced” from this ground state it is. The second main hypothesis of his work, was that along the A/P axis, as soon as a BX-C “gene” is activated in one segment, it remains active from its anterior limit to the last posterior segment; this as an additive fashion (Lewis 1978).
It was during the 80s that Prof D. Hogness led the molecular analysis of the BX-C, mostly by analyzing the segment-specific mutants published by Lewis in 1978. In 1983, using chromosome walking, the map of the centromeric half of the BX-C was described (Bender et al. 1983). Soon after, the complete map was published in 1985 (Karch et al. 1985). It was from these works that the BX-C was first cloned. This study molecularly confirmed the spatial colinearity rule of Lewis, mentioned above.
BX-C Hox gene identification also happened during the mid-80s. The three
embryonic-lethal BX-C mutations (Sánchez-Herrero et al. 1985; Tiong et al.
1985). In 1983, the Gehring and Kaufmann labs were performing similar work on the ANT-C and identified the Antp transcription unit (Garber et al. 1983; Scott et al. 1983). Relatively rapidly, sequence comparison between Ubx and Antp allowed for the identification of the Antp Homeobox domain. Then, additional DNA hybridization experiments showed the presence of two additional
Homeobox domains in the BX-C that turned out to correspond to the abd-A and Abd-B genes (McGinnis et al. 1984; Scott & Weiner 1984; Regulski et al. 1985;
Maeda & Karch 2009). Finally, the presence of only three Homeobox sequences in the BX-C was confirmed by the sequencing of the whole BX-C region in 1995 (Martin et al. 1995).
Introduction to the Bithorax complex
The BX-C is a 300 kb genomic region composed of three homeotic genes, Ubx, abd-A, and Abd-B. Each Hox genes is responsible for determining the identity of a subset of segments during development. Ubx determines the identity of the segments from the posterior half of the second thoracic segment (T2) to the first abdominal segment (A1), abd-A is responsible from the identity of the second abdominal segment (A2) to the fourth abdominal segment (A4), and Abd-B from the fifth abdominal segment to the posterior end of the fly (see Fig. 4) (reviewed in (Maeda & Karch 2006)).
Fig. 4 Synopsis of the BX-C in D. melanogaster. At the top is a representation of a Drosophila fly from the head (left) to the abdomen (right). The horizontal black line represents the DNA sequence of the BX-C in kilobases. Below and above the limits of the cis-regulatory domains are indicated with brackets and names (above) and by colored boxes (below). The bottom part displays the splicing pattern of the three homeotic genes Ubx, abd-A, Abd-B. The color code associates the Hox gene with its cis-regulatory domains and the segment whose identity it determine. As an example, the domains shaded in red drive Ubx expression during development and are responsible for the correct patterning of the segments associated with the same color.
(adapted from (Maeda & Karch 2006)).
Lewis identified 9 segment-specific functions important for determining segment identity, however the BX-C contains only three homeotic genes (Ubx, abd-A, and Abd-B). One could wonder how the 9 segment-specific functions and the three Hox genes interact to coordinate the identity of 9 different segments of a fly. Thanks to the mutational analysis first described by Lewis (Lewis 1978) and the BX-C cloning in the mid-80s (Bender et al. 1983; Karch et al. 1985), a
model of BX-C gene regulation was first proposed in 1987 (Peifer et al. 1987).
The model proposes that the BX-C is divided into regulatory regions that drive Hox gene expression in a segment specific manner. These portions are called cis- regulatory domains, and they are each believed to control the expression of a single homeotic gene in a specific parasegment. During development, a Drosophila embryo is rapidly segmented into 14 metameric portions called parasegments (PS) along the A/P axis. As shown in Fig. 5A, each parasegment corresponds to a slightly shifted segment being composed of the posterior part of a segment and the anterior part of the next one. The pattern of expression of the BX-C Hox genes is parasegment specific in the posterior half of the embryo, which corresponds to the posterior half of T2, T3 and the abdomen of the adult fly. During development, the abx/bx and bxd/pbx cis-regulatory domains control the expression of Ubx to determine the identity of PS5 and PS6, the iab-2, -3 and - 4 domains drive the expression of abd-A in PS7 to PS9 for segmental identity, and iab-5 to iab-8,9 drive the expression of Abd-B from PS10 to PS14. As an example shown in Fig. 5B, Abd-B homeotic gene is expressed in PS10 to 14. In each parasegment, the pattern of expression of Abd-B is different; this difference in the pattern of expression is believed to be the primary key to segment identity.
For example a mutation removing the whole iab-7 domain will induce an Abd-B pattern of expression in PS12 to one resembling PS11, and causes a
corresponding transformation of A7 to A6 at the expense of A7 identity (see Fig.
5B) (Reviewed (Maeda & Karch 2006)).
Fig. 5 BX-C Hox genes in the embryo. (A) At the TOP, a drawing of an embryo originally from Hartenstein (Hartenstein, 1993). During embryogenesis, the embryo is rapidly segmented into 14 distinguishable parasegments. The segments and parasegments along the A/P axis are placed in the yellow (segments) and white (parasegments) ribbons. Bellow, is represented the pattern of expressions of the three BX-C Hox genes along the A/P axis. Ubx in red/orange (PS5 to PS12), Abd-A in blue (PS7 to PS12), and Abd-B in green (PS10 to PS14). (Adapted from (Maeda & Karch 2006)) (B) Abd-B antibody staining in two dissected embryonic CNS (12-h-old embryos) WT and iab-7sz. The iab-7sz contains a deletion of the whole iab-7 domain One can appreciate the pattern of expression of Abd-B in a gradient along the A/P axis. This difference in the pattern of
expression is believed to play a role in segment identity (adapted from (Maeda & Karch 2009)).
Within each cis- regulatory domain are the individual enhancers and silencers that control homeotic gene expression. However, to keep each domain free from the regulatory influences of neighboring domains, and to allow each domain to control homeotic gene expression autonomously in one parasegment, each domain is flanked by boundary elements that function as chromatin
insulators (reviewed in (Maeda & Karch 2011)).
Following the A/P axis, as soon as a BX-C Hox gene is expressed, it remains expressed in all the posterior segments. The exception to this rule are
gene is expressed from PS5 to PS12 (Akam & Martinez-Arias 1985; Beachy et al.
1985; White & Wilcox 1984), abd-A is expressed from PS7 to PS 12 (Karch et al.
1990; Macias et al. 1990), and Abd-B is expressed from PS10 to PS14 (Celniker et al. 1990; Delorenzi & Bienz 1990; Kuziora & McGinnis 1988). Along the A/P axis, we have successive addition of Hox genes in the ectodermal and neurodermal cells; this is believed to bestow segment identity over the T2 ground state segment hypothesized by Lewis. However, even if in one parasegment, several homeotic genes can be present, no cells co-express high levels of two
immediately adjacent Hox genes at the same time in the epidermis. This
characteristic is defined as the principle of posterior dominance. This means that in the epidermis, a posterior Hox gene represses the expression of more anterior Hox genes (Reviewed in (Gummalla et al. 2014)).
The open for business model
The 1987 BX-C gene regulation model proposes an on/off state of the cis- regulatory domain chromatin conformation to control Hox gene regulation, called the open for business model (Peifer et al. 1987). This model is based on the hypothesis that in each segment, a portion of the BX-C would be in an open (or accessible) chromatin conformation in opposition to the closed (inaccessible) chromatin conformation of the rest of the complex. As shown in Fig. 6A, the chromatin conformation of a domain is in an open state in all-posterior segments starting from the parasegment in which it drives Hox gene expression. One of the best illustrations of the anterior activation of the cis-regulatory domains is,
without a doubt, the work performed in 2000 by the laboratory of W. Bender (Bender & Hudson 2000). In this study, the Bender laboratory isolated a series of P-element lines with insertion all along the BX-C. The P-element construct
contains a β-galactosidase gene driven by the P-element promoter. The P-
element construct drives β -gal expression if an active enhancer is in the vicinity of the insertion site; this type of experiment is commonly referred to as an enhancer-trap construct. On Fig. 6B, we can see that depending on the P-element insertion along the BX-C, the β -gal expression is detected with an anterior border that matches with the activation domain of the cis-regulatory, reflecting the open state of the domain. Of particular note, one can see that even when some constructs landed relatively close to each other, their pattern of expression is shifted by exactly one whole parasegment. This last observation shows that there are distinct boundaries of a domain that prevent interactions between domains. In a recent study (Bowman et al. 2014), the analysis of chromatin marks along the BX-C brings an other evidence for the chromatin state
hypothesis. In this study, they analyzed the trimethylation of lysine 27 of histone H3 (H3K27me3) modifications in nuclei coming from different parasegments.
The H3K27me3 chromatin marker is associated with a closed chromatin conformation (Reviewed in(Maeda & Karch 2015)). Analysis of the chromatin from nuclei isolated from PS4 to PS7 showed a significant reduction of the H3K27me3 marker in the cis-regulatory domains along the A/P axis in agreement with the open for business model prediction.
Fig. 6 The open for business model. (A) Representation of the chromatin state in the BX-C along the larva A/P axis. The oval diagram (left) represents a larva; the parasegments annotated in blue (right) and corresponding segments in black (left) anterior is to the top (H, head). The bottom graduated black line represents the DNA sequence of the BX-C locus in kilobases with the Hox genes indicated below. Above the BX-C DNA sequence the cis-regulatory domains are indicated (brackets). Above the BX-C, nine lines represent the chromatin states in 9 parasegments aligned to the left. In each line, the colored rectangle indicates the open conformation of the chromatin state, while the black line represents the closed state. The red ovals represent the boundaries elements. We can appreciate the sequential opening of the chromatin of each cis-regulatory region in 3’ (x-axis) along the A/P (y-axis). (Adapted from (Maeda & Karch 2015)) (B) Pattern of expression of enhancer-trap lines along the BX-C in embryos. The BX-C is represented by the central multicolored bar. It is important to note that the color-code of the cis-regulatory domains/Hox gene is concordant between (A) and (B), and also with fig.5A and fig.4. Above and below the BX-C, P-element insertions are indicated by triangles whose apex is aligned to their insertion position (bender 2000). The colored areas are grouping the insertions and the antibody lacZ staining pictures (from right to left). The antibody stainings are performed on embryos, the embryos are then cut from the ventral midline and flattened. The black bracket indicates the name of the more anterior parasegment in which signal is first detected (Adapted from (Maeda &
Transcription in the Bithorax complex
At the early stages of embryogenesis, the intergenic regions of the BX-C undergo an active phase of transcription. In the Ubx regulatory region, two different classes of transcripts have been characterized, early transcripts detected transiently at 3-6h of embryogenesis (during the blastoderm stage), and a larger transcription unit in the pbx/bxd region called bxd detected until adulthood (Hogness et al. 1985; Lipshitz et al. 1987; Pease et al. 2013). The bxd RNA seems to play a regulatory role during development, and it has been shown that bxd RNA expression is associated with Ubx downregulation (Petruk et al.
2006).
Transcription of the more posterior iab domains was detected by in-situ hybridization (Bae et al. 2002; Rank et al. 2002; Sanchez-Herrero & Akam 1989).
These transcripts are primarly detected during a transient period at the
blastoderm stage. The pattern of expression of the early transcripts shows that transcripts emanating from each cis-regulatory domain can be detected in a fashion correlating with the parasegment in which the domain is active and regulates Hox gene expression. In the case of iab-6, transcription of these RNAs has been linked to a specific element in the cis-regulatory domain called an Initiator. Initiators element seems to be present in each cis-regulatory domain and are thought to be crucial for domain activation, and thus Hox gene
regulation. Indeed loss of an initiator leads to the loss of all domain function
Bithorax Hox gene activation and expression are usually divided into two phases, the Initiation and the Maintenance phases. The Initiation phase is
essentially the activation of the cis-regulatory domains. This step coincides with the presence of the gap and pair-rule gene products during embryogenesis (Maeda & Karch 2009). Numerous cases have already been listed showing the importance of the gap and pair-rules genes in this regulation. For example, some cis-regulatory domain malfunctions coincide with mutations at predicted
binding sites for these genes (Busturia & Bienz 1993; Ho et al. 2009; Qian et al.
1991; Shimell et al. 1994; Starr et al. 2011), suggesting that the gap and par-rule genes are regulators of the cis-regulatory domain activity. The Initiation phase also coincides with the making of the early transcripts (Bae et al. 2002). It has been demonstrated by inducing transcription in a cis-regulatory domain by P- element insertion or by using transgenic assays that transcription can induce ectopic activation of a repressed domain (Bender & Fitzgerald 2002; Hogga &
Karch 2002; Rank et al. 2002).
Maintenance of activity states is important for the Hox genes, as in many tissues Hox gene maintain their activity state throughout the life of the organism despite the absence of the gap and pair-rule genes products. Many experiments have shown that chromatin modifiers of the Polycomb-Group (Pc-G) and the trithorax-Group (trx-G) genes are involved in the maintenance of the activity state by preserving the open and closed chromatin conformation of cis- regulatory domains (Kennison 1993; Paro 1990; Pirrotta 1997; Simon 1995;
Chiang et al. 1995). PC-G proteins usually induce a closed conformation of the
chromatin whereas the TRX-G products induce an open chromatin state; they bind respectively to sequence called Polycomb Response Elements (PRE) and Trithorax Response Elements (TRE) (Brown & Kassis 2013; Müller & Kassis 2006; Schwartz & Pirrotta 2008; Simon & Kingston 2009). In some cases, it has been reported that transcription can induce the modification of chromatin
markers induced by the Polycomb group proteins (Akam & Martinez-Arias 1985;
Bender & Fitzgerald 2002; Hogga & Karch 2002; Petruk et al. 2006; Rank et al.
2002; Rank et al. 2002; Schmitt et al. 2005). In our laboratory, we believe that the gap and pair rule genes products present in each parasegment induce the transcription of specific cis-regulatory domains via the binding to the initiator.
This transcription prevents the PC-G products to induce a closed conformation state of the domain, and allow the cell specific enhancers in the domain to trigger Hox gene expression.
3- The BX-C and the iab-8 long non-coding RNA
Long non-coding RNAs in the BX-C
In addition to the early transcripts, some cis-regulatory domains of the BX-C are also transcribed later in development. Indeed for the previously mentioned bxd transcript, the region between the abd-A and Abd-B genes contains at least two large transcription units going in the opposite directions, the iab-4 ncRNA and the iab-8 ncRNA (see Fig. 7A). Both transcripts are the
al. 2008; Tyler et al. 2008). The iab-4 ncRNA is transcribed from the iab-3 to the iab-4 domain and produces miR-iab-4. As shown in Fig. 7B, the pattern of expression of the iab-4 ncRNA is located from PS8 to PS12. The iab-8 ncRNA transcript starts at the iab-8 domain and spans the entire 94kb-long intergenic region between abd-A and Abd-B. Interestingly, this transcript is spliced, picking-up an exon in each of the cis-regulatory domains between abd-A and Abd-B (Fig. 8). After splicing, the final transcript is a 5’ capped, 3’ polyadenylated, 2.8kb transcript (Gummalla et al. 2012). The iab-8 ncRNA is the template for the miR-iab-8 miRNA (complementary to miR-iab-4). In agreement with the location of its promoter within iab-8, the iab-8 ncRNA is expressed in PS13 and remains active in PS14. Both the iab-4 ncRNA and iab-8 ncRNA transcripts are detected 1st in the epidermis from the early stages of embryogenesis until adulthood. The expression of both transcripts become restricted to the CNS from the germ band elongation stage (Bender 2008; Stark et al. 2008; Tyler et al. 2008).
Overexpression assays proved that miR-iab-4 downregulates Ubx (Ronshaugen et al. 2005). Analysis of the abd-A and Ubx 3’ un-translated regions (UTR)
revealed potential target sites for both mir-iab-4 and mir-iab-8 (Stark et al. 2008;
Tyler et al. 2008). Clonal analysis showed that overexpression of mir-iab-8 was coupled with reduced expression of Ubx and abd-A, whereas mir-iab4
overexpression seems to primarily target Ubx (Tyler et al. 2008). Deletion of the miRs locus correlates with a de-repression of Ubx and abd-A in the PS13 (A8) region of the CNS during development (Bender 2008; Garaulet et al. 2014).
Specific regulation implicating the iab-8 transcript
If one examines the embryonic pattern of expression of the abd-A and Abd-B genes in the central nervous system (CNS), one realizes that some cells co- express the two Hox genes in PS12. This contradicts the posterior dominance rule. Further investigations published by our lab (Gummalla et al. 2012) showed that in the posterior CNS, Abd-B does not repress abd-A. As mentioned earlier and shown in Fig. 5A, ABD-A and UBX are absent in PS13-14. It was always
thought that ABD-B repressed the expression of the two other Hox genes in these parasegments. While it is true for the epidermis, this does not seem to happen in the CNS for abd-A. Instead, it seems that the repression comes from the iab-8 ncRNA through two separate mechanisms. In trans, by the effect of miR-iab-8 that directly downregulates abd-A and Ubx. And secondly in cis, where the iab-8 ncRNA is also thought to represses abd-A expression through transcription interference due to transcription of the abd-A promoter by the iab-8 ncRNA (Gummalla et al. 2012; Gummalla et al. 2014).
Fig. 7 BX-C transcription profile. (A) The BX-C is represented by the blue bar. Boxes indicate delimitations of the cis-regulatory domains, and the names of the domains are noted on the BX-C diagram. Above and below are indicated the transcripts coming from the BX-C. The transcripts represented above the BX-C genomic map are transcribed from the top strand, and transcripts annotated below from the bottom strand. Colored arrows represent all non-coding transcripts.
msa start site is indicated by a star. (B) Patterns of expression of ncRNA and hox genes along an embryonic VNC. The VNC is represented by the white bar, with the A/P axis from left to right and corresponding segments indicated in the boxes. Above the VNC is indicated the distribution of the ncRNA and below the Hox genes proteins (adapted from (Garaulet & Lai 2015)).
The sterility phenotype involving the iab-8 ncRNA
Flies homozygous for a deletion of the miR-iab-4/iab-8 locus (Referred to ΔmiR) display no homeotic defect, but interestingly, both male and female flies are sterile (Bender 2008). The sterility phenotype associated with the miR locus deletion is an exceptional phenomenon worth investigation since the majority of miR deletions usually give no or only weak phenotypes (Miska et al. 2007;
Smibert & Lai 2008). This sterility phenotype displays no visible malformation of the genital tract of both sexes, nevertheless the females seem to have a defect in oviposition, while the males seem to have trouble in bending their abdomen for
mating completion (Bender 2008). By placing the deletion in trans to several chromosome breakpoints that stop transcription of the iab-8 transcript, W.
Bender was able to attribute the sterility phenotype to the absence of miR-iab-8.
Chromosome break studies also indicate that transcription interference of the iab-8 ncRNA on the abd-A promoter also plays a role in this phenotype
(Gummalla et al. 2012). Recent studies indicate that miR-iab-4/iab-8 seems to target the homeotic co-factors homothorax (hth) and extradenticle (exd) (Garaulet et al. 2014). In their study, Garaulet et al. showed that ΔmiR coupled with mutants for Ubx or hth partially rescue the female sterility phenotype, but according to this study, it seems that abd-A and exd are not involved in the female phenotype.
The muscle of Lawrence
The ΔmiR sterility phenotype in males is described as the incapacity of the male to fully bend its abdomen to achieve copulation (Bender 2008). In the literature, flies presenting a similar phenotype have already been described; they involve the muscle of Lawrence (MOL), a crucial component for fertility in males.
The MOL is a male-specific dorsal muscle located in A5 (Lawrence & Johnston 1984). The development of this muscle is not cell-autonomous and seems to be dependent on a single male-specific motoneuron (Taylor 1992). The pleiotropic genes doublesex (dsx) and the FruMC isoform of the fruitless (fru) gene are essential for the correct MOL formation (Anand et al. 2001; Nojima et al. 2014;
characterized; it is referred to as the MOL-inducing (Mind) motoneuron. The Mind neuron is glutamatergic and blocking synaptic transmission of this neuron during development is sufficient to block MOL formation. An aminergic neuron also innervates the MOL. The MOL neurons are located in the mid anterior part of the Abdominal ganglia (Abg) in the ventral nerve chord (VNC). Those two ipsilateral neurons project thought the abdominal nerve trunk (AbNvTk) and reach the MOL in the abdomen (Nojima et al. 2010).
iab-8 ncRNA and CNS
Since the iab-8 ncRNA is primarly expressed in the CNS, the sterility phenotype linked to its absence is thought to be linked to a neuronal defect. Over the recent years, knowledge of the Drosophila reproductive neuronal circuitry has significantly increased. In a recent study (Garaulet et al. 2014), the Insulin like peptide 7 (ilp7+) positives neurons were thought to be involved with the iab- 8 ncRNA. Interestingly, ILP7 shares a certain degree of homology with Relaxin, a mammalian hormone involved in reproduction (Brogiolo et al. 2001; Hudson et al. 1981). Hyperpolarizing Ilp7+ neurons with Kir2.1 overexpression in female flies induces a sterility phenotype, exhibiting an absence of egg-laying, an observation similar to the ΔmiR phenotype (Yang et al. 2008). Besides, some Ilp7+ neurons are located in the posterior region of the abdominal ganglia and innervate part of the female reproductive tract (Yang et al. 2008). An extensive analysis of the ilp7+ neurons, coupled with the ΔmiR mutation was performed. It seems that ΔmiR flies exhibit a decrease of synaptic buttons on the oviduct, but
the interpretations of these results are questionable since the same innervation defect is observed in hth heterozygous females that are fertile (Garaulet et al.
2014). In the embryo and larva, we also know that the iab-8 transcript overlaps with the ABD-B protein in the CNS. By assuming it is also the case in the adult, one could wonder about the function of the ABD-B+ neurons. Study on the female reproductive behaviors associated with Abd-B shows, using RNAi lines, that Abd- B expression during neurogenesis is essential for maximum levels of female receptivity (Bussell et al. 2014). The same study shows Abd-B+ neurons in the abdominal ganglia of the ventral nerve cord (VNC) innervating the oviducts, uterus, and muscles near the vaginal plates and the vaginal bristles.
4- The iab-8 ncRNA, a potential source for micropeptides
The exon-8 miPep, introduction to micropeptides
One of the characterized features of the iab-8 ncRNA is the presence of a conserved short open reading frame (sORF) that could code for a 20 amino-acid- long peptide (or micro-peptide, miPep). This peptide coding sequence is
conserved in many Drosophila species ((Gummalla et al. 2012) and also see section 2 in result section). During the past years, the rise of whole
transcriptome sequencing (RNA-Seq) allowed for the identification of thousands of potential large non-coding RNAs (lncRNA). These transcripts were annotated as “non-coding” since they lack the established protein-coding cues. During