• Aucun résultat trouvé

Chapter 1 - Introduction

1.3   Hox  gene  regulation

1.3.1 Evolution of cis-regulatory regions

Vertebrates all develop from one single fertilized cell to a complex multicellular organism and comprise a wide range of morphologies. Interestingly, different vertebrate species acquire distinct forms that are for the most part specified during embryonic development using the same key developmental genes. Therefore, morphological variation is

most probably stemming from either mutations that affect transcription factors and signaling molecules directly (trans) or mutations in the sequences that regulate expression of a gene located in the same chromosome (cis) (Carroll, 2008; Wittkopp and Kalay, 2012).

Mutations in transcription factor sequence are often disadvantageous or deleterious and are therefore under stronger selective pressure. However, gene expression can be driven by a variety of different tissue or time-specific enhancers. Mutations in these regulatory sequences are less likely to be negatively selected and could in some cases represent an evolutionary advantage (Wittkopp and Kalay, 2012).

Conservation of non-coding sequences has long been used as a method to identify enhancer elements (Alonso et al., 2009). This approach has led to the discovery of highly conserved elements that have either maintained or changed the regulatory specificity compared to the homologous sequence in other species (Maeso et al., 2013; Royo et al., 2011).

However, recent advances in high-throughput technologies have made it possible to obtain genome-wide analysis of gene transcription, chromatin structure and transcription factor binding in different tissues and cell types, mostly of mammalian species. These studies have clearly demonstrated that cis-regulatory sequences have evolved very rapidly and can show distinct transcription factor occupancy despite sequence conservation (Odom et al., 2007; Schmidt et al., 2010; Villar et al., 2015; Yue et al., 2014). However, the comparison between mouse and human transcription factor binding complemented with assessment of open chromatin regions has revealed that pleiotropic regulatory regions are associated with higher conservation of transcription factor occupancy (Cheng et al., 2014). It has also been proposed that when embryonic tissues are used there is less variation in transcription factor occupancy perhaps because developmental enhancers are under higher selective pressure (Sakabe and Nobrega, 2013).

1.3.2 Repeat elements and the evolution of regulation

Transposable element (TE) derived content makes up at least half of the mouse and human genomes (Lander et al., 2001; Waterston et al., 2002). These segments of DNA were first found in maize and are characterized by their ability to “jump” and replicate within genomes (McClintock, 1956). TEs are classified in different families by mode of transposition and sequence similarity. Class I TEs (retroelements) transpose through an RNA intermediate that is then reverse transcribed back to DNA and inserted back in the genome.

Within this class are the long terminal repeat retrotransposons (LTRs) and long and short interspersed elements (LINES and SINES). Class II elements, on the other hand, are DNA

transposons and do not use an RNA intermediate. Instead they transpose through a “cut and paste” mechanism or replicate from DNA to DNA into the genome using a “copy and paste”

mechanism (Feschotte and Pritham, 2007; Finnegan, 1992).

TE insertion or excision from the genome can, in some cases, result in phenotypic consequences. However, TE-mediated regulatory changes can take a long time to evolve into a new type of regulation. For this reason it is not always easy to identify if novel regulatory mechanisms have arisen from TEs since these elements might have degraded beyond recognition (Kidwell and Lisch, 1997).

The insertion of repeats in exons is often deleterious and thus it is normally under negative selection. In rare cases, however, these types of mutations can result in phenotypic diversity (Rubin et al., 1982). On the other hand, TE-mediated mutations in regulatory regions are more likely to affect gene expression only in particular tissues or change the timing of activation and thus have a reduced deleterious effect while creating the opportunity for genetic and phenotypic variation (Kidwell and Lisch, 1997). TEs can also disrupt the host genome by inducing ectopic recombination or chromosomal insertions and translocations (Kidwell and Lisch, 2001).

In many cases TEs are targeted for silencing through RNAi-based mechanisms and heterochromatin formation, which could result in the silencing of nearby regions as a side effect (Feschotte and Pritham, 2007). More recently, evidence has accumulated indicating that transposable elements play a role in the fast evolutionary pace of enhancer elements by providing new transcription factor binding sites (Bourque et al., 2008; Kunarso et al., 2010;

Schmidt et al.).

1.3.3 Hox gene regulation in the main body axis

Initiation, establishment and maintenance of Hox gene expression

Hox gene expression is initially activated in the primitive streak during gastrulation and subsequently spreads anteriorly at specific timings (Deschamps and van Nes, 2005;

Forlani et al., 2003). FGF and Wnt signaling, essential for gastrulation movements, play an important role in the regulation of Hox gene expression in the forming mesoderm (Dubrulle et al., 2001; Forlani et al., 2003; Kessel and Gruss, 1991). Retinoic acid (RA) as well as segmentation genes such as components of the Notch signaling pathway also appear to modulate Hox gene expression insuring that the correct rostral boundaries are established (Cordes et al., 2004; Kessel and Gruss, 1991).

Hox gene expression in neurectoderm is modulated independently from mesodermal expression. Normal Hox gene expression in the hindbrain is dependent on promoter sensitivity to RA signaling (Gavalas and Krumlauf, 2000). More posterior neural tissue has a distinct embryonic origin and relies on RA and FGF opposing actions. RA is produced by neighboring mesoderm and commits cells to a neural fate, while FGF maintains a posterior stem cell zone. This process is essential for a concerted formation of mesodermal and neurectodermal tissue along the main AP axis (del Corral and Storey, 2004). Therefore RA and FGF regulate Hox gene expression in both mesoderm and neurectoderm although different rostral boundaries of expression in the two tissues indicate that different regulatory specificities are involved (del Corral and Storey, 2004; Deschamps and van Nes, 2005).

While Fgf, Wnt and RA have been shown to modulate Hox gene expression, it is not clear if this is achieved through direct or indirect regulation. Cdx genes, on the other hand, have been shown to directly regulate Hox gene expression in the developing AP axis in a dose-dependent manner and are likely to relay positional information provided by Fgf, Wnt and RA (Deschamps and van Nes, 2005).

Once Hox active or inactive states of expression have been established, they are maintained by Polycomb (PcG) and trithorax (trxG) group proteins. In fruit flies, PcG mutants were shown to result in posterior homeotic transformations caused by the anteriorization of homeotic gene expression (Lewis, 1978). Later, trxG genes were identified for suppressing PcG mutant phenotypes. The antagonist action of these protein groups has now long been recognized for their role in maintaining cell identity (Ringrose and Paro, 2004). Hox genes are kept in a silent state by the action of PcG, while trxG ensures that Hox genes remain transcriptionally active. The silencing activity of PcG is achieved by two Polycomb complexes: PRC1 and PRC2. The PRC2 complex contains a component with methyltransferase activity that catalyzes methylation of Lysine 27 (K27) in histone H3 (H3).

This histone modification recruits PRC1 that has a chromatin compaction activity and contains an ubiquitin E3 ligase named RING1B. RING1B catalyzes the monoubiquitylation of histone H2A which results in Hox gene silencing (Schuettengruber and Cavalli, 2009). On the other hand, trxG proteins catalyze the trimethylation H3 at lysine 4, a histone mark usually associated with active gene expression (Schuettengruber et al., 2007). Therefore, PcG and trxG are essential for maintaining Hox gene expression spatially restricted throughout development.

Regulation of spatial and temporal collinearity

In the past it has been hypothesized that, in vertebrates, the ordered and nested expression along the anterior-posterior axis (spatial collinearity) was a read-out of the

sequential activation of Hox genes with respect to their relative order in the chromosome (temporal collinearity) (Duboule, 1994). Currently, evidence has surfaced that counter this intuitive explanation and dissociate the temporal and spatial aspects of collinearity. In the first place, as previously mentioned (see section 1.1.2) although temporal collinearity occurs only in animals with a relatively organized Hox cluster, spatial collinearity is a more widespread phenomenon and has been reported even in species with scattered atomized clusters (Seo et al., 2004). Secondly, single Hox transgenes randomly integrated in the mouse genome have been show to resemble endogenous spatial expression (Oosterveen et al., 2003). In addition, a series of deletions within the HoxD cluster that changed Hox gene activation timing was shown to have, for the most part, no effect on their final domain of expression. It was also demonstrated that, although sequences inside the cluster played a role in temporal collinearity, other sequences located in the surrounding 5’ (centromeric) and 3’ (telomeric) gene deserts provided negative and positive regulatory influences, respectively, in order for the activation to occur in the correct timing (Tschopp and Duboule, 2011).

The action of PcG and trxG has been shown to have an important role in establishing chromatin compartments during Hox gene collinear expression in time and space (Noordermeer et al., 2014; Noordermeer et al., 2011; Soshnikova and Duboule, 2009). At very early stages the H3K4me3 mark only covers the 3’-most Hox genes that are active in the developing embryo. The remaining genes that are yet to be expressed are instead covered by H3K27me3. As the embryo develops, the activation of more 5’ genes in the tail bud is accompanied by a progressive removal of the H3K27me3 mark that is replaced by H3K4me3 (Soshnikova and Duboule, 2009). Recently it has been shown that this change in epigenetic status is reflected at the level of chromatin 3D conformation, keeping inactive genes isolated from active genes. A similar epigenetic mechanism is associated to spatial collinearity. In the forebrain where no Hox genes are expressed, the entire cluster is covered by H3K27me3 and compacted in one single structure. In contrast, more posterior tissues show a bimodal organization where the H3K4me3 domain is restricted to the Hox genes that are expressed, while inactive genes lie in a separate compartment covered by H3K27me3 (Noordermeer et al., 2011). Although these histone changes are tightly related to the collinear expression essential for correct axial skeleton patterning, the mechanisms behind the recruitment of PcG and trxG are not well understood (Beisel and Paro, 2011).

1.3.4 Hox gene regulation in limbs

Neofunctionalization of Hox genes is thought to have occurred by acquiring enhancers that conferred additional expression specificities. Perhaps as a way to avoid

interference with more ancestral modes of regulation, these novel enhancers have often been found to lie outside of the Hox cluster, at least as far as HoxA and HoxD clusters are concerned (Berlivet et al., 2013; Duboule, 2007; Montavon et al., 2011). While both clusters are surrounded by long-range regulatory regions, the HoxD regulatory landscapes have the particularity of being flanked by two 1 megabase (Mb)-long gene deserts that appear to mostly contain regulatory sequences (Andrey et al., 2013; Montavon et al., 2011). The random integration of a human HoxD cluster devoid of the surrounding genomic context results in expression limited to the main body axis. Conversely, the deletion of the cluster yields reporter gene expression in the limb bud, showing that Hoxd gene regulation in the limb originates from sequences located outside the cluster (Spitz et al., 2001).

The two waves of Hox gene expression in developing limbs (see section 1.2.3) have been shown to rely on distinct regulatory modalities. By complementing chromosome capture techniques with disruption of Hox regulatory landscapes and detection of histone marks, a full picture of the complex regulation behind limb patterning starts to emerge. The regulatory mechanisms behind limb Hoxd gene expression, in particular, have been extensively studied (Andrey et al., 2013; Montavon et al., 2011).

Even though the regulation of Hox gene expression in the limb has long been attributed to sequences located outside of the cluster, it is only recently that the full extent of these regulatory landscapes has been revealed (Andrey et al., 2013; Montavon et al., 2011).

Indeed, only the deletion of the entire centromeric or telomeric gene desert resulted in the complete abrogation of gene expression either in distal or proximal limb, respectively.

Chromosomal conformation capture experiments and histone 3 lysine 27 acetylation (H3K27ac) enrichment, a histone mark for putative active promoters and enhancers, revealed limb enhancer candidates spread across the length of the two landscapes. LacZ reporter experiments confirmed the limb regulatory activity of some of these sequences (Andrey et al., 2013; Montavon et al., 2011).

Hi-C techniques have shown that the two regulatory gene deserts correspond to two topological associating domains (TADs) with the boundary located in the 5’ region of the cluster. Interestingly, it has been found that the change between early and late regulation of Hox genes in the limb bud relies on the switch between the centromeric and telomeric TADs.

This switch is mainly seen for Hox9-11 genes that are expressed both in proximal and distal limb. When the early limb bud starts to develop this subset of genes interacts mostly with the telomeric TAD while later, as the autopod forms, the pattern of contacts changes and there is a significant increase of interactions with the centromeric TAD (Andrey et al., 2013). The Hoxa genes that together with Hoxd genes pattern the limb were also shown to employ a bimodal type of regulation with long-range regulatory sequences. Interestingly, zebrafish HoxA and HoxD clusters were shown to also possess a bimodal chromatin structure

(Woltering et al., 2014). The invertebrate amphioxus Hox cluster, however, was found to be located in a single structural domain (Acemel et al., 2016). The bimodal chromatin structure found at the HoxA and HoxD loci would therefore represent a vertebrate evolutionary novelty that has predated tetrapod evolution.

1.3.5 Hox gene regulation in genitals

As previously stated (see section 1.2.3), genital and digit specification is achieved by the activity of the same subset of Hox genes. In addition to this, regulatory elements located centromeric to the HoxD cluster were shown to drive reporter gene expression both in the genital tubercle and in limbs (Gonzalez et al., 2007; Spitz et al., 2003). Recently, it has been shown that Hoxd gene regulation in limbs and genitals is achieved by sequences located in the same TAD, centromeric to the cluster. In addition, even though most enhancers activate Hoxd expression in both appendages, it appears that a small number of regulatory elements are either limb or genital-specific. A similar mechanism of enhancer and topology sharing to activate gene expression in genitals and limbs was also found in the HoxA regulatory landscapes (Lonfat et al., 2014).

These experiments led to the proposal that pre-existing contacts would have facilitated the acquisition of new enhancer functions by acquiring binding sites for tissue-specific transcription factors. Such “genomic niches” might have been in place before the divergence of the HoxA and HoxD clusters and could have provided a fertile ground for new regulatory elements to emerge (Lonfat and Duboule, 2015). The fact that such an organization is seen in teleost fishes that have no limbs and no external genitalia supports this evolutionary argument (Woltering et al., 2014).

Recently, the genome-wide analysis of H3K27ac coverage in different tissues revealed that enhancer sharing between genitals and limbs is not a feature exclusive of the Hox landscapes. Instead, more than 1500 regions in limb and genital tissues were found to be H3K27ac-rich in both appendage types (Infante et al., 2015).

1.3.6 Hox gene regulation in kidney and caecum

Mutant strains of different deletions within the HoxD cluster revealed that posterior and anterior Hox genes are involved in the development of either mesenchymal or epithelial components of the early kidney, respectively. Regulation by long-range enhancers was proposed to be necessary for correct Hoxd gene expression in this tissue. Although distinct

regulatory sequences are likely to be involved in activating the two groups of genes in their respective compartments, it would appear that both mesenchymal and epithelial enhancers are located in the telomeric gene desert (Di-Poï et al., 2007).

Hoxd gene regulation in the caecum was also found to be regulated by several regulatory elements located throughout the telomeric gene desert. Interestingly, orthologous sequences from a species that does not have a caecum, the hedgehog, were still able to drive Hox gene expression in the caecum of the mouse in a transgenic context (Delpretti et al., 2013).