• Aucun résultat trouvé

The  corn  snake  HoxD  cluster  and  regulatory  landscapes

Chapter 2 - Results

2.1   The  corn  snake  HoxD  cluster  and  regulatory  landscapes

The modern study of gene regulation relies heavily on high-throughput sequencing-based techniques most of which require a high quality genome to which millions of sequenced reads are aligned. A first version of the corn snake genome was only very recently released (Ullate-Agote et al., 2014) and it is for the moment too fragmented to be suitable for such applications. We therefore chose to sequence the HoxD cluster together with the two regulatory gene deserts that lie on each side of it. For this we screened a corn snake custom BAC library using probes of highly conserved regions from mammals to lizard that spanned the region of interest. From the clones that scored positive, 13 overlapping BACs were selected and sequenced to obtain a high quality 1.3 Mb (Megabase) sequence (Figure 1A).

The analysis of the corn snake HoxD cluster sequence revealed that, as for other snake species whose genomes were recently sequenced (Castoe et al., 2013; Vonk et al., 2013), all Hoxd genes except Hoxd12 are present and share the same transcriptional orientation in the cluster. However, the corn snake cluster is about 1.5 fold larger than the mouse counterpart (Figure 3A) consistent with the size of the HoxD clusters of the king cobra and the python (Castoe et al., 2013; Vonk et al., 2013). Indeed, plotting the cluster size against the genome size of different vertebrate species including mammals, birds, amphibians and fish as well as snakes and lizards revealed that the size of the squamate clusters is larger than what would be expected (Figure 1B). Accordingly, when reptiles were excluded from the linear regression analysis, an R2 value of 0.43 was scored, indicating significant correlation.

However, when corn snake, king cobra and Burmese python cluster sizes were taken into account, the R2 value was reduced to 0.027 (Figure 1B). Interestingly, the increase in cluster size was consistent in the different snake species: Burmese python, king cobra and corn snake. However, even though, snake clusters showed a larger cluster size than what would be expected based on their genome size, the green anole lizard cluster is, from the vertebrate species analyzed, the one with the lowest level of correlation with genome size (R2=0.0012), which could reflect a higher repeat content within the cluster (Figure 1B).

In agreement with this observation, although the snake HoxD cluster showed a 4 fold increase in transposable element number compared to the mouse, we found that the Anolis cluster had five times more repeats than the corn snake (Figure 1D). Although, mammalian, bird and fish clusters contain few repeats, we found that the frog HoxD cluster has

approximately the same amount of transposable elements as the snake species, consistent with similar deviation of cluster size relative to genome size (Figure 1B, left, asterisk).

The repeat class repertoire within the cluster was very different in mouse, snakes and anole lizard. Although SINEs (short interspersed elements) are the most prominent class in the mouse HoxD cluster, DNA transposons are highly represented in the lizard counterpart.

King cobra and corn snake clusters contain SINEs, LINEs (long interspersed repeats) and DNA transposons in almost equal amounts. The Python cluster, on the other hand, has a similar distribution of repeat content as other snakes but is richer in DNA transposons (Figure 1D).

The high quality sequence of the gene deserts that flank the HoxD cluster permitted the analysis of size and repeat content of both 3’ and 5’ landscapes. We found that, unlike the cluster, the gene desert sizes of different vertebrate species all showed a significant correlation with genome size. However, the gene desert that flanks Hoxd13 appears to have a larger relative size than the one adjacent to Hoxd1 (Figure 1B). In terms of repeat content, the gene deserts of the corn snake have a similar transposable element repertoire as the cluster, although the latter contains a higher percentage of SINEs (Figure1C).

Figure 1 - Corn snake HoxD cluster and surrounding regulatory landscapes

A. Description of the BACs (grey lines) used for sequencing the snake HoxD cluster and the surrounding regulatory gene deserts. The scheme reflects the relative genomic distances between genes (black boxes). B. Comparison between the sizes of the HoxD cluster (left), the 5’-located gene desert (centromeric in the mouse) and the 3’-located gene desert (telomeric in the mouse) relative to genome size. The species used are mouse, human, cow, dog, horse, opossum, chicken, zebra finch, lizard, corn snake, cobra, python, frog, zebrafish, stickleback and medaka. Snakes are shown in red and lizard in green, whereas other vertebrates are in blue. The linear regression line and 95% confidence band were calculated excluding the snakes and lizard data. C. Graphs representing the repeat content in the snake and mouse HoxD clusters as well as within the flanking gene deserts.

D. Graphical representation of the percentage of transposable element families (left) and absolute total number of repeats (right) present in the HoxD cluster of different vertebrate species. The various types of repeats for the C and D panels are shown at the bottom with a colour code.

The alignment of the entire sequenced corn snake HoxD genomic landscape with that of other species revealed that the very large majority of conserved non-coding regions located in the gene deserts was also found in the snake genome (Figure 2). Indeed the pattern of conservation was almost identical to that of the chicken. These results indicate that the snake limb loss is not easily perceived at the level of sequence conservation when compared to a closely related limb-bearing animal.

Figure 2 - Sequence conservation at the HoxD locus.

The conservation plots are shown using mouse sequences and annotations as a reference, and human, chicken, lizard, python, corn snake, cobra, frog and zebrafish sequences for comparison. Conservation is shown only when above 50% and coloured peaks represent a level of conservation higher than 75%. The 5’ located flanking gene desert is on the top and the 3’ located gene desert is on the bottom. The HoxD cluster region is in the middle.

Purple is for conservation of exons, light blue for UTRs and pink for non-coding regions.

The presence of transposable elements in squamate clusters has been hypothesized to potentially alter normal Hox gene expression through the known association between repeats and recruitment of the repressive histone modification H3K9me3 (Di-Poï et al., 2010;

Kidwell and Lisch, 1997). Therefore, we performed a ChIP-seq for this histone mark in snake brain tissue where Hox genes are not expressed. Similarly to the mouse, no H3K9me3 enrichment was scored over the corn snake Hoxd cluster. Nevertheless, six significant H3K9me3 peaks were detected in the surrounding gene deserts, mostly overlapping with LTRs (long terminal repeats). The closest peak to the cluster was located in an intron of Lunapark (Figure 3C). Since H3K9me3 was not present within the cluster and did not cover conserved mouse enhancer regions, it is unlikely that the recruitment of heterochromatin would have an impact on Hoxd gene expression.

2.1.1 HoxD cluster structure and spatial collinearity in the snake

We then asked if, as in the mouse, the progressive and collinear activation of Hox genes along the anterior-posterior body axis was associated with changes in epigenetic status.

For this we performed a ChIP-seq of the H3K27me3 repressive histone mark in two different embryonic tissues: forebrain and a posterior part of the trunk excluding the post-cloacal region (Figure 3C). In the forebrain, a tissue where no Hox genes are expressed, the entire cluster is covered by H3K27me3. In contrast, in the posterior trunk tissue only the 5’ part of the cluster is enriched for this histone modification (Figure 3C). We assessed gene expression in this region of the snake embryo by whole-mount in situ hybridization and found an inverse correlation between H3K27me3 coverage and active Hoxd gene expression (Figure 3B).

Indeed, fully active genes in the posterior trunk such as Hoxd4 are completely devoid of H3K27me3 coverage, while genes that are expressed in more posterior post-cloacal regions are fully enriched with this histone mark. Hoxd9 represents an intermediate situation because it is expressed only partially in the collected tissue (Figure 3B). As expected, although decorated with H3K27me3 in both tissues, Hoxd9 shows lower enrichment in the posterior trunk than in embryonic brain. Therefore, both spatial collinearity and the chromatin structure dynamics of the cluster that accompany gene activation in progressively more posterior tissues appear to be conserved in the snake.

H3K27me3 coverage in the snake brain was detected in a region of the 5’ gene desert homologous to the mouse IslandII, previously described to be a limb regulatory element (Montavon et al., 2011) (Figure 3C). However, the H3K27me3 coverage within IslandII is not overlapping with the sequence previously demonstrated to contain the mouse limb enhancer activity (Montavon et al., 2011)(red box in Figure 3C). Instead, it corresponds to a region which displays strong constitutive contacts with the cluster in both mouse and snake tissues (Woltering et al 2014, Lonfat et al 2014, see section 3.3), thought to be important for other long-range interactions of a tissue-specific nature to occur. Despite the presence of this silencing histone mark in a genomic region of regulatory relevance, the coverage of

H3K27me3 was not found in the posterior trunk of the snake, which unlike the forebrain is a Hox expressing tissue.

Figure 3 - The snake HoxD cluster.

A. Schematic representation at the same scale of the mouse (top) and corn snake (bottom) HoxD clusters. Exons are represented by black rectangles. B. Whole-mount in situ hybridization of corn snake embryos at 8.5 dpo (days post oviposition) showing expression of Hoxd4, Hoxd9, Hoxd11 and Hoxd13. Arrowheads and numbers define the anterior levels of expression. The black arrowhead points to the neural tube whereas the white arrowhead shows mesoderm. A single black arrowhead indicates that the neural and mesodermal boundaries coincided. C. Detection of both H3K9me3 and H3K27me3 histone modifications by ChIP-seq in corn snake brain (top and middle tracks) and of H3K27me3 marks in the posterior trunk of snake embryos (bottom track). The right panel shows H3K27me3 coverage on islandII in brain and posterior trunk tissues. Blue is for brain and orange for posterior trunk, as schematized on the left. The black peaks in the top track represent artifactual signals also present in the input chromatin mapping.