• Aucun résultat trouvé

Chapter 4 - Materials and methods

4.10   RNA-­‐seq  data

4.10.1 RNA-seq data generation

RNA sequencing (RNA-seq) libraries were prepared with the Illumina TruSeq Stranded mRNA protocol and sequenced on a HiSeq 2500 machine, as single-end, 100 base pairs (bp) reads. Prior to analyses, RNA-seq reads were trimmed to remove 3’ end adapters.

4.10.2 Mapping of RNA-seq reads to reference genome

Reads were aligned using Tophat2 (Kim et al., 2013) to the ENSEMBL mm10 genome for mouse samples, the king cobra genome (Vonk et al., 2013) for corn snake

samples and the Burmese python genome (Castoe et al., 2013) for python samples. The UCSC genome browser and the IGV genome browser were used to visualize the mapping.

4.10.3 Global analysis of the RNA-seq mapped data

Bam files generated by Tophat were then used to generate a count table with the function summarizeOverlaps from the GenomicAlignments R package (Lawrence et al., 2013). Count table data was then analysed using the DESeq2 R package (Love et al., 2014).

Data was rlog transformed and sample distances calculated using the dist R function to be plotted in a heatmap. Rlog transformed data was also used to generate a principal component analysis (PCA) plot by using the plotPCA function.

Differential expression analysis was run using the DEseq R function and MA-plots showing differential expression between E9.5 mouse tail RNA transcripts and other samples were done using the MAplot function. A clustered differential expression heatmap was generated by selecting 100 genes with the highest variability across rows (i.e. stages) and calculating for each gene the deviation from the row mean.

4.10.4 De novo transcript assembly with Trinity

We used Trinity (Grabherr et al., 2011) version 2.0.6 to assemble transcripts from RNA-seq data, without relying on genome sequence or annotations. To obtain maximum sensitivity, we combined the RNA-seq data for all available samples in the transcript reconstruction procedure. To reduce memory requirements, we used the read normalization procedure implemented in Trinity, ran separately for each RNA-seq sample. To minimize the occurrence of retained introns in assembled transcripts, the minimum count for K-mers was set to 2 (Haas et al., 2013). As some retained introns or unspliced transcripts can still appear, we detected splice junctions in the assembled transcripts with TopHat (Kim et al., 2013) version 2.0.10, using the Trinity output as a genome index. The predicted splice junctions supported by at least 2 reads were considered to be putative retained introns and the spliced form of the corresponding Trinity contig was added to the set of predicted transcripts, while keeping also the original unspliced form. Whenever several introns were detected in a single Trinity contig, a single spliced form removing all introns was constructed.

4.10.5 Trinity transcript identification

To determine the genes from which Trinity-predicted transcripts were derived, we used an approach based on sequence similarity with annotated mouse proteins. First, we

extracted mouse protein sequences from the Ensembl database, release 81. For genes that had multiple protein-coding isoforms, we kept only the longest protein. For the Hox genes, we manually verified annotated proteins and extracted the one that best fitted the canonical form.

Second, we determined all possible open reading frames (ORFs, defined as any stretch of codons starting at an ATG and ending at a stop codon) within each Trinity contig. Partial ORFs (lacking the initial ATG or the stop codon) starting or ending at the first/last base of the Trinity contig were also permitted. Third, we searched for sequence similarity between the translated ORFs and mouse proteins using blastp (Altschul et al., 1990) and we extracted the best hits between translated ORF and proteins, using the blastp alignment score as a criterion.

We discarded cases where the best hits between translated ORFs and mouse proteins were not unique. We further filtered the blastp hits to select those with a percentage sequence identity above 95% (for mouse contigs) or 40% (for corn snake), and which were aligned across at least 25% of the protein length. We thus were able to predict the identity of 14,286 Trinity contigs for the mouse and 12,702 contigs for the corn snake.

4.10.6 Gene expression quantification

We used the previously identified Trinity contig ORFs to compute gene expression levels. For each contig, we extracted the part of the ORF that aligned with the mouse protein and constructed a Bowtie2 (Langmead and Salzberg, 2012) index containing exclusively these sequences. We then aligned the RNA-seq reads on this index using TopHat 2.0.10 and we computed expression levels for each selected Trinity contig using Cufflinks (Trapnell et al., 2012). All mapped reads were used for gene expression estimation, with the Cufflinks correction procedure for multiple mapped reads.

4.10.7 Gene expression normalization

We normalized gene expression levels across samples using a previously described approach (Brawand et al., 2011). Briefly, we identified the set of 200 genes that vary the least across samples, in terms of expression ranks. We then linearly scaled the FPKM expression levels to bring the medians of these 200 genes at the same value for all samples.

Acknowledgements

I would like to thank Denis Duboule for having accepted me as a PhD student in his lab. He fosters an atmosphere of high scientific level and critical thinking, while providing the freedom for our own initiative and creativity. I appreciate the opportunities that he put at my disposal to gain experience and improve on different aspects of my academic career.

I am also thankful to José-Luis Gómez Skarmeta, Ivan Rodriguez and Marcelo Sánchez for accepting to be my thesis jury. I look forward to discuss my results with such accomplished researchers.

I would like to acknowledge Moisés Mallo, my former supervisor, for the great collaboration that allowed for several mouse lines to be generated in a record amount of time.

It is a pleasure to keep such a good relationship even after the end of my Master. Thanks to Ana for having done the injections so skillfully.

I thank Michel Milinkovitch that allowed me to use the snake eggs that were essential for my project. It was a luxury to have this opportunity to work so closely with him and his lab. I would also like to thank Adrien for bringing me the eggs and taking care of the snakes. I also thank Suzanne and Sophie that were often helpful in assisting me or giving advise on snake related issues.

Thanks to Leo and Joost, two postdocs with whom I overlapped the longest and that were always happy to help, discuss ideas and give interesting input. It was a privilege to share the lab with Joost, the person that pioneered Hox gene studies in snakes and with whom I had extensive Evo-Devo related discussions with. Having shared the office and the lab with Leo for over 4 years, a great spirit of collaboration, support and friendship resulted. Both of them were very important during my PhD both at a professional and personal level.

I would also like to acknowledge Sandra, a very talented and reliable help during the last year of my thesis. Her assistance was essential for completing the last necessary experiments for my project.

I am very thankful to Anouk, an extremely talented bioinformatician that worked a lot on the corn snake transcriptome data analysis and without whom I wouldn’t have had the resources to generate the interspecies comparative study. In addition to her expertise she is an amazing person, always ready to help and patient enough to teach R and RNA analysis tricks to us.

I would also like to acknowledge Joska, an invaluable presence in the lab. One can only benefit from discussing with such an experienced and knowledgeable researcher. I would also like to acknowledge Aurélie for helping me with the French version of the summary. She is a very talented, motivated and hard working Master student that I am having the pleasure to co-supervise. I thank Béné as well as Hanh and Julien for keeping our mouse

lines safe. I would like to thank Eddie for bringing to a new level all discussions related to 3D chromatin conformation techniques, normalizations and interpretations and Imane for the great time I had supervising her 1 month-internship in this lab.

I would like to acknowledge the transgenic facilities at EPFL (Isabelle Barde) and CMU (Nicolas Steiner) as well as the genomics facility at CMU (Mylène Docquier) and the bioinformatics facility at the EPFL (Jacques Rougemont and Marion Leleu).

I am thankful to all members of the lab in Geneva or Lausanne past and present for a great atmosphere, advice and critical input.

Finally I would like to thank my family and my friends. My parents have always supported me in any decision and have from early on encouraged my curiosity for nature and science. My brother for constant challenging and companionship and my grandmothers for unconditional care. My friends for the fun spent together, essential for surviving the frustrations that come with research. I am thankful to Laurent for his unconditional support and care during this entire period.

References transgenic  mice.  Mechanisms  of  Development  52,  291-­‐303.  

Beisel,   C.,   and   Paro,   R.   (2011).   Silencing   chromatin:   comparing   modes   and   Esophageal  Defects  and  Vertebral  Transformations.  Developmental  Biology  177,   232-­‐249.   of  transcriptional  and  post-­‐transcriptional  regulation  are  required  to  define  the   domain  of  Hoxb4  expression.  Development  130,  2717-­‐2728.  

Brent,   A.E.,   and   Tabin,   C.J.   (2002).   Developmental   regulation   of   somite   derivatives:   muscle,   cartilage   and   tendon.   Current   Opinion   in   Genetics   &  

Development  12,  548-­‐557.  

Burge,  C.,  and  Karlin,  S.  (1997).  Prediction  of  complete  gene  structures  in  human  

genomic  DNA.  J  Mol  Biol  268,  78-­‐94.  

Burke,  A.C.,  Nelson,  C.E.,  Morgan,  B.A.,  and  Tabin,  C.  (1995).  Hox  genes  and  the   evolution  of  vertebrate  axial  morphology.  Development  121,  333-­‐346.  

Cameron,   R.A.,   Rowen,   L.,   Nesbitt,   R.,   Bloom,   S.,   Rast,   J.P.,   Berney,   K.,   Arenas-­‐

Mena,   C.,   Martinez,   P.,   Lucas,   S.,   Richardson,   P.M.,   et   al.   (2006).   Unusual   gene   order   and   organization   of   the   sea   urchin   Hox   cluster.   Journal   of   Experimental   Zoology  Part  B-­‐Molecular  and  Developmental  Evolution  306B,  45-­‐58.  

Carapuço,   M.,   Nóvoa,   A.,   Bobola,   N.,   and   Mallo,   M.   (2005).   Hox   genes   specify  

Elements   Increases   Responsiveness   to   Positional   Information.   Developmental  

Biology  171,  294-­‐305.  

paralogous  genes  hoxa-­‐3  and  hoxd-­‐3  reveal  synergistic  interactions.  Nature  370,  

304-­‐307.  

protostome  evolution.  Nature  399,  772-­‐776.  

del  Corral,  R.D.,  and  Storey,  K.G.  (2004).  Opposing  FGF  and  retinoid  pathways:  a  

<italic>Hoxd</italic>  Genes  in  Metanephric  Kidney  Development.  PLoS  Genet  3,   e232.   through  heterochrony.  Development  1994,  135-­‐142.  

Duboule,   D.   (1998).   Hox   is   in   the   hair:   a   break   in colinearity?   Genes   &   Boundary  Position  and  Regulates  Segmentation  Clock  Control  of  Spatiotemporal   Hox  Gene  Activation.  Cell  106,  219-­‐232.  

Economides,  K.D.,  Zeltser,  L.,  and  Capecchi,  M.R.  (2003).  Hoxb13  mutations  cause   overgrowth  of  caudal  spinal  cordand  tail  vertebrae.  Developmental  Biology  256,   317-­‐330.  

Featherstone,   M.S.,   Baron,   A.,   Gaunt,   S.J.,   Mattei,   M.G.,   and   Duboule,   D.   (1988).  

Hox-­‐5.1   defines   a   homeobox-­‐containing   gene   locus   on   mouse   chromosome   2.  

Proceedings  of  the  National  Academy  of  Sciences  of  the  United  States  of  America  

85,  4760-­‐4764.  

Fernandez-­‐Teran,   M.,   and   Ros,   M.A.   (2008).   The   Apical   Ectodermal   Ridge:  

morphological   aspects   and   signaling   pathways.   International   Journal   of   Developmental  Biology  52,  857-­‐871.  

Ferrier,   D.E.K.,   and   Minguillon,   C.   (2003).   Evolution   of   the   Hox/ParaHox   gene   clusters.  International  Journal  of  Developmental  Biology  47,  605-­‐611.  

Feschotte,   C.,   and   Pritham,   E.J.   (2007).   DNA   Transposons   and   the   Evolution   of   Organization  of  Two  Hemichordate  Hox  Clusters.  Current  Biology  22,  2053-­‐2058.  

Friedli,  M.,  Barde,  I.,  Arcangeli,  M.,  Verp,  S.,  Quazzola,  A.,  Zakany,  J.,  Lin-­‐Marq,  N.,  

regulation  during  digit  development.  Developmental  Biology  306,  847-­‐859.  

Grabherr,   M.G.,   Haas,   B.J.,   Yassour,   M.,   Levin,   J.Z.,   Thompson,   D.A.,   Amit,   I.,   functional  equivalence  during  paralogous  Hox  gene  evolution.  Nature  403,  661-­‐

665.   sequence  reconstruction  from  RNA-­‐seq  using  the  Trinity  platform  for  reference   generation  and  analysis.  Nat  Protocols  8,  1494-­‐1512.  

Held  I,  L.J.  (2014).  How  the  Snake  Lost  its  Legs:  Curious  Tales  from  the  Frontier   of  Evo-­‐Devo  (Cambridge  University  Press).  

Herault,  Y.,  Beckers,  J.,  Kondo,  T.,  Fraudeau,  N.,  and  Duboule,  D.  (1998).  Genetic   cluster:  Its  dispersed  structure  and  residual  colinear  expression  in  development.  

Proceedings  of  the  National  Academy  of  Sciences  of  the  United  States  of  America  

101,  15118-­‐15123.  

Infante,   Carlos  R.,   Mihala,   Alexandra  G.,   Park,   S.,   Wang,   Jialiang  S.,   Johnson,   Kenji  K.,   Lauderdale,   James  D.,   and   Menke,   Douglas  B.   (2015).   Shared   Enhancer   Activity   in   the   Limbs   and   Phallus   and   Functional   Divergence   of   a   Limb-­‐Genital   cis-­‐Regulatory  Element  in  Snakes.  Developmental  Cell  35,  107-­‐119.  

Kamm,  K.,  Schierwater,  B.,  Jakob,  W.,  Dellaporta,  S.L.,  and  Miller,  D.J.  (2006).  Axial  

Kidwell,   M.G.,   and   Lisch,   D.   (1997).   Transposable   elements   as   sources   of  

and  microarray-­‐based  analysis  of  protein  location.  Nature  protocols  1,  729-­‐748.  

Lemons,  D.,  and  McGinnis,  W.  (2006).  Genomic  Evolution  of  Hox  Gene  Clusters.  

International  Journal  of  Biological  Sciences  2,  95-­‐103.  

Morimoto,   M.,   Takahashi,   Y.,   Endo,   M.,   and   Saga,   Y.   (2005).   The   Mesp2   transcription  factor  establishes  segmental  borders  by  suppressing  Notch  activity.  

Nature  435,  354-­‐359.  

Mortlock,  D.P.,  and  Innis,  J.W.  (1997).  Mutation  of  HOXA13  in  hand-­‐foot-­‐genital  

syndrome.  Nat  Genet  15,  179-­‐180.  

Noordermeer,  D.,  Leleu,  M.,  Schorderet,  P.,  Joye,  E.,  Chabaud,  F.,  and  Duboule,  D.   transcriptional  regulation  has  diverged  significantly  between  human  and  mouse.  

Nat  Genet  39,  730-­‐732.  

Olivera-­‐Martinez,  I.,  Harada,  H.,  Halley,  P.A.,  and  Storey,  K.G.  (2012).  Loss  of  FGF-­‐

Dependent   Mesoderm   Identity   and   Rise   of   Endogenous   Retinoid   Signalling   Determine  Cessation  of  Body  Axis  Elongation.  PLoS  Biol  10,  e1001415.  

Oosterveen,   T.,   Niederreither,   K.,   Dollé,   P.,   Chambon,   P.,   Meijlink,   F.,   and   Deschamps,  J.  (2003).  Retinoids  regulate  the  anterior  expression  boundaries  of  5

 Hoxb  genes  in  posterior  hindbrain.  The  EMBO  Journal  22,  262-­‐269.  

<em>Hox</em>  code.  Gastroenterology  117,  1339-­‐1351.  

Pizette,  S.,  Abate-­‐Shen,  C.,  and  Niswander,  L.  (2001).  BMP  controls  proximodistal   outgrowth,   via   induction   of   the   apical   ectodermal   ridge,   and   dorsoventral   patterning  in  the  vertebrate  limb.  Development  128,  4463-­‐4474.  

Putnam,  N.H.,  Butts,  T.,  Ferrier,  D.E.K.,  Furlong,  R.F.,  Hellsten,  U.,  Kawashima,  T.,  

genomics  and  epigenomics  strategies  to  study  enhancer  evolution.  Philosophical  

Transactions  of  the  Royal  Society  of  London  B:  Biological  Sciences  368.  

Sarrazin,  A.F.,  Peel,  A.D.,  and  Averof,  M.  (2012).  A  Segmentation  Clock  with  Two-­‐

Segment  Periodicity  in  Insects.  Science  336,  338-­‐341.  

Schmidt,   D.,   Schwalie,   Petra  C.,   Wilson,   Michael  D.,   Ballester,   B.,   Gonçalves,   Â.,   Kutter,  C.,  Brown,  Gordon  D.,  Marshall,  A.,  Flicek,  P.,  and  Odom,  Duncan  T.  Waves   of  Retrotransposon  Expansion  Remodel  Genome  Organization  and  CTCF  Binding   in  Multiple  Mammalian  Lineages.  Cell  148,  335-­‐348.  

Schmidt,   D.,   Wilson,   M.D.,   Ballester,   B.,   Schwalie,   P.C.,   Brown,   G.D.,   Marshall,   A.,   Kutter,   C.,   Watt,   S.,   Martinez-­‐Jimenez,   C.P.,   Mackay,   S.,   et   al.   (2010).   Five-­‐

Vertebrate  ChIP-­‐seq  Reveals  the  Evolutionary  Dynamics  of  Transcription  Factor   Binding.  Science  328,  1036-­‐1040.   disintegration  with  persistent  anteroposterior  order  of  expression  in  Oikopleura   dioica.  Nature  431,  67-­‐71.   resolution  of  ancient  vertebrate  genome  duplications.  Genome  Research.  

Soshnikova,   N.,   and   Duboule,   D.   (2009).   Epigenetic   Temporal   Control   of   Mouse   Hox  Genes  in  Vivo.  Science  324,  1320-­‐1323.  

Spitz,  F.,  Gonzalez,  F.,  and  Duboule,  D.  (2003).  A  Global  Control  Region  Defines  a   Chromosomal  Regulatory  Landscape  Containing  the  HoxD  Cluster.  Cell  113,  405-­‐

417.  

Spitz,   F.,   Gonzalez,   F.,   Peichel,   C.,   Vogt,   T.F.,   Duboule,   D.,   and   Zákány,   J.   (2001).  

Large   scale   transgenic   and   cluster   deletion   analysis   of   the   HoxD   complex   separate  an  ancestral  regulatory  module  from  evolutionary  innovations.  Genes  &  

Development  15,  2209-­‐2214.  

te  Welscher,  P.,  Zuniga,  A.,  Kuijper,  S.,  Drenth,  T.,  Goedemans,  H.J.,  Meijlink,  F.,  and   transcription  in  the  spinal  cord  defines  two  regulatory  subclusters.  Development  

139,  929-­‐939.  

Tschopp,   P.,   and   Duboule,   D.   (2011).   A   regulatory   ‘landscape   effect’   over   the   HoxD  cluster.  Developmental  Biology  351,  288-­‐296.  

Tschopp,  P.,  Sherratt,  E.,  Sanger,  T.J.,  Groner,  A.C.,  Aspiras,  A.C.,  Hu,  J.K.,  Pourquie,   Evolution  of  Chromosomal  Domain  Architecture.  Cell  Reports  10,  1297-­‐1309.  

Villar,  D.,  Berthelot,  C.,  Aldridge,  S.,  Rayner,  Tim  F.,  Lukk,  M.,  Pignatelli,  M.,  Park,   Thomas  J.,   Deaville,   R.,   Erichsen,   Jonathan  T.,   Jasinska,   Anna  J.,   et   al.   (2015).  

Enhancer  Evolution  across  20  Mammalian  Species.  Cell  160,  554-­‐566.  

Vinagre,  T.,  Moncaut,  N.,  Carapuço,  M.,  Nóvoa,  A.,  Bom,  J.,  and  Mallo,  M.  (2010).  

Evidence  for  a  Myotomal  Hox/Myf  Cascade  Governing  Nonautonomous  Control  

of   Rib   Specification   within   Global   Vertebral   Domains.   Developmental   Cell   18,  

Wellik,  D.M.,  and  Capecchi,  M.R.  (2003).  Hox10  and  Hox11  Genes  Are  Required  to   Globally  Pattern  the  Mammalian  Skeleton.  Science  301,  363-­‐367.  

Wellik,   D.M.,   Hawkes,   P.J.,   and   Capecchi,   M.R.   (2002).   Hox11   paralogous   genes   are  essential  for  metanephric  kidney  induction.  Genes  &  Development  16,  1423-­‐

1432.   patterning  in  snakes  and  caecilians:  Evidence  for  an  alternative  interpretation  of   the  Hox  code.  Developmental  Biology  332,  82-­‐89.   development.  Current  Opinion  in  Genetics  &  Development  17,  359-­‐366.  

Zákány,  J.,  Fromental-­‐Ramain,  C.,  Warot,  X.,  and  Duboule,  D.  (1997).  Regulation  of   number  and  size  of  digits  by  posterior  Hox  genes:  A  dose-­‐dependent  mechanism   with  potential  evolutionary implications.  Proceedings  of  the  National  Academy  of   Sciences  94,  13695-­‐13700.  

Zákány,   J.,   Kmita,   M.,   Alarcon,   P.,   de   la   Pompa,   J.-­‐L.,   and   Duboule,   D.   (2001).  

Localized   and   Transient   Transcription   of   Hox   Genes   Suggests   a   Link   between   Patterning  and  the  Segmentation  Clock.  Cell  106,  207-­‐217.  

Zákány,  J.,  Kmita,  M.,  and  Duboule,  D.  (2004).  A  Dual  Role  for  Hox  Genes  in  Limb   Anterior-­‐Posterior  Asymmetry.  Science  304,  1669-­‐1672.  

Zappavigna,   V.,   Renucci,   A.,   Izpisúa-­‐Belmonte,   J.C.,   Urier,   G.,   Peschle,   C.,   and  

Zhang,   Y.,   Liu,   T.,   Meyer,   C.A.,   Eeckhoute,   J.,   Johnson,   D.S.,   Bernstein,   B.E.,   Nusbaum,  C.,  Myers,  R.M.,  Brown,  M.,  Li,  W.,  et  al.  (2008).  Model-­‐based  Analysis   of  ChIP-­‐Seq  (MACS).  Genome  Biology  9,  R137-­‐R137.  

Zuniga,  A.,  Haramis,  A.-­‐P.G.,  McMahon,  A.P.,  and  Zeller,  R.  (1999).  Signal  relay  by   BMP  antagonism  controls  the  SHH/FGF4  feedback  loop  in  vertebrate  limb  buds.  

Nature  401,  598-­‐602.