Thesis
Reference
Cellular biology, genetics and genomics : A powerful liaison to reveal genotype phenotype correlations
ATTAR COHEN, Homa
Abstract
This study integrates multiple recent concepts and technologies to address a major aim in medical genetics: identifying susceptibility loci for complex diseases, including Trisomy 21. In particular, we establish a category of quantitative cellular phenotypes that are closely linked to clinical manifestations and perform genome-wide linkage and association studies to detect regulatory loci Chapter I describes the set-up of an experimental protocol to measure a cellular phenotype in large population sizes. We choose to investigate reactive oxygen species (ROS) production as a fundamental cellular signaling phenotype related to the innate immune system that may be involved in clinical manifestations of Down syndrome patients.
Chapter II and III investigate the gentics and the biological significance of phenotypic natural variation, both in cellular phenotypes and in gene expression levels. Both studies aim at identifying loci that regulate the correlations between genetic and natural phenotypic variation in human cell lines of several cohorts.
ATTAR COHEN, Homa. Cellular biology, genetics and genomics : A powerful liaison to reveal genotype phenotype correlations. Thèse de doctorat : Univ. Genève, 2008, no. Sc.
4003
URN : urn:nbn:ch:unige-7151
DOI : 10.13097/archive-ouverte/unige:715
Available at:
http://archive-ouverte.unige.ch/unige:715
Disclaimer: layout of this document may differ from the published version.
1 / 1
1 UNIVERSITE DE GENEVE
Département de Zoologie et Biologie Animale FACULTE DES SCIENCES Professeur Denis Duboule Département de Médecine Génétique et Développement FACULTE DE MEDECINE
Professeur Stylianos E.
Antonarakis
____________________________________________________________________________
Cellular biology, genetics and genomics : A powerful liaison to reveal genotype phenotype
correlations
THESE
présentée à la Faculté des sciences de l’Université de Genève pour obtenir le grade de Docteur ès sciences, mention biologique
par
Homa ATTAR COHEN de
Ramat Gan, Israël
Thèse nº 4003 Université de Genève
Genève
2
3
Table of Contents
Acknowledgements …….……… 5
Résumé en français ……… 11
Rationale and Contents ……….. 25
Introduction .……….. 49
Chapter I: Significant decrease in reactive oxygen species (ROS) production in B lymphoblastoid cell lines of Down syndrome individuals ………. 99
Chapter II: Genome-wide genetic dissection and identification of susceptibility loci for natural variation in ROS production in multiple human cell line cohorts ………...……….. 127
Chapter III : Genome-wide association of gene expression variation levels of GENCORD individuals: Comparison of regulatory loci across 3 cell-types ...……….. 171
Conclusions and Perspectives ..……….… 203
Appendix I ..………... 209
Appendix II ...………..……….. 217
4
5
___________________________________________
Acknowledgements
6
7
I want to thank first of all the NCCR graduate school for giving me the opportunity to pursue my PhD. The framework of this graduate program gave me the chance to learn from prominent scientists, take high-quality courses, and do rotations during the first year. From this first year educational program, I created a scientific network with students and PIs in several laboratories, learned new methods and technologies, and started new collaborations.
I wish to thank several people - in chronological order in which I met them - for their suggestions, insights, and experience in helping me shape my scientific interests and expand my knowledge: Prof. Denis Duboule, whom I met the first day upon my arrival at the graduate school and who told me after a short description of his work to go back home and think, for at least 2 weeks, about what I really wanted to do during my life before joining his group. I was certainly surprised by this answer, but appreciated this suggestion. So I went home to think about my future life. Then, I met Dr. Manolis Dermitzakis. His work, his scientific background, and his view on medical genetics and what evolutionary biology could contribute to this field convinced me that this was the way I wanted to pursue my studies: applying evolutionary thinking to medical genetics.
After Prof. Stylianos Antonarakis accepted me as a PhD student in his lab, I spent two months working with Manolis before he left for the Sanger Institute, bombarding him with questions based on my struggle to find a way into this new field. One sentence Manolis said to me struck me as deeply profound, and followed me during the coming years: “One day, there is a click in your mind and suddenly all is the same, and you understand all of it”. I am waiting for this moment to happen, but I have understood the truth in this sentence.
I want to thank Prof. Stylianos Antonarakis. It has been a privilege to be one of his students in many ways. Most importantly, his intelligence, combined with his enthusiasm
8
and intensity for the quest for answers in research have impressed me and motivated me every day. I want to thank him for giving me the chance to evolve in his group, trust me to conduct expensive experiments, and give me unlimited possibilities of setting up collaborations with his many scientific friends and colleagues. I appreciated this aspect in its full significance, as it is the work of a life-long effort in establishing and maintaining excellent professional and personal relationships with other scientists. I also want to thank him for having his office door open to me any time, always willing to discuss and give me advices and encouragements. Finally, I wish to thank him for encouraging me to participate in one of the best and most interesting courses, the “Short Course on Medical and Experimental Mammalian Genetics” in Bar Harbor in summer 2006, where he also taught me and other students, in a beautiful little fishermen’s restaurant at the sea, how to eat lobster.
I wish to thank the whole lab for teaching me almost everything I know today about experimental and molecular genetics, especially Samuel Deutsch, Maryline Gagnegin, and Josiane Wyniger (former technician of Prof. Amalio Telenti) for their support, advise, and teaching. Also, I would like to thank the administrative and secretarial team for their smile every morning and their efforts in helping and supporting us at any time.
In addition, I want to thank the laboratories and investigators with whom I worked on specific projects: Prof. Goncalo Abecasis who accepted me in his group as a rotational student and who integrated me as a regular student in intensive graduate courses taught by Prof. Michael Boehnke and himself. I have learned a lot from them, and my knowledge of statistics that I used during my PhD project was acquired mostly in their scientific environment. My PhD project has been a close and intense collaboration with Prof. Karl- Heinz Krause and Dr. Karen Bedard, who taught me everything about reactive oxygen
9
species and its role in the cell. I want to thank them for their investment in this project and the welcoming atmosphere of their group. It has been a pleasure to work together. I also thank the genomics platform, especially Dr. Patrick Descombes, Celine Delucinge Vivier, Didier Chollet, Mylene Docquier, and Olivier Schadt for their constant support, and a welcoming atmosphere.
I want to thank Professors Jacques Beckmann, Manolis Dermitzakis, and Denis Duboule for accepting to be examiners of this thesis.
Very special thanks go to Denis Cohen who has encouraged me on a professional and personal level all the way through my PhD. I also thank my family and my friends for supporting me during these four years and for accompanying me through my life.
10
11
___________________________________________
Résumé en français
12
13
Les découvertes récentes en génétique humaine ont fait avancer notre savoir sur l’aspect médical, biologique et évolutif du génome humain. Afin d’identifier des loci de susceptibilité à une maladie, l'analyse d'association au niveau génomique est considérée comme l’approche la plus puissante, car elle révèle des résultats sur de nouveaux loci et sur des pathophysiologies qui n’auraient pas pu être découvert par des approches de gène de candidat. Le but principal de la génétique humaine est d'identifier des causes de maladies humaines au niveau moléculaire et génétique. L'analyse d'association satisfait trois aspects de cet effort : d'abord, en prédisant les facteurs de risque dans la population et en recommandant des styles de vie adaptés; deuxièmement, en améliorant la connaissance de la pathophysiologie d’une maladie pour guider des approches thérapeutiques ; troisièmement, en découvrant et en cataloguant des sous-classes de phénotypes cliniques qui pourraient être à la base de plusieurs maladies ayant une étiologie génétique distincte. Un préalable fondamental dans la cartographie de maladie est une carte génétique et une annotation des caractéristiques génomiques pour l'interprétation des loci identifiés. Depuis 2001, la séquence d’ADN de l’homme est connue (Venter et al, 2001; International Human Genome Sequencing Consortium, 2001).
L’annotation de cette séquence est fondée sur des approches de génomique comparatives utilisant des génomes d'organismes modèles comme la levure, la drosophile, la souris, le chimpanzé ainsi que plus de 17 génomes d'espèces mammifères (The C. elegans Genome Sequencing Consortium, 1998; Adams et al, 2000; Mouse Genome Sequencing Consortium, 2002; Chimpanzee Sequencing and Analysis Consortium, 2005). Ces informations sur la conservation entre espèces montrent pour chaque séquence une trajectoire évolutive à travers le temps et relient la séquence à la fonction potentielle. Le résultat de ces analyses indique que, en plus des gènes, de grandes fractions d'ADN non codant ont été hautement conservées parmi les phylums et constituent 3 % du génome
14
humain (Dermitzakis et al, 2002; Thomas et al, 2003). Cette conclusion a modifié le modèle précédent proclamant que les informations importantes biologiques étaient contenues seulement dans les séquences codantes. Les évidences pour cette hypothèse précédente étaient fondées sur de multiples conclusions sur les maladies Mendéliennes associées aux mutations dans les régions codantes. De plus, la conservation plus importante des exons par rapport aux introns ont argumenté en faveur de l'importance d'ADN codant. Pourtant, la même force de conservation a été découverte pour les séquences non codantes, suggérant leurs fonctions importantes dans les régions non- codantes avec des mécanismes encore inconnus. Le maintien des régions de basse ou de haute conservation, tant au niveau de la séquence d'ADN qu’au niveau de l'expression de gène à travers des espèces multiples nous permet de déchiffrer un paysage génomique formé par l’évolution et par la fonction biologique pour les séquences codantes et non- codantes (Bergmann et al, 2004; Margulies et al, 2007; Encode Project Consortium, 2007). Comme conséquence, la génétique médicale est affrontée aujourd'hui à une question ambitieuse : quelle est l'histoire évolutive et la fonction de chaque nucléotide ?
Le but principal de cette thèse est d'intégrer des concepts et des technologies récents pour adresser une question importante dans la génétique: l’identification de loci de susceptibilité à des maladies complexes incluant la Trisomie 21. Spécifiquement nous visons à établir une catégorie de phénotypes qui est directement relié aux manifestations cliniques. Nous l’appelons phénotype quantitatif cellulaire, ce qui définit tous les phénotypes qui peuvent être mesurés comme des caractéristiques d'une manifestation de cellule entière, comme la capacité d'adhésion, le taux de prolifération ou la production d'espèces d’oxygènes réactifs. Le chapitre I décrit la mise en place d'un protocole expérimental pour mesurer des phénotypes cellulaires en grands nombres. Nous visons à
15
investiguer la variabilité de la production d’espèces d’oxygènes réactifs (ROS) en tant que phénotypes cellulaire fondamental du système immunitaire inné qui peut être impliqué dans de nombreuses manifestations cliniques. Les chapitres II et III enquêtent sur la génétique et sur la signification biologique de la variation naturelle de ROS ainsi que sur l'expression des gènes. Les deux études visent à identifier des loci qui régulent les corrélations entre la variation phénotypique et génétique naturelle. Les chapitres s’intitulent:
Le chapitre I Diminution significative de la production d’espèces d’oxygènes réactifs (ROS) dans des lignées de cellules lymphoblaste B d’individus atteints de la Trisomie 21
Le chapitre II Dissection génétique du génome et identification des loci de susceptibilité à la variation naturelle de la production de ROS dans des multiples cohortes de lignées cellulaires humaines
Le chapitre III Etude d’association génomique des niveaux de variation d'expression de gènes d’individus GENCORD: Comparaison de loci régulateurs à travers 3 types cellulaires
La première partie de ce travail (Chapitre I) consiste à établir une approche expérimentale afin d'enquêter sur des phénotypes cellulaires quantitatifs qui pourraient être liés aux manifestations cliniques du syndrome de la Trisomie 21. Comme tous les réarrangements chromosomales, la Trisomie 21 est une maladie complexe avec des manifestations cliniques multiples. Le chromosome humain 21 (HSA21) est le plus petit chromosome et contient 283 gènes codant connus (Ensembl, version 48). La principale
16
hypothèse sur les mécanismes moléculaires à la base des manifestations cliniques de la Trisomie 21 est un "effet de dosage de gène”. Ceci signifie qu'une dérégulation de l’état diploïde à un état triploïde dans le cas de la Trisomie 21 mène, par le niveau d'expression de gènes surexprimés, à des phénotypes variés. Cette hypothèse a été évaluée dans de multiples lignées cellulaires et tissus humains ainsi que dans les modèles de souris trisomique comme le Ts1Cje et le Ts65Dn. La majorité de ces études a trouvé une concordance quant à l’hypothèse d’un déséquilibre de dosage de gène, avec une surexpression globale des gènes du HSA21. Pourtant, il reste difficile d'indiquer exactement quel gène est associé à quel phénotype clinique spécifique. Parmi les quelques exemples de gènes directement associé à la trisomie 21 sont l’APP et la maladie d’Alzheimer, une condition qui apparaît chez tous les patients après l'âge de 35 ans.
Aussi, DYRK1 et DSCR1 sont deux gènes récemment identifiés comme pertinents aux phénotypes de la Trisomie 21. Des altérations du système immunitaire des individus trisomiques ont été décrites dans de multiples études, et le taux de production des espèces d’oxygènes réactifs (ROS) est un phénotype important dans le fonctionnement du système immunitaire inné. Un déséquilibre dans la production de ROS a été trouvé chez des individus trisomiques dans quelques études, mais pas dans d’autres. De plus, aucun lien direct entre un gène spécifique et la dérégulation de ROS chez les individus trisomiques a pu être établi.
Dans cette étude, nous souhaitons mesurer la variation de la production de ROS au sein d’une population composé d’individus sains (N=60). Celle-ci est ensuite comparée à une population composé d’individus atteints de trisomie 21 (N=76). Nous utilisons AmplexRed, un essai enzymatique quantitatif qui détecte H2O2 dans des lignées cellulaires humaines de lymphoblaste B (LCL). Nous avons trouvé des phénotypes très variables dans la population saine, suggérant une large variation naturelle de la
17
production de ROS. La comparaison entre les individus et les contrôles a révélé une production hautement diminuée de ROS dans les individus trisomiques. Cette différence entre les deux populations persiste au-delà de la variation à l’intérieur de chaque population.
La variation naturelle sert de substrat à l'évolution. La compréhension de la relation entre le phénotype et sa régulation génétiques est primordiale (voir Chapitre II). L’analyse génétique de la variation naturelle afin d’estimer sa contribution à la variation globale au sein d’une population est aujourd’hui faisable en terme conceptuel et technique pour des traits complexes au niveau génomique. L'importance et la faisabilité d'un tel effort ont été démontrées dans des études multiples basées sur l'utilisation de phénotypes quantitatifs plutôt que des phénotypes qualitatifs, comme le niveau d'expression de gènes. Ces études sur la variation naturelle d’expression de gènes dans des lignées cellulaires humaines d’une population d’individus sains ont démontré des différences substantielles. De plus, 30% de cette variation a été attribué à des facteurs génétiques (Schadt et al, 2003; Cheung et Spielman, 2002; Cheung et al, 2003; Morley et al, 2004). Ces résultats ont permis une compréhension nouvelle et ont démontré une complexité de l’architecture génétique du génome (Pritchard et Cox, 2002; Stamatoyannopoulos, 2004; Gibson et Weir, 2005).
Basé sur la grande quantité de variation naturelle des niveaux d'expression de gènes et du contrôle génétique de cette variation, nous avons présumé que beaucoup de traits quantitatifs pourraient montrer des caractéristiques semblables. Comme décrit dans le chapitre précédent, nous avons trouvé des différences substantielles dans la production de ROS dans la population saine. Notre hypothèse est qu’une partie de cette variation peut être sous contrôle génétique. En outre, nous visons à identifier des régulateurs de cette variation à un niveau génomique. Pour ceci, nous appliquons des analyses de liaison
18
et d'association afin d’identifier les facteurs génétiques de la variation de production de ROS. Pour déterminer la contribution génétique au phénotype, nous avons évalué la production de ROS parmi 113 individus de la collection de lignée cellulaire provenant du répositoire CEPH (10 familles). Nous avons trouvé une héritabilité de 45%, ce qui suggère que presque la moitié de la variation naturelle observée entre individus est d’origine génétique. Des analyses de liaison et d’association sur plusieurs collections de lignées cellulaires de populations saines suggèrent 2 loci sur Hsa12 et Hsa15 dans la proximité des gènes ETV6 et de MEIS2 qui sont des potentiels régulateurs de la production de ROS. Des analyses fonctionnelles soutiennent ces résultats. Les deux gènes identifiés sont des facteurs de transcription impliqués dans la différentiation et le maintien de cellules souches hematopoietiques tôt dans le développement.
Cette étude est la première analyse de la variation naturelle de la production de ROS. Nous démontrons qu’une approche cellulaire, génétique et génomique a la capacité de découvrir des corrélations entre un phénotype et un génotype qui n’aurait pas pu être mis à jour avec une approche de gène candidat. De plus, nous montrons que la production de ROS est sous contrôle génétique, et il est fort probable que la génétique contribue largement à de multiples phénotypes cellulaires.
Peu de données existent sur les mécanismes qui relient la susceptibilité génétique à une maladie dite complexe. Ceci est dû à notre savoir limité sur la pathophysiologie à la base du déclenchement de la maladie, et à la limitation d’accès au tissue ou type de cellules impliqué dans la maladie (voir Chapitre III). Afin d’adresser ces limitations, il est essentiel de connaître les sites de régulateurs génétiques qui contrôlent les processus cellulaires naturels. De plus, la comparaison de la régulation de ces sites régulateurs dans
19
des types de cellules variées avancera notre compréhension sur les mécanismes de la régulation génétique en opposé à la régulation micro-environmentale spécifique au type de cellule. Les sites de régulation génétiques peuvent être révelé avec des approches d’analyses d’association en utilisant le niveau d’expression des gènes autant que phénotype dans plusieurs types de cellules.
Pour ceci, nous établissons une nouvelle cohorte de nouveau-nées se composant des cellules lymphoblastes B et T, ainsi que des fibroblastes que nous avons établi du cordon ombilical de 350 individus à ce jour. Cette cohorte est nommée GENCORD. Pour chaque type de cellule, le niveau d’expression de tous les gènes du génome est mesuré avec la puce d’expression d’Illumina. Les génotypes de chaque individu sont établis pour 550’000 loci du génome avec la puce de génotypage d’Illumina. L’étude d’association entre le phénotype et le génotype dans les différents types de cellules nous indiquera la fraction de gènes sous le contrôle d’un régulateur qui agit sur le niveau de transcription soit dans une seule lignée cellulaire, soit dans des lignées cellulaire de type lymphoblastoides ou bien d’une manière indépendante de l’origine de la cellule.
On considère aujourd’hui qu’une fraction considérable de la variation phénotypique naturelle contribue à la susceptibilité à des maladies complexes (Risch et Merikangas, 1996; Risch, 2000; Pritchard, 2001; Reich et Lander, 2001; Clark, 2003;
Lohmueller et al, 2003). Malgré de considérables progrès au niveau de la détection de loci associé à des maladies complexes, les mécanismes sous-agent restent plus difficiles à identifier. Ce projet a le potentiel d’amener une nouvelle vision sur la régulation d’expression de gènes dans des environnements cellulaires de l’homme ainsi que d’éclaircir des mécanismes de la régulation génétique de la transcription du génome.
20
21
Références
Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF et al. The genome sequence of Drosophila melanogaster. Science. 2000 Mar 24;287(5461):2185-95
Bergmann S, Ihmels J, Barkai N. Similarities and differences in genome-wide expression data of six organisms. PLoS Biol. 2004 Jan;2(1)
Cheung VG and Spielman RS. The genetics of variation in gene expression. Nat. Genet.
2002 32: 522–525
Cheung VG, Conlin LK, Weber TM, Arcaro M, Kuang-Yu J, Morley M, and Spielman RS. Natural variation in human gene expression assessed in lymphoblastoid cells. Nat.
Genet. 2003 33: 422–425
Chimpanzee Sequencing and Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature. 2005;437:69–87
Clark AG. Finding genes underlying risk of complex disease by linkage disequilibrium mapping. Curr Opin Genet Dev. 2003 Jun;13(3):296-302
Dermitzakis ET, Reymond A, Lyle R, Scamuffa N, Ucla C, Deutsch S, Stevenson BJ, Flegel V, Bucher P, Jongeneel CV et al. Numerous potentially functional but non-genic conserved sequences on human chromosome 21. Nature. 2002 Dec 5;420(6915):578-82 ENCODE Project Consortoim, Identification and analysis of functional elements in 1%
of the human genome by the ENCODE pilot project. Nature. 2007 Jun 14;447(7146):799- 816
Gibson G, Weir B. The quantitative genetics of transcription. Trends Genet. 2005 Nov;21(11):616-23
22
International Human Genome Sequencing Consortium, Initial sequencing and analysis of the human genome. Nature. 2001 Feb 15;409(6822):860-921
Lohmueller KE, Pearce CL, Pike M, Lander ES and Hirschhorn JN. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat. Genet. 2003 33, pp. 177–182
Margulies EH, Cooper GM, Asimenos G, Thomas DJ, Dewey CN, Siepel A, Birney E, Keefe D, Schwartz AS, Hou M et al. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res. 2007 Jun;17(6):760-74
Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG.
Genetic analysis of genome-wide variation in human gene expression. Nature. 2004 Aug 12;430(7001)
Mouse Genome Sequencing Consortium, Initial sequencing and comparative analysis of the mouse genome.Nature. 2002 Dec 5;420(6915):520-62
Pritchard JK. Are rare variants responsible for susceptibility to complex disease? Am. J.
Hum. Genet. 2001 69, pp. 124–137
Pritchard JK, Cox NJ. The allelic architecture of human disease genes: common disease-common variant...or not? Hum Mol Genet. 2002 Oct 1;11(20):2417-23
Reich DE and Lander ES. On the allelic spectrum of human disease. Trends Genet. 2001 17, pp. 502–510
Risch N and Merikangas K. The future of genetic studies of complex human diseases.
Science. 1996 273, pp. 1516–1517
Risch NJ. Searching for genetic determinants in the new millenium. Nature. 2000 405, pp. 847–856
23
Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V, Ruff TG, Milligan SB, Lamb JR, Cavet G et al. Genetics of gene expression surveyed in maize, mouse and man.
Nature. 2003 422:297-302
Spielman RS, Bastone LA, Burdick JT, Morley M, Ewens WJ, Cheung VG. Common genetic variants account for differences in gene expression among ethnic groups. Nat Genet. 2007 Feb;39(2):226-31
Stamatoyannopoulos JA. The genomics of gene expression. Genomics. 2004 Sep;84(3):449-57
The C. elegans Genome Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. 1998;282:2012–2018
Thomas JW, Touchman JW, Blakesley RW, Bouffard GG, Beckstrom-Sternberg SM, Margulies EH, Blanchette M, Siepel AC, Thomas PJ, McDowell JC et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature. 2003 Aug 14;424(6950):788-93
Venter JC et al. The sequence of the human genome. Science. 2001 Feb 16;291(5507):1304-51
24
25
___________________________________________
Rationale and Contents
26
27
Human genetics is in the midst of revealing fundamental medical, biological and evolutionary patterns of the human genome. Genome-wide association analysis scanning the whole genome for susceptibility to disease provide insight into novel disease loci and their pathophysiology that would not have been detected by hypothesis-driven candidate- gene approaches. The major aim of human genetics is to identify human disease at the molecular and genetic level. Association analysis hold the promise to satisfy three aspects of this endeavor: first, to predict risk factors in the population and propose adapted lifestyles; second, to improve our knowledge of the pathophysiology of disease to direct therapeutical approaches and third, to detect and catalogue sub-classes of clinical phenotypes underlying several diseases with distinct genetic aetiology. A fundamental prerequisite in disease mapping is a reliable genetic map and annotation of genomic features for interpretation of identified loci. To date, the human sequence is known (Venter et al, 2001; International Human Genome Sequencing Consortium, 2001) and largely annotated based on comparative genomics approaches using genomes of important model organisms such as yeast, drosophila, mice, and chimpanzee (The C.
elegans Genome Sequencing Consortium, 1998; Adams et al, 2000; Mouse Genome Sequencing Consortium, 2002; Chimpanzee Sequencing and Analysis Consortium, 2005) and more than 17 mammalian species genomes. Comparative genomics compares sequence-based features of various genomes to trace the original appearance, divergence, duplication and eventual loss of a sequence. This information provides, for each sequence, a historical trajectory through evolutionary time and links sequence to potential function. Evidence from these analyses indicated that, in addition to genes, large fractions of non-coding DNA were highly conserved among phyla and made up 3% of the human genome (Dermitzakis et al, 2002; Thomas et al, 2003). This finding challenged the previous belief that biological important information was contained only in coding
28
sequences. Evidence for this previous assumption was based on many findings on Mendelian diseases associated with mutations in coding regions. Also, higher conservation levels in exons as compared to introns argued for the importance of coding DNA. However, the same conservation strength was detected for non-coding DNA in genome-wide comparative analysis, suggesting important function within non-coding regions with yet unknown mechanisms. Maintenance of low or high conserved regions for both DNA sequence and gene expression levels across multiple species provide insight into a genomic landscape shaped by evolutionary time and biological function across coding and non-coding DNA (Bergmann et al, 2004; Margulies et al, 2007;
Encode Project Consortium, 2007). As a consequence, medical genetics is confronted today with the ambitious question: what is the history and the function of every single nucleotide?
The main aim of this thesis is to integrate multiple recent concepts and technologies to address a major aim in genetics: identifying susceptibility loci for complex diseases, including Trisomy 21. In particular, we aim at establishing a category of phenotypes that are more closely linked to clinical manifestations. We call them cellular quantitative phenotypes, which defines any phenotype that can be measured as a characteristics of a whole cell manifestation, such as adhesion capacity, proliferation rate or production of reactive oxygen species.
Chapter I describes the set-up of an experimental protocol to measure a cellular phenotype in large population sizes. We choose to investigate reactive oxygen species (ROS) production as a fundamental cellular signaling phenotype related to the innate immune system that may be involved in clinical manifestations of Down syndrome patients. Chapter II and III investigate the genetics and the biological significance of
29
phenotypic natural variation, both in cellular phenotypes and in gene expression levels.
Both studies aim at identifying loci that regulate the correlations between genetic and natural phenotypic variation. The chapters are entitled:
Chapter I Significant decrease in reactive oxygen species (ROS) production in B lymphoblastoid cell lines of Down syndrome individuals
Chapter II Genome-wide genetic dissection and identification of susceptibility loci for natural variation in ROS production in multiple human cell line cohorts Chapter III Genome-wide association of gene expression variation levels of
GENCORD individuals: Comparison of regulatory loci across 3 cell-types
The first part of this work (chapter I) consist in establishing an experimental approach to investigate quantitative cellular phenotypes that might be closely related to clinical manifestations of Down syndrome. Like all chromosomal rearrangements, Down syndrome (DS) is a complex disease with multiple clinical manifestations. Human chromosome 21 (HSA21) is the smallest chromosome and has 283 known protein-coding genes (Ensembl, release 48). The leading hypothesis on the molecular mechanisms at the basis of DS clinical manifestations is a “gene-dosage effect”, meaning that a deregulation of the diploid to a triploid state in the case of DS leads, through gene expression level up- regulation, to various DS phenotypes. This hypothesis was tested in multiple human cell lines and tissues (mainly cerebellum, cerebellar cortex, astrocytes, primary T-cells, primary fibroblasts and secondary B-cells) as well as in DS mouse models such as Ts1Cje and Ts65Dn in multiple tissues (all brain, midbrain, cortex, cerebellum, testis, kidney, lung, liver, heart, and skeletal muscle). Methods included SAGE (Chrast et al, 2000), differential display PCR (Bahn et al, 2002), comparative hybridization cDNA array
30
(FitzPatrick et al, 2002), microarray (Mao et al, 2003; Saran et al, 2003, Amano et al, 2004; Giannone et al, 2004; Kahlem et al, 2004; Dauphinot et al, 2005), and TaqMan real-time PCR (Lyle et al, 2004; Prandini et al, 2007). The majority of these studies found concordance regarding the proposed “gene-dosage imbalance” hypothesis, meaning an overall up-regulation of 1.5 fold of HSA21 genes. Some analyses also suggest deregulation of gene expression levels among all chromosomes, which may be a consequence of the primary “gene-dosage imbalance” of HSA21 genes (Saran et al, 2003). However, it remains difficult to pinpoint which gene is associated with a specific clinical DS phenotype. Few examples for genes with known associations with DS are APP and Alzheimer disease, a condition which is common to all DS patients after the age of 35 (on HSA21). For the latest reviews of the role of APP in the pathophysiology of Alzheimer disease in DS, see (Head et al, 2007, Lott et al, 2006). DYRK1 and DSCR1 are two genes recently identified as relevant to DS phenotypes. Based on extensive mathematical and experimental approaches using several transgenic mice and 4 DS mouse models, DYRK1 and DSCR1 (both genes localized on HSA21) were identified as implicated in immune-related features of DS, suggesting that a sensitive and developmental-time dependent regulatory circuit exist between the expression levels of DYRK1 and DSCR1 and the NFAT pathway which, when mutated, resembles many DS phenotypes (Arron et al, 2006). An additional example for a HSA21 gene-phenotype correlation is ETS2, a gene conferring to decreased incidence of intestinal tumors in trisomic DS mouse models (Sussan et al, 2008). Results of this study showed that the gene expression level of ETS2 was directly correlated to cancer incidence. Despite these advances, most of the genes on HSA21 still remain to be characterized for their specific role in DS clinical manifestations.
31
A complementary approach in DS research is to investigate the phenotype–
genotype correlation based on phenotype mapping. Although establishing genotype- phenotype relationships for complex disorders remain challenging, genome-wide linkage and association analysis have recently been successful in the identification of loci that contribute to the aetiology of multi-factorial traits such as invasive pneumococcal disease, bacteremia, malaria and tuberculosis (Khor et al, 2007), Crohn’s disease (Rioux et al, 2007) and determinants for host control of HIV-1 (Fellay et al, 2007). As for many complex diseases, success in identifying susceptibility loci may be limited due to the complexity of the disease aetiology, and/or limited sample size. To reduce the complexity of the clinical manifestation Carlson et al (2004) proposed to divide phenotypes into sub- phenotypes and thereby enhance power for their detection.
Here, we study such a sub-phenotype that may be one of several underlying causes of premature ageing and increased susceptibility to infections in DS patients (Cuadrado and Barrena, 1996; Yang et al, 2002; de Hingh et al, 2005): Reactive oxygen species (ROS) production. ROS is involved in the innate immune response against certain pathogens (e.g. Staphylococcus aureus, Pseudomonas, Aspergillus). It is produced in phagocytes and neutrophils as a reaction to certain bacterial or fungal infections, and therefore constitutes a model phenotype to study. ROS production and its first associated human disease “chronic granulomatous disease” (CGD) was described and characterized in 1933 and 1967 by Baldridge and Gerard (1933) and Quie et al (1967) respectively.It is the first immune deficiency where a molecular mechanism, namely lack of ROS production, was established. Immune deficiencies are commonly observed in Down syndrome individuals, such as increased susceptibility to both viral and bacterial pathogens (Cuadrado and Barrena, 1996; Yang et al, 2002). Immune related features and hematopoiesis are altered in DS, and potential underlying physiological features include
32
thymus involution, abnormal proportions of mature B and T-cells in the peripheral blood, cellular dysfunction and autoimmune phenomena (Cuadrado and Barrena, 1996, Douglas, 2005). Also, a molecular link was clearly demonstrated between increased ROS, DS and Alzheimer Disease in neuronal cells of patients (Busciglio and Yankner, 1995, Zana et al, 2007) which involves HSA21 genes such as SOD1 and APP. However, many conflicting results exist for each immune-related aspect of DS, which may be explained by either problems with sensitivity or specificity of assays used, or by natural variation in ROS production which was not quantified so far. To date, no direct correlations between specific DS clinical phenotypes and ROS production unbalance could be established yet.
We set out to identify the difference in ROS production in B-lymphoblastoid cell lines of 60 HapMap individuals (acquired from CEPH cell line repository) and 76 Down syndrome patients (from US, Italy and AnEUploidy Consortium) to quantify variation between individuals, and between DS and controls. We first measured ROS production for 21 unrelated CEPH individuals in Epstein-Barr virus (EBV) transformed lymphoblastoid cell lines (LCL) using AmplexRed, a quantitative enzymatic assay that detects H2O2. We found highly variable phenotypes in this population. To analyze the reproducibility of the assay and the robustness of the cellular ROS phenotype, we repeated the whole experiment for the same individuals some weeks later, and results supported the notion that there is high reproducibility for both, the assay and the phenotype. We concluded from this experiment that there is substantial natural variation between healthy individuals for ROS production and that this variation is reproducible.
Next, we assessed ROS production from B-lymphoblastoid cell lines of 60 HapMap individuals and 76 Down syndrome patients. Again, we found large inter-individual natural variation within the 60 HapMap individuals. Interestingly, the comparison between cases and controls (DS and HapMap) revealed a highly decreased production of
33
ROS in DS (p < 4,9 e-10, two-sided Kolmogorov-Smirnov test). The molecular mechanism of why DS show highly reduced ROS production still needs to be explored.
Overall, this study illuminates the importance of assessing natural variation between individuals, as it may be large and influence the outcome of individual-based case-control studies dramatically. The finding of decreased ROS in B-lymphoblastoid cell lines is in contrast to previous findings in neurons and astrocytes, where ROS was increased. The discrepancy between the studies may originate from (1) the considerable amount of natural variation, which might bias results when sample sizes are too small.
Most studies concentrated on comparisons of one or two cases and controls. It might be of interest to reassess ROS production in neuronal cell lines of several individuals to estimate natural variation in those cell-types. (2) Multiple molecular pathways may underlie the regulation of ROS production within the same cell-type and across cell-types.
Therefore, it is crucial to identify molecular mechanisms in various cell-types of the same individual to test whether the action of these regulators are conserved across cell-types.
The natural variation of ROS may be regulated by genetic loci, and their identification would greatly enhance our understanding of the biological pathways involved in this phenotype (see chapter II). (3) With the knowledge of the second point, we can compare genetic regulation across multiple cell-types to estimate by what fraction phenotypic variation and its regulation is governed through genetic factors or cell-specific differentiation and physiological environment. Analysis including ROS production across multiple cell lines may shed light on the cell-specific regulation of ROS in DS (see chapter III).
Natural variation serves as the substrate for evolution. As such, understanding the relationship between phenotypic variation and its regulation is crucial. In Chapter II, we
34
aim at dissecting natural variation of ROS production to identify the genetic loci controlling ROS production by genome-wide linkage and association studies. The importance and feasibility of such endeavor was demonstrated in multiple successful studies based on the use of quantitative rather than qualitative phenotypes for mapping complex traits: The first study to show the amount of gene expression level variation under genetic control was provided by Brem and colleagues (Brem et al, 2002) in the budding yeast. Overall, half of the genome showed differential gene expression and the median proportion of the genetic contribution to gene expression variation was 84%.
About one third of mapped loci were in the vicinity of the gene (cis-regulator), and a few ones were found to regulate genes in trans (10 kb away from the linked gene). These results suggested that genetic variation is characterized by a high rate of cis-acting alleles and a small number of trans-acting alleles with widespread transcriptional effects.
Subsequently, gene expression levels in several species, including in human cell lines, were assessed for inter-individual variation. Interestingly, gene expression variation among healthy individuals showed substantial gene expression level variation, 30% of which could be attributed to genetic factors (Cheung and Spielman, 2002; Cheung et al, 2003; Schadt et al, 2003; Morley et al, 2004). These results provided a framework for the analysis of the underlying genetic architecture of quantitative phenotypes, the genetics and genomics of transcription on a genome-wide level and among multiple species (Pritchard and Cox, 2002; Stamatoyannopoulos, 2004, Gibson and Weir, 2005).
In chapter II, we aimed to identify key players within the pathways of ROS production, which may be related to DS immune-related phenotypes. While absence of ROS production in chronic granulomatous disease is well established, little attention has hitherto been paid to the variability of the amplitude of ROS production in the general population. Observations (Krause, unpublished data) suggest that there are high and low
35
ROS producers among the general population, a finding which we confirm and quantify in chapter I. High ROS producers are expected to have a better protection from infection, while low ROS producers are expected to be at increased risk. However, the genetic factors underlying the variability of ROS generation in the healthy population are not understood. Polymorphisms in the subunits of the phagocyte NADPH oxidase, in particular the p22phox/CYBA gene do not explain the ROS variability (reviewed in Bedard & Krause, 2007). Most likely, factors upstream from the enzyme itself (in particular transcriptional regulators) may play an important role. To determine which fraction of natural variation between individuals shows genetic contribution, we assessed ROS production in 113 CEPH individuals, originating from 10 large families (acquired from CEPH cell line repository). The measurements are performed as described in chapter I. First evidence for genetic contribution is given by heritability analysis suggesting 45% of genetic contribution, meaning that from the overall natural variation in ROS production in 10 CEPH families, almost half can be attributed to genetic factors. For the same individuals, we perform genome-wide linkage analysis to map regulatory loci of ROS variation. Based on 1000 simulations to define nominal significance threshold, results suggest 2 loci on Hsa12 and Hsa15 with LOD scores of 4.64 (p < 1.00 e-6) and 4.81 (p < 1.00 e-6) respectively for the strongest and significant linkage hits genome- wide. To refine linkage results and replicate the 2 identified loci, we used 200 individuals from the KORA sample cohort for genome-wide association analysis. Among the top association signals, we replicated the same 2 loci detected in the linkage analysis on Hsa12 and Hsa15. The locus on Hsa12 is located 25 kb upstream of the transcription factor ETV6. This gene is required for hematopoesis and maintenance of the developing vascular network (Wang et al, 1998). The locus on Hsa15 is located 166 kb upstream of the transcriptional start site of MEIS2, which transcribes a highly conserved homeobox
36
protein and is associated with transcription regulation. Function of Meis2 was also associated with growth control and differentiation during embryogenesis and homeostasis. To follow-up and validate these results, we assess the phenotype, genotype and association analysis for additional 100 individuals from the GENCORD sample cohort (described in chapter III) and 60 individuals of the HapMap sample collection.
Associations support the initial findings for the loci near ETV6 and MEIS2. Furthermore, we perform genome-wide correlation analysis between gene expression level variation (for all genes genome-wide expressed in B-lymphoblastoid cell lines) and ROS variation for the 60 HapMap individuals. This approach aims at providing independent biological evidence for specific genes that may regulate or modify ROS expression. SPRED1, the neighbouring gene of MEIS2 appeared as one of the highest correlations (P < 1.00 e-6).
Finally, we performed over-expression analysis for the identified candidate genes in LCL’s to assess the direct effect of the genes in ROS variation. We selected 1 individual cell lines with low expression of ROS, and 1 cell lines with high expression of ROS and transfected a plasmid containing one of the candidate genes ETV6 on Hsa12, MEIS2, SPRED1, and RASGRP1 (three neighbouring genes) on Hsa15 respectively. We compared ROS expression for each transfected gene per cell line with the control cell line containing GFP. We detected a 3 fold down regulation of ROS when overexpressing MEIS2 in the high ROS producer cell line.
This study is the first analysis of natural variation in ROS production. We show that this fundamental cellular phenotype is regulated by two major loci genome-wide, suggesting ETV6 and MEIS2 and the adjacent SPRED1 and RASGRP1 as novel candidate genes involved in the etiology of a complex phenotype. Applying genome-wide approaches to quantitative cellular phenotypes as a sub-phenotype for disease status
37
represents a powerful approach in advancing our understanding of the genetics of variation and associated disease.
The next project (chapter III) investigates genetic regulation of gene expression levels across multiple cell-types. For the first time, we address the question of how genetic regulation is maintained across diverse cell-types on a genome-wide scale. The significance for this question relies on the fact that little data is available on mechanisms that relate genetic susceptibility to a disease to physiological features. These physiological features at the basis of a disease may be regulated in a cell-type specific manner. Therefore, distinguishing overall genetic risk for common diseases and cell-type specific regulation of phenotypic expression is crucial to advance our understanding of the impact of genetic factors to physiological and molecular mechanisms.
Sequence variation is thought to underlie a considerable fraction of natural phenotypic variation between individuals, including the risk for common complex disorders (Risch and Merikangas, 1996; Risch, 2000; Pritchard, 2001; Reich and Lander, 2001; Clark, 2003; Lohmueller et al, 2003). There has been a deluge of genome-wide studies examining the genetic basis of complex diseases by exploring the effects of genetic variation such as SNPs and CNVs (discussed in Beckmann et al, 2007) on disease. Either discrete traits for disease status (especially for complex diseases) or quantitative traits were used in this regard. Examples for quantitative traits are physiological measures such as blood pressure and hypertension (Dominiczak et al, 2004;
Mein et al, 2004; Deng et al, 2007), lipid-levels for diabetes (Wellcome Trust Case Control Consortium, 2007) and gene expression levels in yeast, maize, mice and humans (Brem et al, 2002; Schadt et al, 2003; Cheung et al, 2003; Morley et al, 2004, Stranger et al, 2005, 2007, Spielman et al, 2007). A fraction of these genetic polymorphisms may act
38
on gene expression variation, either by mechanisms altering the sequence of the protein (non-synonymous polymorphism), disrupting mRNA splicing pattern or affecting a regulatory element of a gene or a whole cluster of genes. The latter mechanism and liaison between a genetic variant and a disease outcome with an identified molecular mechanism was found in the recent study of de Gobbi et al (2006). DNA of individuals with the inherited blood disorder alpha thalassemia was screened for variants in the known alpha-globin gene cluster, but no mutations were detected. When upstream regions were resequenced and analyzed by chromatin immunoprecipitation and dense oligonucleotide arrays, a SNP was identified which created newly opened chromatin and a new GATA1 binding site, resulting in a deregulation of downstream alpha globin genes.
This SNP was neither in a coding nor in a conserved non-coding region on HSA16, emphasizing the need for systematic investigation of all variants for potential regulatory mechanisms. Few such examples with identified mechanisms are known to date, and both the description and characterization of functional consequences of variants are needed.
With the deluge of genome-wide association analysis, the landscape of genetic and genomic regulation could be analyzed across the whole-genome for various diseases. The emerging picture shows substantial amounts of cis-regulatory regions, distributed across the whole genome. Also, depending on the aetiology and evolutionary history of the disease, different genetic architectures were observed (Gibson and Goldstein, 2007).
Furthermore, gene expression level differences among populations were attributed to differential allele frequencies in populations, showing very basic genetic dynamics in human populations such as selection and drift (Spielman et al, 2007, Stranger et al, 2007).
Most of the studies on gene expression levels were performed in lymphoblastoid B cells of individuals and sample sizes were limited due to practical reasons. Even though the newest study of Stranger and colleagues (Stranger et al, 2007) assessed association for
39
eQTL in 270 individuals, they originated from 4 distinct human populations (HapMap Yoruban, Chinese, Japanese and Caucasian), each of which contributes 45 to 60 individuals.
It is crucial to enlarge sample sizes, and collection of cell-types and tissues for genome- wide approaches. Several “biobanks” are under way to propose larger cohorts for cell lines and DNA. We have chosen to establish a sample cohort allowing to create several cell lines per individual. At the University Hospital in Geneva, more than 3000 babies are born each year. With the consent of parents and the ethical committee, we sample umbilical cord and blood from these newborn children. We establish primary fibroblast cell lines, and extract lymphoblastoid cells from the blood to derive primary lymphoblastoid, primary T-cell and EBV-transformed B-cell lines. To date, the sample collection contains more than 350 individuals with 3 cell-types per individual. The collection, called GENCORD for Geneva Umbilical Cord sample collection, has several advantages compared to other cell line collections. First, it is an ongoing collection, allowing for increase of sample sizes. Second, all individuals have exactly the same age, and have not been exposed to variable environmental and lifestyle conditions. Third, specific cell-types might be more closely related to the disease or phenotype under investigation. Therefore, the GENCORD sample collection may serve for future case- control studies as an ideal control population for phenotypes measured in any of the 3 cell-types. Fourth, and most importantly, the establishment of several cell-types per individual allows us to ask a new and fundamental question: What is the fraction of regulatory loci which act similarly or diversely in cell-type specific biological environments within the same individual? Results to this question provide quantitative answers to whether a genetic risk factor can be measured in one cell-type and be assumed for others respectively. The increased sample size and homogeneous age and ethnicity
40
(Caucasians only) will enhance statistical power to detect reliable association signals. We genotyped 226 individuals of this sample cohort using the Illumina Human Hap500K array, and whole-genome Illumina Human expression will be measured for all 3 cell- types in initial 96 individuals. Association analysis within and among cell-types will provide a first glimpse into the amount and distribution of regulatory loci which act in a cell-specific, lymphoblast-specific or non-cell-type specific manner. Cell-type specific regulation of gene expression may modulate a considerable fraction of gene expression levels in a cell. Therefore, assessing genetic risk factors to common disorders mediated through gene expression levels need to be performed in the relevant cell-type or tissue, or the regulation of the gene in question needs to be known. This catalogue of gene expression regulatory loci across cell-types may help the scientific community in guiding cell-type selection for diseases where the appropriate cell-type and pathophysiological mechanism is not known or where the appropriate cell-type or tissue is not available.
Most importantly, this catalogue may serve as an atlas to categories of purely genetic versus microenvironmental (cell-type specific environment due to cell signaling, differentiation and regulation) regulation of gene expression levels genome-wide.
41
References
Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF et al. The genome sequence of Drosophila melanogaster. Science. 2000 Mar 24;287(5461):2185-95
Amano K, Sago H, Uchikawa C, Suzuki T, Kotliarova SE, Nukina N, Epstein CJ, Yamakawa K. Dosage-dependent over-expression of genes in the trisomic region of Ts1Cje mouse model for Down syndrome. Human Molecular Genetics. 2004 13 : 1333 Arron JR, Winslow MM, Polleri A, Chang CP, Wu H, Gao X, Neilson JR, Chen L, Heit JJ, Kim SK et al. NFAT deregulation by increased dosage of DSCR1 and DYRK1A on chromosome 21. Nature. 2006 Jun 1;441(7093):595-600
Bahn S, Mimmack M, Ryan M, Caldwell MA, Jaunlaux E, Starkey M, Svendsen CN, Emson P. Neuronal target genes of the neuron-restrictive silencer factor in neurospheres derived from fetuses with Down's syndrome: a gene expression study. Lancet. 2002 359 : 310
Baldridge CW and Gerard RW. The extra respiration of phagocytosis. Am J Physiol 103:
235-236, 1933
Beckmann JS, Estivill X, Antonarakis SE. Copy number variants and genetic traits:
closer to the resolution of phenotypic to genotypic variability. Nat Rev Genet. 2007 Aug;8(8):639-46
Bedard K and Krause KH. The NOX family of ROS-generating NADPH oxidases.
Physiol Reviews. 2007 87: 245-313
Bergmann S, Ihmels J, Barkai N. Similarities and differences in genome-wide expression data of six organisms. PLoS Biol. 2004 Jan;2(1)
Brem RB, Yvert G, Clinton R, Kruglyak L. Genetic dissection of transcriptional
42
regulation in budding yeast. Science. 2002 Apr 26;296(5568):752-5
Busciglio J, Yankner BA. Apoptosis and increased generation of reactive oxygen species in Down's syndrome neurons in vitro. Nature. 1995 Dec 21-28;378(6559):776-9
Carlson CS, Eberle MA, Kruglyak L, Nickerson DA. Mapping complex disease loci in whole-genome association studies. Nature. 2004 May 27;429(6990):446-52
Cheung VG and Spielman RS. The genetics of variation in gene expression. Nat. Genet.
2002 32: 522–525
Cheung VG, Conlin LK, Weber TM, Arcaro M, Kuang-Yu J, Morley M, and Spielman RS. Natural variation in human gene expression assessed in lymphoblastoid cells. Nat.
Genet. 2003 33: 422–425
Chimpanzee Sequencing and Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature. 2005;437:69–87
Chrast R, Scott HS, Papasavvas MP, Rossier C, Antonarakis ES, Barras C, Davisson MT, Schmidt C, Estivill X et al. The mouse brain transcriptome by SAGE: Differences in gene expression between P30 brains of the partial trisomy 16 mouse model of Down syndrome (Ts65Dn) and normals. Genome Research. 2000 Dec;10(12):2006-21
Clark AG. Finding genes underlying risk of complex disease by linkage disequilibrium mapping. Curr Opin Genet Dev. 2003 Jun;13(3):296-302
Cuadrado E, Barrena MJ. Immune dysfunction in Down's syndrome: primary immune deficiency or early senescence of the immune system? Clin Immunol Immunopathol.
1996 Mar;78(3):209-14
Dauphinot L, Lyle R, Rivals I, Dang MT, Moldrich RX, Golfier G, Ettwiller L, Toyama K, Rossier J, Personnaz L et al. The cerebellar transcriptome during postnatal development of the Ts1Cje mouse, a segmental trisomy model for Down syndrome.
43 Human Molecular Genetics 2005 14 : 373
de Hingh YC, van der Vossen PW, Gemen EF, Mulder AB, Hop WC, Brus F, de Vries E. Intrinsic abnormalities of lymphocyte counts in children with Down syndrome. J Pediatr. 2005 Dec;147(6):744-7
Dermitzakis ET, Reymond A, Lyle R, Scamuffa N, Ucla C, Deutsch S, Stevenson BJ, Flegel V, Bucher P, Jongeneel CV et al. Numerous potentially functional but non-genic conserved sequences on human chromosome 21. Nature 2002 Dec 5;420(6915):578-82 Deng AY. Genetic basis of polygenic hypertension. Hum Mol Genet. 2007 Oct 15;16 Spec No. 2:R195-202
Dominiczak AF, Brain N, Charchar F, McBride M, Hanlon N, Lee WK. Genetics of hypertension: lessons learnt from mendelian and polygenic syndromes. Clin Exp Hypertens. 2004 Oct-Nov;26(7-8):611-20
Douglas SD. Down syndrome: immunologic and epidemiologic associations-enigmas remain. J Pediatr. 2005 Dec;147(6):723-5
ENCODE Project Consortium, Identification and analysis of functional elements in 1%
of the human genome by the ENCODE pilot project. Nature. 2007 Jun 14;447(7146):799- 816
Fellay J, Shianna KV, Ge D, Colombo S, Ledergerber B, Weale M, Zhang K, Gumbs C, Castagna A, Cossarizza A et al. A whole-genome association study of major determinants for host control of HIV-1. Science. 2007 Aug 17;317(5840):944-7
FitzPatrick DR, Ramsay J, McGill NI, Shade M, Carothers AD, Hastie ND Transcriptome analysis of human autosomal trisomy. Human Molecular Genetics 2002 11 : 3249
Giannone S, Strippoli P, Vitale L, Casadei R, Canaider S, Lenzi L, D'Addabbo R,
44
Frabetti F, Facchin F, Farina A et al. Gene expression profile analysis in human T lymphocytes from patients with Down syndrome. Annals of Human Genetics. 2004 68 : 546
Gibson G, Goldstein DB. Human genetics: the hidden text of genome-wide associations.Curr Biol. 2007 Nov 6;17(21):R929-32.
Gibson G, Weir B. The quantitative genetics of transcription. Trends Genet. 2005 Nov;21(11):616-23
Gobbi M, Viprakasit V, Hughes JR, Fisher C, Buckle VJ, Ayyub H, Gibbons RJ, Vernimmen D, Yoshinaga Y, de Jong P et al. A regulatory SNP causes a human genetic disease by creating a new transcriptional promoter. Science. 2006 May 26;312(5777):1215-7
Head E, Lott IT, Patterson D, Doran E, Haier RJ. Possible compensatory events in adult Down syndrome brain prior to the development of Alzheimer disease neuropathology:
targets for nonpharmacological intervention. J Alzheimers Dis. 2007 Mar;11(1):61-76 International Human Genome Sequencing Consortium, Initial sequencing and analysis of the human genome. Nature. 2001 Feb 15;409(6822):860-921
Kahlem P, Sultan M, Herwig R, Steinfath M, Balzereit D, Eppens B, Saran NG, Pletcher MT, South ST, Stetten G et al. Transcript level alterations reflect gene dosage effects across multiple tissues in a mouse model of Down syndrome. Genome Research. 2004 14 : 1258
Khor CC, Chapman SJ, Vannberg FO, Dunne A, Murphy C, Ling EY, Frodsham AJ, Walley AJ, Kyrieleis O, Khan A et al. A Mal functional variant is associated with protection against invasive pneumococcal disease, bacteremia, malaria and tuberculosis.
Nat Genet. 2007 Apr;39(4):523-8
45
Lohmueller KE, Pearce CL, Pike M, Lander ES and Hirschhorn JN. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat. Genet. 2003 33 , pp. 177–182
Lott IT, Head E, Doran E, Busciglio J. Beta-amyloid, oxidative stress and Down syndrome. Curr Alzheimer Res. 2006 Dec;3(5):521-8
Lyle R, Gehrig C, Neergaard-Henrichsen C, Deutsch S, Antonarakis SE. Gene expression from the aneuploid chromosome in a trisomy mouse model of Down syndrome. Genome Research. 2004 14 : 1268
Mao R, Zielke CL, Zielke HR, Pevsner J. Global up-regulation of chromosome 21 gene expression in the developing Down syndrome brain. Genomics. 2003 81 : 457
Margulies EH, Cooper GM, Asimenos G, Thomas DJ, Dewey CN, Siepel A, Birney E, Keefe D, Schwartz AS, Hou M et al. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res. 2007 Jun;17(6):760-74
Mein CA, Caulfield MJ, Dobson RJ, Munroe PB. Genetics of essential hypertension.
Hum Mol Genet. 2004 Apr 1;13 Spec No 1:R169-75
Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG.
Genetic analysis of genome-wide variation in human gene expression. Nature. 2004 Aug 12;430(7001)
Mouse Genome Sequencing Consortium, Initial sequencing and comparative analysis of the mouse genome. Nature. 2002 Dec 5;420(6915):520-62
Prandini P, Deutsch S, Lyle R, Gagnebin M, Delucinge Vivier C, Delorenzi M, Gehrig C, Descombes P, Sherman S, Dagna Bricarelli F et al. Natural gene-expression variation in Down syndrome modulates the outcome of gene-dosage imbalance. Am J Hum Genet.
46 2007 Aug;81(2):252-63
Pritchard JK. Are rare variants responsible for susceptibility to complex disease? Am. J.
Hum. Genet. 2001 69, pp. 124–137
Pritchard JK, Cox NJ. The allelic architecture of human disease genes: common disease-common variant...or not? Hum Mol Genet. 2002 Oct 1;11(20):2417-23
Quie PG, White JG, Holmes B, and Good RA. In vitro bactericidal capacity of human polymorphonuclear leukocytes: diminished activity in chronic granulomatous disease of childhood. J Clin Invest. 1967 46: 668-679
Reich DE and Lander ES. On the allelic spectrum of human disease. Trends Genet. 2001 17, pp. 502–510
Rioux JD, Xavier RJ, Taylor KD, Silverberg MS, Goyette P, Huett A, Green T, Kuballa P, Barmada MM, Datta LW et al. Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis.
Nat Genet. 2007 May;39(5):596-604
Risch N and Merikangas K. The future of genetic studies of complex human diseases.
Science. 1996 273, pp. 1516–1517
Risch NJ. Searching for genetic determinants in the new millenium. Nature. 2000 405, pp. 847–856
Saran NG, Pletcher MT, Natale JE, Cheng Y, Reeves RH. Global disruption of the cerebellar transcriptome in a Down syndrome mouse model. Human Molecular Genetics 2003 12 : 2013
Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V, Ruff TG, Milligan SB, Lamb JR, Cavet G et al. Genetics of gene expression surveyed in maize, mouse and man.
Nature. 2003 422:297-302
47
Spielman RS, Bastone LA, Burdick JT, Morley M, Ewens WJ, Cheung VG. Common genetic variants account for differences in gene expression among ethnic groups. Nat Genet. 2007 Feb;39(2):226-31
Stamatoyannopoulos JA. The genomics of gene expression. Genomics. 2004 Sep;84(3):449-57
Stranger BE, Forrest MS, Clark AG, Minichiello MJ, Deutsch S, Lyle R, Hunt S, Kahl B, Antonarakis SE, Tavaré S et al. Genome-wide associations of gene expression variation in humans. PLoS Genet. 2005 Dec;1(6)
Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N, Redon R, Bird CP, de Grassi A, Lee C et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science. 2007 Feb 9;315(5813):848-53
Sussan TE, Yang A, Li F, Ostrowski MC, Reeves RH. Trisomy represses Apc(Min)- mediated tumours in mouse models of Down's syndrome. Nature. 2008 Jan 3;451(7174):73-5
The C. elegans Genome Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. 1998;282:2012–2018
Thomas JW, Touchman JW, Blakesley RW, Bouffard GG, Beckstrom-Sternberg SM, Margulies EH, Blanchette M, Siepel AC, Thomas PJ, McDowell JC et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature 2003 Aug 14;424(6950):788-93
Venter JC et al, The sequence of the human genome. Science. 2001 Feb 16;291(5507):1304-51
Wang LC, Swat W, Fujiwara Y, Davidson L, Visvader J, Kuo F, Alt FW, Gilliland DG, Golub TR, Orkin SH. The TEL/ETV6 gene is required specifically for hematopoiesis in
48
the bone marrow. Genes Dev. 1998 Aug 1;12(15):2392-402
Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007 Jun 7;447(7145):661-78
Yang Q, Rasmussen SA, Friedman JM. Mortality associated with Down's syndrome in the USA from 1983 to 1997: a population-based study. Lancet. 2002 Mar 23;359(9311):1019-25
Zana M, Janka Z, Kálmán J. Oxidative stress: a bridge between Down's syndrome and Alzheimer's disease. Neurobiol Aging. 2007 May;28(5):648-76
49
___________________________________________
Introduction
50