• Aucun résultat trouvé

Genomic profiling of B-cell lymphoid neoplasms

N/A
N/A
Protected

Academic year: 2022

Partager "Genomic profiling of B-cell lymphoid neoplasms"

Copied!
225
0
0

Texte intégral

(1)

Thesis

Reference

Genomic profiling of B-cell lymphoid neoplasms

PORETTI, Giulia

Abstract

Le cancer est une maladie génétique des cellules somatiques caractérisée par l'accumulation d'anomalies génomiques. Nous avons effectué des études d'hybridation génomique comparative et d'expression avec microarrays sur environ 200 lymphomes à cellules B pour identifier des gènes candidats avec niveaux accrus d'ADN associés à surexpression. Dans le myélome multiple l'analyse intégrative des profils du génome et du transcriptome a identifié des candidats rélévants, à évaluer par des études supplémentaires avec plus d'échantillons.

L'analyse des profils génomiques de plusieurs lymphomes a révélée une amplification récurrente en 11q23.1. L'amplification a été validée par PCRq et par hybridation fluorescente in situ, ce qui a montré des remaniements chromosomiques complexes. Avec les tiling arrays ont y a trouvé une transcription accrue, mais les cibles de l'amplification n'ont pas été découverts. Nous avons supposé une certaine fragilité de la région ou l'existence d'une nouvelle activité de transcription superposée aux gènes connus, en gênant leur fonction.

PORETTI, Giulia. Genomic profiling of B-cell lymphoid neoplasms . Thèse de doctorat : Univ. Genève, 2009, no. Sc. 4054

URN : urn:nbn:ch:unige-50961

DOI : 10.13097/archive-ouverte/unige:5096

Available at:

http://archive-ouverte.unige.ch/unige:5096

Disclaimer: layout of this document may differ from the published version.

(2)

UNIVERSITÉ DE GENÈVE FACULTÉ DES SCIENCES Section des Sciences Pharmaceutiques Professeur Leonardo Scapozza ONCOLOGY INSTITUTE OF SOUTHERN SWITZERLAND (IOSI)

Laboratory of Experimental Oncology Francesco Bertoni, MD

Genomic Profiling of B-cell Lymphoid Neoplasms

THÈSE

présentée à la Faculté des sciences de l’Université de Genève

pour obtenir le grade de Docteur ès sciences, mention sciences pharmaceutiques

par GIULIA PORETTI

de Chiasso (TI)

Thèse N

o

4054

LUGANO

Fondazione OTAF Sorengo

2009

(3)
(4)

To my grandmother Martina, who taught me that

“Chi la dura la vince.”

“He who perseveres wins at last.”

and to all smart, great people

I was lucky enough to meet.

(5)

“La natura non fa nulla di inutile.”

Aristotele

(6)

RÉSUMÉ

Le cancer est une maladie génétique très complexe des cellules somatiques. Elle est caractérisée par l’accumulation de nombreuses anomalies génomiques menant à une déstabilisation du transcriptome et du protéome et à l’altération de la biologie cellulaire. Les aberrations génétiques affectent les cellules tumorales avec des altérations au niveau des nucléotides telles que les mutations ponctuelles et l’instabilité des microsatellites, ainsi qu’au niveau des chromosomes, avec l’aneuploïdie, les translocations, les insertions, les duplications/amplifications, les inversions et les délétions. L’établissement d’un inventaire global des mutations causales dans le processus de cancérisation n’est pas évident si l’on considère que ces mutations s’accompagnent souvent d’événements génétiques secondaires. Ceux-ci sont acquis au hasard pendant la progression de la tumeur et ne sont pas directement responsables de l’évolution de la maladie. Toutefois l’étude des aberrations génétiques récurrentes a conduit à la découverte de gènes liés au cancer, qui ont été classés en oncogènes et gènes suppresseurs de tumeur. En outre, les techniques à haut débit et à haute résolution comme les microarrays permettent maintenant des analyses globales des anomalies du nombre de copies des molécules d’ADN et des altérations du transcriptome dans les échantillons tumoraux. Cette approche globale représente une réponse appropriée à la complexité des cancers humains. Ainsi les études de typage génomique ont permis l’identification des facteurs de diagnostic et pronostic, mis en évidence de nouvelles cibles thérapeutiques et ont proposé pour certaines tumeurs un système de classification fondé sur les caractéristiques biologiques.

Nous avons effectué des études de typage génomique sur des lymphomes à cellules B pour découvrir les altérations de dosage génique et les gènes ciblés par des variations du nombre de copies. L’objectif de notre recherche était l’identification de gènes candidats ayant un rôle probable dans l’apparition des tumeurs ainsi que leur caractérisation. Pour améliorer la sélection des gènes candidats nous avons émis l’hypothèse que les mutations causales sont récurrentes et nous avons sélectionné les gènes candidats avec des altérations aussi bien au niveau du génome qu’au niveau du transcriptome. Nous avons donc recherché les changements du dosage d’ADN fréquents et associés à une expression génique altérée.

Nous nous sommes notamment intéressés aux gènes avec gain ou amplification d’ADN associé à une surexpression, deux caractéristiques typiquement observées comme mécanisme d’activation oncogénique et très importantes pour des applications thérapeutiques potentielles.

Nous avons commencé par des investigations fondées sur l’hybridation génomique comparative (aCGH) effectuée avec microarray. Nous avons utilisé des microarrays à haute résolution interrogeant des polymorphismes du simple nucléotide (SNP) (SNP arrays) pour analyser les profils du génome d’environ 200 tumeurs à cellules B. Le projet que nous

(7)

présentons ici concerne la caractérisation moléculaire du myélome multiple (MM) et les recherches suivantes basées sur les résultats obtenus.

Le MM est une malignité incurable des plasmocytes, caractérisée par plusieurs anomalies génomiques (aneuploïdie, anomalies de nombre des copies d’ADN, translocations impliquant surtout le gène IgH en 14q32). Nous avons analysé des échantillons de MM (patients et lignées cellulaires) en utilisant des microarrays pour obtenir les profils globaux au niveau du génome et de l’expression génique. L’objectif de cette analyse était principalement l’identification de nouveaux gènes candidats pour des approches thérapeutiques. Pour représenter les profils de gains ou pertes et de perte d’hétérozygocité (PH) du génome entier nous avons utilisé des cartes calorimétriques, alors que les fréquences des aberrations génomiques ont été présentées graphiquement. Parmi les caractéristiques principalement observées, nous avons noté un gain du bras chromosomique 1q, des niveaux d’ADN fréquemment élevés chez les chromosomes 3, 5, 7, 9, 11, 15, 19, et des pertes d’ADN récurrentes accompagnées de PH en 1p, 13q et 17p. Ces évidences étaient concordantes avec les connaissances antérieures, ce qui a confirmé la fiabilité de la méthode utilisée.Pour trouver les changements de nombre des copies d’ADN avec un effet sur l’expression de gènes importants, nous avons intégré les résultats du typage global du génome et du transcriptome en utilisant deux méthodes: un filtre combiné tant sur les données d’ADN que celles d’expression génique, ou la détection des gènes avec valeurs aberrantes d’expression selon l’algorithme COPA et un niveau d’ADN élevé. Notre intérêt portait sur la sélection de gènes candidats caractérisés par un niveau d’ADN augmenté et simultanément surexprimés.

L’analyse intégrative a identifié des candidats importants et récurrents, y compris des transcrits impliqués dans la pathogenèse du MM, des gènes impliqués dans d’autres types de tumeurs, des oncogènes connus mais pas encore associés au MM, et des candidats concordants dans plusieurs jeux de données sur le MM.

L’analyse des profils génomiques de nombreuses tumeurs à cellules B faite dans notre laboratoire avec les 10K SNP arrays, comprenant les échantillons de MM déjà mentionnés, des échantillons de lymphome à cellules du manteau (MCL) et de lymphome diffus à grandes cellules (DLBCL), a permis l’identification de trois profils 10K avec une amplification similaire en 11q23.1. L’amplification d’ADN en 11q23.1 a été trouvée dans trois lignées cellulaires: JJN3 (MM), KARPAS422 (DLBCL) et JEKO1 (MCL). Cette amplification récurrente était localisée dans une région souvent réarrangée dans les malignitées hématologiques et on l’a choisie pour d’autres analyses. Nous avons utilisé les microarrays 250K, plus récents et avec une meilleure résolution, pour la détection des SNPs permettant ainsi une définition plus précise de la région amplifiée. Les résultats ont confirmé une amplification superposée en 11q23.1 pour les lignées JJN3 et KARPAS42, tandis que JEKO1 a montré un gain d’ADN entouré de régions amplifiées. Une autre lignée cellulaire de DLBCL, U2932, a été analysée avec les microarrays 250K, ce qui nous a permis de montrer une amplification en 11q23.1, ressemblant à celle des lignées JJN3 et KARPAS422. La région minimale d’amplification en

(8)

commun entre les lignées cellulaires JJN3, KARPAS422 et U2932 était délimitée par l’amplimère de JJN3 et couvrait 330 kb. Cette amplification a été validée par PCRq. Les analyses d’hybridation fluorescente in situ (FISH) effectuées sur les quatres lignées cellulaires avec des clones de chromosomes bactériens artificiels spécifiques pour la région amplifiée ont montré quatre motifs d’amplification différents, avec des remaniements chromosomiques complexes.

Pour identifier les cibles de l’amplification d’ADN nous avons analysé le transcriptome des lignées cellulaires par RT-PCR. Le but était l’identification de transcrits surexprimés, y compris les transcrits annotés, prédits et microARNs (mir-34b et mir-34c). Apparemment le niveau de transcription n’était pas influencé par l’amplification d’ADN. La cartographie du transcriptome obtenue avec tiling arrays a révélé une activité de transcription accrue uniquement près des gènes PPP2R1B (JJN3 et KARPAS422), POU2AF1 (KARPAS422), et SNF1LK2 (KARPAS422), et a permis d’écarter toute activité de transcription hors des gènes annotés.

Nous avons sélectionné comme cibles putatives de l’amplification les gènes POU2AF1 et PPP2R1B, et nous les avons caractérisés ultérieurement à travers des études de perte de fonction. Une évaluation fonctionnelle de leur effet biologique n’a pas été possible parce que nous ne sommes pas arrivés à un niveau satisfaisant de knockdown de l’ARNm.

Les profils d’expression génétique révélés par les tiling arrays pendant la caractérisation de la région 11q23.1 ont fourni aussi une cartographie détaillée de l’activité de transcription des chromosomes 8, 11 et 12 dans les lignée cellulaires JJN3, JEKO1 et KARPAS422. Nous avons catalogué les unités de transcription putatives sur la base du niveau d’ADN sous-jacent. Nous avons défini des régions génomiques où l’activité de transcription élevée était combinée avec une augmentation du niveau d’ADN (up genomic intervals, UGIs), et d’autres où l’activité de transcription réduite était combinée avec une diminution du niveau d’ADN (down genomic intervals, DGIs). La comparaison des UGIs avec les annotation génomique courants a amené à l’identification de régions avec une activité de transcription tout à fait nouvelle. Nous avons en particulier étudié l’activité de transcription nouvellement détectée en 11p12 dans la lignée cellulaire JJN3. Le transcrit putatif en 11p12 était distal (>10kb) de gènes annotés et était composé par 15 UGIs non-annotés et consécutifs, complètement exclus de toute annotation génomique préexistante. Les UGIs de JJN3 en 11p12 couvraient presque 128 kb et nous avons émis l’hypothèse qu’ils faisaient potentiellement partie d’un nouveau transcrit unique. Parmi les 15 UGIs, deux ont été validés par RT-PCR. D’autres étaient superposés à de nouveaux transcrits ARN récemment décrits et/ou parfaitement alignés avec des régions hautement conservées parmi les mammifères. Ces résultats sont intéressants, mais des études supplémentaires sont nécessaires pour comprendre la vraie nature des régions génomiques avec une activité de transcription nouvelle.

L’utilisation de microarrays pour effectuer des études d’hybridation génomique comparative et d’expression génique sur des lymphomes à cellules B nous a permis de sélectionner des

(9)

découvrir de l’activité de transcription inattendue. À propos des gènes candidats dans le MM, des études supplémentaires avec un nombre d’échantillons de MM plus vaste sont nécessaires pour comprendre l’importance éventuelle des transcrits sélectionnés dans la pathogenèse de la maladie et la possibilité de les exploiter comme des nouvelles cibles thérapeutiques. En ce qui concerne la région amplifiée en 11q23.1, les analyses effectuées ont permis une caractérisation précise de la région au niveau du ADN et du ARN, mais les résultats obtenus n’ont pas clairement révélé les cibles de l’amplification. D’une part nous avons supposé l’existence d’une nouvelle activité de transcription superposée aux gènes annotés, en gênant leur fonction. D’autre part une certaine fragilité de la région pourrait être responsable des réarrangements chromosomiques complexes, ce que confirment les résultats des analyses avec FISH. En conclusion les tiling arrays ont permis d’apprécier la complexité architecturale du génome humain. La caractérisation moléculaire et fonctionnelle du nouveau transcrit putatif en 11p12 et d’autres UGIs non-annotés pourrait révéler des aspects intéressants et inconnus de l’organization du génome humain.

(10)

TABLE OF CONTENTS

ABBREVIATIONS III

SUMMARY V

1 GENERAL INTRODUCTION 1 1.1 Genomic profiling in cancer 1 1.1.1 Cancer and genetic aberrations 1

1.1.1.1 Cancer 1

1.1.1.2 Genetic instability in cancer 3 1.1.1.3 Genetic aberrations in cancer 4

1.1.2 Genomic profiling 14

1.1.2.1 Array comparative genomic hybridization (aCGH) 14

1.1.2.2 Gene expression profiling 24

1.2 B-cell lymphoid neoplasms 28 1.2.1 Normal B-cell ontogeny 28

1.2.2 B-cell tumors 32

2 GENERAL AIM AND EXPERIMENTAL PLAN 34 3 GENOMIC PROFILING OF MULTIPLE MYELOMA 36

3.1 Introduction 37

3.2 Aim 50

3.3 Materials and methods 51

3.4 Results 62

3.4.1 Genome profiling 62

3.4.2 Integrative analysis of expression and genomic profiles 71

3.5 Discussion 88 4 MOLECULAR AND FUNCTIONAL CHARACTERIZATION OF A

RECURRENT DNA AMPLIFICATION AT 11q23.1 98

4.1 Introduction 100

4.2 Aim 101

4.3 Materials and Methods 102

4.4 Results 120

4.4.1 Genome profiling 120

4.4.2 Transcriptome profiling 126

4.4.3 FISH experiments 132

(11)

4.5 Discussion 137 5 TILING GENE EXPRESSION PROFILES ANALYSIS 145

5.1 Introduction 146

5.2 Aim 146

5.3 Materials and methods 147

5.4 Results 152

5.5 Discussion 161

6 GENERAL DISCUSSION 164

7 BIBLIOGRAPHY 166

ACKNOWLEDGEMENTS VIII

SUPPLEMENTARY MATERIAL X

APPENDIX XXIII

CURRICULUM VITAE XXV

POSTERS AND PUBLICATIONS XXVI

(12)

ABBREVIATIONS

AB antibody

(a)CGH (array) comparative genomic hybridization

AG antigen

ASCT autologous (hematopoietic) stem cell transplantation BAC bacterial artificial chromosome

BCR B-cell receptor

BM bone marrow

BMSC bone marrow stromal cell CIN chromosomal instability CN (DNA) copy number

CNAT GeneChip Chromosome Copy Number Analysis Tool (Affymetrix) COPA Cancer Outlier Profile Analysis

CSR class switch recombination

DGI down genomic interval (see chapter 5 for definition) DLBCL diffuse large B-cell lymphoma

ds double-stranded

DSB double-stranded break EST expressed sequence tag FISH fluorescence in situ hybridization GC germinal centre

GCOS GeneChip Operating Software (Affymetrix) GDAS GeneChip DNA Analysis Software (Affymetrix) GEP gene expression profile

HMCL human multiple myeloma cell line HMM Hidden Markov Model

HSR homogeneously staining region Ig immunoglobulin

IgH immunoglobulin heavy chain LOH loss of heterozygosity

MAS Affymetrix MicroArray Software MCL mantle cell lymphoma

MGUS monoclonal gammopathy of undetermined significance miRNA microRNA

MM multiple myeloma mmix master mix

ncRNA non-coding RNA

(13)

NTC no template control (negative reaction)

PC plasma cell

(p)UPD (partial) uniparental disomy RACE rapid amplification of cDNA ends RMA robust multi-array average SHM somatic hypermutation

SNP single nucleotide polymorphism ssDNA single-stranded DNA

TUF transcript of unknown function

UGI up genomic interval (see chapter 5 for definition) WGSA whole genome sampling assay

WT whole transcript

(14)

SUMMARY

Cancer is a very complex genetic disease of somatic cells characterized by the accumulation of multiple lesions at genome level, that cause deregulation at transcriptome and proteome levels and alter cell biology. Genetic aberrations affect cancer cells with changes at the nucleotide level, including point mutations and microsatellite instability, and at chromosome level, comprising aneuploidy, translocations, insertions, duplications/amplifications, inversions, and deletions. The comprehensive knowledge of causative mutations involved in tumorigenesis is challenging, since these are usually accompanied by secondary genetic events, acquired at random during disease progression and irrelevant for the development of malignancy. However, the study of recurrent genetic alterations led to the discovery of genes associated with cancer, further classified into oncogenes and tumor suppressor genes. In addition, the advent of high-throughput and high-resolution techniques like microarray platforms enabled genome-wide analyses of DNA copy number (CN) abnormalities and gene expression alterations in tumor samples, permitting an adequate approach to face the complexity of the disease. Genomic profiling studies have been successfully used for the identification of diagnostic and prognostic factors, for the detection of new therapeutic targets, and for the development of tumor classification systems reflecting the underlying biology.

We performed genomic profiling studies in B-cell lymphoid neoplasms to discover gene dosage alterations and to detect the genes targeted by CN changes. Our aim was to identify tumor-specific candidate genes with a putative role in tumor development, and to further characterize them. To improve the selection of candidate cancer genes, we followed the assumption that causative mutations should be recurrent, and we selected candidate genes with changes of both DNA and RNA levels. We therefore looked for frequent CN changes associated with deregulated gene expression. In particular, we were interested in genes with DNA gain or amplification associated with upregulated gene expression, strong indicators of oncogenic behaviour and interesting targets for the development of new therapeutic approaches.

We performed aCGH (array comparative genomic hybridization) studies with high-resolution SNP (single nucleotide polymorphism) arrays to investigate the genome-wide DNA profiles of almost 200 B-cell tumors. Here we present the molecular characterization of multiple myeloma (MM) and the successive investigations started from observations done in the course of this study.

MM is an incurable plasma cell malignancy linked to heterogeneous genomic abnormalities, such as ploidy status alterations, chromosomal copy number changes and non-random chromosomal translocations mainly involving the IgH (immunoglobulin heavy chain) gene

(15)

candidate cancer genes exploitable as therapeutic targets we obtained, with DNA microarrays, the genome-wide DNA profiles and gene expression profiles of both MM clinical samples and human MM cell lines (HMCLs). We created heat-maps for visualizing genome- wide DNA CN and loss of heterozygosity (LOH), and we presented the frequencies of genomic aberrations through frequency plots. As major characteristics we observed 1q gain and frequently increased DNA level of odd chromosomes (3, 5, 7, 9, 11, 15, 19), and recurrent DNA losses and LOH events at 1p, 13q and 17p, in concordance with previous knowledge, thus confirming the efficacy of the adopted method. To identify DNA CN changes influencing the expression of relevant genes we integrated global gene expression and genomic profiling data by applying a matched filter on both DNA and expression data, or by the COPA-based detection of genes with outlier expression and underlying increased DNA CN. We were particularly interested in the selection of candidate oncogenes presenting increased DNA CN coupled with overexpression. The integrative analysis allowed the identification of strong, recurrent candidates as evidenced by the presence of transcripts with a proven role in MM pathogenesis, but also genes targeted in other tumor types, known oncogenes not yet implicated in MM and candidates concordant across various MM data sets.

The observation of the genome-wide DNA profiles of a series of B-cell tumors collected at that time in our laboratory by means of 10K SNP arrays, including the previously mentioned MM, but also mantle cell lymphoma (MCL) and diffuse large B-cell lymphoma (DLBCL) samples, led us to the identification of three 10K profiles with a similar 11q23.1 amplification, belonging to the cell lines JJN3 (MM), KARPAS422 (DLBCL) and JEKO1 (MCL). The recurrent amplification was localized in a region often rearranged in haematological malignancies and was therefore selected for further characterization. The updated 250K SNP arrays were used to better define the amplified region, and we confirmed an overlapping amplicon at 11q23.1 in JJN3 and KARPAS422, whereas JEKO1 showed a DNA gain flanked by amplified regions. A fourth DLBCL cell line, U2932, was analyzed with 250K arrays and exhibited an amplification similar to JJN3 and KARPAS422. The minimal common region of amplification was delimited by the amplicon of JJN3, and covered 330 kb. 11q23.1 amplification was validated by qPCR.

Fluorescence in situ hybridization (FISH) analyses were performed on the four cell lines using BAC clones overlapping the amplified region, and showed four different amplification patterns of complex chromosome rearrangements.

To identify the amplification target(s), we analyzed the transcriptome of the cell lines by RT- PCR looking for the overexpression of known and predicted transcripts and microRNAs (miR- 34b and miR-34c). The transcription level was apparently not influenced by the underlying DNA amplification. Transcriptome mapping with tiling arrays was performed on JJN3, JEKO1 and KARPAS422, and revealed increased transcription activity only in correspondence of the genes PPP2R1B (in JJN3 and KARPAS422), POU2AF1 (in KARPAS422), and SNF1LK2 (in KARPAS422), excluding the presence of transcription activity outside annotated transcripts.

The putative amplification targets POU2AF1 and PPP2R1B were selected for further

(16)

characterization by loss-of-function studies. Unfortunately, functional evaluation of their biological effect was not possible due to the inability to reach satisfactory mRNA knockdown, and we could not indicate a clear amplification target.

Besides the characterization of the transcription activity at 11q23.1, transcriptome mapping with tiling arrays provided a detailed map of the gene expression profiles on chromosomes 8, 11, and 12 of the cell lines JJN3, JEKO1 and KARPAS422. We catalogued the putative transcribed units with respect to the cell line-specific DNA CN changes and we defined genomic regions where increased transcription activity was matching with increased DNA level (up genomic intervals, UGIs), and other regions where decreased transcription activity was matching with decreased DNA level (down genomic intervals, DGIs). A precise comparison of the UGIs with currently known annotation tracks led to the identification of UGIs located outside any annotation, thus representing novel transcription activity. Among the various regions with novel transcription activity, we concentrated on the one detected at 11p12 in the cell line JJN3. The 11p12 putative novel transcript was located distal (>10kb) to previously annotated genes and was constituted by 15 un-annotated, consecutively mapped UGIs, not interrupted by any known transcript. JJN3 UGIs at 11p12 covered about 128 kb and we assumed that they were part of a unique, long putative novel transcript. Two UGIs were validated by RT-PCR, others were overlapping with previously described novel RNA transcripts, and/or perfectly aligned with high mammalian conservation scores. These results are promising, but further investigations are needed to reveal the real nature of the genomic regions presenting novel transcription activity.

In conclusion, the use of microarrays to perform aCGH and gene expression analyses in B-cell lymphoid neoplasms allowed us to select candidate genes in MM, to detect a recurrent amplification at 11q23.1, and to discover unexpected transcription activity. Concerning the candidate genes in MM, further studies involving larger series of MM samples are needed to elucidate whether they can be considered relevant candidates in the pathogenesis of the disease, and whether they may contribute toward the identification of novel therapeutic targets. In the case of the amplified region at 11q23.1, all the analyses carried out provided a precise characterization of the region both at DNA and RNA level, but they were still not successful in indicating strong amplification targets. We concluded that this could be due either to the existence of novel transcription activity overlapping annotated genes and interfering with their normal function, or to the relative fragility of the region causing complex chromosome rearrangements, as observed in FISH results. Finally, we could appreciate the complex architecture of the human genome by the analysis of the tiling gene expression profiles. Molecular and functional characterization of the putative novel transcript at 11p12 and of other un-annotated UGIs could reveal interesting, unknown aspects of the genomic organization.

(17)

1 GENERAL INTRODUCTION

1.1 Genomic profiling in cancer

1.1.1 Cancer and genetic aberrations

1.1.1.1 Cancer

Cancer is defined by the World Health Organization as a large group of diseases that can affect any part of the body and are characterized by uncontrolled growth and spread of abnormal cells beyond their usual boundaries7. Cancer is a very complex genetic disease of somatic cells, characterized by the accumulation of multiple alterations at genome level. The concept that neoplasia originates from genetic changes was first proposed by Boveri8 in 1914 and generally it is referred to as the somatic mutation theory of tumorigenesis. Experimentally it was first proven in 1927 by Muller’s discovery that ionizing radiation, already considered a potent carcinogen, had mutagenic activity9 and it is supported today by a variety of experimental data. Genetic abnormalities may cause deregulation at transcriptome and proteome levels and affect cell proliferation, differentiation and survival.

In cancer, genetic aberrations due to the influence of environmental factors, to mistakes in cell replication and to aberrant DNA repair mechanisms are found. Environmental factors are intended as physical carcinogens (e.g. UV and ionizing radiation), chemical carcinogens (e.g. asbestos, tobacco smoke components) and biological carcinogens, such as infections with oncogenic viruses (e.g. DNA viruses like HPV, SV40; RNA viruses like HIV) or certain bacteria (e.g. Helicobacter pylori). It is possible to distinguish between germline mutations or polymorphisms and somatic mutations. Germline mutations are mainly recessive (i.e both alleles need to be mutated to have an impact on the phenotype) and predisposing to cancer. Somatic mutations represent the majority of cancer mutations and are acquired during pathogenesis (de novo mutations)10. Clearly, recurrent somatic mutations, defined as those found in at least two cases of the same morphologic entity11 (URL:

http://cgap.nci.nih.gov/Chromosomes/Mitelman), play an important role in diagnosis and prognosis, and represent interesting therapeutic targets.

The study of mutational events linked to tumor development has a great challenge to overcome: the distinction between “driver” or causative mutations, i.e. conferring functional advantages to tumor cells, and “passenger” or random mutations, that represent secondary genetic events, acquired during disease progression and without biologic significance10. Causative mutations are expected to be recurrent within a sample batch but also among different data sets on the same pathology12. Moreover, the type of genome aberrations has

(18)

also to be considered, with high-level DNA copy number changes probably having a greater impact than low-level12.

Even if every cancer type presents specific alterations, it is possible to establish some common biological principles leading to cell transformation13. Essentially, the following acquired capabilities are hallmarks of cancer cells: self-sufficiency in growth signals, insensitivity to growth-inhibitory signals, evasion of apoptosis, limitless replicative potential, sustained angiogenesis and tissue invasion and metastasis13. An additional, typical attribute of cancer cells is genetic instability. It is still debated on the role of genetic instability as driving force in the development of sporadic cancers. On one hand a so-called “mutator phenotype”, i.e. a cell with increased genomic instability, is considered necessary for carcinogenesis14,15. The “mutator phenotype” model proposes that the initiating event is the occurrence of mutations in DNA polymerases, that render them error-prone, and/or mutations in enzymes involved in DNA repair mechanisms, or more generally in genes that control genomic integrity16,17. This phenotype has no direct selective advantage but increases the mutation rate of other genes. It corresponds to the situation observed in inherited human cancer syndromes with germline mutations in genes involved in maintaining genome integrity (e.g. xeroderma pigmentosum, ataxia telangiectasia, hereditary non- polyposis colorectal cancer). On the other hand, in a model comparing the tumorigenesis process to a form of somatic evolution, the selection of advantageous mutations is considered the driving force of sporadic tumor development rather than an increase in the mutation rate18. According to this model, tumor growth is initiated by one or more mutations giving the cell a growth advantage. The clone will acquire successive mutations and, under the selective pressure of tumor microenvironment, the cells bearing advantageous mutations undergo successive waves of clonal expansion. This was described for example in sporadic colorectal cancer development19. In this context, the existence of many genetic alterations in a tumor does not necessarily mean that it has a “mutator phenotype”. In fact, instability is a matter of rate, and the presence of a mutated state provides no information about the rate of its occurrence20.

Although an inherited tendency to genomic instability clearly drives tumorigenesis and even if genetic instability is often seen in cancer cells, the development of sporadic cancers probably occurs on the basis of a normal mutation rate and is mainly driven by the selection of advantageous mutations21,22. Genetic instability might arise as a secondary effect of mutations, but the direct selective effect of such mutations override it, and genetic instability only indirectly contribute to the somatic evolution of cancer22.

Recently, another hypothesis has been made for cancer development, based on an oncogene-induced DNA damage model (reviewed in Halazonetis et al.23). The authors suggest that activated oncogenes induce perturbations in the DNA replication machinery, which in turn lead to the formation of DNA breaks. According to this model, the presence of

(19)

transformation is given by the inactivation of p53 or, less often, other DNA damage checkpoint proteins, that impairs the DNA damage response pathway normally leading to cell cycle arrest, apoptosis or senescence. This model is therefore based on two key features of most cancers, namely genomic instability and p53 mutations.

1.1.1.2 Genetic instability in cancer

Even if the causative role of genetic instability in tumor development can be discussed, it is widely accepted that numerous genetic aberrations are accumulated in neoplastic cells, providing evidence for the genetic basis of human cancer.

The consequences of genomic instability are genetic aberrations, ranging from point mutations to complex chromosome rearrangements. The existence of two levels of genetic instability affecting the vast majority of cancers was proposed20. In a small subset of cases the instability is observed as subtle sequence alterations at nucleotide level, whereas in most other cancers it is found at chromosome level. Sequence errors at nucleotide level can be generated by polymerases during DNA replication or by mutagens. In normal cells they are avoided by DNA repair mechanisms, such as nucleotide-excision repair (NER), base-excision repair (BER) and mismatch repair (MMR) mechanisms. NER is mainly involved in repairing covalent modifications of DNA caused by exogenous mutagens. NER defects result for example in high susceptibility to skin tumors due to the exposure to a widely diffused mutagen like ultraviolet light20. The inactivation of MMR genes gives rise to microsatellite instability (MIN). Microsatellites are short sequences of DNA repeats scattered throughout the genome and MIN leads to repetitive DNA expansions and contractions. Another typical consequence of genetic instability at nucleotide level is given by point mutations, involving substitution, deletion or insertion of a few nucleotides.

At chromosome level, genetic instability involves both structural and numerical chromosome abnormalities and is the result of both chromosomal instability and chromosomal rearrangements. Chromosomal instability (CIN) involves changes in chromosome number leading to chromosome gains or losses and nondisjunction, a phenomenon known as aneuploidy. CIN is caused by failures in mitotic chromosome segregation,normally controlled both by the entire mitotic machinery and by a spindle checkpoint24. The spindle mitotic checkpoint is a quality-control mechanism preceding anaphase entry, that ensures that all pairs of sister chromatids have achieved bipolar attachment to the mitotic spindle for a correct segregation. Together with mutations in spindle checkpoint, also structural components of the mitotic spindle, in particular defects in the regulation of centrosome number, seem to play a key role in CIN. In contrast, instability leading to chromosomal rearrangements refers to events changing the genetic linkage of two DNA fragments, resulting in translocations, insertions, duplications, inversions or deletions25. The common feature of these instability events is that they are generated by DNA breaks and are mainly

(20)

phase of the cell cycle, DNA is most vulnerable due to unwind of the parental duplex to allow access to the replication machinery. Stalling of the replication process causes the formation of single-stranded DNA (ssDNA) gaps and double-stranded breaks (DSBs). Many checkpoint functions, such as replication, repair and S-phase checkpoints, are involved in monitoring genome integrity during replication and in preventing DNA damage. If the controlled process is disrupted by replication stress, DNA breaks accumulate and chromosomal rearrangements are generated by error-prone DNA DSB repair mechanisms. These latter typically utilize homologous recombination and/or non-homologous end joining pathways to resolve DSBs (reviewed in Jackson et al.26). Replication stress can be induced by replication inhibition and/or S-phase checkpoint inactivation, but, as previously mentioned, can also be the consequence of oncogene activation23.

Nevertheless, not all DSBs arise as a result of replication stress. Another mechanism involves telomere erosion. Telomeres are complex nucleoprotein structures located at the ends of linear chromosomes, critical for maintaining genome integrity. Telomere erosion is associated with massive genomic instability during the “cellular crisis” state, i.e. the period of massive cell death due to telomere dysfunction before the acquisition of telomerase activity by the surviving, malignant cells27.

Other two elements that play a role in genetic instability leading to rearrangements are fragile sites and highly transcribed DNA sequences25. Fragile sites are DNA sequences spanning approximately 50 kb to 1 Mb, that show gaps and breaks due to partial inhibition of DNA synthesis, particularly if exposed to physiological stress28. Their instability is probably linked to their DNA structure, usually presenting trinucleotide repeats. They are frequently associated with rearrangements like translocations, amplifications, and integrations of exogenous DNA.

Transcription-associated instability is based on the fact that transcription of a DNA sequence involves the formation of ssDNA intermediates, that are chemically more unstable than dsDNA.

1.1.1.3 Genetic aberrations in cancer

As previously discussed, a key feature of cancer is the presence of non-random genetic aberrations (Fig. 1). Genomic changes typically observed in cancer cells can be numerical or structural, and can coexist in complex karyotypes29. Numerical aberrations comprise aneuploidy, DNA copy number (CN) changes, and unbalanced translocations, whereas structural aberrations include balanced translocations, and inversions. However, in situations like extra-chromosomal amplifications, unbalanced translocations, insertions and deletions, the distinction of numerical and structural aberration is not univocal. Structural changes are defined as balanced if they involve equal exchange of material between two chromosomal regions, whereas the term unbalanced refers to a non-reciprocal exchange leading to gain or loss of genome portions.

(21)

As already anticipated, aneuploidy corresponds to the phenomenon of an abnormal chromosome number with respect to the typical number of 46 human chromosomes. This is due to gains or losses of whole chromosomes by CIN.

Translocations are given by exchange of chromosomal segments between non-homologous chromosomes, and can be balanced or unbalanced. Spatial proximity and sequence homology are factors probably contributing in increasing the propensity of translocated chromosome partners to rearrange29.

DNA copy number changes comprise deletions, gains or amplifications of genomic material.

Amplification is a neoplasia-associated mechanism we were particularly interested in, and is therefore reviewed in a separate chapter (see next sub-chapter). CN aberrations are also referred to as gene dosage alterations, i.e alterations in the number of copies of a given sequence found in a cell.

or or

A

B

C

D

E

UPD:

failed cytokinesis

segregation errors at anaphase

high-level amplification

translocations

D1 D2

C1 C2

loss of heterozygosity

or or

A

B

C

D

E

UPD:

failed cytokinesis

segregation errors at anaphase

high-level amplification

translocations

D1 D2

C1 C2

loss of heterozygosity

Fig. 1: Acquisition of genetic aberrations in cancer cells (modified from Bayani et al.29 and Albertson et al.56). A normal cell (left) is represented with three chromosome pairs (yellow, fucsia, green). (A) Polyploidy: failure to undergo cytokinesis after the diploid chromosome set has doubled leads to polyploidy (here a tetraploid cell for chromosomes yellow, fucsia, green); (B) aneuploidy: segregation errors at anaphase result in two aneuploid daughter cells: one monosomic for the green chromosome and the other trisomic; (C) amplification: gene amplification by double minutes (C1) or homogeneously staining regions (C2); (D) translocation: unbalanced (D1) or balanced (D2); (E) loss of heterozygosity (LOH): denotes the loss of one allele in an heterozygous pair due to deletion or mutation (black spot), or, more rarely, to uniparental disomy (UPD).

(22)

Further genetic anomalies involved in cancer are somatic point mutations, loss of heterozygosity (LOH) and epigenetic modifications, all reviewed in the following paragraphs.

The result of somatic point mutations, where one or a few nucleotides are altered by sequence error and/or deletion or insertion, can be nonsense mutations, missense mutations or frameshift mutations. In nonsense mutations the new codon causes the protein to prematurely terminate, leading to a shorter and usually non-functional product. Missense mutations present an incorrect aminoacid into the protein sequence. The effect on protein function depends on the site of mutation and the nature of the aminoacid replacement. In general, mutations that do not affect the protein sequence or function are called silent or synonymous mutations. Frameshift mutations cause the affected codon to be misread and subsequently also all the following codons, leading to a very different and often non- functional product.

Classic LOH is defined as the loss of one allele of a heterozygous locus. Genetic mechanisms leading to LOH are highly variegated. An LOH event can be caused by deletion, that can involve the whole chromosome or smaller regions, by mitotic nondisjunction, or by mitotic recombination between two homologous chromosomes, including break-induced replication and gene conversion (interstitial mitotic recombination event)33-35 (Figg. 2-3).

Interchromosomal recombinations like translocations are sometimes cited as possible cause of LOH since they can lead to deletions as consequences of DSBs. As reported in figure 2, there may be a series of events leading to the final, selected clone with the observed LOH pattern35.

(23)

The result of LOH is the loss of the wild-type allele with the consequent functional alteration (Fig. 3). As discussed later (chapter 1.1.2.1), LOH was observed also without an accompanying copy number change, the so called copy-neutral LOH, and was associated to uniparental disomy (UPD)35-37. UPD refers to the situation in which both copies of a chromosome pair have originated from one parent, either as isodisomy, in which two identical segments form one parent homologue are present, or heterodisomy, where sequences from both homologues of the transmitting parent are present38. This can occur during transmission of chromosomes from parents to gametes or during early cell divisions in the zygote (germline recombination resulting in constitutive UPD), but also as a consequence of somatic recombination during mitotic cell divisions (mitotic or somatic recombination).

Constitutive UPD is a quite rare phenomenon, whereas UPD occurring in adult cells by somatic recombination is probably more common, with an incidence increasing with cell division rate. Two mechanisms are supposed to be involved (Fig. 4): in UPD of whole chromosomes, mitotic nondisjunction is followed by duplication of the remaining homologue in the monosomic cell or loss of one chromosome in the trisomic cell; partial uniparental disomy (pUPD) involves mitotic recombination events between chromatids, such as reciprocal exchange of chromosome material between two homologues. pUPD causing interstitial regions of LOH by multiple recombination events is also called gene conversion.

deletion

gene conversion

recombination

translocation

nondisjunction

chromosome loss and duplication deletion

gene conversion

recombination

translocation

nondisjunction

chromosome loss and duplication

Fig. 3: Mechanisms leading to LOH (modified from de Nooij-van Dalen et al.33). In cells carrying one wild-type allele (T) and one mutant allele (t) of a tumor suppressor gene, LOH events may lead to the expression of the recessive mutation.

(24)

Fig. 4: Mechanisms of uniparental disomy (UPD) (modified from Walker et al.125). In a normal cell (i) two homologous chromosomes with both parental alleles are present. During mitosis they are duplicated, so that each chromosome is made by two sister chromatids (iv). Mitotic nondisjunction can occur (ii), resulting in either monosomic or trisomic cells for a given chromosome. UPD of a whole chromosome can then occur if the remaining chromosome in the monosomic cell is duplicated or if the outnumbered allele in the trisomic cell is deleted. Alternatively, mitotic recombination between homologous chromatids can occur, resulting in partial UPD involving a chromosome arm (v), a telomeric region (vi), or interstitial regions (vii, viii).

The most common epigenetic modifications found in cancer cells are histone hyper-/hypo- methylation and alterations in histone acetylation/deacetylation balance (reviewed in39).

DNA methylation involves the addition of a methyl group to the cytosine ring (carbon 5 position) by DNA methyltransferases. In the human genome, methylation is primarily found in repetitive DNA elements to protect them against recombination. Hypomethylation of the genome results in increased mutation rate and activation of otherwise silenced genes. On the contrary, hypermethylation of gene promoter regions causes their transcriptional repression. In tumors, hypomethylation can result in increased expression of oncogenes and hypermethylation in silencing of tumor suppressor genes40,41. Another relevant epigenetic modification involved in chromatin remodelling is histone acetylation, which consists of the addition of acetyl groups to lysines from the amino-terminal tail of core histones. Lysine acetylation and deacetylation reactions are catalyzed respectively by histone acetyltransferases (HAT) and histone deacetylases (HDAC). Acetylated histones enhance chromatic decondensation and DNA accessibility, therefore the acetylation state correlates with transcriptional activation. The disruption of the acetylation/deacetylation balance due to alterations of HATs or HDACs has been detected in many cancers. Moreover, malignant upregulation of histone deacetylation processes has been successfully targeted by therapeutic interventions with HDAC inhibitors.

The identification of recurrent genetic alterations by cytogenetics has led to the discovery of previously unknown genes associated with cancer, also named cancer genes10. The first spectacular success of cancer cytogenetics goes back to the 1960s, when the Philadelphia chromosome (Ph) was discovered in patients with chronic myeloid leukaemia (CML)42. With the advent of molecular genetic techniques during the 1980s11, the breakpoints of genetic

(25)

discovered. The cancer genes located at frequently rearranged genomic sites were divided into two functional classes: the “dominant” oncogenes (i.e a single mutated allele is sufficient to contribute to tumorigenesis) and the “recessive” tumor suppressor genes (i.e both alleles need to be mutated). Chromosomal rearrangements were recognized as mechanisms of cancer genes deregulation, resulting in abnormal gene expression profiles and aberrant growth and proliferation of cancer cells. The first molecular consequence of a genetic aberration to be explained was the activation of MYC oncogene by translocations involving the MYC locus at chromosome 8 and immunoglobulin (Ig) gene loci (chromosome 2: IgK locus; chromosome 14: IgH locus; chromosome 22: IgL locus)43. Due to the translocation, MYC expression is increased by the Ig transcriptional enhancer. Another oncogenic deregulation mechanism is activated in the above mentioned Ph-positive CML, where the t(9;22) leads to the formation of a BCR/ABL fusion transcript that activates the ABL tyrosine kinase inappropriately44. The functional consequences of recurrent chromosomal rearrangements at the molecular level are of two major types: they can create a chimeric gene coding for a tumorigenic fusion transcript or juxtapose a gene to the regulatory elements of a partner gene, such as gene promoters or transcriptional enhancers, resulting in deregulated expression. The two main groups of genes deregulated (i.e activated but also inactivated) by the formation of a fusion gene are those encoding tyrosine kinases (mainly activated) and transcription factors (transcriptional activity can be enhanced, aberrant, or repressed)45. Inactivating mechanisms of tumor suppressor genes were discovered later, mainly studying familial cancers and in particular childhood tumors, where losses of tumor suppressor loci were demonstrated46. The widely described, simplified alteration scheme of tumor suppressor genes involves first a mutation of one allele (inactivating mutation) and then the loss of the wild-type allele (LOH, Fig. 3). According to this model, two successive mutational events are required in the same cell in order to inactivate tumor suppressor genes, as proposed by Knudson with the two-hit hypothesis in retinoblastoma47. Briefly, Knudson’s hypothesis states that inactivation of a tumor suppressor gene follows either inheritance or occurrence of a spontaneous mutation, with subsequent loss of the other wild-type allele. As previously reported, genetic mechanisms leading to LOH are more variegated than this simplistic representation (Figg. 2-3).

In general, cancer genes are characterized by a redundancy of deregulation mechanisms.

Oncogenes can be activated by gene dosage alterations, point mutations, translocations, promoter hypomethylation, constitutively active transcription factors, or absence of control by a tumor suppressor gene, if the latter is inactivated. Tumor suppressor genes can be inactivated by gene dosage alterations, LOH events, point mutations, or epigenetic silencing by promoter hypermethylation. If the aberrant functional consequences of genetic aberrations like somatic point mutations, epigenetic changes, LOH, CN alterations of small genomic regions or chromosomal rearrangements with breakpoints are almost clear, the

(26)

pathogenetic significance of aneuploidy or CN imbalances affecting large genomic regions containing multiple genes remains difficult to explain.

A recent version (January 22, 2007) of the Cancer Gene Consensus of the Cancer Genome Project at the Sanger Institute, originally described by Futreal et al.10, contains 363 cancer genes whose aberrations are considered causal in the development of specific cancers. Of these, 70 are tumor suppressor genes, 292 are oncogenes and one can act as both. Only seven (2%) of these oncogenes were shown to be predominately activated by amplification, compared to 268 (92%), which are activated mainly by chromosomal translocation.

However, a recent publication suggests that amplification is a mechanism of oncogene activation more common than previously believed48.

In recent years, advanced high-throughput technologies have provided an overview of the high variety of somatic mutations49 and copy-number alterations50-52 found in human cancers. In addition, large-scale association studies have uncovered genome variations that determine the genetic susceptibility of individuals to various types of cancer53,54.

In conclusion, the identification of genetic alterations can lead to the discovery of genes, transcripts and pathways that play a relevant role in cancer. Moreover, neoplasia-associated genomic abnormalities are clinically useful, since they can be used for diagnosis and tumor classification, and can be helpful in the selection of appropriate treatment modalities, and in some cases, they can be exploited as therapeutic targets.

DNA amplification

Amplification refers to an increase of at least four copies of an intrachromosomal DNA segment that is less than 20 Mb in length20. Amplified DNA can be organized as repeated units at a single genomic locus or as units scattered throughout various chromosomes, but also as extra-chromosomal elements55.

The detection of DNA amplifications can be done at cytogenetic level by means of standard microscopic techniques, and at molecular level by fluorescence in situ hybridization (FISH), comparative genomic hybridization (CGH), and arrayCGH (aCGH). Cytogenetic manifestations of DNA amplification are homogeneously staining regions (HSRs) and extrachromosomal acentric DNA fragments, such as double minutes (DMs) and episomes56. HSRs are made of inverted tandem repeats within a chromosome, whereas DMs are circular, extra-chromosomal elements, a few Mbs in size, that replicate autonomously. Episomes are only ~250 bp in length and can be identified only with molecular methods.

Frequently amplified chromosomal loci have been detected using computational modelling on published CGH data57. Mainly, they co-localize with well known oncogenes such as BCL2, typical for Non-Hodgkin’s lymphoma (NHL), EGFR in glioma and non-small cell lung cancer, MYCN in neuroblastoma, ERBB2 in breast and ovarian cancer, and MYC in various cancers. In fact, DNA amplification is one of the mechanisms by which oncogenes are activated, as

(27)

DNA amplification is the increased expression of the targeted genes. However, its impact on gene expression may vary among different cancer types30,31,58.

Gene amplification is a mechanism of oncogene deregulation mainly observed in solid tumors. On the contrary, amplifications occur with lower frequencies in haematological malignancies, where translocation events are thought to be prevalent57. In a recent survey of amplifications in 104 cell lines of various tissues of origin, leukaemia and lymphomas had on average 10 amplicons, whereas epithelial cancers had an average of 3648.

Amplifications probably arise through DNA DSBs at both ends of the amplified region, as predicted by models explaining their formation, such as breakage–fusion–bridge cycle and excision and unequal segregation of extrachromosomal DNA fragments59. According to the breakage–fusion–bridge (BFB) model59 leading to the formation of HSRs, the double-stranded DNA break is induced by telomere erosion/dysfunction or fragile sites. The two uncapped, sister chromatids then fuse, apparently as a consequence of aberrant DNA repair mechanisms. During mitosis, at anaphase, the dicentric, fused chromatids form a bridge, where DNA is arranged in head-to-head position. If this structure divides asymmetrically, the daughter cells will receive either a chromatid with duplicated genetic material, or a chromatid from which DNA was deleted. BFB cycles could result in amplified inverted copies of a genomic region. On the other hand, the excision-relocation model59 describes the selection of circular extrachromosomal chromatin bodies (DMs and episomes) at cell division as gene amplification mechanism. These circular elements are probably formed by excision following loop formation, in some cases due to replication bubbles, or by circularization of fragments derived from HSRs breakdown59. They can persist extrachromosomally, or can be transposed to another chromosome as HSRs or distributed insertions.

DNA amplifications might be promoted by the presence of specific genome sequences (e.g.

fragile sites, repetitive elements), defects in DNA replication or telomere dysfunction55, as well as by environmental factors59. Fragile sites are chromosomal regions prone to breakage under conditions of replication stress, and it was shown that common fragile sites are co- localized with amplifications60. Hellman et al. demonstrated that the breakpoints of the intrachromosomal amplification of the MET oncogene in a human gastric carcinoma is located within the common fragile site FRA7G region. Since their discovery, fragile sites were thought to facilitate genetic recombination events giving rise to cancer. For example, the FHIT gene, a particularly large gene discovered at the most active common fragile region FRA3B, plays a role in tumor development as inactivated tumor suppressor gene, as well as the WWOX gene at 16q23.161. We recently observed that FHIT, WWOX and other genes at fragile sites are preferentially targeted by genomic deletions in HIV-NHL4 (manuscript in preparation). A statistical study evaluating the influence of fragile sites, large genes and cancer genes on the preferential localization of DNA amplifications co-localized the amplification hot spots with known fragile sites, cancer genes, and virus integration sites, but

(28)

DNA amplifications is mainly directed by the presence of oncogenes and of large genes, while fragile sites might be involved in the DNA amplification mechanism without affecting its precise localization. Similar results are reported in a study involving 104 cancer cell lines48. Any statistical significant association was shown between amplifications and known fragile sites in the human genome, but amplifications co-localized at some hot spots with known cancer genes and fragile sites.

DNA amplification usually appears in advanced cancers, where p53-mediated maintenance of genomic integrity is lost62, but this is not an universal rule. In fact, amplification was observed also in early or pre-malignant stages demonstrating that it is not simply a feature of highly rearranged genomes of advanced tumors55.

Biological specificity and significance of gene amplifications make them attractive targets for clinical applications, such as prognostic evaluation63,64, diagnosis and therapy65, as also shown by us1. In fact, the gain-of-function effect of gene amplifications makes them ideal targets for therapeutic intervention due to the direct nature of their activation and the fact that a tumor can become addicted to the enhanced expression of the affected gene65. There is increasing interest in the genome-wide study of amplified regions in cancer samples as putative loci that harbour genes important for tumor development. Gene amplification has the potential to lead to the identification of novel oncogenes, and to characterize tumors from a prognostic and diagnostic point of view. The drawback of the study of amplified regions is the identification of the driver gene, because several genes might map in the investigated region. One possibility to improve the prediction of candidate oncogenes is based on the simultaneous evaluation of their expression level: overexpression linked to increased DNA level is a strong indicator of oncogene behaviour.

Genetic aberrations in non-coding regions

Besides the deregulation of cancer genes, genetic aberrations may also perturb the function of non-coding RNAs (ncRNAs). This RNA class seems to be mainly involved in translational regulation, and its altered expression has been associated with cancer and other complex diseases (e.g. Alzheimer’s disease, mental retardation, psychiatric disorders)66. Well- characterized ncRNAs with known function include ribosomal RNAs (rRNAs) and transfer RNAs (tRNAs), which are involved in mRNA translation, small nuclear RNAs (snRNAs), which are involved in splicing, small nucleolar RNAs (snoRNAs), which are involved in the modification of rRNAs, and microRNAs (miRNAs). Recently, another type of ncRNAs has been described, the ultraconserved genes (UCGs)67.

Two ncRNAs are known to play a role in tumorigenesis: miRNAs and the recently discovered UCGs. Both miRNAs68,69 and UCGs67 are frequently located at rearranged regions involved in cancer, and the expression level of miRNAs can be influenced by the underlying DNA copy number69.

(29)

MicroRNA genes encode small, non-coding RNA molecules. Once mature they are about 22 nt long, and are involved in transcriptional and translational regulation of gene expression.

Many miRNAs have conserved sequences between distantly related organisms, suggesting their essential role in physiologic processes, demonstrated for some pathways linked to development, differentiation, cell cycle regulation, apoptosis and metabolism70. Approximately 50% of all miRNAs are embedded within introns of protein-coding genes or non-coding RNA transcripts71. miRNAs bind to target mRNAs by base pairing at partial complementary sites (in plants perfect base pairing), mainly located at the 3’-untranslated region (UTR)72,73. Each miRNA has the potential to regulate a large number of gene’s UTR, and several miRNAs can also have the same target, giving rise to a complex regulation network of miRNAs and miRNA targets. miRNAs are very similar to siRNAs in their function. The main difference lies in the biogenesis process (reviewed in Kim et al.74). Both siRNAs and miRNAs are double-stranded before incorporation into the RNA-induced silencing complex (RISC), but prior to insertion one strand is removed. The strand of the miRNA duplex stably associated with RISC represents the mature miRNA, that determines the fate of the miRNA and its target mRNA. In case of perfect complementarity between mature miRNA and target mRNA (like in plants), the latter is cleaved and degraded. In contrast, the regulatory mechanism observed in case of imperfect base pairing is mainly translational silencing, but a reduction of target mRNA level was also observed75.

Growing interest is dedicated to the role of specific microRNAs in cancer. In fact, evidences have shown that losses of miRNAs with tumor-suppressive activity may contribute to tumorigenesis. For example, in chronic lymphocytic leukaemia 13q14.3 deletion has been associated with loss of mir-15a and mir-16-176. mir-15a and mir-16-1 are located in a region without known coding genes, which is frequently deleted also in mantle cell lymphoma, multiple myeloma and prostate cancer. They were found to negatively regulate the expression of the anti-apoptotic BCL2 protein77, therefore playing a role in B-cell survival, but these results have not been largely confirmed78. An oncogenic role of overexpressed miRNAs in cancer has also been established. For example, this is the case for mir-155, both in solid and in haematological malignancies79,80, or for the mir-17-92 cluster, located at 13q31-32, a region frequently amplified in malignant B-cell lymphoma81. In our laboratory we detected a high copy number amplification at 13q31 in the human mantle cell lymphoma cell line JEKO1, targeting both MYC and mir-17-9282. To conclude on the role of miRNAs in cancer, we can say that they potentially act both as oncogenes and tumor suppressor genes depending on the specific set of conditions they are subjected to. Moreover, miRNA expression profiles in human cancers can be significantly informative and can potentially help in tumor classification, diagnosis and prognosis prediction83.

Ultraconserved regions (UCRs) are a subset of conserved sequences located both in intra- and intergenic regions, that are strictly conserved among orthologous regions of the human,

(30)

human cancers (leukaemia and carcinomas) and their location at known regions involved in cancer67. Moreover, they proved in a cancer model that a differentially expressed UCG had oncogenic properties, as decreasing its overexpression induced apoptosis in colon cancer cells. UCGs are a relative new field of research and new investigations are needed to better define the functional significance of their alterations in human cancer.

In conclusion, genetic aberrations in tumors of these two ncRNA classes, miRNAs and UCGs, support a model of carcinogenesis in which both coding and non-coding genes are playing a role67,85.

1.1.2 Genomic profiling

One of the central aims of cancer research is the comprehensive knowledge of genomic alterations driving oncogenesis. Due to the complexity of the disease, this implies the use of global approaches and the integration of genome-wide analyses at DNA, RNA and functional levels86. With the available human genome sequence87,88 and the advent of high- throughput techniques like microarray platforms, it is now possible to survey genome-wide DNA copy number abnormalities and gene expressionalterationsat high resolution89.

Genomic profiling plays a very important role, both for the identification of diagnostic and prognostic factors and new therapeutic targets, and for the development of tumor classification systems that reflects the underlying biology. Basically, genomic profiling enables high-throughput screening of the genomic and gene expression alterations in tumor samples.

Once candidate genes or transcripts are identified, other independent techniques are applied for confirming the results and studying the pathological mechanisms linked to the observed alterations.

1.1.2.1 Array comparative genomic hybridization (aCGH)

Genome-wide DNA profiling, sometimes simply referred to as genomic profiling in the context of DNA characterization studies, is a powerful initial approach for cancer research. It allows to simultaneously identify multiple alterations at genomic level from various specimens in an unbiased manner and in short time.

Array comparative genomic hybridization (aCGH) is based upon conventional CGH. The latter appeared as a valid approach to detect DNA copy number changes due to duplications/amplifications, deletions and unbalanced translocations. In CGH, the DNA to be tested and a reference DNA are differentially labelled and hybridized on a glass slide with normal metaphase chromosome spreads90. The ratio of the hybridization signal intensity of test and reference samples at any interrogated locus is detected. The assumption is that the relative amount of test and reference DNA bound to a given chromosomal locus is

(31)

measurement of the ratio of the intensities of the two different dyes along the target chromosomes, regions of impaired DNA content can be detected.

CGH studies provided many useful data on genetic events occurring in lymphoid neoplasms91-94. A drawback of CGH is that the DNA of the chromosomal spreads on the slide is still highly condensed and supercoiled, thus the resolution of the technique is limited:

approximately 10-12 Mb for deleted regions and approximately 2 Mb only for high level (> 5- 10 fold) amplifications. CGH is also not able to identify small intragenic alterations, ploidy changes, alterations affecting pericentromeric and heterochromatic regions, and balanced translocations.

To overcome some of these limits, the microarray-based CGH, or aCGH, was implemented.

Solinas-Toldo et al. performed matrix-based CGH on glass slides with spotted probe DNAs of different types: cosmids, chromosome-specific DNA libraries, P1 or PAC clones of genomic fragments (75-130 kb)95. They demonstrated that matrix-based CGH was at least as efficient as CGH on metaphase chromosomes in detecting high-copy-number amplifications and low- copy-number changes. Moreover, PAC probes greatly enhanced the resolution and allowed the detection of imbalances not identified by chromosomal CGH. The possibility to spot many DNA probes in an ordered manner on a substrate made of glass or nylon opened this technology to automation and high-throughput screenings, thus rendering aCGH a more simple and more efficient procedure to detect DNA copy number changes. Two examples are represented by a very recent work on diffuse large B-cell lymphoma (DLBCL)96 and another one on acute lymphoblastic leukaemia (ALL)50. The study of 203 DLBCL samples by high-resolution, genome-wide copy number analysis identified 272 recurrent chromosomal aberrations that were associated with gene expression alterations96. With both gene expression and genomic profiles available the authors demonstrated that 30 chromosomal alterations were differentially distributed among DLBCL subtypes previously defined by different gene expression profiles, termed germinal center B-cell-like (GCB) DLBCL, activated B-cell-like (ABC) DLBCL, and primary mediastinal B-cell lymphoma (PMBL)94. For example, an amplicon on chromosome 19 was detected in 26% of ABC DLBCLs but only in 3% of GCB DLBCLs and PMBLs. A highly upregulated gene in this amplicon was SPIB, which encodes an ETS family transcription factor and is probably involved in the pathogenesis of ABC DLBCL. The data presented in that work provide genetic evidence that the DLBCL subtypes are distinct diseases that use different oncogenic pathways. Another recent genome-wide analysis of genetic alterations in ALL reaffirms the power of high-resolution genome-wide approaches as initial step to detect new oncogenic lesions50. Two hundred and forty two pediatric ALL patients were analyzed by means of SNP (single nucleotide polymorphism) arrays and genomic DNA sequencing. The work revealed the involvement of principal regulators of B- cell development and differentiation in ALL pathogenesis. PAX5 was identified as the most frequent target of somatic mutations (deletions, translocations, point mutations) leading to

Références

Documents relatifs

tiling, and calculate the number of defects per unit cell required to change the aperiodic structure into a periodic onei. We discuss how some

More complex interactions are generated upon renormaliz~ition, and will be given below... 3, The Triangle-Dart tiling. ai i piece of the iiling : bj ihe eighi type, ut ,ire, wiih

A simple occlusion scheme could consider that a tile is fully occlud- ing if all lines between the two pairs of opposite segments (sides) of a tile (North/South and East/West in

Unit´e de recherche INRIA Lorraine, Technopˆole de Nancy-Brabois, Campus scientifique, ` NANCY 615 rue du Jardin Botanique, BP 101, 54600 VILLERS LES Unit´e de recherche INRIA

We expect that weak tiling cohomology is always infinitely generated for aperiodic tilings of finite local complexity. As explained in [18] for one dimensional tilings PV-cohomology

Quasi-RKKY oscillations along a path of 19 sites (runmng index j), taking to be an A site on trie 1393 site tiling for two values of trie Fermi energy, ~c a) shows trie

As in the case of the classical approach of rhythmic tiling canons construction by fac- torization of cyclic groups, the polynomial approach nat- urally leads to some still

One class of semantic tiling is a transformation that takes the scalar version of a program as an input, and which returns such a blocked version.. This blocked version has the