• Aucun résultat trouvé

SCNAs profiling in large datasets

1. Cancer and genomics

1.4. Comprehensive analysis of large datasets

1.4.1. SCNAs profiling in large datasets

Using SNP arrays, Zack and collaborators studied SCNAs across 4,934 tumor samples of 11 different cancer types (Zack et al., 2013) as part of the Pan-Cancer analysis project (Cancer Genome Atlas Research et al., 2013), with the objective of distinguishing driver from

Figure 3. Methylation status of the MGMT promoter and its relationship with treatment and mutational contexts in glioblastoma. a. the y axis corresponds to number of mutations.

The x axys corresponds to the treatment status of a patient (+ treated, - non-treated), the methylation status of the MGMT promoter (Meth=methylated, - non-methylated), and the mutational status of MMR genes (Mut= at least one MMR gene mutated, - non-mutated). The numbers under the bars represent the number of samples in each group. b. Mutational spectrum of the MMR genes as a function of treatment and methylation status of the MGMT promoter. Color codes for both graphs are at the bottom. Reproduced from Cancer Genome Atlas Research, 2008 .

a b

27 of 299

passenger events and identifying the mechanisms of SCNAs acquisition in cancer. They also aimed at pinpointing the key genes within a SCNA that were ultimately driving the cancer phenotype. After inferring the SCNA profiles that better explained the ploidy determined for each tumor, the authors called 202,244 SCNAs (median of 39 per tumor sample) and classified them in 6 different categories (Table 2).

SCNAs category Median per tumor

Focal copy gain, smaller than chromosome arm 11

Focal copy loss, smaller than chromosome arm 12

Arm-level copy gain, full arm-length or longer 3

Arm-level copy loss, full arm-length or longer 5

cnLOH 1

Whole genome duplication 37% of cancers

They observed that cancers with whole genome duplication (WGD) had twice the rate of SCNAs that tumors without. This correlated well with WGD tumors having an average ploidy of 3.31 and not 4, while tumors with no WGD had a ploidy of 1.99 (when 2 was the expected ploidy). WGD occurred early in the SCNA events history in tumors, while other types of SCNAs arose after the WGD event. The average copy number profile for these 11 cancer types in WGD or near-euploid state can be seen in Figure 4a.

Focal SCNAs that extended to the telomeres were longer than intrachromosomal SCNAs.

These internal SCNAs had frequencies inversely proportional to their length, while telomeric SCNAs were uniform in size (Figure 4b) and were more frequent than expected assuming random positions for SCNAs (P<0.0001). SCNAs in general tended to finish at the centromeres.

Table 2. Types of SCNAs across 11 tumor types. Events assessed from SNP array data of 4,934 tumors from Zack et al, 2013.

28 of 299

Figure 4. Characteristics of different types of SCNAs. a.Number of amplifications (red) or deletions (blue) on 10 cancer types from an arm-level or a focal perspective (top and bottom respectively). In each cancer type, samples with WGD events are at the right and samples without WGD to the left; SCNA in samples with WGD are resolved according to their timing relative to the WGD event. b. Distribution of lengths of SCNAs originating at telomeres compared to intra-chromosomal SCNAs. c. Rates of chromothripsis across different cancer types. BLCA= bladder, BRCA=breast, COAD=Colon and rectal carcinoma, GBM= glioblastoma multiforme, HNSC= head and neck squamous cell, KIRC=kidney renal cell, LUAD=lung adenocarcinoma, LUSC=lung squamous cell, OV=ovary, UCEC=uterine cervix. All three panels reproduced from Zack et al., 2013.

a b

c

Chromothripsis was detected in 5% of samples with varying frequencies depending on tumor type (Figure 4c) but unrelated to overall rates of SCNAs per sample. Chromothripsis tended to occur in specific regions and is associated with particular driver events.

Across all cancer types, 70 recurrent amplifications and the same number of recurrent deletions were identified. The authors identified “peak” regions within these SCNAs that were more likely to contain oncogenes or tumor suppressor genes. SCNAs within the peak regions were shorter than events occurring elsewhere in the chromosome (P<0.0001) and they were also more often high-amplitude events (P<0.0001). The frequency of events in these peak regions was stable across tumors of the same lineage. 24 of the 70 peak regions

29 of 299

of amplification contained an oncogene known to be activated by amplification (such as CCND1, EGFR, MYC, ERBB2, and CCNE1) or other genes directly involved in carcinogenesis, such as TERC, which encodes the substrate for TERT, a known oncogene. From the peak regions of amplification, 12 contained tumor suppressor genes (such as ATM, NOTCH, FOXK2 and PPP2R2A) and two other regions had tumor suppressor gene candidates (ERRFI1 and FOXC1).

The peaks that contained no obvious cancer gene candidates were subjected to literature citation searching algorithms, and enrichment for topics related to epigenetic and mitochondrial regulation was observed. This finding stresses the relevance of epigenetic alterations in cancer progression, in concordance with previous observations (Berman et al., 2012, Fullgrabe et al., 2011). When significantly mutated genes (SMG) were called within the peak regions, the authors identified all genes known to be tumor suppressors as well as a significant fraction of genes known to act as oncogenes through amplification, inside the peak regions. It was interesting to note that deleted regions are probably enriched on tumor suppressor genes, as they had more truncating and frameshift deletions than expected (P=0.0002). Furthermore, from the 770 peak regions identified across specific cancer lineages, 84% occurred in at least two lineages and 65% were inside peak regions from the pan-cancer analysis.

This study was the first high resolution analysis of SCNAs and ploidy across several cancer types. It reports areas in the genome that are enriched in SCNAs in cancer and that probably contain genes or regulatory elements that act as drivers of tumorigenesis, and confirm the relevance of several previously identified tumor suppressor and oncogenes. It stresses the advantages of unbiased approaches in large dataset for the identification of events and genes important in carcinogenesis.