Article
Reference
Prospective validation in epithelial tumors of a gene expression predictor of liver metastasis derived from uveal melanoma
TSANTOULIS, Petros, et al.
Abstract
Predicting the risk of liver metastasis can have important prognostic and therapeutic implications, given the availability of liver-directed therapy. Uveal melanoma has a striking predisposition for liver metastasis despite the absence of anatomical proximity. Understanding its biology may uncover factors promoting liver metastasis in other malignancies. We quantified gene expression by RNAseq in 76 uveal melanomas and combined with public data in a meta-analysis of 196 patients. The meta-analysis of uveal melanoma gene expression identified 63 genes which remained prognostic after adjustment for chromosome 3 status.
Two genes, PTP4A3 and JPH1, were selected by L1-penalized regression and combined in a prognostic score. The score predicted liver-specific relapse in a public pan-cancer dataset and in two public colorectal cancer datasets. The score varied between colorectal consensus molecular subtypes (CMS), as did the risk of liver relapse, which was lowest in CMS1.
Additional prospective validation was done by real-time PCR in 463 breast cancer patients.
The score was significantly correlated with liver relapse in [...]
TSANTOULIS, Petros, et al. Prospective validation in epithelial tumors of a gene expression predictor of liver metastasis derived from uveal melanoma. Scientific Reports, 2019, vol. 9, no.
1, p. 17178
DOI : 10.1038/s41598-019-52841-y PMID : 31748560
Available at:
http://archive-ouverte.unige.ch/unige:144609
Disclaimer: layout of this document may differ from the published version.
1
Prospective validation in epithelial tumors of a gene expression
1
predictor of liver metastasis derived from uveal melanoma.
2
Petros Tsantoulis,1,2,3* Mauro Delorenzi,3,4,5 Ivan Bièche,6 Sophie Vacher,6 Pascale 3
Mariani,7 Nathalie Cassoux,7 Alexandre Houy,6,8 Marc-Henri Stern,6,8 Sergio Roman- 4
Roman,9 Pierre-Yves Dietrich,1,2 Arnaud Roth,1,2 Wulfran Cacheux10,1,6 5
1. Hôpitaux Universitaires de Genève, Service d’Oncologie, Geneva, Switzerland 6
2. University of Geneva, Geneva, Switzerland 7
3. SIB Swiss Institute of Bioinformatics, Bioinformatics Core Facility, Lausanne, 8
Switzerland 9
4. University Lausanne, Department of Fundamental Oncology, Lausanne, Switzerland 10
5. Ludwig Institute for Cancer Research, Epalinges, Switzerland 11
6. Institut Curie, Département de génétique, Paris, France 12
7. Institut Curie, Département de chirurgie, Paris, France 13
8. Institut Curie, PSL Research University, INSERM U830, Paris, France 14
9. Institut Curie, PSL Research University, Translational Research Department, Paris, 15
France 16
10. Hôpital Privé – Pays de Savoie, Oncology department, Annemasse, France 17
18
19
Supplementary methods
20
1. Processing of array expression data 21
Both processed and raw data, when available, were imported from the GEO and 22
ArrayExpress. Raw Affymetrix data were treated with the RMA algorithm from the “affy”
23
BioConductor (v3.4) package using the default settings (1).
24
Scaling of different datasets was performed with the scale() function from base R to a mean 25
of zero and a variance of one. This was done in order to make distributions from different 26
technologies (microarray, RNAseq) comparable, so that a single score equation can be used 27
across datasets without modification. Otherwise the coefficients of the gene score would 28
have to be re-calculated (and validated) separately for each dataset.
29
2. Molecular classes 30
Based on the molecular classification of uveal melanoma into two classes by Onken et al.
31
(2), we used their published list of 60 class-specific genes to assign the most likely class to 32
each tumor. The distribution of the difference between the means of the expression of class 33
1 and the mean of class 2 genes, was clearly bimodal (supplementary figure 7) and was used 34
to classify the tumors into the two classes. Visual inspection of the gene expression 35
heatmap (supplementary figure 8) further supported this classification.
36
3. Univariable and bivariable survival models 37
For the purposes of the meta-analysis in uveal melanoma, the Cox survival models were 38
specified as follows:
39
Univariable: coxph(Surv(interval, event) ~ exp[x,]) 40
Bivariable: coxph(Surv(interval, event) ~ chr3 + exp[x,]) 41
where chr3 is a factor (categorical variable) with two levels (loss/no loss) and exp[x,] is the 42
standardized expression of gene x, within each dataset.
43
4. Consensus Molecular Subtypes (CMS) of colorectal cancer 44
The molecular classification of colorectal cancer has been published previously(3) and an 45
open-source reference classifier is available.1 This classifier was used to classify PETACC and 46
GSE14095 data based on both the RandomForest and SSP methods. Whenever the 47
RandomForest failed to assign a CMS class, the SSP algorithm was used. The proportion of 48
CMS groups was similar to previously published results.
49
5. Derivation of a prognostic linear model 50
The pooled data of the model-fitting dataset (training, N=196 patients) were used to train a 51
prognostic model with the penalized likelihood algorithm of the GLMnet package (coxnet 52
method). The following parameters were used for the derivation of a linear predictor:
53
nfolds=16, to increase the number of cross-validation folds, and maxit=200000, to increase 54
the iterations that are allowed until convergence can be achieved. The random generator 55
1 https://github.com/Sage-Bionetworks/CMSclassifier
3 seed was fixed in order to ensure reproducibility. To improve the chances of convergence 56
on a parsimonious model, the algorithm was applied only on candidate genes, which were 57
associated with survival after adjustment for chromosome 3 status at FDR<0.1 in the meta- 58
analysis (N=119, S. Table 2).
59
The linear predictor was trained to correlate with the risk of relapse. The predictor is a 60
linear combination of two variables p and j, corresponding to the standardized expression of 61
PTP4A3 and JPH respectively. The score S is calculated with the following formula:
62
S = 0.249 * p + 0.147 * j 63
64
The resulting values are mean-centered and standardized within each dataset.
65
In order to verify the robustness of the model-generating procedure, we also performed 66
100 consecutive runs without fixing the random number generator seed. Genes PTP4A3 and 67
JPH1 were included in all the resulting predictive models.
68 69
References 70
1. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP. Summaries of 71
Affymetrix GeneChip probe level data. Nucleic Acids Res. 2003 Feb;31(4):e15.
72
2. Onken MD, Worley LA, Ehlers JP, Harbour JW. Gene expression profiling in uveal 73
melanoma reveals two molecular classes and predicts metastatic death. Cancer Res.
74
2004 Oct;64(20):7205–9.
75
3. Guinney J, Dienstmann R, Wang X, de Reyniès A, Schlicker A, Soneson C, et al. The 76
consensus molecular subtypes of colorectal cancer. Nat Med. 2015 Oct;21(11):1350–
77
6.
78 79
Figure S1 80
Kaplan-Meier plot of subgroups from figure 1B. The differences between a-vs-b and a-vs-c 81
are significant (Cox regression, P-values shown below).
82
83 84
5 Figure S2
85
Hierarchical clustering with Pearson's correlation similarity and average linkage. Heatmap of 86
all available genes. The vertical color bars to the left show patients with RFS of less than 3 87
years, chromosome 3 monosomy and molecular class (all in black). The most significantly 88
prognostic genes in the meta-analysis are shown for chromosome 8 (twenty genes in light 89 green) and other chromosomes (twenty genes in blue). The horizontal color bar below the 90
heatmap is grey for genes that are not significantly prognostic and black for FDR<0.05.
91
It can be seen that genes PTP4A3 and JPH1 belong to distinct clusters, probably 92
corresponding to associated pathways. BAP1 (in red) is also distinct and centered on a 93
narrow gene cluster.
94
95 96
Figure S3 97
Risk score and relapse-free survival in uveal melanoma (validation data). Pooled data from 98
GSE39717 and TCGA. The score was separated at a median cutoff for plotting.
99
100
A B
7 Figure S4
101
Boxplot of the risk score in tumors from the pan-cancer dataset. A. Comparison of tumors 102
without liver metastasis (including non-metastatic tumors and tumors with other 103
metastases) with tumors that presented liver metastases. B. Comparison of tumors with 104
liver metastases with metastatic tumors that presented metastases in other sites.
105
106 107
A B
Figure S5 108
Distribution of the risk score in colorectal cancer. A bimodal distribution of the risk score is 109
observed in both datasets (PETACC and GSE14095). Significant differences can be seen 110
between the four CMS groups in the lower part of the figure (Wilcoxon’s test, with Holm- 111
Bonferroni P-value adjustment for multiple hypothesis testing).
112
113 114
9 Figure S6
115
Data flow chart. A summary of the different datasets that have been used and their role in 116
the training and validation process.
117 118
119
Figure S7 120
Distribution of the molecular class gene expression. The values visibly separate the tumors 121
into two distinct peaks, corresponding to the two molecular classes.
122 123 124
125
Class 1 Class 2
11 Figure S8
126
Heatmap of genes that have been associated with Class 1 and Class 2 tumors. The color 127
bar to the left corresponds to predicted molecular class (red=class 1, deep blue=class 2).
128 129
130 131