• Aucun résultat trouvé

Automated exploration of gene ontology term and pathway networks with ClueGO-REST

N/A
N/A
Protected

Academic year: 2021

Partager "Automated exploration of gene ontology term and pathway networks with ClueGO-REST"

Copied!
4
0
0

Texte intégral

(1)

HAL Id: hal-02414247

https://hal.sorbonne-universite.fr/hal-02414247

Submitted on 16 Dec 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Automated exploration of gene ontology term and pathway networks with ClueGO-REST

Bernhard Mlecnik, Jérôme Galon, Gabriela Bindea

To cite this version:

Bernhard Mlecnik, Jérôme Galon, Gabriela Bindea. Automated exploration of gene ontology term

and pathway networks with ClueGO-REST. Bioinformatics, Oxford University Press (OUP), 2019, 35

(19), pp.3864-3866. �10.1093/bioinformatics/btz163�. �hal-02414247�

(2)

Systems biology

Automated exploration of gene ontology term and pathway networks with ClueGO-REST

Bernhard Mlecnik 1,2 , Je´roˆme Galon 1 and Gabriela Bindea 1, *

1

INSERM, Laboratory of Integrative Cancer Immunology, Equipe Labellise´e Ligue Contre le Cancer, Sorbonne Universite´, Universite´ Sorbonne Paris Cite´, Universite´ Paris Descartes, Universite´ Paris Diderot, Centre de Recherche des Cordeliers, Paris F-75006, France and

2

Inovarion, Paris 75013, France

*To whom correspondence should be addressed.

Associate Editor: Jonathan Wren

Received on November 28, 2018; revised on February 28, 2019; editorial decision on March 1, 2019; accepted on March 5, 2019

Abstract

Summary: Large scale technologies produce massive amounts of experimental data that need to be investigated. To improve their biological interpretation we have developed ClueGO, a Cytoscape App that selects representative Gene Onology terms and pathways for one or multiple lists of genes/proteins and visualizes them into functionally organized networks. Because of its reli- ability, userfriendliness and support of many species ClueGO gained a large community of users.

To further allow scientists programmatic access to ClueGO with R, Python, JavaScript etc., we implemented the cyREST API into ClueGO. In this article we describe this novel, complementary way of accessing ClueGO via REST, and provide R and Phyton examples to demonstrate how ClueGO workflows can be integrated into bioinformatic analysis pipelines.

Availability and implementation: ClueGO is available in the Cytoscape App Store (http://apps.cyto scape.org/apps/cluego).

Contact: gabriela.bindea@crc.jussieu.fr

Supplementary information: Supplementary data are available at Bioinformatics online.

1 Introduction

High-throughput technologies produce large amounts of experimen- tal data that need to be investigated to gain insights into biological processes. Systemic approaches for data integration, analysis and visualization were developed to reflect not only individual biological components, but also their interactions in pathways and networks.

Software tools that perform such type of analyses in an automatic way are more and more needed.

Cytoscape (Shannon et al., 2003) is a major computational plat- form to visualize and analyze networks. Cytoscape Automation (https://github.com/cytoscape/cytoscape-automation) enables scien- tific workflows written in many languages and scales Cytoscape to large datasets and pipelines. The Cytoscape programmatic interface for this is CyREST (Ono et al., 2015).

We have contributed to the Cytoscape App collection (Saito et al., 2012) with ClueGO (Bindea et al., 2009) and CluePedia (Bindea et al., 2013) apps that are broadly used by the scientific

community to enhance the interpretation of biological data (Mlecnik et al., 2018). Within ClueGO, representative gene ontol- ogy (GO) terms (Ashburner et al., 2000) as well as KEGG (Kanehisa et al., 2002), WikiPathways (Pico et al., 2008) and Reactome (Croft et al., 2011) pathways are integrated into a functionally organized network. Furthermore, ClueGO can compare the biological role of several lists of genes/proteins.

To perform multiple analyses in complex workflows can be time consuming and prone to errors. We have thus enabled the cyREST Application Programming Interface (API) in ClueGO, to allow sci- entists programmatic access to functional analyses. We describe here this new functionality of the ClueGO App.

2 ClueGO functional analysis via cyREST

The Cytoscape App Manager (http://cytoscape.org/) allows the auto- matic download of the latest version of ClueGO from the App Store

V

C

The Author(s) 2019. Published by Oxford University Press. 3864

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

Bioinformatics, 35(19), 2019, 3864–3866 doi: 10.1093/bioinformatics/btz163 Advance Access Publication Date: 7 March 2019 Applications Note

Downloaded from https://academic.oup.com/bioinformatics/article-abstract/35/19/3864/5371065 by BIUS Jussieu user on 16 December 2019

(3)

(Lotia et al., 2013). The yFiles Layout Algorithms App should be installed as well. ClueGO is written in Java programing language implementing the OSGi interface of Cytoscape. Starting with version 2.5.0 ClueGO implements the cyREST core plugin API and provides programmatic access to its functionality (Fig. 1A). ClueGO features REST enabled can be explored in the cyREST API Swagger (Fig. 1B), accessible via the Cytoscape menu, Help ! Automation

! CyREST API.

ClueGO can be hence accessed with both the graphical user inter- face of Cytoscape as well as programmatical through cyREST (Fig. 1 and Supplementary Material). These two ways of performing analyses complement each other and answer to different types of user requests.

2.1 ClueGO analysis steps

The four main steps of a typical ClueGO analysis and additional op- tional REST enabled features are shown in Supplementary Table S1.

The ClueGO-REST enabled functions are accessed by a URL that is built up by the host address (e.g. localhost), the port (e.g. 1234) and the feature requested by the user. The HTTP request return types are then encoded as JSON, tab delimited text or binary data.

The selection of an organism is the first step. Human and mouse data sources are included by default in ClueGO and more than 200 other organisms are available for download. The organism to ana- lyze has to be set e.g. human:/v1/apps/cluego/cluego-manager/organ- isms/set-organism/‘Homo Sapiens’.

The second step requires the upload of one or several lists of genes/proteins to analyze. ClueGO automatically recognizes mul- tiple identifier types based on information from NCBI (NCBI Resource Coordinators, 2018), UniProtKB (The UniProt Consortium, 2017) and Ensembl (Aken et al., 2017) databases.

Different colors and shapes are automatically attributed to the clus- ters, to visualize them on the network.

In the third step representative GO terms and pathways are selected using predefined filters based on the number of associated genes found from the uploaded list, their percentage from the total number of genes of the term or the GO tree level. Additionally, GO terms can be selected based on particular evidence codes of the gene- term associations. The significance of the pathways and their simi- larity in terms of associated genes are automatically mapped on the network and illustrated in different interchangeable visual styles.

After running the enrichment analysis through the last Step 4 the functionally grouped network with the terms connected and grouped based on kappa score is created (Supplementary Fig. S1).

The kappa score is calculated by taking into account how many genes are shared among two terms and is also used to define the functional groups of terms and pathways. The results of the statistic- al analysis, the network as well as other graphical representations of the results can be downloaded through cyREST functions. The ana- lysis and visualization can be customized. Large networks can be refined by applying the fusion of similar terms or by visualizing only significant pathways (optional steps).

Debug messages appear during the workflow if mandatory ana- lysis steps are skipped, if the term selection is too restrictive/permis- sive or when other exceptions occur.

2.2 Use cases

R and Python ClueGO analysis examples with one or two lists of genes are provided as Supplementary Material. B and NK cell genes (Critchley-Thorne et al., 2007; Edgar et al., 2002) were analysed, and results are illustrated as a network of pathways showing either functional groups (Supplementary Fig. S1) or the origin of the genes

in the two lists (Supplementary Fig. S2). All available ClueGO- REST endpoints are illustrated, including optional features.

3 Summary

In this article, we describe the REST enabled ClueGO functionality and show how scientists can integrate ClueGO with other Cytoscape apps and non-Cytoscape libraries to perform functional analyses in a programmatic way. Detailed information on ClueGO can be found at http://www.ici.upmc.fr/cluego.

Acknowledgements

The authors thank the Cytoscape team for their great work and support.

Funding

This work was supported by grants from INSERM (‘Integrative Cancer Immunology’, ‘Cancer et Environnement’, ‘Heterogeneity of Colorectal and Liver Cancer’ (HETCOLI), the National Cancer Institute of France (INCa), Canceropole Ile de France, MedImmune, AstraZeneca, the Transcan ERAnet European Project [Grant number TRANS201401218], La Ligue contre le Cancer, the Qatar National Research Fund (QNRF), Cancer research for per- sonalized medicine (CARPEM), Paris Alliance of Cancer Research Institutes (PACRI) and LabEx Immuno-Oncology.

A

B

Fig. 1. (A) ClueGO can be accessed through the Cytoscape Graphical User Interface (GUI) and the cyREST Programmatic User Interface (UI). (B) ClueGO functions REST enabled in the CyREST API Swagger

ClueGO-REST 3865

Downloaded from https://academic.oup.com/bioinformatics/article-abstract/35/19/3864/5371065 by BIUS Jussieu user on 16 December 2019

(4)

Conflict of Interest: none declared.

References

Aken,B.L. et al. (2017) Ensembl 2017. Nucleic Acids Res., 45, D635–D642.

Ashburner,M. et al. (2000) Gene ontology: tool for the unification of biology The Gene Ontology Consortium. Nat. Genet., 25, 25–29.

Bindea,G. et al. (2009) ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics, 25, 1091–1093.

Bindea,G. et al. (2013) CluePedia Cytoscape plugin: pathway insights using integrated experimental and in silico data. Bioinformatics, 29, 661–663.

Critchley-Thorne,R.J. et al. (2007) Down-regulation of the interferon signal- ing pathway in T lymphocytes from patients with metastatic melanoma.

PLoS Med., 4, e176.

Croft,D. et al. (2011) Reactome: a database of reactions, pathways and bio- logical processes. Nucleic Acids Res., 39, D691–D697.

Edgar,R. et al. (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res., 30, 207–210.

Kanehisa,M. et al. (2002) The KEGG databases at GenomeNet. Nucleic Acids Res., 30, 42–46.

Lotia,S. et al. (2013) Cytoscape app store. Bioinformatics, 29, 1350–1351.

Mlecnik,B. et al. (2018) Comprehensive functional analysis of large lists of genes and proteins. J. Proteomics, 171, 2–10.

NCBI Resource Coordinators (2018) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 46, D8–D13.

Ono,K. et al. (2015) CyREST: turbocharging Cytoscape access for external tools via a RESTful API. F1000Res., 4, 478.

Pico,A. et al. (2008) WikiPathways: pathway editing for the people. PLoS Biol., 6, e184.

Saito,R. et al. (2012) A travel guide to Cytoscape plugins. Nat. Methods, 9, 1069–1076.

Shannon,P. et al. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res., 13, 2498–2504.

The UniProt Consortium (2017) UniProt: the universal protein knowledge- base. Nucleic Acids Res., 45, D158–D169.

3866 B.Mlecnik et al.

Downloaded from https://academic.oup.com/bioinformatics/article-abstract/35/19/3864/5371065 by BIUS Jussieu user on 16 December 2019

Références

Documents relatifs

a Upset plot representing overlaps between lists of enriched GO terms, b Clustering heatmap plot combining a dendrogram based on Wang ’ s semantic similarity distance and

The weighted-kappa score for consensus clustering (solid line) calculated by comparing consensus clusters to the corresponding individual clustering algorithm is shown relative

When the transcription rates were slowed by diluting both UTP and rNTPs, there was still a slight preference towards the WRC hotspot motifs but activity also peaked at

A comparison of the annotations included when using the old (2012) slim, or the new (2016), slim-o-matic generated one, on a sample from the EBI Metagenomics dataset.. Using the old

Solution of (1) then corresponds to solving a spin coupling problem, ana- logous to L. S coupling of atomic spectroscopy. An external magnetic field can be used to

The biological interpretation of gene lists was performed by three approaches: study of GO enrichment terms with GOTM (Gene Ontology Tree Machine) and two pathway software with

GenFam* is dedicated to the manual and precise analysis of gene families and includes specific workflows running under a Galaxy platform, allowing to gather

GenesPath: a tool using GenFam, and focused on metabolic pathways.. GenesPath: launch workflow through