HAL Id: hal-00832028
https://hal-upec-upem.archives-ouvertes.fr/hal-00832028v2
Submitted on 30 Jun 2013
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives| 4.0
International License
Clust&See: A Cytoscape plugin for the identification,
visualization and manipulation of network clusters
Lionel Spinelli, Philippe Gambette, Charles Chapple, Benoît Robisson, Anaïs
Baudot, Henri Garreta, Laurent Tichit, Alain Guénoche, Christine Brun
To cite this version:
Lionel Spinelli, Philippe Gambette, Charles Chapple, Benoît Robisson, Anaïs Baudot, et al..
Clust&See: A Cytoscape plugin for the identification, visualization and manipulation of network
clusters. BioSystems, Elsevier, 2013, 113 (2), pp.91-93. �10.1016/j.biosystems.2013.05.010�.
�hal-00832028v2�
BioSystems113 (2013) 91–95
ContentslistsavailableatSciVerseScienceDirect
BioSystems
j o u r n a l ho me p ag e :w w w . e l s e v i e r . c o m / l o c a t e / b i o s y s t e m s
Clust&See:
A
Cytoscape
plugin
for
the
identification,
visualization
and
manipulation
of
network
clusters
夽,夽夽
Lionel
Spinelli
a,d,
Philippe
Gambette
a,d,1,
Charles
E.
Chapple
b,d,
Benoît
Robisson
b,d,
Anaïs
Baudot
a,d,
Henri
Garreta
c,d,
Laurent
Tichit
a,d,
Alain
Guénoche
a,d,
Christine
Brun
b,d,e,∗aInstitutdeMathématiquesdeLuminy,CNRS–FRE3529,AvenuedeLuminy,13288MarseilleCedex9,France
bTechniquesAvancéespourleGénomeetClinique,INSERM–U1090,AvenuedeLuminy,13288MarseilleCedex9,France cLaboratoired’InformatiqueFondamentaledeMarseille,CNRS–UMR7279,AvenuedeLuminy,13288MarseilleCedex9,France dAix-MarseilleUniversité,CampusdeLuminy,13288MarseilleCedex9,France
eCNRS,France
a
r
t
i
c
l
e
i
n
f
o
Articlehistory: Received3April2013
Receivedinrevisedform22May2013 Accepted22May2013 Keywords: Interactionnetworks Graphpartitioning Clustering Visualization
a
b
s
t
r
a
c
t
Backgroundandscope:Largenetworks,suchasproteininteractionnetworks,areextremelydifficultto analyzeasawhole.WedevelopedClust&See,aCytoscapeplugindedicatedtotheidentification, visual-izationandanalysisofclustersextractedfromsuchnetworks.
Implementationandperformance:Clust&Seeprovidestheabilitytoapplythreedifferent,recently devel-opedgraphclusteringalgorithmstonetworksandtovisualize:(i)theobtainedpartitionasaquotient graphinwhichnodescorrespondtoclustersand(ii)theobtainedclustersastheircorresponding sub-networks.Importantly,toolsforinvestigatingtherelationshipsbetweenclustersandverticesaswellas theirorganizationwithinthewholegrapharesupplied.
© 2013 The Authors. Published by Elsevier Ireland Ltd. All rights reserved.
1. Introduction
Thefieldoffunctionalgenomicsisproducingalargeamount ofdata,oftenrepresentedasinteractionnetworks–orundirected graphs.Thesegraphstypicallycontainthousandsofvertices, ren-deringtheextractionofpertinentbiologicalinformationadaunting task. Graph partitioning or clustering methods have beenused tohighlightgroupsofdenselyconnectedvertices(Aittokallioand Schwikowski,2006)which,inthefieldofproteininteractions,often correspondtoclustersofproteinsinvolvedin thesame cellular process(es).
Cytoscapeisapopularandversatilesoftwareplatform(Shannon etal.,2003)fornetworkvisualizationandanalysis.Whileanumber
夽 Thisisanopen-accessarticledistributedunderthetermsoftheCreative Com-monsAttribution-NonCommercial-NoDerivativeWorksLicense,whichpermits non-commercialuse,distribution,andreproductioninanymedium,providedthe originalauthorandsourcearecredited.
夽夽 Availability:http://tagc.univ-mrs.fr/tagc/index.php/clustnsee
∗ Corresponding authorat: TAGC, U1090 Inserm-AMU, ParcScientifique de Luminycase928,163,AvenuedeLuminy,13288MarseilleCedex09,
France.Tel.:+33491828712.
E-mailaddress:brun@tagc.univ-mrs.fr(C.Brun).
1 Currentaddress:Laboratoired’InformatiqueGaspard-Monge,CNRS–UMR8049,
CitéDescartes,BâtCopernic–5,bdDescartes,ChampssurMarne,77454 Marne-la-ValléeCedex2,France.
ofCytoscapepluginssuchasClusterMaker(Morrisetal.,2011)or ClusterOne(Nepuszetal.,2012)canidentifyclustersfromgraphs, theymainlyfocusonvisualizingtheobtainedclustersindividually andindependentlyassubnetworkstofurtherinvestigatetheirnode composition.However,exploringtherelationshipsbetween clus-tersindetailisasimportantasstudyingtheirinternalcomposition. Indeed, while proteinsinvolved in the same process(es) inter-actwithinclusters,linksbetweenclusterscorrespondtocrosstalk betweenprocesses.Communicationbetweenprocessescanalsobe performedbyproteinsbelongingtoseveralclusters.Consequently, consideringthelinksbetweennetworkclusterspermitsabetter understandingofthemodularityofbiologicalnetworksand the functionaltransitionsimposedbytheintegrativeorganization lev-els,fromproteinstofunctionalmodulestoentiresystems.
Tofill this gap,we havedeveloped Clust&See, a truly inter-activetoolthatcan(i)automaticallydecomposeanetworkinto clusters;(ii)visualizethoseclustersasmetanodeslinkedby sev-eraltypesofedges/relationships;(iii)manipulatetheclustersfor furtherdetailedvisualization,analysesandcomparisons.
2. Softwaredescription
Clust&See is a Cytoscape plugin developed for Cytoscape version 2.8. Some GUI elements have been reused from code
0303-2647/$–seefrontmatter © 2013 The Authors. Published by Elsevier Ireland Ltd. All rights reserved. http://dx.doi.org/10.1016/j.biosystems.2013.05.010
92 L.Spinellietal./BioSystems113 (2013) 91–95
Fig.1.QuotientgraphsobtainedwhenclusteringthePI3Knetworkwiththe3implementedalgorithms.
from the MCODE (Bader and Hogue, 2003) and ClusterViz (http://apps.cytoscape.org/apps/clusterviz)plugins.
2.1. Implementedgraphclusteringalgorithms
Todate,threeclusteringalgorithmsbasedontheoptimization ofNewman’smodularity(Newman,2004)havebeenimplemented inClust&See.WhileTFitandFTleadtodisjointclusters,OCGleads tooverlappingones:
(1)FT(forFusion-Transfer)(Guénoche,2011)isanascending hier-archicalmethodfusing two clustersiterativelyifthefusion resultsinamodularitygain.Thealgorithmstartswith single-tonsandstopswhenfurtherfusionsleadtoalossinmodularity. Modularityisthenfurtheroptimizedbytransferringvertices fromoneclustertoanother.
(2)TFit(foriteratedTransfer-Fusion)(Gambetteand Guénoche, 2012)isamulti-levelalgorithminwhichavertextransfer pro-cedureisperformedateverylevel.Levelonecorrespondstothe network.Whilemodularityincreases,eachnodeisassociatedto itsbestadjacentcluster.Classicaltransfersarethenperformed andaquotientgraphiscomputed;clustersthenbecomethe nodesofthenextleveltobefurtheroptimized.
(3)OCG(forOverlappingClusterGenerator)(Beckeretal.,2012)is anascendinghierarchicalmethodfusingtwoclustersateach step.Initially,anoverlappingclasssystemformedbyeither(i) maximalcliques,or(ii)edgesor(iii)centeredcliquesisbuilt. These classes are then merged,while modularity increases, resultinginoverlappingclusters.
Performancevaluesintermsoftimeandmemoryofthethree algorithms are provided as Supplementary Material. Clustering resultsproducedbyClust&Seecanbeexportedastextfiles,and subsequently re-imported and re-mapped to the original net-work,avoidingrepetitivecomputation.Importantly,resultsfrom
external clustering tools can be analyzed with Clust&See. Cur-rently, the R package “Linkcomm” (Kalinka and Tomancak, 2011), in which the LinkCommunities (Ahn et al., 2010) and OCG (Becker et al., 2012) algorithms are implemented, pro-videsoutputfilesthatarecompatiblewithClust&See(forfurther information on supported formats, see the online Documen-tation,http://tagc.univ-mrs.fr/tagc/index.php/software/clustnsee/ clustnseedocumentation). Finally, the modular structure of Clust&Seemakesiteasytoimplementotherclusteringalgorithms directlyinJava.
2.2. Visualizationandanalysis
TheclusteringresultscanbevisualizedinClust&Seeasa quo-tientgraphinwhichclustersarerepresentedasmetanodeswhose widthisproportionaltothenumberoftheirconstituentvertices (Fig.1).Metanodescanbelinkedbytwotypesof“metaedges”,one (black)whosewidthisproportionaltothenumberofinteractions betweentheirverticesand,mostimportantly,one(green)whose widthisproportionaltothenumberofverticessharedby overlap-pingclusterscomputedbyalgorithmssuchasOCG.AdockedResult Panelprovidesasortablelistoftheclustersinwhicheachcluster’s subnetworkisdisplayedalongwithitsrelevantfeatures,suchas sizeoredgedensity.
Novelviews,intowhichclustersofinterestcanbesuccessively loaded,can becreatedondemand.An“Expand/collapsenodes” functionallowstheusertoswitchfromthecluster/metanode rep-resentationtothecorrespondingsubnetworkofverticesandvice versa(Fig.2).DetailsprovidedintheDataPaneluponselectionof thedifferentobjects(vertices,edges)facilitatethestudyofthe rela-tionshipsbetweenclusters.Thecompositionofeachmetanodeis providedaswellas,importantly,thecompositionofthemetaedges representingthesharedobjects(nodesoredges)betweencluster pairs.
L.Spinellietal./BioSystems113 (2013) 91–95 93
Fig.2.A‘NewClusterView’showingtheclusters5,12and25generatedbytheOCGalgorithmasmetanodesandthetwoedgetypeslinkingthem.Blackmetaedgesconnect verticesofoneclustertothoseofanother.Greenmetaedgesrepresentthenodessharedbetweentheclusters.DetailsareshownintheDataPanelwhenselectingagreenor ablackmetagedge.Anexampleofanexpandedmetanodeisalsogiven.
Whentwodifferentpartitionsarecomputedonthesame net-work(usingdifferentalgorithmsordifferentparameters),theyare comparedusingtheJaccardindex(Jaccard,1901)whichprovides ameasureofthepartitions’similarity.Acontingencytablelisting thenumberofsharednodesbetweentheclustersofeachpartition isprovidedforfurtheranalysis.
Inaddition,becauselaunchingananalysisonaverylarge net-workorselection maylead, atbest,unmanageable results(too manyclusters)and,atworst,tomemoryissuesandverylong com-putationtimes,Clust&Seeofferstheuserthechoiceofextracting sub-networksofinterestonwhichtocontinuetheanalysisbyusing the“Buildneighborhoodnetwork”functionality.
Finally,theprovidedsearchfunctioncanidentifyaspecificnode amongtheclustersofallpartitionsunderinvestigation.
3. Application
3.1. Fromclusterstonodes
Fig.1showstheresultsobtainedwhenapplyingthe3algorithms currentlyimplementedinClust&SeetothePI3Kinteractome net-work(Pilot-Storcketal.,2010).Aglobalviewofeachpartitionas aquotientgraph,inwhichtheobtainedclustersarerepresented asmetanodes,isgiven.Notethatthedefaultviewisshownwhen thepartitioncontainsnomorethan15clusters/metanodes,butcan
alwaysbedisplayedondemandforlargerpartitions.Theseviews areoneoftheoriginalfeaturesprovidedbytheplug-in.
ThePI3Kpathwaytransmitssignalsfromreceptorslocatedat thecellsurfacetotranscriptionfactorsinthenucleus,viaan intra-cellularsignalingcascadeinvolvingseveralkinases.Toillustratethe valueofalocalanalysisusingClust&See,wehavechosentoexplore theconnectionsbetween3overlappingclustersgeneratedbythe OCGalgorithm,containingamajorityofreceptor-bindingproteins (Cluster12),serine/threoninekinases(Cluster25)andnuclearacid bindingproteins(Cluster5)respectively.The3clustersare repre-sentedasmetanodesinFig.2,whichshowsa“NewClusterView” createdonthefly. In addition,Cluster 25, formedby 13nodes linkedby16intra-clusteredges,isshownintheClusterBrowser oftheResultsPanel.Detailsonthemono/multi-clusteredstatus ofverticesaregivenintheDataPanel.Clusterssharingvertices (likeClusters5and25),areeasilyidentifiablesincetheyarelinked byagreenmetaedgewhosedetailsareshownintheDataPanel uponselection:twokinases,KS6B1andPK3CA,belongtoboth clus-ters,suggestingapossiblefunctionallinkbetweenthoseproteins. Interestingly,particularvariantsofthegenesencodingthese pro-teinshavebeenfoundtointeractgeneticallyinacase-controlstudy forcolorectalcancers(Slatteryetal.,2011).Avisualexploration oftheorganizationoftheclusterswithClust&Seecantherefore helpbuildinghypothesisandpointingtowardrelevantfunctional objects(representedasnodes,metanodes, edgesormetaedges) andtheirrelationships.Finally,metanodescanbeexpanded(and
94 L.Spinellietal./BioSystems113 (2013) 91–95
Fig.3.TheresultofasearchforPARP1intheclustersobtainedfromthedifferentalgorithmsisshown.ThetwoclusterscontainingPARP1accordingtoOCGareprovidedas a‘NewClusterView’andtheexpandedmetanodesareshownbelowthem.
subsequentlycollapsed)inordertovisualizetheunderlying sub-network.ThecombinationofthedetailsshownintheDataPanel andthevisualizationofthecompositesubnetworkgreatly facili-tatesthestudyoftheidentifiedclustersandtheirconnections. 3.2. Fromnodestoclusters
Fig.3illustratesthesearchforaparticularnode,PARP1,across thedifferentpartitionsobtainedwhenusingallthreealgorithmson thesamenetwork.PARP1belongstotwoOCGclusters(P3partition intheResultsPanel).Interestingly,whenbothclustersareloaded ina“NewClusterView”forfurtheranalysis,theDataPanelshows thatPDK1isalsosharedbythesameclusters,suggestingapossible functionallinkbetweenPARP1andPDK1.Thisisfurtherconfirmed whentheexpandedclusterviewisgenerated,showingthatboth proteinsinteractdirectly.Thisinteractioncouldinpartexplainthe factthatco-targetingthePI3Kpathwayimprovestheresponseof cancercellstoPARP1inhibition(Kimbungetal.,2012).
4. Conclusion
TheCytoscapeplug-inClust&Seeaimstofacilitatenetwork clus-teringand analysis for biologistsnot only byproviding several originalfunctionalitiesbutalsobyprovidingthemwithinasingle analysisframework.WhileClusterMaker(Morrisetal.,2011)can representclustersasmetanodes,it providesneither intra/extra-edge visualization nor the possibility to expand/collapse the metanodes.Similarly,whileClusterOne(Nepuszetal.,2012)can identifyoverlappingclusters,studyingtherelationshipsbetween these clusters is not possible because their combined repre-sentation is not supported. In addition, Clust&See enables (i) betterevaluationofthebiologicalmeaningofnetworkclustering, (ii)betterunderstandingoftheunderlyingreasonsfora partic-ularnodeclassification, (iii) betterestimation of thequality of
thenetworkunderscrutinyand(iv)adjustingtheclustering algo-rithmchoicetothestudiednetwork.Insummary,theoriginality ofClust&Seeliesinitsproviding userswithacomplete toolfor thecreationandanalysisofnetworkclustersandtherelationships betweenthem.
Acknowledgements
WethankMarineVeyssièrefortheupdateofthePI3K inter-actome.Thisworkissupportedby the“AgenceNationale dela Recherche”asaPiribiogrant(09-PIRI-0028,Moonlightprojectto C.B)andasapartneroftheERASysbio+initiativesupportedunder theEUERANETPlusschemeinFP7(ModHeartProject).B.R.isaPhD fellowoftheAXAResearchFund.C.C.isapostdoctoralfellowofthe “FondationpourlaRechercheMédicale”.
AppendixA. Supplementarydata
Supplementarydataassociatedwiththisarticlecanbefound, in the online version, at http://dx.doi.org/10.1016/j.biosystems. 2013.05.010.
References
Ahn,Y.-Y.,Bagrow,J.P.,Lehmann,S.,2010.Linkcommunitiesrevealmultiscale com-plexityinnetworks.Nature466,761–764.
Aittokallio,T.,Schwikowski,B.,2006.Graph-basedmethodsforanalysingnetworks incellbiology.BriefBioinform.7,243–255.
Bader,G.D.,Hogue,C.W.,2003.Anautomatedmethodforfindingmolecular com-plexesinlargeproteininteractionnetworks.BMCBioinform.4,2.
Becker,E.,Robisson,B.,Chapple,C.E.,Guénoche,A.,Brun,C.,2012.Multifunctional proteinsrevealedbyoverlappingclusteringinproteininteractionnetwork. Bioinformatics28,84–90.
Gambette,P.,Guénoche,A.,2012.Bootstrapclusteringforgraphpartitioning.RAIRO –Oper.Res.45,339–352.
Guénoche,A.,2011.Consensusofpartitions:aconstructiveapproach.Adv.Data Anal.Classification5,215–229.
L.Spinellietal./BioSystems113 (2013) 91–95 95 Jaccard,P.,1901.DistributiondelaflorealpinedanslebassindesDransesetde
quelquesrégionsvoisines.Bull.Soc.Vaud.Sci.Nat.37,241–272.
Kalinka,A.T.,Tomancak,P.,2011.linkcomm:anRpackageforthegeneration, visual-ization,andanalysisoflinkcommunitiesinnetworksofarbitrarysizeandtype. Bioinformatics27,2011–2012.
Kimbung, S., Biskup, E., Johansson, I., Aaltonen, K., Ottosson-Wadlund, A., Gruvberger-Saal,S.,Cunliffe,H.,Fadeel,B.,Loman,N.,Berglund,P.,etal.,2012. Co-targetingofthePI3KpathwayimprovestheresponseofBRCA1deficient breastcancercellstoPARP1inhibition.CancerLett.319,232–241.
Morris,J.H.,Apeltsin,L.,Newman,A.M.,Baumbach,J.,Wittkop,T.,Su,G.,Bader, G.D.,Ferrin,T.E.,2011.clusterMaker:amulti-algorithmclusteringpluginfor Cytoscape.BMCBioinform.12,436.
Nepusz,T.,Yu,H.,Paccanaro,A.,2012.Detectingoverlappingproteincomplexesin protein–proteininteractionnetworks.Nat.Methods9,471–472.
Newman,M.E.,2004.Fastalgorithmfordetectingcommunitystructureinnetworks. Phys.Rev.EStat.Nonlin.SoftMatterPhys.69,066133.
Pilot-Storck,F.,Chopin,E.,Rual,J.-F.,Baudot,A.,Dobrokhotov,P.,Robinson-Rechavi, M.,Brun,C.,Cusick,M.E.,Hill,D.E.,Schaeffer,L.,etal.,2010.Interactomemapping ofthephosphatidylinositol3-kinase-mammaliantargetofrapamycinpathway identifiesdeformedepidermalautoregulatoryfactor-1asanewglycogen syn-thasekinase-3interactor.Mol.CellProteomics9,1578–1593.
Shannon, P., Markiel,A.,Ozier, O.,Baliga,N.S.,Wang, J.T.,Ramage,D.,Amin, N.,Schwikowski,B.,Ideker,T.,2003.Cytoscape:asoftwareenvironmentfor integrated models ofbiomolecular interactionnetworks.Genome Res. 13, 2498–2504.
Slattery,M.L.,Lundgreen,A.,Herrick,J.S.,Wolff,R.K.,2011.Geneticvariationin RPS6KA1,RPS6KA2,RPS6KB1,RPS6KB2,andPDK1andriskofcolonorrectal cancer.Mutat.Res.706,13–20.