orthologous genes and proteins
J.A.Miñarro-Gimenez 1
, MarisaMadrid 2
,J.T.Fernandez-Breis 1
1
DepartamentodeInformaticaySistemas,UniversidaddeMurcia,Spain,emails:
{jose.minyarro,jfernand}@um.es
2
CellDivisionGroup,PatersonInstitutefor CancerResearch,Universityof
Manchester,UK,email:[email protected]
Abstract
Theintegration ofbiomedicalresourcesis necessarydue to theamountofhet-
erogeneousinformation continuouslybeinggenerated.Inthis work,weaddress
the integration of existing information about orthologous genes and proteins.
The process followed in this work consists of three main steps: (1) construc-
tionof theontologyrepresentingthebiologicalknowledgeof theresources;(2)
denition ofthemappings betweentheontologiesandtheresources,which are
needed to support the data integration process; and (3) integration, through
whichanontologicalknowledgebaseiscreated.Theontologywasimplemented
in OWL, containing 52 concepts, 9object properties and 2data type proper-
ties, cardinality and disjointness constraints.As aresult, asemanticresource,
whichintegratestheresourcesusedbyYOGY[1],hasbeenobtained,containing
approximately1168000orthologousgenes, 956000proteinsconnectedto genes,
and114000orthologousclusters.Thedevelopmentofthis integratedontological
repositoryprovidesaseriesofadvantages:(1)morepowerfulandecientusage
oftheinformation; (2)theintegrationprocessallowsforremovingredundancy,
sotheinformationimprovesonprecisionandqualityaswellasreducestimeand
eort for researchers;(3) therepositorycan be integrated with dierent tools
usingthesameontology,andtheknowledgecaneasilybeenrichedandextended
byreusingother bio-ontologies.
ACKNOWLEDGEMENTS
JoseAntonioMiñarroissupportedbytheFundacionSeneca.MarisaMadridis
supportedbyanEMBOLong-TermFellowship.ThankstotheSpanishMinistry
for Science and Education throughthe project TSI2007-66575-C02-02,and to
ValerieWoodandJurgBählerfortheirsupportandtheirvaluablecomments.
References
1. Penkett,C.,Morris,J.,Wood,V.,Bähler,J.:Yogy:aweb-based,integrateddatabase
toretrieveproteinorthologsandassociatedgeneontologyterms.NucleicAcidsRes