• Aucun résultat trouvé

Approaching OBDA Evolution through Mapping Repair

N/A
N/A
Protected

Academic year: 2022

Partager "Approaching OBDA Evolution through Mapping Repair"

Copied!
4
0
0

Texte intégral

(1)

Approaching OBDA Evolution through Mapping Repairs

?

Domenico Lembo1, Riccardo Rosati1, Valerio Santarelli1, Domenico Fabio Savo1, Evgenij Thorstensen2

1Sapienza Universit`a di Roma [email protected]

2University of Oslo [email protected]

Ontology-based data access (OBDA) is a new paradigm for accessing source databases through mediation of a conceptual domain view, given in terms of an ontol- ogy [9]. A major issue in OBDA is the design of an OBDA specification and the man- agement of its evolution. AnOBDA specificationis constituted by an ontology, usually a Description Logic (DL) TBox, a schema of the source databases, and a declarative mapping specifying the semantic relationship between the data at the sources and the elements of the ontology. In the following we denote it byJ =hT,S,Mi, whereT is the TBox (a set of DL axioms),Sthe source schema, i.e., a relational signature possibly with integrity constraints (ICs), and the mappingMis a set of mapping assertions of the formφ(x);ψ(x), whereφ(x)andψ(x)are queries overSandT, respectively, both with free variablesx(such an assertion is called a GLAV mapping assertion if both φ(x)andψ(x)are conjunctive queries (CQs), while it is a GAV mapping assertion if it is GLAV andψ(x)is an atom without occurrences of non-free variables).

The design of the specification is normally conducted in an iterative fashion, and changes are continuously implemented to its various components. Also, the entire spec- ification is often a lively artifact, continuously modified due to, e.g., changes in the re- quirements. Due to these characteristics, OBDA design and maintenance must be sup- ported by adequate tools and methodologies. The mapping is certainly the component of the specification which has received so far less attention, and thus consolidated tools supporting its design are currently not available. Mapping design is a time-consuming and complex operation, which typically (and especially in complex scenarios) has to be conducted manually [1]. Of course, modifying the mapping due to changes in the other components of the specification is tedious and time-consuming as well.

In this paper we study the evolution of OBDA specifications. We start our investi- gation by observing that many approaches exist for bothontology evolution [12] and database schema evolution[11]. However, to the best of our knowledge, no previous study has analyzed evolution in the presence of mappings connecting an ontology to a database schema. In this sense, a problem that is close to OBDA isontology matching and alignment, which is based on the use of a notion ofmappingto integrate different ontologies. Several works have studied the problem of repairing inconsistent mappings in this context (e.g., [3, 7, 8, 10]). However, the framework of ontology matching, and in particular the notion of mapping, is very different from OBDA.

We adopt amapping-centerednotion of OBDA evolution: given an OBDA specifi- cationJ =hT,S,Mi, we want to repair the mappingMgiven a modification of the

?The present paper is an extended abstract of [6].

(2)

TBoxT and/or of the source schemaS. We think that, at least for a first analysis of evolution in OBDA, this is a natural assumption: indeed, the mapping is an information that depends on both the TBox and the source schema, while the TBox and the schema are (at least in principle) semantically independent entities.

Following the classical approaches to belief revision, we look for a notion of repair of a mapping that is based on two general principles: (i) preservingconsistencyof the OBDA specification; (ii) expressingminimal changewith respect to the initial OBDA specification. With respect to consistency preservation, we adopt a non-classical notion of inconsistency for an OBDA specification, calledglobal mapping inconsistency, re- cently introduced in [4, 5]. According to this notion, a mappingMis inconsistent with respect to a TBoxT and a source schemaSif there exists no instanceDforSsuch that Dis consistent withJ andMis activeonD, i.e., every query over the source schema appearing inMhas a non-empty answer inD.

Definition 1 (Global mapping inconsistency [5]).LetJ =hT,S,Mibe an OBDA specification. We say thatMis globally inconsistent forhT,Siif there does not exist a source instanceDsatisfying the ICs inSsuch that (i)Mis active onD; and (ii) there exists an interpretation that agrees withDon the predicates ofSand satisfiesJ.

Global mapping inconsistency provides a more meaningful notion of inconsistency than the classical one in the context of OBDA: for instance, in all the cases when the source schema is a relational database schema with standard integrity constraints, the OBDA specification is inconsistent according to the classical semantics if and only if its TBox is inconsistent (which in turn implies that this notion is trivial for many DLs).

With respect to minimal change, we propose two different notions of repair. The first one is calleddeletion-based mapping repairand reflects the simple idea of repairing a mapping through a (subset-)minimal deletion of assertions from the initial mapping.

Definition 2 (Deletion-based mapping repair).Let J = hT,S,Mibe an OBDA specification such thatMis globally consistent forhT,Si,T0 a consistent TBox,S0 a consistent source schema, andM0a mapping such thatM0 ⊆ M. We say thatM0 is a deletion-based mapping repair (DMR) forJ under updatehT0,S0iif: (i) M0 is globally consistent forhT0,S0i, and (ii) there exists no mappingM00such thatM00is globally consistent forhT0,S0i, andM0⊂ M00⊆ M.

The second notion of repair, calledentailment-based mapping repair, relies on the notion ofmapping entailment set (MES): the mapping entailment set of an OBDA spec- ificationJ for a mapping languageL is the set of mapping assertions inL that are logical consequences ofJ. Then, the repairs are the globally consistent subsets of the MES that are selected according to a preference criterion (the notion offewer changes [2]) that formalizes the intuitive principle of preferring insertions over deletions. In practice, a mappingM1has fewer deletions (resp., fewer insertions) than a mapping M2with respect to a mappingMifM\M1⊂ M\M2(resp.,M1\M ⊂ M2\M), andM1has fewer changes thanM2with respect toMif eitherM1 has fewer dele- tions thanM2with respect toM, orM1andM2have the same deletions with respect toM, andM1has fewer insertions thanM2with respect toM.

Definition 3 (Entailment-based mapping repair).LetJ =hT,S,Mibe an OBDA specification such thatMis globally consistent forhT,Si,T0a consistent TBox,S0a

(3)

consistent source schema,La mapping language, andM0anL-mapping. We say that M0 is an entailment-basedL-mapping repair (L-EMR) forJ under updatehT0,S0i if: (i)M0 is globally consistent forhT0,S0i; and (ii) there exists noL-mappingM00 such thatM00is globally consistent forhT0,S0iandMESL(hT0,S0,M00i)has fewer changes thanMESL(hT0,S0,M0i)with respect toMESL(J).

Example 1. Consider the OBDA specificationJ = hT,S,Miwhere:T = {C v F, C v A, ∃R v B, A v B}, S = {T1/2}, M = {T1(x, y) ; R(x, y), T1(x, y) ; C(x)}, and consider the TBoxT0 = T ∪ {A v ¬∃R}. It is easy to see thatMis not globally consistent forhT0,Si, since for every database that activatesMthe mapping produces two facts of the formR(x, y)andC(x), which vio- late the TBox assertionCv ¬∃Rinferred byT0. Then, the DMRs ofJ under update hT0,SiareM01 = {T1(x, y) ; R(x, y)}andM02 = {T1(x, y) ; C(x)}. On the other hand, one can easily verify that the following are GAV-EMRs ofJ under update hT0,Si:M001 ={T1(x, y) ;R(x, y), T1(x, y) ; F(x)} andM002 = {T1(x, y) ; C(x)}. Such repairs are contained in all GAV-EMRs ofJ under updatehT0,Si. ut Then, we define query entailment under mapping repairs, which corresponds to a form of skeptical reasoning over all the mapping repairs.

Definition 4 (Query Entailment).LetJ =hT,S,Mibe an OBDA specification such thatMis globally consistent forhT,Si,T0 a consistent TBox,S0a consistent source schema,D a source instance satisfying the ICs inS0, andqa Boolean CQ over over the signature ofT0. We say that qis entailed under DMR (resp.L-EMR where Lis a mapping language) byJ,T0,S0, andD if(hT0,S0,M0i, D) |= qfor every DMR (respectively, for everyL-EMR) forJ under updatehT0,S0i.

LetJ = hT,S,MiandT0 be as in Example 1, and letD ={T1(a, b)}. Given the DMRsM01,M02and the GAV-EMRsM001,M002above described, it follows that the queryB(a)is entailed under DMR byJ,T0,S0, andD; moreover, the query∃x.F(x) is not entailed under DMR byJ,T0,S0, andD, while it is entailed under GAV-EMR.

We finally provide initial results on the complexity of query entailment, focusing on DL-LiteR TBoxes,simplesource schemas (i.e., without ICs), and on GAV and GLAV mappings. We first establish an exact bound for query entailment under deletion-based mapping repairs, which holds for both GAV and GLAV mappings.

Theorem 1. LetJ =hT,S,Mibe an OBDA specification, withT a DL-LiteRTBox, S a simple source schema, andMa GLAV mapping. LetT0 be DL-LiteRTBox,S0a simple source schema,Dan instance forS, andqa Boolean CQ over the signature of T0. Deciding whetherqis entailed under DMR byJ,T0,S0, andDisΠ2p-complete.

Then, we study the same problem under entailment-based mapping repairs and GAV mappings, and prove that theΠ2pexact bound holds also in this case.

Theorem 2. LetJ = hT,S,Mibe an OBDA specification, whereT is a DL-LiteR

TBox, S is a simple source schema, andMis a GAV mapping. LetT0 be DL-LiteR

TBox,S0a simple source schema,Dan instance forS0, andqa Boolean CQ over the signature ofT0. Deciding whetherqis entailed under GAV-EMR byJ,T0,S0, andD isΠ2p-complete.

(4)

References

1. N. Antonioli, F. Castan`o, S. Coletta, S. Grossi, D. Lembo, M. Lenzerini, A. Poggi, E. Virardi, and P. Castracane. Ontology-based data management for the italian public debt. InProc. of the 8th Int. Conf. on Formal Ontology in Information Systems (FOIS), pages 372–385, 2014.

2. R. Fagin, G. M. Kuper, J. D. Ullman, and M. Y. Vardi. Updating logical databases.Advances in Computing Research, 3:1–18, 1986.

3. E. Jim´enez-Ruiz, C. Meilicke, B. C. Grau, and I. Horrocks. Evaluating mapping repair systems with large biomedical ontologies. InProc. of the 26th Int. Workshop on Description Logic (DL), CEUR Electronic Workshop Proceedings,http://ceur-ws.org/, pages 246–257, 2013.

4. D. Lembo, J. Mora, R. Rosati, D. F. Savo, and E. Thorstensen. Towards mapping analysis in ontology-based data access. InProc. of the 8th Int. Conf. on Web Reasoning and Rule Systems (RR), volume 8741 ofLecture Notes in Computer Science, pages 108–123. Springer, 2014.

5. D. Lembo, J. Mora, R. Rosati, D. F. Savo, and E. Thorstensen. Mapping analysis in ontology- based data access: Algorithms and complexity. InProc. of the 14th Int. Semantic Web Conf.

(ISWC), pages 217–234, 2015.

6. D. Lembo, R. Rosati, V. Santarelli, D. F. Savo, and E. Thorstensen. Approaching OBDA evolution through mapping repairs. Manuscript, 2016.

7. C. Meilicke, H. Stuckenschmidt, and A. Tamilin. Reasoning support for mapping revision.

J. Log. Comput., 19(5):807–829, 2009.

8. ¨O. L. ¨Ozc¸ep. Towards principles for ontology integration. InProc. of the 5th Int. Conf. on Formal Ontology in Information Systems (FOIS), pages 137–150, 2008.

9. A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, and R. Rosati. Linking data to ontologies.J. on Data Semantics, X:133–173, 2008.

10. G. Qi, Q. Ji, and P. Haase. A conflict-based operator for mapping revision. InProc. of the 22nd Int. Workshop on Description Logic (DL), CEUR Electronic Workshop Proceedings, http://ceur-ws.org/, 2009.

11. E. Rahm and P. A. Bernstein. An online bibliography on schema evolution.SIGMOD Record, 35(4):30–31, 2006.

12. F. Zablith, G. Antoniou, M. d’Aquin, G. Flouris, H. Kondylakis, E. Motta, D. Plexousakis, and M. Sabou. Ontology evolution: a process-centric survey. Knowledge Eng. Review, 30(1):45–75, 2015.

Références

Documents relatifs

SPARQL. To properly evaluate the performance of OBDA systems, benchmarks tailored towards the requirements in this setting are needed. OWL benchmarks, which have been developed to

Consider the two data sources in Figure 3 (left): Source 1 (or S1 for short) contains a unary table about production wells with the schema PWell and says that the well ‘w123’ is

Finally, we propose a notion of incoherency-tolerant semantics for query answering in Datalog ± , and present a particu- lar one based on the transformation of classic Datalog

The database schema of our Phenomis database considered for mapping contains plant phenotyping information and environmental information.. The ontologies considered for

So the functions (p above represent stochastic versions of the harmonic morphisms, and we find it natural to call them stochastic harmonic morphisms. In Corollary 3 we prove that

Thus, a configuration of a OBDA system, named iOBDA, aiming at providing access to the data layer, through the orchestration of all of the tools involved in the OBDA process

In this work, we present the Mastro plug-in for the standard ontology editor Prot´eg´e, which allows users to create a full OBDA specification and pose SPARQL queries over the

The server-side modules consist of Data Source and DBMS Manager which manages the user defined data sources and provides JDBC-based access, Ontol- ogy Handler for OWL