• Aucun résultat trouvé

Source Information Disclosure in Ontology-based Data Integration (Extended Abstract)

N/A
N/A
Protected

Academic year: 2022

Partager "Source Information Disclosure in Ontology-based Data Integration (Extended Abstract)"

Copied!
1
0
0

Texte intégral

(1)

Source Information Disclosure in Ontology-based Data Integration (Extended Abstract)

?

Michael Benedikt, Bernardo Cuenca Grau, and Egor V. Kostylev Department of Computer Science, University of Oxford, UK

Ontology-based data integration systems allow users to access data sitting in multi- ple sources by means of queries over a global schema described by an ontology. User queries are formulated against the vocabulary of the ontology and the relationships be- tween the datasources and the ontology terms are specified declaratively by mappings.

In practice, datasources often contain sensitive information that data owners want to keep inaccessible to users. In the setting of ontology-based data integration, the risks of unauthorized information disclosure quickly become apparent since the information ex- posed to users depends on a complex combination of schema reconciliation, reasoning over the ontology, and access to data in the sources via the mappings.

In this paper, we formalize and study the privacy requirements that a data integration system should satisfy before it is made available to users for querying, as well as on the computational complexity of checking whether such requirements are fulfilled.

Our logical framework for information disclosure builds on work in the database community. In line of existing approaches, we assume that sensitive information is rep- resented by a query (thepolicy) over the source schema, and also that schema-level information (ontology, mappings, source schemas, and policy specification) is publicly available. In contrast, the actual data is only made available as a result of user queries over the ontology. Disclosure in our framework occurs when users are able to uncover an answer to the policy query by querying the system and exploiting the availability of schema-level information. We consider disclosure for a particular dataset, and also whether a schema admits a dataset on which disclosure occurs.

We provide lower and upper bounds on disclosure analysis, in the process intro- ducing a number of techniques for analyzing logical privacy issues in ontology-based data integration. In our analysis, we consider different ontology, mapping, and policy languages. In all cases, we put special emphasis on the results most relevant to standard OBDA, where the ontology is expressed in DL-LiteRand the mappings are GAV.

Our results have implications on related work. In particular, they imply new lower bounds for theinstance-based determinacyproblem in databases, which is at the core ofdata pricing—the problem of automatically assigning a fair price to a chunk of data given the price of a given set of views.

?This extended abstract is based on the the paper “Source Information Disclosure in Ontology- based Data Integration”, published in the proceedings of AAAI-2017.

Références

Documents relatifs

As we noted above, the reasoning rules to support ontology for the data in- tegration concept are based on the OPENMath formalism and the algebra of integrable data.. Let the

Ontology-based data access and integration (OBDA/I) are by now well-established paradigms in which users are provided access to one or more data sources through a conceptual view

In the context of OBDA, the conceptual schema provides end-users with a vocabulary they are familiar with, at the same time masking how data are concretely stored, and enriching

Users access the data by means of queries formulated using the vocabulary of the ontology; query answering amounts to computing the certain answers of the query over the union

In these runs, topic field is composed by modified title like in ADJUSTEN and body field is composed by the original query text. Our system is not dependent of the query

Linked Data Approach We propose to represent enterprise taxonomies in RDF em- ploying the standardized and widely used SKOS [12] vocabulary as well as publishing term definitions

(1) The configuration of source databases was completely independent, except for the Federation Ontology, which lists the URL of each endpoint and the concepts each imple- ments;

 We present the architecture of a data integration system, named Evolving Data Integration (EDI) system, that allows the evolution of the ontology used as global