• Aucun résultat trouvé

Ontology-Mediated Queries for Probabilistic Databases (Extended Abstract)

N/A
N/A
Protected

Academic year: 2022

Partager "Ontology-Mediated Queries for Probabilistic Databases (Extended Abstract)"

Copied!
1
0
0

Texte intégral

(1)

Ontology-Mediated Queries for Probabilistic Databases (Extended Abstract)

?

Stefan Borgwardt1, İsmail İlkan Ceylan1, and Thomas Lukasiewicz2

1 Faculty of Computer Science, Technische Universität Dresden, Germany firstname.lastname@tu-dresden.de

2 Department of Computer Science, University of Oxford, UK thomas.lukasiewicz@cs.ox.ac.uk

The semantics of large-scale knowledge bases like NELL and Google’s Knowledge Vault is founded on (tuple-independent) probabilistic databases (PDBs) [3]. As for ordinary databases, they employ the closed-world assumption, i.e., missing facts are treated as being false (having the probability0), which leads to unintu- itive results when querying PDBs. Recently,open-world probabilistic databases (OpenPDBs) were proposed to address this issue by allowing probabilities of facts that are not in the database, calledopen tuples, to take any value from a fixed probability interval [1]. In the resulting framework of OpenPDBs, query probabilities are given in terms of upper and lower probability values, which is more in line with an incomplete view of the world. However, OpenPDBs are still limited in the sense that they do not allow to distinguish queries that are intuitively different with respect to our commonsense knowledge. In order to model such knowledge, we extend OpenPDBs by ontology-mediated queries using Datalog± ontologies (which cover some Horn Description Logics). We show that the data complexity dichotomy betweenP andPPin (Open)PDBs [1, 2]

can be lifted to the case of first-order rewritable positive ontologies (without negative constraints); and that the problem can becomeNPPP-complete once negative constraints are allowed. This result demonstrates the difference between OpenPDBs and PDBs, as in the latter reasoning with ontologies remains inPP. We also propose an approximating semantics that circumvents the increase in complexity caused by negative constraints. Finally, we provide complexity results beyond the data complexity for ontology-mediated query evaluation relative to (tuple-independent) PDBs and OpenPDBs.

References

1. Ceylan, İ.İ., Darwiche, A., Van den Broeck, G.: Open-world probabilistic databases.

In: Proc. of KR. pp. 339–348 (2016)

2. Dalvi, N., Suciu, D.: The dichotomy of probabilistic inference for unions of conjunctive queries. J. ACM 59(6), 1–87 (2012)

3. Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Morgan & Claypool (2011)

?Full paper accepted at AAAI’17. This work was supported by DFG in the CRC 912 (HAEC) and the GRK 1907 (RoSI), and by the EPSRC grants EP/J008346/1, EP/L012138/1, EP/M025268/1, and EP/N510129/1.

Références

Documents relatifs

They showed that for queries under bag semantics and for minimal queries under set semantics, weakest precon- ditions for query completeness can be expressed in terms of

incomplete information, numerical data, missing data, query an- swering, measure of certainty, first-order queries, approximate query answers, asymptotic behavior.. ACM

More restricted instances: Words, trees and bounded treewidth (1 slide) More restricted instances: Unweighted instances (1 slide)... Uncertain

For the purposes of answering standard CQs without negation, the minimal canonical model behaves as a usual canonical model, which means that our semantics generalizes the

Inspired by the maximal posterior probability computations in Probabilistic Graphical Models (PGMs) [3], we investigate the problem of finding most probable explanations for

3 To the best of our knowledge, and to our surprise, the fact that safe PDB queries have linear-time data complexity, and that the dichotomy of [5] is between linear time (not

In AI and machine learning, however, one typically learns highly symmetric distributions, where large numbers of symmetric databases get assigned identical probability. This

O log times, which in many cases may be undesirable.. To remedy these problems we resort to a more suitable data structure, the interval tree [5,6,7]. The interval tree uses