A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs
Antoine Amarilli1and İsmail İlkan Ceylan2 September 16, 2020
1Télécom Paris
Uncertain data management
In this talk, we managedatarepresented as alabeled graph
WorksAt
Antoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
Antoine
Benny
İsmail
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we are notcertainabout the true state of the data
2/7
Uncertain data management
In this talk, we managedatarepresented as alabeled graph
WorksAt
Antoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
Antoine
Benny
İsmail
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we are notcertainabout the true state of the data
Uncertain data management
In this talk, we managedatarepresented as alabeled graph
WorksAt
Antoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
Antoine
Benny
İsmail
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we are notcertainabout the true state of the data
2/7
Uncertain data management
In this talk, we managedatarepresented as alabeled graph
WorksAt
Antoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
Antoine
Benny
İsmail
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we are notcertainabout the true state of the data
Uncertain data management
In this talk, we managedatarepresented as alabeled graph
WorksAt
Antoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
Antoine
Benny
İsmail
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we are notcertainabout the true state of the data
2/7
Uncertain data management
In this talk, we managedatarepresented as alabeled graph
WorksAt
Antoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
Antoine
Benny
İsmail
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we are notcertainabout the true state of the data
Uncertain data management
In this talk, we managedatarepresented as alabeled graph
WorksAt
Antoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
Antoine
Benny
İsmail
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we are notcertainabout the true state of the data 2/7
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
80% 10%
40% 80%
100%
90% 90% 50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Each fact (edge) carries aprobability•
Each fact exists with its givenprobability•
All facts areindependent•
Possible worldW:subset of facts•
ProbabilityofW: Pr(W) = YF∈W
Pr(F)
!
× Y
F∈W/
1−Pr(F)
!
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Each fact (edge) carries aprobability•
Each fact exists with its givenprobability•
All facts areindependent•
Possible worldW:subset of facts•
ProbabilityofW: Pr(W) = YF∈W
Pr(F)
!
× Y
F∈W/
1−Pr(F)
!
3/7
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Each fact (edge) carries aprobability•
Each fact exists with its givenprobability•
All facts areindependent•
Possible worldW:subset of facts•
ProbabilityofW: Pr(W) = YF∈W
Pr(F)
!
× Y
F∈W/
1−Pr(F)
!
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Each fact (edge) carries aprobability•
Each fact exists with its givenprobability•
All facts areindependent•
Possible worldW:subset of facts•
ProbabilityofW: Pr(W) = YF∈W
Pr(F)
!
× Y
F∈W/
1−Pr(F)
!
3/7
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
80% 10%
40% 80%
100%
90% 90% 50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Each fact (edge) carries aprobability•
Each fact exists with its givenprobability•
All facts areindependent•
Possible worldW:subset of facts•
ProbabilityofW: Pr(W) = YF∈W
Pr(F)
!
× Y
F∈W/
1−Pr(F)
!
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Each fact (edge) carries aprobability•
Each fact exists with its givenprobability•
All facts areindependent•
Possible worldW:subset of facts•
ProbabilityofW:Pr(W) = Y
F∈W
Pr(F)
!
× Y
F∈W/
1−Pr(F)
!
3/7
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Each fact (edge) carries aprobability•
Each fact exists with its givenprobability•
All facts areindependent•
Possible worldW:subset of facts•
ProbabilityofW: Pr(W) = YF∈W
Pr(F)
!
× Y
F∈W/
1−Pr(F)
!
Homomorphism-closed queries
•
Query:maps a graph (without probabilities) to YES/NO•
Conjunctive query(CQ): can I find a match of apattern? e.g., x y z•
Union of conjunctive queries(UCQ): doesone of the CQsmatch?→ Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQ
They generalizeCQsandUCQs, but alsoregular path queries(RPQs),Datalog, etc.
4/7
Homomorphism-closed queries
•
Query:maps a graph (without probabilities) to YES/NO•
Conjunctive query(CQ): can I find a match of apattern? e.g., x y z•
Union of conjunctive queries(UCQ): doesone of the CQsmatch?→ Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQ
They generalizeCQsandUCQs, but alsoregular path queries(RPQs),Datalog, etc.
Homomorphism-closed queries
•
Query:maps a graph (without probabilities) to YES/NO•
Conjunctive query(CQ): can I find a match of apattern? e.g., x y z•
Union of conjunctive queries(UCQ): doesone of the CQsmatch?→ Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQ
They generalizeCQsandUCQs, but alsoregular path queries(RPQs),Datalog, etc.
4/7
Homomorphism-closed queries
•
Query:maps a graph (without probabilities) to YES/NO•
Conjunctive query(CQ): can I find a match of apattern? e.g., x y z•
Union of conjunctive queries(UCQ): doesone of the CQsmatch?→ Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQ
They generalizeCQsandUCQs, but alsoregular path queries(RPQs),Datalog, etc.
Homomorphism-closed queries
•
Query:maps a graph (without probabilities) to YES/NO•
Conjunctive query(CQ): can I find a match of apattern? e.g., x y z•
Union of conjunctive queries(UCQ): doesone of the CQsmatch?→ Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQ
They generalizeCQsandUCQs, but alsoregular path queries(RPQs),Datalog, etc.
4/7
Problem statement: Probabilistic query evaluation (PQE) Here is the problemPQE(Q):
•
Wefixa queryQ: x y z•
Theinputis a TIDD: A.B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40% 80%
100%
90% 90% 50%
90%
100%
•
Theoutputis theprobabilitythat the query is true→ Question: What is the complexity ofPQE(Q)depending on the queryQ?
Problem statement: Probabilistic query evaluation (PQE) Here is the problemPQE(Q):
•
Wefixa queryQ: x y z•
Theinputis a TIDD: A.B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Theoutputis theprobabilitythat the query is true→ Question: What is the complexity ofPQE(Q)depending on the queryQ?
5/7
Problem statement: Probabilistic query evaluation (PQE) Here is the problemPQE(Q):
•
Wefixa queryQ: x y z•
Theinputis a TIDD: A.B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Theoutputis theprobabilitythat the query is true→ Question: What is the complexity ofPQE(Q)depending on the queryQ?
Problem statement: Probabilistic query evaluation (PQE) Here is the problemPQE(Q):
•
Wefixa queryQ: x y z•
Theinputis a TIDD: A.B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Theoutputis theprobabilitythat the query is true→ Question: What is the complexity ofPQE(Q)depending on the queryQ?
5/7
Results on PQE
Existing dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hardWe study PQE forhomomorphism-closed queriesand show: Theorem [Amarilli and Ceylan, 2020]
For anyquery Q closed under homomorphisms:
•
EitherQis equivalent to asafe UCQandPQE(Q)is inPTIME•
In all other cases,PQE(Q)is#P-hardSobad news: all homomorphism-closed queries arehardexcept safe UCQs
Results on PQE
Existing dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hardWe study PQE forhomomorphism-closed queriesand show:
Theorem [Amarilli and Ceylan, 2020]
For anyquery Q closed under homomorphisms:
•
EitherQis equivalent to asafe UCQandPQE(Q)is inPTIME•
In all other cases,PQE(Q)is#P-hardSobad news: all homomorphism-closed queries arehardexcept safe UCQs
6/7
Results on PQE
Existing dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hardWe study PQE forhomomorphism-closed queriesand show:
Theorem [Amarilli and Ceylan, 2020]
For anyquery Q closed under homomorphisms:
•
EitherQis equivalent to asafe UCQandPQE(Q)is inPTIME•
In all other cases,PQE(Q)is#P-hardSobad news: all homomorphism-closed queries arehardexcept safe UCQs
Results on PQE
Existing dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hardWe study PQE forhomomorphism-closed queriesand show:
Theorem [Amarilli and Ceylan, 2020]
For anyquery Q closed under homomorphisms:
•
EitherQis equivalent to asafe UCQandPQE(Q)is inPTIME•
In all other cases,PQE(Q)is#P-hardSobad news: all homomorphism-closed queries arehardexcept safe UCQs
6/7
Results on PQE
Existing dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hardWe study PQE forhomomorphism-closed queriesand show:
Theorem [Amarilli and Ceylan, 2020]
For anyquery Q closed under homomorphisms:
•
EitherQis equivalent to asafe UCQandPQE(Q)is inPTIME•
In all other cases,PQE(Q)is#P-hardWhat’s next?
•
The result only applies tographs, not higher-arity databases• Weconjecturethat it holds for arbitrary arity
•
Adapting tounweightedPQE, where all probabilities are1/2?• We have a recent result onnon-hierarchical self-join-free CQs [Amarilli and Kimelfeld, 2020]
• Recent paper by Suciu and Kenig [Kenig and Suciu, 2020]
Thanks for your attention!
7/7
What’s next?
•
The result only applies tographs, not higher-arity databases• Weconjecturethat it holds for arbitrary arity
•
Adapting tounweightedPQE, where all probabilities are1/2?• We have a recent result onnon-hierarchical self-join-free CQs [Amarilli and Kimelfeld, 2020]
• Recent paper by Suciu and Kenig [Kenig and Suciu, 2020]
Thanks for your attention!
What’s next?
•
The result only applies tographs, not higher-arity databases• Weconjecturethat it holds for arbitrary arity
•
Adapting tounweightedPQE, where all probabilities are1/2?• We have a recent result onnon-hierarchical self-join-free CQs [Amarilli and Kimelfeld, 2020]
• Recent paper by Suciu and Kenig [Kenig and Suciu, 2020]
Thanks for your attention!
7/7
References i
Amarilli, A. and Ceylan, I. I. (2020).
A dichotomy for homomorphism-closed queries on probabilistic graphs.
InICDT.
Amarilli, A. and Kimelfeld, B. (2020).
Uniform reliability of self-join-free conjunctive queries.
arXiv preprint arXiv:1908.07093.
Dalvi, N. and Suciu, D. (2012).
The dichotomy of probabilistic inference for unions of conjunctive queries.
J. ACM, 59(6).
References ii
Kenig, B. and Suciu, D. (2020).
A dichotomy for the generalized model counting problem for unions of conjunctive queries.
arXiv preprint arXiv:2008.00896.