Query Evaluation on Probabilistic Data:
New Hard Cases
Antoine Amarilli1, joint work with Benny Kimelfeld2, İsmail İlkan Ceylan3 October 10, 2019
1Télécom Paris
2Technion
3University of Oxford
Uncertain data management
•
Databases:managedataand answerqueriesover it•
In this talk,datais simply a labeled graph WorksAtAntoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we may beuncertainabout the data
Uncertain data management
•
Databases:managedataand answerqueriesover it•
In this talk,datais simply a labeled graphWorksAt
Antoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we may beuncertainabout the data
Uncertain data management
•
Databases:managedataand answerqueriesover it•
In this talk,datais simply a labeled graphWorksAt
Antoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay
A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we may beuncertainabout the data
Uncertain data management
•
Databases:managedataand answerqueriesover it•
In this talk,datais simply a labeled graph WorksAtAntoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we may beuncertainabout the data
Uncertain data management
•
Databases:managedataand answerqueriesover it•
In this talk,datais simply a labeled graph WorksAtAntoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we may beuncertainabout the data
Uncertain data management
•
Databases:managedataand answerqueriesover it•
In this talk,datais simply a labeled graph WorksAtAntoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we may beuncertainabout the data
Uncertain data management
•
Databases:managedataand answerqueriesover it•
In this talk,datais simply a labeled graph WorksAtAntoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay
A.
B.
Télécom Paris
Paris Sud
Technion
ParisTech
IP Paris
Paris-Saclay
→ Problem:we may beuncertainabout the data
Uncertain data management
•
Databases:managedataand answerqueriesover it•
In this talk,datais simply a labeled graph WorksAtAntoine Télécom Paris Antoine Paris Sud
Benny Paris Sud Benny Technion İsmail U. Oxford
MemberOf
Télécom Paris ParisTech Télécom Paris IP Paris
Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER
A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
→ Problem:we may beuncertainabout the data 2/17
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
80% 10%
40% 80%
100%
90% 90% 50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Every fact carries aprobability•
Every fact exists with the indicatedprobability•
All facts areindependent•
Possible world:subset of facts•
What isprobabilityof this possible world?0.03%→ This model issimplistic, but already challenging to understand
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Every fact carries aprobability•
Every fact exists with the indicatedprobability•
All facts areindependent•
Possible world:subset of facts•
What isprobabilityof this possible world?0.03%→ This model issimplistic, but already challenging to understand
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Every fact carries aprobability•
Every fact exists with the indicatedprobability•
All facts areindependent•
Possible world:subset of facts•
What isprobabilityof this possible world?0.03%→ This model issimplistic, but already challenging to understand
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Every fact carries aprobability•
Every fact exists with the indicatedprobability•
All facts areindependent•
Possible world:subset of facts•
What isprobabilityof this possible world?0.03%→ This model issimplistic, but already challenging to understand
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
80% 10%
40% 80%
100%
90% 90% 50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Every fact carries aprobability•
Every fact exists with the indicatedprobability•
All facts areindependent•
Possible world:subset of facts•
What isprobabilityof this possible world?0.03%→ This model issimplistic, but already challenging to understand
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER
80% 10%
40% 80%
100%
90% 90% 50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Every fact carries aprobability•
Every fact exists with the indicatedprobability•
All facts areindependent•
Possible world:subset of facts•
What isprobabilityof this possible world?0.03%
→ This model issimplistic, but already challenging to understand
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Every fact carries aprobability•
Every fact exists with the indicatedprobability•
All facts areindependent•
Possible world:subset of facts•
What isprobabilityof this possible world?0.03%
→ This model issimplistic, but already challenging to understand
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Every fact carries aprobability•
Every fact exists with the indicatedprobability•
All facts areindependent•
Possible world:subset of facts•
What isprobabilityof this possible world?0.03%→ This model issimplistic, but already challenging to understand
Uncertain data model A.
B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Uncertain data model:TID, for tuple-independent database•
Every fact carries aprobability•
Every fact exists with the indicatedprobability•
All facts areindependent•
Possible world:subset of facts•
What isprobabilityof this possible world?0.03%Queries
•
Query:maps anon-probabilisticgraph to YES/NO•
Conjunctive query:can I find an occurrence of apattern?x y z
• We want ahomomorphismfrom the pattern to the graph
• Not necessarilyinjective!
•
Union of conjunctive queries: does one of the patterns match?•
Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQQueries
•
Query:maps anon-probabilisticgraph to YES/NO•
Conjunctive query:can I find an occurrence of apattern?x y z
• We want ahomomorphismfrom the pattern to the graph
• Not necessarilyinjective!
•
Union of conjunctive queries: does one of the patterns match?•
Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQQueries
•
Query:maps anon-probabilisticgraph to YES/NO•
Conjunctive query:can I find an occurrence of apattern?x y z
• We want ahomomorphismfrom the pattern to the graph
• Not necessarilyinjective!
•
Union of conjunctive queries: does one of the patterns match?•
Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQQueries
•
Query:maps anon-probabilisticgraph to YES/NO•
Conjunctive query:can I find an occurrence of apattern?x y z
• We want ahomomorphismfrom the pattern to the graph
• Not necessarilyinjective!
•
Union of conjunctive queries: does one of the patterns match?•
Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQQueries
•
Query:maps anon-probabilisticgraph to YES/NO•
Conjunctive query:can I find an occurrence of apattern?x y z
• We want ahomomorphismfrom the pattern to the graph
• Not necessarilyinjective!
•
Union of conjunctive queries: does one of the patterns match?•
Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQQueries
•
Query:maps anon-probabilisticgraph to YES/NO•
Conjunctive query:can I find an occurrence of apattern?x y z
• We want ahomomorphismfrom the pattern to the graph
• Not necessarilyinjective!
•
Union of conjunctive queries: does one of the patterns match?•
Homomorphism-closed queryQ: ifGsatisfiesQandGhas a homomorphism toG0 thenG0 also satisfiesQProblem statement: Probabilistic query evaluation (PQE)
•
Wefixa queryQ, for instance: x y z•
Theinputis a TID: A.B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40% 80%
100%
90% 90% 50%
90%
100%
•
Theoutputis thetotal probabilityof the worlds which satisfy the query→ Intuition:theprobabilitythat the query is true
→ What is thecomplexityof the problemPQE(Q), depending on the queryQ?
Problem statement: Probabilistic query evaluation (PQE)
•
Wefixa queryQ, for instance: x y z•
Theinputis a TID: A.B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Theoutputis thetotal probabilityof the worlds which satisfy the query→ Intuition:theprobabilitythat the query is true
→ What is thecomplexityof the problemPQE(Q), depending on the queryQ?
Problem statement: Probabilistic query evaluation (PQE)
•
Wefixa queryQ, for instance: x y z•
Theinputis a TID: A.B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Theoutputis thetotal probabilityof the worlds which satisfy the query→ Intuition:theprobabilitythat the query is true
→ What is thecomplexityof the problemPQE(Q), depending on the queryQ?
Problem statement: Probabilistic query evaluation (PQE)
•
Wefixa queryQ, for instance: x y z•
Theinputis a TID: A.B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Theoutputis thetotal probabilityof the worlds which satisfy the query→ Intuition:theprobabilitythat the query is true
→ What is thecomplexityof the problemPQE(Q), depending on the queryQ?
Problem statement: Probabilistic query evaluation (PQE)
•
Wefixa queryQ, for instance: x y z•
Theinputis a TID: A.B.
İ.
Télécom Paris
Paris Sud
Technion
U. Oxford
ParisTech
IP Paris
Paris-Saclay
CESAER 80%
10%
40%
80%
100%
90%
90%
50%
90%
100%
•
Theoutputis thetotal probabilityof the worlds which satisfy the query→ Intuition:theprobabilitythat the query is true
→ What is thecomplexityof the problemPQE(Q), depending on the queryQ?
Existing results
Dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hard•
Our example query x y z is...safe Also: dichotomy on theinstance families:Theorem [Amarilli et al., 2015, Amarilli et al., 2016]
•
For any queryQinmonadic second-order logic,PQE(Q)is in PTIMEif the input TIDs havebounded treewidth•
There is a queryQsuch thatPQE(Q)is#P-hardon any TID family ofunbounded treewidth(with several technical assumptions)Existing results
Dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hard•
Our example query x y z is...safe
Also: dichotomy on theinstance families:
Theorem [Amarilli et al., 2015, Amarilli et al., 2016]
•
For any queryQinmonadic second-order logic,PQE(Q)is in PTIMEif the input TIDs havebounded treewidth•
There is a queryQsuch thatPQE(Q)is#P-hardon any TID family ofunbounded treewidth(with several technical assumptions)Existing results
Dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hard•
Our example query x y z is...safeAlso: dichotomy on theinstance families:
Theorem [Amarilli et al., 2015, Amarilli et al., 2016]
•
For any queryQinmonadic second-order logic,PQE(Q)is in PTIMEif the input TIDs havebounded treewidth•
There is a queryQsuch thatPQE(Q)is#P-hardon any TID family ofunbounded treewidth(with several technical assumptions)Existing results
Dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hard•
Our example query x y z is...safe Also: dichotomy on theinstance families:Theorem [Amarilli et al., 2015, Amarilli et al., 2016]
•
For any queryQinmonadic second-order logic,PQE(Q)is in PTIMEif the input TIDs havebounded treewidth•
There is a queryQsuch thatPQE(Q)is#P-hardon any TID family ofunbounded treewidth(with several technical assumptions)Existing results
Dichotomy on theunions of conjunctive queries(UCQs):
Theorem [Dalvi and Suciu, 2012]
•
Some UCQsQaresafeandPQE(Q)is inPTIME•
All others areunsafeandPQE(Q)is#P-hard•
Our example query x y z is...safe Also: dichotomy on theinstance families:Theorem [Amarilli et al., 2015, Amarilli et al., 2016]
•
For any queryQinmonadic second-order logic,PQE(Q)is in PTIMEif the input TIDs havebounded treewidth•
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2 1
1 1
1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2 1
1 1
1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2 1
1 1
1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2 1
1 1
1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2 1
1 1
1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2 1
1 1
1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2 1
1 1
1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2 1
1 1
1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2 1
1 1
1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b02 1/2
1/2
1 1
1 1
1
Why are some queries unsafe?
This query isunsafe: x y z w
•
#SAT: counting satisfying valuations of a Boolean formula•
Specifically, reduce from#PP2DNF:• Partitionedvariables:X1, . . . ,XnandY1, . . . ,Ym
• Positive: no negation
• 2-DNF: disjunction of clauses likeXi∧Yj
→ #SATis already#P-hard
•
Example:(X1∧Y1)∨(X1∧Y2)∨(X2∧Y2)∨(X3∧Y1)∨(X3∧Y2)a01
a02
a03
a1
a2
a3 1/2 1/2 1/2
b1
b2
b01
b0 1/2
1/2 1
1 1
1
1 7/17
New results in this talk
We presentmore caseswhere PQE is#P-hard:
•
With İsmail İlkan Ceylan, forexpressive queries: Theorem [Amarilli and Ceylan, 2019]For anyquery Q closed under homomorphisms,PQE(Q)is#P-hard unlessQis equivalent to asafe UCQ
•
With Benny Kimelfeld, in theunweighted case: Theorem [Amarilli and Kimelfeld, 2019]For anyCQ Q without self-joins(every edge has a different color), if Qis unsafe thenPQE(Q)is#P-hardeven if all probabilities are1/2
New results in this talk
We presentmore caseswhere PQE is#P-hard:
•
With İsmail İlkan Ceylan, forexpressive queries: Theorem [Amarilli and Ceylan, 2019]For anyquery Q closed under homomorphisms,PQE(Q)is#P-hard unlessQis equivalent to asafe UCQ
•
With Benny Kimelfeld, in theunweighted case: Theorem [Amarilli and Kimelfeld, 2019]For anyCQ Q without self-joins(every edge has a different color), if Qis unsafe thenPQE(Q)is#P-hardeven if all probabilities are1/2
Hardness for queries closed under
homomorphisms
Result statement
We consider queriesclosed under homomorphisms:
•
GeneralizesCQsandUCQs,Datalog,Regular path queries...• Example:WorksAt/MemberOf+
•
Equivalent phrasing:infiniteunion of CQs• Example:WA/MO,WA/MO/MO,WA/MO/MO/MO, ... Theorem
•
Some queries closed under homomorphisms areUCQs: the previous dichotomy applies, PQE isPTIMEor#P-hard•
Forall other queries closed under homomorphisms, PQE is#P-hard•
The queryWA/MO+ isequivalenttoWA/MOwhich is asafe UCQ•
The queryWA/MO+/INisnot equivalent to a UCQ so PQE is#P-hardResult statement
We consider queriesclosed under homomorphisms:
•
GeneralizesCQsandUCQs,Datalog,Regular path queries...• Example:WorksAt/MemberOf+
•
Equivalent phrasing:infiniteunion of CQs• Example:WA/MO,WA/MO/MO,WA/MO/MO/MO, ... Theorem
•
Some queries closed under homomorphisms areUCQs: the previous dichotomy applies, PQE isPTIMEor#P-hard•
Forall other queries closed under homomorphisms, PQE is#P-hard•
The queryWA/MO+ isequivalenttoWA/MOwhich is asafe UCQ•
The queryWA/MO+/INisnot equivalent to a UCQ so PQE is#P-hardResult statement
We consider queriesclosed under homomorphisms:
•
GeneralizesCQsandUCQs,Datalog,Regular path queries...• Example:WorksAt/MemberOf+
•
Equivalent phrasing:infiniteunion of CQs• Example:WA/MO,WA/MO/MO,WA/MO/MO/MO, ...
Theorem
•
Some queries closed under homomorphisms areUCQs: the previous dichotomy applies, PQE isPTIMEor#P-hard•
Forall other queries closed under homomorphisms, PQE is#P-hard•
The queryWA/MO+ isequivalenttoWA/MOwhich is asafe UCQ•
The queryWA/MO+/INisnot equivalent to a UCQ so PQE is#P-hardResult statement
We consider queriesclosed under homomorphisms:
•
GeneralizesCQsandUCQs,Datalog,Regular path queries...• Example:WorksAt/MemberOf+
•
Equivalent phrasing:infiniteunion of CQs• Example:WA/MO,WA/MO/MO,WA/MO/MO/MO, ...
Theorem
•
Some queries closed under homomorphisms areUCQs:the previous dichotomy applies, PQE isPTIMEor#P-hard
•
Forall other queries closed under homomorphisms, PQE is#P-hard•
The queryWA/MO+ isequivalenttoWA/MOwhich is asafe UCQ•
The queryWA/MO+/INisnot equivalent to a UCQ so PQE is#P-hardResult statement
We consider queriesclosed under homomorphisms:
•
GeneralizesCQsandUCQs,Datalog,Regular path queries...• Example:WorksAt/MemberOf+
•
Equivalent phrasing:infiniteunion of CQs• Example:WA/MO,WA/MO/MO,WA/MO/MO/MO, ...
Theorem
•
Some queries closed under homomorphisms areUCQs:the previous dichotomy applies, PQE isPTIMEor#P-hard
•
Forall other queries closed under homomorphisms, PQE is#P-hard•
The queryWA/MO+isequivalenttoWA/MOwhich is asafe UCQ•
The queryWA/MO+/INisnot equivalent to a UCQ so PQE is#P-hardResult statement
We consider queriesclosed under homomorphisms:
•
GeneralizesCQsandUCQs,Datalog,Regular path queries...• Example:WorksAt/MemberOf+
•
Equivalent phrasing:infiniteunion of CQs• Example:WA/MO,WA/MO/MO,WA/MO/MO/MO, ...
Theorem
•
Some queries closed under homomorphisms areUCQs:the previous dichotomy applies, PQE isPTIMEor#P-hard
•
Forall other queries closed under homomorphisms, PQE is#P-hard•
The queryWA/MO+isequivalenttoWA/MOwhich is asafe UCQProof idea: finding hard patterns
•
Fix the queryQand find atight pattern, i.e,. a graph such that:•
• •
• satisfiesQ
but •
•
•
•
•
• violatesQ
•
If the query isunbounded, we can find atight pattern• Unbounded queries havearbitrarily largeminimal models
• Take one such model anddisconnect edgesas much as possible
• Unbounded queries havearbitrarily largeminimal models
•
•
• to
•
•
•
•
• to
•
•
•
•
•
•
• to •
•
•
•
•
•
•
•
•
• IfQis still true then the model is “explained” by aunion of stars
Proof idea: finding hard patterns
•
Fix the queryQand find atight pattern, i.e,. a graph such that:•
• •
• satisfiesQ
but •
•
•
•
•
• violatesQ
•
If the query isunbounded, we can find atight pattern• Unbounded queries havearbitrarily largeminimal models
• Take one such model anddisconnect edgesas much as possible
• Unbounded queries havearbitrarily largeminimal models
•
•
• to
•
•
•
•
• to
•
•
•
•
•
•
• to •
•
•
•
•
•
•
•
•
• IfQis still true then the model is “explained” by aunion of stars
Proof idea: finding hard patterns
•
Fix the queryQand find atight pattern, i.e,. a graph such that:•
• •
• satisfiesQ
but •
•
•
•
•
• violatesQ
•
If the query isunbounded, we can find atight pattern• Unbounded queries havearbitrarily largeminimal models
• Take one such model anddisconnect edgesas much as possible
• Unbounded queries havearbitrarily largeminimal models
•
•
• to
•
•
•
•
• to
•
•
•
•
•
•
• to •
•
•
•
•
•
•
•
•
• IfQis still true then the model is “explained” by aunion of stars
Proof idea: finding hard patterns
•
Fix the queryQand find atight pattern, i.e,. a graph such that:•
• •
• satisfiesQ
but •
•
•
•
•
• violatesQ
•
If the query isunbounded, we can find atight pattern• Unbounded queries havearbitrarily largeminimal models
• Take one such model anddisconnect edgesas much as possible
• Unbounded queries havearbitrarily largeminimal models
•
•
• to
•
•
•
•
• to
•
•
•
•
•
•
• to •
•
•
•
•
•
•
•
•
• IfQis still true then the model is “explained” by aunion of stars
Proof idea: finding hard patterns
•
Fix the queryQand find atight pattern, i.e,. a graph such that:•
• •
• satisfiesQ
but •
•
•
•
•
• violatesQ
•
If the query isunbounded, we can find atight pattern• Unbounded queries havearbitrarily largeminimal models
• Take one such model anddisconnect edgesas much as possible
• Unbounded queries havearbitrarily largeminimal models
•
•
•
to
•
•
•
•
• to
•
•
•
•
•
•
• to •
•
•
•
•
•
•
•
•
• IfQis still true then the model is “explained” by aunion of stars
Proof idea: finding hard patterns
•
Fix the queryQand find atight pattern, i.e,. a graph such that:•
• •
• satisfiesQ
but •
•
•
•
•
• violatesQ
•
If the query isunbounded, we can find atight pattern• Unbounded queries havearbitrarily largeminimal models
• Take one such model anddisconnect edgesas much as possible
• Unbounded queries havearbitrarily largeminimal models
•
to
•
to
•
•
•
•
•
•
• to •
•
•
•
•
•
•
•
•
• IfQis still true then the model is “explained” by aunion of stars