Probabilities and Provenance on Trees and Treelike Instances
Antoine Amarilli1, Pierre Bourhis2, Pierre Senellart1,3,4 September 7th, 2016
1Télécom ParisTech
2CNRS CRIStAL
3National University of Singapore
4École normale supérieure de Paris 1/7
How to travel to Highlights from Paris?
How to travel to Highlights from Paris?
2/7
How to travel to Highlights from Paris?
How to travel to Highlights from Paris?
2/7
How to travel to Highlights from Paris?
How to travel to Highlights from Paris?
(Metro|RER)*|(Bus|Tram)*
2/7
How to travel to Highlights from Paris?
How to travel to Highlights from Paris?
50%
(Metro|RER)*|(Bus|Tram)*
2/7
How to travel to Highlights from Paris?
50%
How to travel to Highlights from Paris?
50%
90%
(Metro|RER)*|(Bus|Tram)*
2/7
How to travel to Highlights from Paris?
50%
42%
37%
78% 90%
72%
How to travel to Highlights from Paris?
(Metro|RER)*|(Bus|Tram)*
50%
90%
42%
37%
90%
83%
78%
72%
What is the probability
that I can attend Highlights 2016?
2/7
Problem statement
Input:
? QueryQ (Metro|RER)*|(Bus|Tram)*
DatabaseDor graph
% Probabilitieson facts or edges 50%
Output: theprobabilitythat the query is true under the distribution (assuming independence of all probabilistic events)
Complexity: already#P-hardin the input database! (from #MONOTONE-SAT)
Problem statement
Input:
? QueryQ (Metro|RER)*|(Bus|Tram)*
DatabaseDor graph
% Probabilitieson facts or edges 50%
Output: theprobabilitythat the query is true under the distribution (assuming independence of all probabilistic events)
Complexity: already#P-hardin the input database! (from #MONOTONE-SAT)
3/7
Problem statement
Input:
? QueryQ (Metro|RER)*|(Bus|Tram)*
DatabaseDor graph
% Probabilitieson facts or edges 50%
Output: theprobabilitythat the query is true under the distribution (assuming independence of all probabilistic events)
Complexity: already#P-hardin the input database! (from #MONOTONE-SAT)
Problem statement
Input:
? QueryQ (Metro|RER)*|(Bus|Tram)*
DatabaseDor graph
% Probabilitieson facts or edges 50%
Output:theprobabilitythat the query is true under the distribution (assuming independence of all probabilistic events)
Complexity: already#P-hardin the input database! (from #MONOTONE-SAT)
3/7
Problem statement
Input:
? QueryQ (Metro|RER)*|(Bus|Tram)*
DatabaseDor graph
% Probabilitieson facts or edges 50%
Output:theprobabilitythat the query is true under the distribution (assuming independence of all probabilistic events)
Complexity: already#P-hardin the input database!
(from #MONOTONE-SAT)
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
Using treewidth to make the problem tractable
Treewidth by example:
• Treeshave treewidth1
• Cycleshave treewidth2
• k-cliquesand(k−1)-gridshave treewidthk−1
→ Treelike: thetreewidthis bounded by aconstant
4/7
Tractability on treelike instances
(RER|metro)*
|(bus|tram)*
MSO query Treelike data
Theorem
For anyfixedBoolean MSO queryqandk∈N,
given a databaseDoftreewidth≤kwithindependent probabilities, we can compute inlinear timethe probability thatDsatisfiesq
Tractability on treelike instances
Tree automaton (RER|metro)*
|(bus|tram)*
MSO query Treelike data
Theorem
For anyfixedBoolean MSO queryqandk∈N,
given a databaseDoftreewidth≤kwithindependent probabilities, we can compute inlinear timethe probability thatDsatisfiesq
5/7
Tractability on treelike instances
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query Treelike data
Theorem
For anyfixedBoolean MSO queryqandk∈N,
given a databaseDoftreewidth≤kwithindependent probabilities, we can compute inlinear timethe probability thatDsatisfiesq
Tractability on treelike instances
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query
linear [Courcelle]
Treelike data
Query answer
TRUE
Theorem
For anyfixedBoolean MSO queryqandk∈N,
given a databaseDoftreewidth≤kwithindependent probabilities, we can compute inlinear timethe probability thatDsatisfiesq
5/7
Tractability on treelike instances
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query Treelike data
Theorem
For anyfixedBoolean MSO queryqandk∈N,
given a databaseDoftreewidth≤kwithindependent probabilities, we can compute inlinear timethe probability thatDsatisfiesq
Tractability on treelike instances
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query
Provenance circuit
∧
linear Treelike data
Theorem
For anyfixedBoolean MSO queryqandk∈N,
given a databaseDoftreewidth≤kwithindependent probabilities, we can compute inlinear timethe probability thatDsatisfiesq
5/7
Tractability on treelike instances
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query
Provenance circuit
∧
linear Treelike data
Probability
95%
linear
Theorem
For anyfixedBoolean MSO queryqandk∈N,
given a databaseDoftreewidth≤kwithindependent probabilities, we can compute inlinear timethe probability thatDsatisfiesq
Tractability on treelike instances
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query
Provenance circuit
∧
linear Treelike data
Probability
95%
linear
Theorem
For anyfixedBoolean MSO queryqandk∈N,
given a databaseDoftreewidth≤kwithindependent probabilities, we can compute inlinear timethe probability thatDsatisfiesq
5/7
Lower bound
What can we do for unbounded-treewidth instances?
... not much. Theorem
For any graph signatureσ, there is afirst-orderqueryqsuch that for any constructibleunbounded-treewidthclassI,
probability evaluation ofqonI is#P-hardunder RP reductions
3 1
2
4 3 4
2 1
maps vertices to vertices maps edges to vertex-disjoint paths
Proof idea:extract instances of a hard problem astopological minors using recentpolynomial bounds[Chekuri and Chuzhoy, 2014]
Lower bound
What can we do for unbounded-treewidth instances?... not much.
Theorem
For any graph signatureσ, there is afirst-orderqueryqsuch that for any constructibleunbounded-treewidthclassI,
probability evaluation ofqonI is#P-hardunder RP reductions
Proof idea:extract instances of a hard problem astopological minors using recentpolynomial bounds[Chekuri and Chuzhoy, 2014]
6/7
Lower bound
Theorem
For any graph signatureσ, there is afirst-orderqueryqsuch that for any constructibleunbounded-treewidthclassI,
probability evaluation ofqonI is#P-hardunder RP reductions
Proof idea:extract instances of a hard problem astopological minors using recentpolynomial bounds[Chekuri and Chuzhoy, 2014]
Lower bound
Theorem
For any graph signatureσ, there is afirst-orderqueryqsuch that for any constructibleunbounded-treewidthclassI,
probability evaluation ofqonI is#P-hardunder RP reductions
Proof idea:extract instances of a hard problem astopological minors using recentpolynomial bounds[Chekuri and Chuzhoy, 2014]
6/7
Lower bound
Theorem
For any graph signatureσ, there is afirst-orderqueryqsuch that for any constructibleunbounded-treewidthclassI,
probability evaluation ofqonI is#P-hardunder RP reductions
3 1
2
4 3 4
2 1
maps vertices to vertices maps edges to vertex-disjoint paths
Proof idea:extract instances of a hard problem astopological minors
Future and ongoing work
• Improving thelower bound:
• Fromgraphstoarbitrary aritydatabases
• FromFOdown tounions of conjunctive querieswith6=
• Complexity inqueryanddatabase— currentlyΩ
22. ..2
|Q|
× |D|
→ Which queries canefficientlybe compiled to automata?
• Non-Booleanqueries: efficientenumerationof query results?
• Other tasks: probabilisticconditioning
“Knowing that I’m here, what’s the probability that RER B is up?” Thanks for your attention!
7/7
Future and ongoing work
• Improving thelower bound:
• Fromgraphstoarbitrary aritydatabases
• FromFOdown tounions of conjunctive querieswith6=
• Complexity inqueryanddatabase— currentlyΩ
22. ..2
|Q|
× |D|
→ Which queries canefficientlybe compiled to automata?
• Non-Booleanqueries: efficientenumerationof query results?
• Other tasks: probabilisticconditioning
“Knowing that I’m here, what’s the probability that RER B is up?” Thanks for your attention!
Future and ongoing work
• Improving thelower bound:
• Fromgraphstoarbitrary aritydatabases
• FromFOdown tounions of conjunctive querieswith6=
• Complexity inqueryanddatabase— currentlyΩ
22. ..2
|Q|
× |D|
→ Which queries canefficientlybe compiled to automata?
• Non-Booleanqueries: efficientenumerationof query results?
• Other tasks: probabilisticconditioning
“Knowing that I’m here, what’s the probability that RER B is up?” Thanks for your attention!
7/7
Future and ongoing work
• Improving thelower bound:
• Fromgraphstoarbitrary aritydatabases
• FromFOdown tounions of conjunctive querieswith6=
• Complexity inqueryanddatabase— currentlyΩ
22. ..2
|Q|
× |D|
→ Which queries canefficientlybe compiled to automata?
• Non-Booleanqueries: efficientenumerationof query results?
• Other tasks: probabilisticconditioning
“Knowing that I’m here, what’s the probability that RER B is up?”
Thanks for your attention!
Future and ongoing work
• Improving thelower bound:
• Fromgraphstoarbitrary aritydatabases
• FromFOdown tounions of conjunctive querieswith6=
• Complexity inqueryanddatabase— currentlyΩ
22. ..2
|Q|
× |D|
→ Which queries canefficientlybe compiled to automata?
• Non-Booleanqueries: efficientenumerationof query results?
• Other tasks: probabilisticconditioning
“Knowing that I’m here, what’s the probability that RER B is up?”
Thanks for your attention!
7/7
References I
Chekuri, C. and Chuzhoy, J. (2014).
Polynomial bounds for the grid-minor theorem.
InSTOC.
Courcelle, B. (1990).
The monadic second-order logic of graphs. I. Recognizable sets of finite graphs.
Inf. Comput., 85(1).
Image credits
• Slide 2:
• https://commons.wikimedia.org/wiki/File:
Paris_Metro_map.svg(cropped), user Umx on Wikimedia Commons, public domain
• http://www.parisvoyage.com/images/cartoon18.jpg, ParisVoyage, fair use
• http://www.vianavigo.com/fileadmin/galerie/pdf/CGU_t_.pdf (cropped), RATP, fair use
• Slides 4 and 5: https://commons.wikimedia.org/wiki/File:
Carte_Transilien_RER_sch%C3%A9matique.svg(modified), user Benjamin Smith on Wikimedia Commons, license CC BY-SA 4.0 international.