Antoine Amarilli May 16, 2018
1/33
Example application: Subway routing
2/33
2/33
Example application: Subway routing
2/33
2/33
Example application: Subway routing
(Metro|RER)*|(Bus|Tram)*
2/33
13 6
23 14
Database
• (Hyper)graph
• Collection of ground facts
G(aa1, ab2), G(ab2, ac3), S(aa1, m4), S(ab2, rB), ...
3/33
Database theory and query evaluation
13 6
23 14
Query
?
+
Database
• (Hyper)graph
• Collection of ground facts
G(aa1, ab2), G(ab2, ac3), S(aa1, m4), S(ab2, rB), ...
• Regular path
∀X(rm ∈ X ∧ ∀xy (Metro|RER)*
• Logic formula
(x ∈ X ∧ G(x, y) → y ∈ X)) → gn ∈ X
|(Bus|Tram)*
3/33
13 6
23 14
Query
?
+
Database
• (Hyper)graph
• Collection of ground facts
G(aa1, ab2), G(ab2, ac3), S(aa1, m4), S(ab2, rB), ...
• Regular path
∀X(rm ∈ X ∧ ∀xy
i Result
• TRUE/FALSE
↔ Model checking (Metro|RER)*
• Logic formula
(x ∈ X ∧ G(x, y) → y ∈ X)) → gn ∈ X
|(Bus|Tram)*
3/33
Probabilistic query evaluation
13 6
23 14
4/33
4/33
Probabilistic query evaluation
4/33
4/33
Probabilistic query evaluation
4/33
4/33
Probabilistic query evaluation
4/33
13 6
23 14
4/33
Probabilistic query evaluation
13 6
23 95% 14
5% 60
98%
2% 60
4/33
• (Hyper)graph
• Collection of ground facts
Probabilistic database
+ independent probabilities
13 6
23 95%14
5% 60
98%
2% 60
4/33
Probabilistic query evaluation
Query
?
+
• (Hyper)graph
• Collection of ground facts
• Regular path
∀X(rm ∈ X ∧ ∀xy (Metro|RER)*
• Logic formula
(x ∈ X ∧ G(x, y) → y ∈ X)) → gn ∈ X
|(Bus|Tram)*
Probabilistic database
+ independent probabilities
13 6
23 95%14
5% 60
98%
2% 60
4/33
Query
?
+
• (Hyper)graph
• Collection of ground facts
• Regular path
∀X(rm ∈ X ∧ ∀xy (Metro|RER)*
• Logic formula
(x ∈ X ∧ G(x, y) → y ∈ X)) → gn ∈ X
|(Bus|Tram)*
Probabilistic database
+ independent probabilities
Probabilistic i
Result
• Probability according to the input distribution
13 6
23 95%14
5% 60
98%
2% 60
proba to be on time: 98%
4/33
Computational complexity
•
Computing paths on a large graph:→ Well-studied problem, efficient algorithms
→ #P-hardcomputational complexity in thedatabase
5/33
Computational complexity
50%
90%
42%
37%
90%
83%
78%
72%
•
Computing paths on a largeprobabilisticgraph:→ ???
5/33
Computational complexity
50%
90%
42%
37%
90%
83%
78%
72%
•
Computing paths on a largeprobabilisticgraph:→ Exponentialnumber of possibilities
→ #P-hardcomputational complexity in thedatabase
5/33
50%
90%
42%
37%
90%
83%
78%
72%
•
Computing paths on a largeprobabilisticgraph:→ Exponentialnumber of possibilities
→ #P-hardcomputational complexity in thedatabase 5/33
Idea: use the structure of data
→ Shortest path:very easyon alarge tree
6/33
Idea: use the structure of data
6/33
Idea: use the structure of data
→ Shortest path:very easyon alarge tree
6/33
Idea: use the structure of data
6/33
Idea: use the structure of data
→ Shortest path:very easyon alarge tree
6/33
Idea: use the structure of data
6/33
Idea: use the structure of data
→ Shortest path:very easyon alarge tree
6/33
Leveraging the structure of uncertain data
Does query evaluation on probabilistic data havelower complexity when thestructureof the data isclose to a tree?
•
Existing results onnon-probabilistic data:• Tree automata, to evaluate queries on trees
• Treewidth, formalizes the notion of being “close to a tree”
• Courcelle’s theorem
•
Introducenew toolsandresults:• Provenance circuitsof tree automata on uncertain trees
• Applications toprobabilistic query evaluation
•
Other applications:Counting, enumeration, provenance...7/33
Leveraging the structure of uncertain data
Does query evaluation on probabilistic data havelower complexity when thestructureof the data isclose to a tree?
In this talk:
•
Existing results onnon-probabilistic data:• Tree automata, to evaluate queries on trees
• Treewidth, formalizes the notion of being “close to a tree”
• Courcelle’s theorem
•
Introducenew toolsandresults:• Provenance circuitsof tree automata on uncertain trees
• Applications toprobabilistic query evaluation
•
Other applications:Counting, enumeration, provenance...7/33
Leveraging the structure of uncertain data
Does query evaluation on probabilistic data havelower complexity when thestructureof the data isclose to a tree?
In this talk:
•
Existing results onnon-probabilistic data:• Tree automata, to evaluate queries on trees
• Courcelle’s theorem
•
Introducenew toolsandresults:• Provenance circuitsof tree automata on uncertain trees
• Applications toprobabilistic query evaluation
•
Other applications:Counting, enumeration, provenance...7/33
Leveraging the structure of uncertain data
Does query evaluation on probabilistic data havelower complexity when thestructureof the data isclose to a tree?
In this talk:
•
Existing results onnon-probabilistic data:• Tree automata, to evaluate queries on trees
• Treewidth, formalizes the notion of being “close to a tree”
• Courcelle’s theorem
•
Introducenew toolsandresults:• Provenance circuitsof tree automata on uncertain trees
• Applications toprobabilistic query evaluation
•
Other applications:Counting, enumeration, provenance...7/33
Leveraging the structure of uncertain data
Does query evaluation on probabilistic data havelower complexity when thestructureof the data isclose to a tree?
In this talk:
•
Existing results onnon-probabilistic data:• Tree automata, to evaluate queries on trees
• Treewidth, formalizes the notion of being “close to a tree”
• Courcelle’s theorem
•
Introducenew toolsandresults:• Provenance circuitsof tree automata on uncertain trees
• Applications toprobabilistic query evaluation
•
Other applications:Counting, enumeration, provenance...7/33
Leveraging the structure of uncertain data
Does query evaluation on probabilistic data havelower complexity when thestructureof the data isclose to a tree?
In this talk:
•
Existing results onnon-probabilistic data:• Tree automata, to evaluate queries on trees
• Treewidth, formalizes the notion of being “close to a tree”
• Courcelle’s theorem
•
Introducenew toolsandresults:• Provenance circuitsof tree automata on uncertain trees
• Applications toprobabilistic query evaluation
•
Other applications:Counting, enumeration, provenance...7/33
Leveraging the structure of uncertain data
Does query evaluation on probabilistic data havelower complexity when thestructureof the data isclose to a tree?
In this talk:
•
Existing results onnon-probabilistic data:• Tree automata, to evaluate queries on trees
• Treewidth, formalizes the notion of being “close to a tree”
• Courcelle’s theorem
•
Introducenew toolsandresults:• Provenance circuitsof tree automata on uncertain trees
•
Other applications:Counting, enumeration, provenance...7/33
Leveraging the structure of uncertain data
Does query evaluation on probabilistic data havelower complexity when thestructureof the data isclose to a tree?
In this talk:
•
Existing results onnon-probabilistic data:• Tree automata, to evaluate queries on trees
• Treewidth, formalizes the notion of being “close to a tree”
• Courcelle’s theorem
•
Introducenew toolsandresults:• Provenance circuitsof tree automata on uncertain trees
• Applications toprobabilistic query evaluation
•
Other applications:Counting, enumeration, provenance...7/33
Does query evaluation on probabilistic data havelower complexity when thestructureof the data isclose to a tree?
In this talk:
•
Existing results onnon-probabilistic data:• Tree automata, to evaluate queries on trees
• Treewidth, formalizes the notion of being “close to a tree”
• Courcelle’s theorem
•
Introducenew toolsandresults:• Provenance circuitsof tree automata on uncertain trees
• Applications toprobabilistic query evaluation
•
Other applications:Counting, enumeration, provenance...7/33
Table of contents
Introduction Existing tools
Provenance circuits and probabilistic query evaluation Other applications
8/33
Query evaluation on words
Database: awordwwhere nodes have a color from an alphabet
?
inmonadic second-order logic(MSO) and a blue node?”i
Result: TRUE/FALSE indicating if the wordwsatisfies the queryQComputational complexityas a function ofw (the queryQisfixed)
9/33
Query evaluation on words
Database: awordwwhere nodes have a color from an alphabet
?
QueryQ: asentence(yes/no question) inmonadic second-order logic(MSO)“Is there both a pink and a blue node?”
i
Result: TRUE/FALSE indicating if the wordwsatisfies the queryQComputational complexityas a function ofw (the queryQisfixed)
9/33
Query evaluation on words
Database: awordwwhere nodes have a color from an alphabet
?
QueryQ: asentence(yes/no question) inmonadic second-order logic(MSO)“Is there both a pink and a blue node?”
i
Result: TRUE/FALSE indicating if the wordwsatisfies the queryQ(the queryQisfixed)
9/33
Query evaluation on words
Database: awordwwhere nodes have a color from an alphabet
?
QueryQ: asentence(yes/no question) inmonadic second-order logic(MSO)“Is there both a pink and a blue node?”
i
Result: TRUE/FALSE indicating if the wordwsatisfies the queryQComputational complexityas a function ofw (the queryQisfixed)
9/33
Monadic second-order logic (MSO)
•
P (x)means “xis blue”; alsoP (x),P (x)•
x→ymeans “xis the predecessor ofy”•
Propositional logic:formulas withAND∧,OR∨,NOT¬• P (x)∧P (y)means“Nodexis pink and nodeyis blue”
•
First-order logic:addsexistential quantifier∃and universal quantifier∀• ∃x y P (x)∧P (y)means“There is both a pink and a blue node”
•
Monadic second-order logic (MSO):addsquantifiers over sets• ∃S∀x S(x)means“there is a setScontaining every elementx”
• Can expresstransitive closurex→∗y, i.e.,“xis beforey”
• ∀x P (x)⇒ ∃y P (y)∧x→∗y
means“There is a blue node after every pink node”
10/33
Monadic second-order logic (MSO)
•
P (x)means “xis blue”; alsoP (x),P (x)•
x→ymeans “xis the predecessor ofy”•
Propositional logic:formulas withAND∧,OR∨,NOT¬• P (x)∧P (y)means“Nodexis pink and nodeyis blue”
•
First-order logic:addsexistential quantifier∃and universal quantifier∀• ∃x y P (x)∧P (y)means“There is both a pink and a blue node”
•
Monadic second-order logic (MSO):addsquantifiers over sets• ∃S∀x S(x)means“there is a setScontaining every elementx”
• Can expresstransitive closurex→∗y, i.e.,“xis beforey”
• ∀x P (x)⇒ ∃y P (y)∧x→∗y
means“There is a blue node after every pink node”
10/33
Monadic second-order logic (MSO)
•
P (x)means “xis blue”; alsoP (x),P (x)•
x→ymeans “xis the predecessor ofy”•
Propositional logic:formulas withAND∧,OR∨,NOT¬• P (x)∧P (y)means“Nodexis pink and nodeyis blue”
•
First-order logic:addsexistential quantifier∃and universal quantifier∀• ∃x y P (x)∧P (y)means“There is both a pink and a blue node”
•
Monadic second-order logic (MSO):addsquantifiers over sets• ∃S∀x S(x)means“there is a setScontaining every elementx”
• Can expresstransitive closurex→∗y, i.e.,“xis beforey”
• ∀x P (x)⇒ ∃y P (y)∧x→∗y
means“There is a blue node after every pink node”
10/33
Monadic second-order logic (MSO)
•
P (x)means “xis blue”; alsoP (x),P (x)•
x→ymeans “xis the predecessor ofy”•
Propositional logic:formulas withAND∧,OR∨,NOT¬• P (x)∧P (y)means“Nodexis pink and nodeyis blue”
•
First-order logic:addsexistential quantifier∃and universal quantifier∀• ∃x y P (x)∧P (y)means“There is both a pink and a blue node”
•
Monadic second-order logic (MSO):addsquantifiers over sets• ∃S∀x S(x)means“there is a setScontaining every elementx”
• Can expresstransitive closurex→∗y, i.e.,“xis beforey”
• ∀x P (x)⇒ ∃y P (y)∧x→∗y
means“There is a blue node after every pink node”
10/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w: Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
> Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥ P P > >
Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
> Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w: Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
> Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥ P P > >
Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
> Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥ Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
> Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥
P P > >
Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
>
Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥ P Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
>
Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥ P P
> >
Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
>
Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥ P P > Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
>
Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥ P P > > Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
>
Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥ P P > > Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
>
Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words
Query evaluation of MSO on words is inlinear time.
11/33
Word automata
Translate the queryQto adeterministic word automaton
Alphabet: w:
⊥ P P > > Q:∃x y P (x)∧P (y)
•
States:{⊥,B,P,>}•
Final states:{>}•
Initial function: ⊥ P B•
Transitions(examples): ⊥P P
>
>
>
Theorem (Büchi, 1960)
MSOandword automatahave the sameexpressive poweron words Corollary
Query evaluation of MSO on words is inlinear time.
11/33
Query evaluation on trees
Database: atreeTwhere nodes have a color from an alphabet
?
QueryQ: asentencein monadicsecond-order logic (MSO)
•P (x)means “xis blue”
•x→ymeans “xis the parent ofy”
“Is there both a pink and a blue node?”
∃x y P (x)∧P (y)
i
Result: TRUE/FALSE indicating if the treeTsatisfies the queryQComputational complexityas a function ofT (the queryQisfixed)
12/33
Query evaluation on trees
Database: atreeTwhere nodes have a color from an alphabet
?
QueryQ: asentencein monadicsecond-order logic (MSO)
•P (x)means “xis blue”
•x→ymeans “xis the parent ofy”
“Is there both a pink and a blue node?”
∃x y P (x)∧P (y)
i
Result: TRUE/FALSE indicating if the treeTsatisfies the queryQComputational complexityas a function ofT (the queryQisfixed)
12/33
Query evaluation on trees
Database: atreeTwhere nodes have a color from an alphabet
?
QueryQ: asentencein monadicsecond-order logic (MSO)
•P (x)means “xis blue”
•x→ymeans “xis the parent ofy”
“Is there both a pink and a blue node?”
∃x y P (x)∧P (y)
i
Result: TRUE/FALSE indicating if the treeTsatisfies the queryQ(the queryQisfixed)
12/33
Query evaluation on trees
Database: atreeTwhere nodes have a color from an alphabet
?
QueryQ: asentencein monadicsecond-order logic (MSO)
•P (x)means “xis blue”
•x→ymeans “xis the parent ofy”
“Is there both a pink and a blue node?”
∃x y P (x)∧P (y)
i
Result: TRUE/FALSE indicating if the treeTsatisfies the queryQComputational complexityas a function ofT (the queryQisfixed)
12/33
Tree automata Tree alphabet:
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples): PP ⊥
> B P
⊥
⊥
⊥ Theorem [Thatcher and Wright, 1968]
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree automata
Tree alphabet:
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples): PP ⊥
> B P
⊥
⊥
⊥ Theorem [Thatcher and Wright, 1968]
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree automata
Tree alphabet:
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples): PP ⊥
> B P
⊥
⊥
⊥ Theorem [Thatcher and Wright, 1968]
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree automata
Tree alphabet:
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples): PP ⊥
> B P
⊥
⊥
⊥ Theorem [Thatcher and Wright, 1968]
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree automata
Tree alphabet:
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples): PP ⊥
> B P
⊥
⊥
⊥ Theorem [Thatcher and Wright, 1968]
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree automata Tree alphabet:
B
⊥ P ⊥
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples): PP ⊥
> B P
⊥
⊥
⊥ Theorem [Thatcher and Wright, 1968]
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree automata Tree alphabet:
B
⊥ P ⊥
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples):P P ⊥
>
B P
⊥
⊥
⊥
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree automata Tree alphabet:
B
B
⊥ P
P ⊥
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples):P P ⊥
>
B P
⊥
⊥
⊥
Theorem [Thatcher and Wright, 1968]
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree automata Tree alphabet:
>
B
B
⊥ P
P ⊥
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples):P P ⊥
>
B P
⊥
⊥
⊥
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree automata Tree alphabet:
>
B
B
⊥ P
P ⊥
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples):P P ⊥
>
B P
⊥
⊥
⊥ Theorem [Thatcher and Wright, 1968]
MSOandtree automatahave the sameexpressive poweron trees
Corollary
Query evaluation of MSO on trees is inlinear time.
13/33
Tree alphabet:
>
B
B
⊥ P
P ⊥
•
Bottom-up deterministictree automaton•
“Is there both a pink and a blue node?”•
States:{⊥,B,P,>}•
Final states: {>}•
Initial function: ⊥ P B•
Transitions(examples):P P ⊥
>
B P
⊥
⊥
⊥ Theorem [Thatcher and Wright, 1968]
MSOandtree automatahave the sameexpressive poweron trees Corollary
Query evaluation of MSO on trees is inlinear time. 13/33
Query evaluation on trees
Database: atreeTwhere nodes have a color from an alphabet
?
QueryQ: asentencein monadicsecond-order logic (MSO)
•P (x)means “xis blue”
•x→ymeans “xis the parent ofy”
“Is there both a pink and a blue node?”
∃x y P (x)∧P (y)
i
Result: TRUE/FALSE indicating ifT satisfies the queryQComputational complexityas a function of thetreeT (the queryQisfixed)
14/33
Query evaluation on treelike data
Database: atreelike databaseT
???
?
QueryQ: asentencein monadic second-order logic (MSO)•x→ymeans “xis the parent ofy”
(Metro|RER)∗
|(Bus|Tram)∗
i
Result: TRUE/FALSE indicating ifT satisfies the queryQComputational complexityas a function of thetreeT (the queryQisfixed)
14/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Treewidth by example:
•
Treeshave treewidth1•
Cycleshave treewidth2•
k-cliquesand(k−1)-gridshave treewidthk−1→ Treelike: thetreewidthis bounded by aconstant
15/33
Courcelle’s theorem
(RER|metro)*
|(bus|tram)*
MSO query Treelike data
Theorem [Courcelle, 1990]
For any fixed Boolean MSO queryQandk∈N, given a databaseDof treewidth≤k,
we can compute inlinear timeinDwhetherDsatisfiesQ
16/33
Courcelle’s theorem
Tree automaton (RER|metro)*
|(bus|tram)*
MSO query Treelike data
For any fixed Boolean MSO queryQandk∈N, given a databaseDof treewidth≤k,
we can compute inlinear timeinDwhetherDsatisfiesQ
16/33
Courcelle’s theorem
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query Treelike data
linear
Theorem [Courcelle, 1990]
For any fixed Boolean MSO queryQandk∈N, given a databaseDof treewidth≤k,
we can compute inlinear timeinDwhetherDsatisfiesQ
16/33
Courcelle’s theorem
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query Treelike data
Query answer
TRUE
linear linear
For any fixed Boolean MSO queryQandk∈N, given a databaseDof treewidth≤k,
we can compute inlinear timeinDwhetherDsatisfiesQ
16/33
Courcelle’s theorem
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query Treelike data
Query answer
TRUE
linear linear
Theorem [Courcelle, 1990]
For any fixed Boolean MSO queryQandk∈N, given a databaseDof treewidth≤k,
we can compute inlinear timeinDwhetherDsatisfiesQ 16/33
Introduction Existing tools
Provenance circuits and probabilistic query evaluation Other applications
17/33
Probabilistic query evaluation on treelike data
•
DatabaseDwithtreewidth≤kfor some constantk
•
Probabilityof each fact ofD to be actually present in the data (independently from other facts)?
QueryQ: asentencein monadicsecond-order logic (MSO)
(Metro|RER)∗
|(Bus|Tram)∗
i
Result:Probabilitythat the databaseDsatisfies queryQComputational complexityas a function of thedatabaseD (the queryQisfixed)
18/33
Probabilistic query evaluation on treelike data
•
DatabaseDwithtreewidth≤kfor some constantk
•
Probabilityof each fact ofD to be actually present in the data (independently from other facts)?
QueryQ: asentencein monadicsecond-order logic (MSO)
(Metro|RER)∗
|(Bus|Tram)∗
i
Result:Probabilitythat the databaseDsatisfies queryQComputational complexityas a function of thedatabaseD (the queryQisfixed)
18/33
Probabilistic query evaluation on treelike data
•
DatabaseDwithtreewidth≤kfor some constantk
•
Probabilityof each fact ofD to be actually present in the data (independently from other facts)?
QueryQ: asentencein monadicsecond-order logic (MSO)
(Metro|RER)∗
|(Bus|Tram)∗
i
Result: Probabilitythat the databaseDsatisfies queryQComputational complexityas a function of thedatabaseD (the queryQisfixed)
18/33
•
DatabaseDwithtreewidth≤kfor some constantk
•
Probabilityof each fact ofD to be actually present in the data (independently from other facts)?
QueryQ: asentencein monadicsecond-order logic (MSO)
(Metro|RER)∗
|(Bus|Tram)∗
i
Result: Probabilitythat the databaseDsatisfies queryQComputational complexityas a function of thedatabaseD (the queryQisfixed)
18/33
Roadmap
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query Treelike data
linear
19/33
Tree automaton Tree encoding
(RER|metro)*
|(bus|tram)*
MSO query Treelike data
linear linear
Provenance circuit
19/33
Roadmap
Tree automaton Uncertain tree encoding
(RER|metro)*
|(bus|tram)*
MSO query Probabilistic treelike data
linear linear
Provenance circuit
Probability
95%
linear
+probabilities
19/33
Uncertain trees
1
5
6 7 2
3 4
keep(1) ordiscard(0) node labels Valuation: {2,3,77→1, ∗ 7→0}
A:“Is there both a pink and a blue node?”
20/33
Uncertain trees
1
5
6 7 2
3 4
Avaluationof a tree decides whether to keep(1) ordiscard(0) node labels
Valuation: {2,3,77→1, ∗ 7→0}
A:“Is there both a pink and a blue node?”
20/33
Uncertain trees
1
5
6 7 2
3 4
Avaluationof a tree decides whether to keep(1) ordiscard(0) node labels Valuation: {2,3,77→1, ∗ 7→0}
20/33
Uncertain trees
1
5
6 7 2
3 4
Avaluationof a tree decides whether to keep(1) ordiscard(0) node labels Valuation: {27→1, ∗ 7→0}
A:“Is there both a pink and a blue node?”
20/33
Uncertain trees
1
5
6 7 2
3 4
Avaluationof a tree decides whether to keep(1) ordiscard(0) node labels Valuation: {2,77→1, ∗ 7→0}
20/33
Uncertain trees
1
5
6 7 2
3 4
Avaluationof a tree decides whether to keep(1) ordiscard(0) node labels Valuation: {2,77→1, ∗ 7→0}
A:“Is there both a pink and a blue node?”
20/33
1
5
6 7 2
3 4
Avaluationof a tree decides whether to keep(1) ordiscard(0) node labels Valuation: {2,3,77→1, ∗ 7→0}
A:“Is there both a pink and a blue node?”
The tree automatonAaccepts
20/33
Uncertain trees
1
5
6 7 2
3 4
Avaluationof a tree decides whether to keep(1) ordiscard(0) node labels Valuation: {27→1, ∗ 7→0}
A:“Is there both a pink and a blue node?”
The tree automatonArejects
20/33
1
5
6 7 2
3 4
Avaluationof a tree decides whether to keep(1) ordiscard(0) node labels Valuation: {2,77→1, ∗ 7→0}
A:“Is there both a pink and a blue node?”
The tree automatonAaccepts
20/33
Boolean circuit
∨
¬
x
∧
y
•
Directed acyclic graph ofgates•
Outputgate:•
Variablegates: x•
Internalgates: ∨ ∧ ¬•
Valuation: function from variables to{0,1} Example:ν ={x7→0, y7→1}... mapped to121/33
Boolean circuit
∨
¬
x
∧
y
•
Directed acyclic graph ofgates•
Outputgate:•
Variablegates: x•
Internalgates: ∨ ∧ ¬•
Valuation: function from variables to{0,1} Example:ν ={x7→0, y7→1}... mapped to121/33
Boolean circuit
∨
¬
x
∧
y
•
Directed acyclic graph ofgates•
Outputgate:•
Variablegates: x•
Internalgates: ∨ ∧ ¬•
Valuation: function from variables to{0,1} Example:ν ={x7→0, y7→1}... mapped to121/33