Pattern Matching on Trees
•
ThedataTis no longertextbut is now atree:<body>1
<section>3
<p>5
<img>7
<img>6
<h2>4
<div>2
•
ThepatternPasks about thestructureof the tree: Is thereα:
anh2header and
β:
animagein the same section?
•
Results:hα:4,β :6i,hα:4,β:7iPattern Matching on Trees
•
ThedataTis no longertextbut is now atree:<body>1
<section>3
<p>5
<img>7
<img>6
<h2>4
<div>2
•
ThepatternPasks about thestructureof the tree:Is there
α:
anh2header and
β:
animagein the same section?
•
Results:hα:4,β :6i,hα:4,β:7iPattern Matching on Trees
•
ThedataTis no longertextbut is now atree:<body>1
<section>3
<p>5
<img>7
<img>6
<h2>4
<div>2
•
ThepatternPasks about thestructureof the tree:Is there
α:
anh2header and
β:
animagein the same section?
•
Results:hα:4,β :6i,hα:4,β:7i
Pattern Matching on Trees
•
ThedataTis no longertextbut is now atree:<body>1
<section>3
<p>5
<img>7
<img>6
<h2>4
<div>2
•
ThepatternPasks about thestructureof the tree:Is thereα: anh2header andβ: animagein the same section?
•
Results:hα:4,β :6i,hα:4,β:7i
Pattern Matching on Trees
•
ThedataTis no longertextbut is now atree:<body>1
<section>3
<p>5
<img>7
<img>6
<h2>4
<div>2
•
ThepatternPasks about thestructureof the tree:Is thereα: anh2header andβ: animagein the same section?
•
Results:hα:4,β :6i,hα:4,β:7iDefinitions and Results on Trees
•
Tree patternsPcan be written as a kind oftree automaton...•
Existing work has studied this problem and shown: Theorem [Bagan, 2006]We can find all matches on a treeTof a tree patternP (with constantly many capture variables) with:
• PreprocessinglinearinT
and exponential in P
• DelayconstantinT
and exponential in P
•
Again, this only measures thecomplexity inT!→ We areworking onproving the following: Conjecture
• Preprocessing inO(|T| ×Poly(P))
• DelaypolynomialinPandindependentfromT
Definitions and Results on Trees
•
Tree patternsPcan be written as a kind oftree automaton...•
Existing work has studied this problem and shown:Theorem [Bagan, 2006]
We can find all matches on a treeTof a tree patternP (with constantly many capture variables) with:
• PreprocessinglinearinT
and exponential in P
• DelayconstantinT
and exponential in P
•
Again, this only measures thecomplexity inT!→ We areworking onproving the following: Conjecture
• Preprocessing inO(|T| ×Poly(P))
• DelaypolynomialinPandindependentfromT
Definitions and Results on Trees
•
Tree patternsPcan be written as a kind oftree automaton...•
Existing work has studied this problem and shown:Theorem [Bagan, 2006]
We can find all matches on a treeTof a tree patternP (with constantly many capture variables) with:
• PreprocessinglinearinT
and exponential in P
• DelayconstantinT
and exponential in P
•
Again, this only measures thecomplexity inT!→ We areworking onproving the following: Conjecture
• Preprocessing inO(|T| ×Poly(P))
• DelaypolynomialinPandindependentfromT
Definitions and Results on Trees
•
Tree patternsPcan be written as a kind oftree automaton...•
Existing work has studied this problem and shown:Theorem [Bagan, 2006]
We can find all matches on a treeTof a tree patternP (with constantly many capture variables) with:
• PreprocessinglinearinTand exponential in P
• DelayconstantinTand exponential in P
•
Again, this only measures thecomplexity inT!→ We areworking onproving the following: Conjecture
• Preprocessing inO(|T| ×Poly(P))
• DelaypolynomialinPandindependentfromT
Definitions and Results on Trees
•
Tree patternsPcan be written as a kind oftree automaton...•
Existing work has studied this problem and shown:Theorem [Bagan, 2006]
We can find all matches on a treeTof a tree patternP (with constantly many capture variables) with:
• PreprocessinglinearinTand exponential in P
• DelayconstantinTand exponential in P
•
Again, this only measures thecomplexity inT!→ We areworking onproving the following:
Conjecture
• Preprocessing inO(|T| ×Poly(P))
• DelaypolynomialinPandindependentfromT
Proof Idea for Trees: Structure
Similar structureto the previous proof, but with acircuit:
•
Preprocessing:Compute acircuit representationof the answers•
Enumeration:Apply ageneric algorithmon the circuitTree
Phase 1:
Preprocessing Data structure
Phase 2:
Enumeration
hα:4,β:6i, hα:4,β:7i
Results
∃ssection(s)∧
s α∧s β∧
h2(α)∧img(β)
Pattern
Proof Idea for Trees: Structure
Similar structureto the previous proof, but with acircuit:
•
Preprocessing:Compute acircuit representationof the answers•
Enumeration:Apply ageneric algorithmon the circuit TreeProof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × : relational productProof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × : relational productProof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × :→ relational product
Proof Idea for Trees: Set Circuits
Aset circuitrepresents aset of answersto a patternP(α,β)
•
Singletonα:6→“the variableαis mapped to node6”•
Tuplehα:4,β:6i: tuple of singletons•
The circuit captures asetof tuples, e.g.,hα:4,β:6i,hα:4,β:7i
Three kinds ofset-valued gates:
•
Variable gate α:4 :→ captures hα:4i
•
Union gate ∪ :→ union of sets of tuples
•
Product gate × : relational productProof Idea for Trees: Results given atreeT, we can build inO(|T| × |A|)aset circuitcapturing exactly the set of tuples{hα1 :n1, . . . ,αk:nkiin the output ofAonT
Proof Idea for Trees: Results
Given a set circuitsatisfying some conditions, we can enumerate all tuples that it captures with linear preprocessing and constant delay E.g., for
hα:4,β:6i,hα:4,β:7i : enumeratehα:4,β:6ithenhα:4,β:7i