• Aucun résultat trouvé

May31st,2015 AntoineAmarilli StructurallyTractableUncertainData

N/A
N/A
Protected

Academic year: 2022

Partager "May31st,2015 AntoineAmarilli StructurallyTractableUncertainData"

Copied!
85
0
0

Texte intégral

(1)

Introduction Trees Treelike instances Future work

Structurally Tractable Uncertain Data

Antoine Amarilli

Supervisor: Pierre Senellart Expected graduation: August 2016

Télécom ParisTech, France

May 31st, 2015

(2)

Introduction Trees Treelike instances Future work

Uncertain data management

Is data alwayscompleteandcertain?

Unreliable sources

Crowdsourcing

Massive collaborations: Wikidata, etc. Error-prone processing

Unsupervisedinformation extraction

OCR,speech recognition, etc. Outdated or stale data

We need uncertain data management

(3)

Introduction Trees Treelike instances Future work

Uncertain data management

Is data alwayscompleteandcertain?

Unreliable sources

Crowdsourcing

Massive collaborations: Wikidata, etc.

Error-prone processing

Unsupervisedinformation extraction

OCR,speech recognition, etc. Outdated or stale data

We need uncertain data management

(4)

Introduction Trees Treelike instances Future work

Uncertain data management

Is data alwayscompleteandcertain?

Unreliable sources

Crowdsourcing

Massive collaborations: Wikidata, etc.

Error-prone processing

Unsupervisedinformation extraction

OCR,speech recognition, etc.

Outdated or stale data

We need uncertain data management

(5)

Introduction Trees Treelike instances Future work

Uncertain data management

Is data alwayscompleteandcertain?

Unreliable sources

Crowdsourcing

Massive collaborations: Wikidata, etc.

Error-prone processing

Unsupervisedinformation extraction

OCR,speech recognition, etc.

Outdated or stale data

We need uncertain data management

(6)

Introduction Trees Treelike instances Future work

Uncertain data management

Is data alwayscompleteandcertain?

Unreliable sources

Crowdsourcing

Massive collaborations: Wikidata, etc.

Error-prone processing

Unsupervisedinformation extraction

OCR,speech recognition, etc.

Outdated or stale data

(7)

Introduction Trees Treelike instances Future work

Example model: TID

Consider a relational instance Date Animal Wed 3rd Kangaroo Wed 3rd Tasmanian devil Thu 4th Kangaroo Thu 4th Tasmanian devil Fri 5th Kangaroo Fri 5th Tasmanian devil

Add probabilitiesto facts

Assume independence between facts

Semantics: aprobability distributionon regular instances What about queries? (Boolean CQs)

Semantics: compute theprobabilitythat the query holds

(8)

Introduction Trees Treelike instances Future work

Example model: TID

Consider a relational instance Date Animal Wed 3rd Kangaroo Wed 3rd Tasmanian devil Thu 4th Kangaroo Thu 4th Tasmanian devil Fri 5th Kangaroo Fri 5th Tasmanian devil Add probabilitiesto facts

Assume independence between facts

Semantics: aprobability distributionon regular instances What about queries? (Boolean CQs)

Semantics: compute theprobabilitythat the query holds

(9)

Introduction Trees Treelike instances Future work

Example model: TID

Consider a relational instance

Date Animal Probability

Wed 3rd Kangaroo 5%

Wed 3rd Tasmanian devil 0%

Thu 4th Kangaroo 6%

Thu 4th Tasmanian devil 2%

Fri 5th Kangaroo 20%

Fri 5th Tasmanian devil 15%

Add probabilitiesto facts

Assume independence between facts

Semantics: aprobability distributionon regular instances What about queries? (Boolean CQs)

Semantics: compute theprobabilitythat the query holds

(10)

Introduction Trees Treelike instances Future work

Example model: TID

Consider a relational instance

Date Animal Probability

Wed 3rd Kangaroo 5%

Wed 3rd Tasmanian devil 0%

Thu 4th Kangaroo 6%

Thu 4th Tasmanian devil 2%

Fri 5th Kangaroo 20%

Fri 5th Tasmanian devil 15%

Add probabilitiesto facts

Semantics: aprobability distributionon regular instances What about queries? (Boolean CQs)

Semantics: compute theprobabilitythat the query holds

(11)

Introduction Trees Treelike instances Future work

Example model: TID

Consider a relational instance

Date Animal Probability

Wed 3rd Kangaroo 5%

Wed 3rd Tasmanian devil 0%

Thu 4th Kangaroo 6%

Thu 4th Tasmanian devil 2%

Fri 5th Kangaroo 20%

Fri 5th Tasmanian devil 15%

Add probabilitiesto facts

Assume independence between facts

Semantics: aprobability distributionon regular instances

What about queries? (Boolean CQs)

Semantics: compute theprobabilitythat the query holds

(12)

Introduction Trees Treelike instances Future work

Example model: TID

Consider a relational instance

Date Animal Probability

Wed 3rd Kangaroo 5%

Wed 3rd Tasmanian devil 0%

Thu 4th Kangaroo 6%

Thu 4th Tasmanian devil 2%

Fri 5th Kangaroo 20%

Fri 5th Tasmanian devil 15%

Add probabilitiesto facts

Semantics: compute theprobabilitythat the query holds

(13)

Introduction Trees Treelike instances Future work

Example model: TID

Consider a relational instance

Date Animal Probability

Wed 3rd Kangaroo 5%

Wed 3rd Tasmanian devil 0%

Thu 4th Kangaroo 6%

Thu 4th Tasmanian devil 2%

Fri 5th Kangaroo 20%

Fri 5th Tasmanian devil 15%

Add probabilitiesto facts

Assume independence between facts

Semantics: aprobability distributionon regular instances What about queries? (Boolean CQs)

(14)

Introduction Trees Treelike instances Future work

Big problem: Tractability

Evaluate thefixed Boolean CQ:∃xy R(x)S(x,y)T(y) Measure data complexity, i.e., as a function of theinstance

#P-hard [Dalvi and Suciu, 2007] (instead of AC0) Existing approaches:

Avoidhard queries [Dalvi and Suciu, 2012] Use samplingto getapproximate answers

(15)

Introduction Trees Treelike instances Future work

Big problem: Tractability

Evaluate thefixed Boolean CQ:∃xy R(x)S(x,y)T(y) Measure data complexity, i.e., as a function of theinstance

#P-hard [Dalvi and Suciu, 2007] (instead of AC0)

Existing approaches:

Avoidhard queries [Dalvi and Suciu, 2012] Use samplingto getapproximate answers

(16)

Introduction Trees Treelike instances Future work

Big problem: Tractability

Evaluate thefixed Boolean CQ:∃xy R(x)S(x,y)T(y) Measure data complexity, i.e., as a function of theinstance

#P-hard [Dalvi and Suciu, 2007] (instead of AC0) Existing approaches:

Avoidhard queries [Dalvi and Suciu, 2012]

Usesampling to getapproximate answers

(17)

Introduction Trees Treelike instances Future work

The general idea

Input instances are notarbitrary!

Impose structural restrictions on instances

Prove fixed-parameter tractability results

(18)

Introduction Trees Treelike instances Future work

This talk

Parameter: instance treewidth Bound it by a constant

MSO queries havelineardata complexity [Courcelle, 1990]

Also holds on TID instances(with unit cost arithmetics) (joint work with Pierre Bourhis and Pierre Senellart)

(19)

Introduction Trees Treelike instances Future work

This talk

Parameter: instance treewidth Bound it by a constant

MSO queries havelineardata complexity [Courcelle, 1990]

Also holds on TID instances(with unit cost arithmetics) (joint work with Pierre Bourhis and Pierre Senellart)

(20)

Introduction Trees Treelike instances Future work

Table of contents

1 Introduction

2 Trees

3 Treelike instances

4 Future work

(21)

Introduction Trees Treelike instances Future work

Uncertain tree example

A possiblePrXML tree, from Wikidata facts:

Q298423

ind

occupation musician

0.4 ind

place of birth Crescent

0.7 ind

surname Manning 0.7 given name

mux Chelsea Bradley

0.4 0.6

Probabilities reflectcontributor trustworthiness

(22)

Introduction Trees Treelike instances Future work

Formalizing uncertain trees

1

5

7 6

2

4 3

A valuationof a tree decides whether to keepor discard node labels.

Example query:

“Is there both a red and green node?”

Valuation: {1,2,3,4,5,6,7} The query is true

(23)

Introduction Trees Treelike instances Future work

Formalizing uncertain trees

1

5

7 6

2

4 3

A valuationof a tree decides whether to keepor discard node labels.

Example query:

“Is there both a red and green node?”

Valuation: {1,2,5,6} The query is false

(24)

Introduction Trees Treelike instances Future work

Formalizing uncertain trees

1

5

7 6

2

4 3

A valuationof a tree decides whether to keepor discard node labels.

Example query:

“Is there both a red and green node?”

Valuation: {2,7} The query is true

(25)

Introduction Trees Treelike instances Future work

Provenance formulae and circuits

1

5

7 6

2

4 3

Whichvaluations satisfy the query?

Provenance formula of a query q on an uncertain treeT:

Boolean formulaϕ onvariablesx1. . .x7

ν(T)satisfiesqiffν(ϕ)istrue Provenance circuit ofq onT [Deutch et al., 2014]

Boolean circuitC withinput gatesg1. . .g7

ν(T)satisfiesqiffν(C)is true

(26)

Introduction Trees Treelike instances Future work

Provenance formulae and circuits

1

5

7 6

2

4 3

Whichvaluations satisfy the query?

Provenance formula of a query q on an uncertain treeT:

Boolean formulaϕ onvariablesx1. . .x7

ν(T)satisfiesqiffν(ϕ)istrue

Provenance circuit ofq onT [Deutch et al., 2014]

Boolean circuitC withinput gatesg1. . .g7

ν(T)satisfiesqiffν(C)is true

(27)

Introduction Trees Treelike instances Future work

Provenance formulae and circuits

1

5

7 6

2

4 3

Whichvaluations satisfy the query?

Provenance formula of a query q on an uncertain treeT:

Boolean formulaϕ onvariablesx1. . .x7

ν(T)satisfiesqiffν(ϕ)istrue Provenance circuit ofq onT [Deutch et al., 2014]

Boolean circuitC withinput gatesg1. . .g7

ν(T)satisfiesqiffν(C)is true

(28)

Introduction Trees Treelike instances Future work

Example

1

5

7 6

2

4 3

Is there both a red and agreennode?

Provenance formula: (x2∨x3)∧x7 Provenance circuit:

g7

g2 g3

(29)

Introduction Trees Treelike instances Future work

Example

1

5

7 6

2

4 3

Is there both a red and agreennode?

Provenance formula: (x2∨x3)∧x7

Provenance circuit:

g7

g2 g3

(30)

Introduction Trees Treelike instances Future work

Example

1

5

7 6

2

4 3

Is there both a red and agreennode?

Provenance formula: (x2∨x3)∧x7 Provenance circuit:

g7

g2 g3

(31)

Introduction Trees Treelike instances Future work

Our main result on trees

Theorem

For any fixedMSO query q (first order + quantify on sets) we can compute aprovenance circuitC for any input tree T inlinear time in the input T.

Key ideas:

Compile q to a tree automaton[Thatcher and Wright, 1968] Write thepossible transitions of the automaton onT in C Corollary

If tree nodes have a probabilityof being independently kept, we can compute thequery probabilityin linear time.

Relates tomessage passing[Lauritzen and Spiegelhalter, 1988]

Already known [Cohen et al., 2009]

(32)

Introduction Trees Treelike instances Future work

Our main result on trees

Theorem

For any fixedMSO query q (first order + quantify on sets) we can compute aprovenance circuitC for any input tree T inlinear time in the input T.

Key ideas:

Compile q to a tree automaton[Thatcher and Wright, 1968]

Write thepossible transitions of the automaton onT in C

Corollary

If tree nodes have a probabilityof being independently kept, we can compute thequery probabilityin linear time.

Relates tomessage passing[Lauritzen and Spiegelhalter, 1988]

Already known [Cohen et al., 2009]

(33)

Introduction Trees Treelike instances Future work

Our main result on trees

Theorem

For any fixedMSO query q (first order + quantify on sets) we can compute aprovenance circuitC for any input tree T inlinear time in the input T.

Key ideas:

Compile q to a tree automaton[Thatcher and Wright, 1968]

Write thepossible transitions of the automaton onT in C Corollary

If tree nodes have aprobability of being independently kept, we can compute thequery probabilityin linear time.

Relates tomessage passing[Lauritzen and Spiegelhalter, 1988]

Already known [Cohen et al., 2009]

(34)

Introduction Trees Treelike instances Future work

Our main result on trees

Theorem

For any fixedMSO query q (first order + quantify on sets) we can compute aprovenance circuitC for any input tree T inlinear time in the input T.

Key ideas:

Compile q to a tree automaton[Thatcher and Wright, 1968]

Write thepossible transitions of the automaton onT in C Corollary

If tree nodes have aprobability of being independently kept, we

(35)

Introduction Trees Treelike instances Future work

Table of contents

1 Introduction

2 Trees

3 Treelike instances

4 Future work

(36)

Introduction Trees Treelike instances Future work

Treewidth intuition

Generalize fromtrees totreelike instances:

Treewidth: measure on instances Treeshave treewidth1 Cycleshave treewidth2

k-cliquesandk-gridshave treewidthk1 Treelike: the treewidthis bounded by a constant

Treelike instances can be encodedto trees

(37)

Introduction Trees Treelike instances Future work

Treewidth intuition

Generalize fromtrees totreelike instances:

Treewidth: measure on instances Treeshave treewidth1 Cycleshave treewidth2

k-cliquesandk-gridshave treewidthk1 Treelike: the treewidthis bounded by a constant

Treelike instances can be encodedto trees

(38)

Introduction Trees Treelike instances Future work

Treewidth intuition

Generalize fromtrees totreelike instances:

Treewidth: measure on instances Treeshave treewidth1 Cycleshave treewidth2

k-cliquesandk-gridshave treewidthk1 Treelike: the treewidthis bounded by a constant

Treelike instances can be encodedto trees

(39)

Introduction Trees Treelike instances Future work

Treewidth formal definition

Instance:

N a b b c c d d e e f

S a c b e

Gaifman graph: a

b

c d

e f

Tree decomp.: a b c b c e c d e e f

Tree encoding: N(a1,a2) N(a2,a3) S(a1,a3) S(a2,a4) N(a3,a1) N(a1,a4)

N(a4,a1)

Treelike: constantbound on the maximal bag size

(40)

Introduction Trees Treelike instances Future work

Treewidth formal definition

Instance:

N a b b c c d d e e f

S a c

Gaifman graph:

a

b

c d

e f

Tree decomp.: a b c b c e c d e e f

Tree encoding: N(a1,a2) N(a2,a3) S(a1,a3) S(a2,a4) N(a3,a1) N(a1,a4)

N(a4,a1)

Treelike: constantbound on the maximal bag size

(41)

Introduction Trees Treelike instances Future work

Treewidth formal definition

Instance:

N a b b c c d d e e f

S a c b e

Gaifman graph:

a

b

c d

e f

Tree decomp.:

a b c b c e c d e e f

Tree encoding: N(a1,a2) N(a2,a3) S(a1,a3) S(a2,a4) N(a3,a1) N(a1,a4)

N(a4,a1)

Treelike: constantbound on the maximal bag size

(42)

Introduction Trees Treelike instances Future work

Treewidth formal definition

Instance:

N a b b c c d d e e f

S a c

Gaifman graph:

a

b

c d

e f

Tree decomp.:

a b c b c e c d e e f

Tree encoding:

N(a1,a2) N(a2,a3) S(a1,a3) S(a2,a4) N(a3,a1) N(a4,a1)

Treelike: constantbound on the maximal bag size

(43)

Introduction Trees Treelike instances Future work

Treewidth formal definition

Instance:

N a b b c c d d e e f

S a c b e

Gaifman graph:

a

b

c d

e f

Tree decomp.:

a b c b c e c d e e f

Tree encoding:

N(a1,a2) N(a2,a3) S(a1,a3) S(a2,a4) N(a3,a1) N(a1,a4)

N(a4,a1)

Treelike: constantbound on themaximal bag size

(44)

Introduction Trees Treelike instances Future work

Our main result on treelike instances

Theorem

For any fixedMSO query q and bound k∈N, for any inputinstance I of treewidth≤k,

we can compute inlinear timea provenance circuit of q on I.

Key ideas:

Compute tree decomposition and tree encoding inlinear time Compileqto an automaton on encodings [Flum et al., 2002] Usethe previous construction

Possible subinstances are possible valuations of the encoding Corollary

MSO queries havelineardata complexity on treelike TID instances.

(45)

Introduction Trees Treelike instances Future work

Our main result on treelike instances

Theorem

For any fixedMSO query q and bound k∈N, for any inputinstance I of treewidth≤k,

we can compute inlinear timea provenance circuit of q on I.

Key ideas:

Compute tree decomposition and tree encoding inlinear time Compileqto an automaton on encodings [Flum et al., 2002]

Usethe previous construction

Possible subinstances are possible valuations of the encoding

Corollary

MSO queries havelineardata complexity on treelike TID instances.

(46)

Introduction Trees Treelike instances Future work

Our main result on treelike instances

Theorem

For any fixedMSO query q and bound k∈N, for any inputinstance I of treewidth≤k,

we can compute inlinear timea provenance circuit of q on I.

Key ideas:

Compute tree decomposition and tree encoding inlinear time Compileqto an automaton on encodings [Flum et al., 2002]

Usethe previous construction

Possible subinstances are possible valuations of the encoding

(47)

Introduction Trees Treelike instances Future work

Further results

Support other models with dependenciesbetween facts:

Block-independent disjoint(BID): mutually exclusive facts pc-tables: events and Boolean annotations

Support other tasks:

Counting query resultsencodes to probabilistic evaluation General connection tosemiring provenance[Green et al., 2007]

(48)

Introduction Trees Treelike instances Future work

Further results

Support other models with dependenciesbetween facts:

Block-independent disjoint(BID): mutually exclusive facts pc-tables: events and Boolean annotations

Support other tasks:

Counting query resultsencodes to probabilistic evaluation General connection tosemiring provenance[Green et al., 2007]

(49)

Introduction Trees Treelike instances Future work

Table of contents

1 Introduction

2 Trees

3 Treelike instances

4 Future work

(50)

Introduction Trees Treelike instances Future work

Extending the provenance connection

Negation:

Semiring provenance usually defined forpositive queries Yet ourprovenance circuitswork fine with negation

Relate this to provenance forqueries with negation?

Multiplicities:

Our works connects to theuniversal semiringN[X]... ... but only forUCQs, not arbitraryMSO

Missing: notion ofmultiplicityfor MSO (multisets?) Structural restrictions:

Are real-world instancestree-like? Are there other possiblerestrictions? Experiments?

(51)

Introduction Trees Treelike instances Future work

Extending the provenance connection

Negation:

Semiring provenance usually defined forpositive queries Yet ourprovenance circuitswork fine with negation

Relate this to provenance forqueries with negation?

Multiplicities:

Our works connects to theuniversal semiringN[X]...

... but only forUCQs, not arbitraryMSO

Missing: notion ofmultiplicityfor MSO (multisets?)

Structural restrictions:

Are real-world instancestree-like? Are there other possiblerestrictions? Experiments?

(52)

Introduction Trees Treelike instances Future work

Extending the provenance connection

Negation:

Semiring provenance usually defined forpositive queries Yet ourprovenance circuitswork fine with negation

Relate this to provenance forqueries with negation?

Multiplicities:

Our works connects to theuniversal semiringN[X]...

... but only forUCQs, not arbitraryMSO

Missing: notion ofmultiplicityfor MSO (multisets?) Structural restrictions:

Are real-world instancestree-like?

(53)

Introduction Trees Treelike instances Future work

Connect to other frameworks

Compiling to automata has high combined complexity

Investigate Monadic Datalogapproaches [Gottlob et al., 2010]

Uncertainty on facts notvalues

Connect to work on nulls[Libkin, 2014]

What about reasoning on uncertain data and its implications?

Connect totractable languages (e.g., guarded Datalog) What about incorporating new evidence?

Connect to work on conditioning [Tang et al., 2012]

(54)

Introduction Trees Treelike instances Future work

Connect to other frameworks

Compiling to automata has high combined complexity

Investigate Monadic Datalogapproaches [Gottlob et al., 2010]

Uncertainty on facts notvalues

Connect to work on nulls[Libkin, 2014]

What about reasoning on uncertain data and its implications?

Connect totractable languages (e.g., guarded Datalog) What about incorporating new evidence?

Connect to work on conditioning [Tang et al., 2012]

(55)

Introduction Trees Treelike instances Future work

Connect to other frameworks

Compiling to automata has high combined complexity

Investigate Monadic Datalogapproaches [Gottlob et al., 2010]

Uncertainty on facts notvalues

Connect to work on nulls[Libkin, 2014]

What about reasoning on uncertain data and its implications?

Connect totractable languages (e.g., guarded Datalog)

What about incorporating new evidence?

Connect to work on conditioning [Tang et al., 2012]

(56)

Introduction Trees Treelike instances Future work

Connect to other frameworks

Compiling to automata has high combined complexity

Investigate Monadic Datalogapproaches [Gottlob et al., 2010]

Uncertainty on facts notvalues

Connect to work on nulls[Libkin, 2014]

What about reasoning on uncertain data and its implications?

Connect totractable languages (e.g., guarded Datalog)

(57)

Introduction Trees Treelike instances Future work

Other projects and directions

Open-world query answering (with Michael Benedikt)

Certaintyof a Boolean CQ when completing under constraints Which constraint languages aredecidable?

What is the impact of assumingfiniteness?

Uncertain ordered data

Bag semanticsfor the relational algebra

(with M. Lamine Ba, Daniel Deutch, Pierre Senellart) Interpolation schemesfor partially ordered numerical values (with Yael Amsterdamer, Tova Milo, Pierre Senellart) Problem ofinstance possibility

Onuncertain orders(labeled posets) Onprobabilistic XML

Thanks for your attention!

(58)

Introduction Trees Treelike instances Future work

Other projects and directions

Open-world query answering (with Michael Benedikt)

Certaintyof a Boolean CQ when completing under constraints Which constraint languages aredecidable?

What is the impact of assumingfiniteness?

Uncertain ordered data

Bag semanticsfor the relational algebra

(with M. Lamine Ba, Daniel Deutch, Pierre Senellart) Interpolation schemesfor partially ordered numerical values (with Yael Amsterdamer, Tova Milo, Pierre Senellart)

Problem ofinstance possibility

Onuncertain orders(labeled posets) Onprobabilistic XML

Thanks for your attention!

(59)

Introduction Trees Treelike instances Future work

Other projects and directions

Open-world query answering (with Michael Benedikt)

Certaintyof a Boolean CQ when completing under constraints Which constraint languages aredecidable?

What is the impact of assumingfiniteness?

Uncertain ordered data

Bag semanticsfor the relational algebra

(with M. Lamine Ba, Daniel Deutch, Pierre Senellart) Interpolation schemesfor partially ordered numerical values (with Yael Amsterdamer, Tova Milo, Pierre Senellart) Problem ofinstance possibility

Onuncertain orders(labeled posets) Onprobabilistic XML

Thanks for your attention!

(60)

Introduction Trees Treelike instances Future work

Other projects and directions

Open-world query answering (with Michael Benedikt)

Certaintyof a Boolean CQ when completing under constraints Which constraint languages aredecidable?

What is the impact of assumingfiniteness?

Uncertain ordered data

Bag semanticsfor the relational algebra

(with M. Lamine Ba, Daniel Deutch, Pierre Senellart) Interpolation schemesfor partially ordered numerical values (with Yael Amsterdamer, Tova Milo, Pierre Senellart) Problem ofinstance possibility

(61)

References I

Cohen, S., Kimelfeld, B., and Sagiv, Y. (2009).

Running tree automata on probabilistic XML.

InPODS.

Courcelle, B. (1990).

The monadic second-order logic of graphs. I. Recognizable sets of finite graphs.

Inf. Comput., 85(1).

Dalvi, N. and Suciu, D. (2007).

Efficient query evaluation on probabilistic databases.

VLDBJ, 16(4):523–544.

(62)

References II

Dalvi, N. and Suciu, D. (2012).

The dichotomy of probabilistic inference for unions of conjunctive queries.

JACM, 59(6):30.

Deutch, D., Milo, T., Roy, S., and Tannen, V. (2014).

Circuits for datalog provenance.

InICDT.

Flum, J., Frick, M., and Grohe, M. (2002).

Query evaluation via tree-decompositions.

JACM, 49(6):716–752.

(63)

References III

Green, T. J., Karvounarakis, G., and Tannen, V. (2007).

Provenance semirings.

InPODS.

Lauritzen, S. L. and Spiegelhalter, D. J. (1988).

Local computations with probabilities on graphical structures and their application to expert systems.

J. Royal Statistical Society. Series B.

Libkin, L. (2014).

Incomplete data: what went wrong, and how to fix it.

InProc. SIGMOD.

Tang, R., Cheng, R., Wu, H., and Bressan, S. (2012).

A framework for conditioning uncertain relational data.

InDatabase and Expert Systems Applications. Springer.

(64)

References IV

Thatcher, J. W. and Wright, J. B. (1968).

Generalized finite automata theory with an application to a decision problem of second-order logic.

Mathematical systems theory, 2(1):57–81.

(65)

Semiring provenance [Green et al., 2007]

Semiring (K,⊕,⊗,0,1)

(K,)commutative monoid with identity0 (K,)commutative monoid with identity1

distributes over 0absorptive for

Idea: Maintainannotations on tuples while evaluating: Union: annotation is thesum of union tuples

Select: select as usual

Project: annotation is thesum of projected tuples Product: annotation is theproduct

(66)

Semiring provenance [Green et al., 2007]

Semiring (K,⊕,⊗,0,1)

(K,)commutative monoid with identity0 (K,)commutative monoid with identity1

distributes over 0absorptive for

Idea: Maintainannotations on tuples while evaluating:

Union: annotation is thesum of union tuples Select: select as usual

Project: annotation is thesum of projected tuples

(67)

Tree automata

Tree alphabet:

bNTA: bottom-up nondeterministic tree automaton

“Is there both a red and green node?”

States: {⊥,G,R,⊤} Final states: {⊤} Initial function:

R G

Transitions(examples): R

R

G R

(68)

Tree automata

Tree alphabet:

bNTA: bottom-up nondeterministic tree automaton

“Is there both a red and green node?”

States: {⊥,G,R,⊤} Final states: {⊤} Initial function:

R G

Transitions(examples): R

R

G R

(69)

Tree automata

Tree alphabet:

bNTA: bottom-up nondeterministic tree automaton

“Is there both a red and green node?”

States: {⊥,G,R,⊤}

Final states: {⊤} Initial function:

R G

Transitions(examples): R

R

G R

(70)

Tree automata

Tree alphabet:

bNTA: bottom-up nondeterministic tree automaton

“Is there both a red and green node?”

States: {⊥,G,R,⊤}

Final states: {⊤}

Initial function:

R G

Transitions(examples): R

R

G R

(71)

Tree automata

Tree alphabet:

bNTA: bottom-up nondeterministic tree automaton

“Is there both a red and green node?”

States: {⊥,G,R,⊤}

Final states: {⊤}

Initial function:

R G

Transitions(examples): R

R

G R

(72)

Tree automata

Tree alphabet:

bNTA: bottom-up nondeterministic tree automaton

“Is there both a red and green node?”

States: {⊥,G,R,⊤}

Final states: {⊤}

Initial function:

R G

Transitions(examples): R

R

G R

(73)

Tree automata

Tree alphabet:

G

R

bNTA: bottom-up nondeterministic tree automaton

“Is there both a red and green node?”

States: {⊥,G,R,⊤}

Final states: {⊤}

Initial function:

R G

Transitions(examples):

R

R

G R

(74)

Tree automata

Tree alphabet:

G R

bNTA: bottom-up nondeterministic tree automaton

“Is there both a red and green node?”

States: {⊥,G,R,⊤}

Final states: {⊤}

Initial function:

R G

Transitions(examples):

(75)

Tree automata

Tree alphabet:

G

G R

R

bNTA: bottom-up nondeterministic tree automaton

“Is there both a red and green node?”

States: {⊥,G,R,⊤}

Final states: {⊤}

Initial function:

R G

Transitions(examples):

R

R

G R

(76)

Constructing the provenance circuit

Construct a Boolean provenance circuitbottom-up

... q

1

q

2

...

in

(77)

Constructing the provenance circuit

Construct a Boolean provenance circuitbottom-up

q

1

q

2

∧ ...

in

whenever q

q

1

q

2

in in

q

∨ ∨ ∨

... ... ... ...

...

(78)

Constructing the provenance circuit

Construct a Boolean provenance circuitbottom-up

q

1

q

2

∧ ...

in

whenever q

q q

q

¬

∨ ∨ ∨

...

(79)

Example: block-independent disjoint (BID) instances

name city iso p pods melbourne au 0.8

pods sydney au 0.2

icalp tokyo jp 0.1

icalp kyoto jp 0.9

Evaluating a fixed CQ is #P-hard in general

For a treelikeinstance, linear time!

(80)

Example: block-independent disjoint (BID) instances

name city iso p pods melbourne au 0.8

pods sydney au 0.2

icalp tokyo jp 0.1

icalp kyoto jp 0.9

Evaluating a fixed CQ is #P-hard in general

For a treelikeinstance, linear time!

(81)

Example: block-independent disjoint (BID) instances

name city iso p pods melbourne au 0.8

pods sydney au 0.2

icalp tokyo jp 0.1

icalp kyoto jp 0.9

Evaluating a fixed CQ is #P-hard in general

For a treelikeinstance, linear time!

(82)

Supporting coefficients

In the world oftrees

The samevaluationcan be acceptedmultiple times

Number ofaccepting runsof the bNTA In the world oftreelike instances

The samematchcan be the image ofmultiple homomorphisms

Add assignment facts to represent possible assignments

Encode to a bNTA thatguesses them

(83)

Supporting coefficients

In the world oftrees

The samevaluationcan be acceptedmultiple times

Number ofaccepting runsof the bNTA In the world oftreelike instances

The samematchcan be the image ofmultiple homomorphisms

Add assignment facts to represent possible assignments

Encode to a bNTA thatguesses them

(84)

Supporting exponents

In the world oftrees

The samefactcan be usedmultiple times Annotate nodes with amultiplicity

The bNTA ismonotonefor thatmultiplicity

Use eachinput gateas many times as we read its fact In the world oftreelike instances

The samefactcan be the image ofmultiple atoms Maximal multiplicityis query-dependent but instance-independent

Encodes CQs to bNTAs that readmultiplicities Consider all possible CQself-homomorphisms Count the multiplicities ofidentical atoms Rewrite relations toadd multiplicities Usual compilation on themodified signature

(85)

Supporting exponents

In the world oftrees

The samefactcan be usedmultiple times Annotate nodes with amultiplicity

The bNTA ismonotonefor thatmultiplicity

Use eachinput gateas many times as we read its fact In the world oftreelike instances

The samefactcan be the image ofmultiple atoms Maximal multiplicityis query-dependent but instance-independent

Encodes CQs to bNTAs that readmultiplicities Consider all possible CQself-homomorphisms Count the multiplicities ofidentical atoms Rewrite relations toadd multiplicities Usual compilation on themodified signature

Références

Documents relatifs

Given an ordinary elliptic curve E with `-maximal endomorphism ring, we explicitly compute diagonal and horizontal bases of E[` k ] as defined in the previous section.. We will use

Videotaped lessons of a model on energy chain, from two French 5 th grade classes with the same teacher, were analysed in order to assess the percentage of stability as

Niculescu, “Survey on recent results in the stability and control of time-delay systems,” Journal of dynamic systems, measurement, and control, vol.. Niculescu, “Global asymptotic

Introduction Extracting candidates Cleaning up candidates Experimental results Conclusion.. IBEX: Harvesting Entities from the Web Using

Stable autoregressive process, Least squares estimation, Asymptotic properties of estimators, Residual autocorrelation, Statistical test for residual correlation, Durbin-

1 The classification performances (mAP) on the Pas- cal VOC 2007 (Everingham et al, 2007) dataset: (i) the pro- posed nonlinear SVM method (denoted nSVM, in red/light)

Masmoudi, Infinite time aggregation for the critical Patlak- Keller-Segel model in R 2 , Comm. Perthame, Two-dimensional Keller-Segel model: Optimal critical mass and

Obviously, a class of such domains contains closed domains which are quasisymmetric images of 3-dimensional ball. : Discrete transformation groups and structures