• Aucun résultat trouvé

June5,2014 AntoineAmarilli ThePossibilityProblemforProbabilisticXML

N/A
N/A
Protected

Academic year: 2022

Partager "June5,2014 AntoineAmarilli ThePossibilityProblemforProbabilisticXML"

Copied!
25
0
0

Texte intégral

(1)

The Possibility Problem for Probabilistic XML

Antoine Amarilli

Télécom ParisTech, Paris, France

June 5, 2014

(2)

Probabilistic XML

We areunsureabout the exact contents of an XML document.

directory

person

address 42 Foo St.

x name

mux

Jean Doe Jane Doe

0.8 0.2

person

ind

mobile 1234 0.6 address 42 Foo St.

x name

John Doe

(3)

Introduction Known results Unordered documents Unambiguous labels Conclusion

Local formalisms: possible worlds semantics

r ind

b a

α β

(1−α)(1−β) α(1−β) (1−α)β αβ

r r

a

r b

r b a

r mux

b a

α β

1−α−β α β

r r

a

r b r

det b a

1 r b a

Caution: we impose α <1,β <1 in ind.

(4)

Introduction Known results Unordered documents Unambiguous labels Conclusion

Local formalisms: possible worlds semantics

r ind

b a

α β

(1−α)(1−β) α(1−β) (1−α)β αβ

r r

a

r b

r b a r

mux b a

α β

1−α−β α β

r r

a

r b

det b a

r

b a

Caution: we impose α <1,β <1 in ind.

(5)

Local formalisms: possible worlds semantics

r ind

b a

α β

(1−α)(1−β) α(1−β) (1−α)β αβ

r r

a

r b

r b a r

mux b a

α β

1−α−β α β

r r

a

r b r

det b a

1 r b a

Caution: we impose α <1,β <1 in ind.

(6)

Event formalisms

x 0.7

y 0.4

r

b a

x ¬x∧y

Probability distributionon events Draw events independently

Edges annotated withformulae on the events Edges with false formulae areremoved

mie: multivalued events (see later)

cie: conjunctions of Boolean events

fie: formulae of Boolean events

(7)

Possibility problem (Poss)

Given:

aprobabilistic documentD adeterministic documentW Is Wa possible worldof D?

If yes, with whichprobability?

Diverse probabilisticformalisms, ordered and unordered Likequery evaluation but:

Needinequality: “don’t collapse nodes”

Neednegation: “no additional things”

Querydepends on inputW

Specific boundsfor this Possproblem?

(8)

Table of contents

1 Introduction

2 Known results

3 Unordered documents

4 Unambiguous labels

5 Conclusion

(9)

In NP, in FP

#P

Guess a valuation of the events Guess a match of W inD

Check that the match isrealized by the valuation

Likewise, probability computation is in FP#P

Of coursePossis NP-hard forfie

(10)

Tractable for ordered local documents

Localchoices and ordereddocuments

Possibility decisionandcomputationare in PTIME Intuitively:

match each possiblesubsequences of siblings dynamic algorithmfor match at each level

Implied bydetermininstic tree automata on probabilistic XML:

Cohen, Kimelfeld, and Sagiv 2009

Assumption of orderis crucial

(11)

Table of contents

1 Introduction

2 Known results

3 Unordered documents

4 Unambiguous labels

5 Conclusion

(12)

Computation is #P-hard for ind or mux

1 2 3

1 2 3

D W

r

a3 a2

1/2 1/2

a3 1/2

a2 a1

1/2 1/2

r

a3

a2

a1

(13)

Decision is in PTIME for ind or mux

Compute bottom-up if a node has theempty possible world Check dynamicallybetween all nodes ofD andW

Buildbipartite graphbased on child compatibility

Adddummy nodesfor deletions of nodes that can be deleted

Check in PTIME if graph has aperfect matching

D W G

r

ind c

1/2 X2

ind b

1/2 X1

ind b a 1/2 1/2

r X4

b X3

a

X3 X4 del X1 X2 c

(14)

Decision is NP-hard for any two of ind, mux , det

Withdet, reduction fromexact cover S={Si},Si={sij}

Is thereTS such that T=

Swith no dupes?

D W

S={{a,b}, {a,c}, {b}}

r

det det

det

1/2 1/2 1/2

r

c b a

(15)

Decision is NP-hard for any two of ind, mux , det (cont’d)

Withind andmux, reduction from SAT F= (a∨b∨ ¬c)∧(a∨c)∧(¬a)

D W

r

mux ind

c3 1/2 ind

c2 1/2 1/2 1/2 mux

ind ind

c1 1/2 1/2 1/2 mux

ind c3 1/2 ind

c2 c1 1/2 1/2

1/2 1/2

r

c3 c2 c1

(16)

Table of contents

1 Introduction

2 Known results

3 Unordered documents

4 Unambiguous labels

5 Conclusion

(17)

Unambiguity

D isunambiguousif node labels are unique Possible refinements (unique among siblings, etc.)

There is at most one way to matchW!

Alllocal modelstractable (can impose order)

Can we have correlations?

(18)

Still NP-hard for cie

F=∧

i

j±xij in CNF Equivalently: ∧

i¬

j∓xij

D W

r

. . . an a1

j∓x1j

j∓x1j

r

W is a possible world ofDiff F is satisfiable

Decision for PossisNP-hard

(19)

The mie class

Var Val Prob

x 1 0.6

x 2 0.2

x 3 0.1

x 4 0.1

y 1 0.5

y 2 0.5

mie: Multivalued independent events Noconjunctions allowed

Captures mux

Doesn’t capture detor indhierarchies Intractable ifambiguous

Ifnon-ambiguous, do we have tractability?

(20)

mie tractable on non-ambiguous documents

Var Val Prob

x 1 0.6

x 2 0.2

x 3 0.1

x 4 0.1

y 1 0.5

y 2 0.5

D W

r

c g

y= 2 b

f e

x= 2 y= 1 a

d

y= 2 x= 1

r b a

d

= 2,= 1,y= 2,= 1 x∈ {3,4},y∈ {2}.

(21)

Table of contents

1 Introduction

2 Known results

3 Unordered documents

4 Unambiguous labels

5 Conclusion

(22)

Introduction Known results Unordered documents Unambiguous labels Conclusion

Conclusion

Ordered local modelsare tractable Unordered local models are tractable

Fordecisiononly, and

With onlymux or onlyind

mie is tractable onunambiguousdocuments Other cases are hard

Height does not matter

Probabilities do not matter

Can we refine mie, unambiguity,mux–ind interaction?

What if Dis partially ordered?

Thanks for your attention!

(23)

Introduction Known results Unordered documents Unambiguous labels Conclusion

Conclusion

Ordered local modelsare tractable Unordered local models are tractable

Fordecisiononly, and

With onlymux or onlyind

mie is tractable onunambiguousdocuments Other cases are hard

Height does not matter

Probabilities do not matter

Can we refine mie, unambiguity,mux–ind interaction?

What if Dis partially ordered?

Thanks for your attention!

(24)

Conclusion

Ordered local modelsare tractable Unordered local models are tractable

Fordecisiononly, and

With onlymux or onlyind

mie is tractable onunambiguousdocuments Other cases are hard

Height does not matter

Probabilities do not matter

Can we refine mie, unambiguity,mux–ind interaction?

What if Dis partially ordered?

(25)

References

Cohen, Sara, Benny Kimelfeld, and Yehoshua Sagiv (2009).

“Running tree automata on probabilistic XML”. In:Proc.

PODS. ACM, pp. 227–236.

Références

Documents relatifs

Introduction Résultats connus Documents non ordonnés Non-ambiguïté Conclusion.. Déterminer la possibilité en

S’effectue avec un financement?. S’accompagne parfois d’une

⇒ We can identify examples of hard TPQJ query classes (e.g., the image of classes of non-hierarchical CQs) MSO evaluation over PrXML {ind,mux} is linear time. ⇒ Read-once

Finite model theory which is the math behind relational databases. Natural language processing for

Généralités Structure Texte, listes et tableaux Multimédia Formulaires.. Technologies du Web Master COMASIC Le

Possibilité de faire tourner des applications Java dans une zone de taille fixe dans une page HTML, depuis 1995. Bac à sable : par défaut, l’application ne peut pas faire

Historical approach: index multimedia content using text clues: surrounding text, file names, alt text, etc. → http://google.com/search?tbm=isch&amp;q=241543903 Better: index

GovUK Societal Wellbeing Deprivation imd Employment Rank La 2010. GNU Licenses