• Aucun résultat trouvé

A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs

N/A
N/A
Protected

Academic year: 2022

Partager "A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs"

Copied!
134
0
0

Texte intégral

(1)

A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs

Antoine Amarilli October 29, 2020

Télécom Paris

(2)

Table of contents

Introduction and problem statement

Existing results

Main result: Dichotomy on homomorphism-closed queries

More restricted instances: Words, trees and bounded treewidth (1 slide) More restricted instances: Unweighted instances (1 slide)

(3)

Uncertain data management

Relational databases managedata, represented here as alabeled graph

WorksAt

Antoine Télécom Paris Antoine Paris Sud

Benny Paris Sud

Benny Technion

İsmail U. Oxford

MemberOf

Télécom Paris ParisTech Télécom Paris IP Paris

Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER

→ Problem:we are notcertainabout the true state of the data

(4)

Uncertain data management

Relational databases managedata, represented here as alabeled graph

WorksAt

Antoine Télécom Paris Antoine Paris Sud

Benny Paris Sud

Benny Technion

İsmail U. Oxford

MemberOf

Télécom Paris ParisTech Télécom Paris IP Paris

Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER

→ Problem:we are notcertainabout the true state of the data

(5)

Uncertain data management

Relational databases managedata, represented here as alabeled graph

WorksAt

Antoine Télécom Paris Antoine Paris Sud

Benny Paris Sud

Benny Technion

İsmail U. Oxford

MemberOf

Télécom Paris ParisTech Télécom Paris IP Paris

Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER

→ Problem:we are notcertainabout the true state of the data

(6)

Uncertain data management

Relational databases managedata, represented here as alabeled graph

WorksAt

Antoine Télécom Paris Antoine Paris Sud

Benny Paris Sud

Benny Technion

İsmail U. Oxford

MemberOf

Télécom Paris ParisTech Télécom Paris IP Paris

Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER

→ Problem:we are notcertainabout the true state of the data

(7)

Uncertain data management

Relational databases managedata, represented here as alabeled graph

WorksAt

Antoine Télécom Paris Antoine Paris Sud

Benny Paris Sud

Benny Technion

İsmail U. Oxford

MemberOf

Télécom Paris ParisTech Télécom Paris IP Paris

Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER

→ Problem:we are notcertainabout the true state of the data

(8)

Uncertain data management

Relational databases managedata, represented here as alabeled graph

WorksAt

Antoine Télécom Paris Antoine Paris Sud

Benny Paris Sud

Benny Technion

İsmail U. Oxford

MemberOf

Télécom Paris ParisTech Télécom Paris IP Paris

Paris Sud IP Paris

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay

→ Problem:we are notcertainabout the true state of the data

(9)

Uncertain data management

Relational databases managedata, represented here as alabeled graph

WorksAt

Antoine Télécom Paris Antoine Paris Sud

Benny Paris Sud

Benny Technion

İsmail U. Oxford

MemberOf

Télécom Paris ParisTech Télécom Paris IP Paris

Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER

(10)

Uncertain data model A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay

80% 10%

40% 80%

100%

90% 90% 50%

90%

100%

Uncertain data model:TID, for tuple-independent database

Each fact (edge) carries aprobability

Each fact exists with its givenprobability

All facts areindependent

Possible worldW:subset of facts

What is theprobabilityof this possible world?0.03%

Pr(W) = Y

F∈W

Pr(F)

!

× Y

F∈W/

1−Pr(F)

!

(11)

Uncertain data model A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Uncertain data model:TID, for tuple-independent database

Each fact (edge) carries aprobability

Each fact exists with its givenprobability

All facts areindependent

Possible worldW:subset of facts

What is theprobabilityof this possible world?0.03%

Pr(W) = Y

F∈W

Pr(F)

!

× Y

F∈W/

1−Pr(F)

!

(12)

Uncertain data model A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

Uncertain data model:TID, for tuple-independent database

Each fact (edge) carries aprobability

Each fact exists with its givenprobability

All facts areindependent

Possible worldW:subset of facts

What is theprobabilityof this possible world?0.03%

Pr(W) = Y

F∈W

Pr(F)

!

× Y

F∈W/

1−Pr(F)

!

(13)

Uncertain data model A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Uncertain data model:TID, for tuple-independent database

Each fact (edge) carries aprobability

Each fact exists with its givenprobability

All facts areindependent

Possible worldW:subset of facts

What is theprobabilityof this possible world?0.03%

Pr(W) = Y

F∈W

Pr(F)

!

× Y

F∈W/

1−Pr(F)

!

(14)

Uncertain data model A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay

80% 10%

40% 80%

100%

90% 90% 50%

90%

100%

Uncertain data model:TID, for tuple-independent database

Each fact (edge) carries aprobability

Each fact exists with its givenprobability

All facts areindependent

Possible worldW:subset of facts

What is theprobabilityof this possible world?0.03%

Pr(W) = Y

F∈W

Pr(F)

!

× Y

F∈W/

1−Pr(F)

!

(15)

Uncertain data model A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER

80% 10%

40% 80%

100%

90% 90% 50%

90%

100%

Uncertain data model:TID, for tuple-independent database

Each fact (edge) carries aprobability

Each fact exists with its givenprobability

All facts areindependent

Possible worldW:subset of facts

What is theprobabilityof this possible world?

0.03%

Pr(W) = Y

F∈W

Pr(F)

!

× Y

F∈W/

1−Pr(F)

!

(16)

Uncertain data model A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

Uncertain data model:TID, for tuple-independent database

Each fact (edge) carries aprobability

Each fact exists with its givenprobability

All facts areindependent

Possible worldW:subset of facts

What is theprobabilityof this possible world?

0.03%

Pr(W) = Y

F∈W

Pr(F)

!

× Y

F∈W/

1−Pr(F)

!

(17)

Uncertain data model A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Uncertain data model:TID, for tuple-independent database

Each fact (edge) carries aprobability

Each fact exists with its givenprobability

All facts areindependent

Possible worldW:subset of facts

What is theprobabilityof this possible world?0.03%

Pr(W) = Y

F∈W

Pr(F)

!

× Y

F∈W/

1−Pr(F)

!

(18)

Uncertain data model A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

Uncertain data model:TID, for tuple-independent database

Each fact (edge) carries aprobability

Each fact exists with its givenprobability

All facts areindependent

Possible worldW:subset of facts

What is theprobabilityof this possible world?0.03%

! !

(19)

Queries

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER

80% 10%

40% 80%

100%

90% 90% 50%

90%

100%

Central database task:evaluate queries

“Is there some personxemployed in an institution who is part of a consortiumz?”

Q(x,z) :∃y x y z Result on this graph:

x z

A. ParisTech

72%

A. IP Paris

99.1%

A. Paris-Saclay

9%

B. IP Paris

20%

B. Paris-Saclay

36%

B. CESAER

80%

(20)

Queries

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay

80% 10%

40% 80%

100%

90% 90% 50%

90%

100%

Central database task:evaluate queries

“Is there some personxemployed in an institution who is part of a consortiumz?”

Q(x,z) :∃y x y z Result on this graph:

x z

A. ParisTech

72%

A. IP Paris

99.1%

A. Paris-Saclay

9%

B. IP Paris

20%

B. Paris-Saclay

36%

B. CESAER

80%

(21)

Queries

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER

80% 10%

40% 80%

100%

90% 90% 50%

90%

100%

Central database task:evaluate queries

“Is there some personxemployed in an institution who is part of a consortiumz?”

Q(x,z) :∃y x y z

Result on this graph:

x z

A. ParisTech

72%

A. IP Paris

99.1%

A. Paris-Saclay

9%

B. IP Paris

20%

B. Paris-Saclay

36%

B. CESAER

80%

(22)

Queries

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay

80% 10%

40% 80%

100%

90% 90% 50%

90%

100%

Central database task:evaluate queries

“Is there some personxemployed in an institution who is part of a consortiumz?”

Q(x,z) :∃y x y z Result on this graph:

x z

A. ParisTech

72%

A. IP Paris

99.1%

A. Paris-Saclay

9% 20% 36% 80%

(23)

Queries

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Central database task:evaluate queries

“Is there some personxemployed in an institution who is part of a consortiumz?”

Q(x,z) :∃y x y z Result on this graph:

x z

A. ParisTech

72%

A. IP Paris

99.1%

A. Paris-Saclay

9%

B. IP Paris

20%

B. Paris-Saclay

36% 80%

(24)

Queries

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

Central database task:evaluate queries

“Is there some personxemployed in an institution who is part of a consortiumz?”

Q(x,z) :∃y x y z Result on this graph:

x z

A. ParisTech 72%

A. IP Paris 99.1%

A. Paris-Saclay 9%

(25)

Restricting to YES/NO queries

To make the problem simpler to study, we will restrict toYES/NO queries:

Query:maps a graph to YES/NO Why can we get away with that?

Consider a query:Q(x,z) :∃y x y z

Considereach possible choiceof(x,z), e.g.,(A.,CESAER)

The queryQ(A.,CESAER)is aYES/NO query:

Q(A.,CESAER.) :∃y x y z

The number of choices for(x,z)ispolynomialin the input graph

→ From now on, all queries areYES/NO queries,

so we have just one YES/NO answer to compute, orjust oneprobability

(26)

Restricting to YES/NO queries

To make the problem simpler to study, we will restrict toYES/NO queries:

Query:maps a graph to YES/NO Why can we get away with that?

Consider a query:Q(x,z) :∃y x y z

Considereach possible choiceof(x,z), e.g.,(A.,CESAER)

The queryQ(A.,CESAER)is aYES/NO query:

Q(A.,CESAER.) :∃y x y z

The number of choices for(x,z)ispolynomialin the input graph

→ From now on, all queries areYES/NO queries,

so we have just one YES/NO answer to compute, orjust oneprobability

(27)

Restricting to YES/NO queries

To make the problem simpler to study, we will restrict toYES/NO queries:

Query:maps a graph to YES/NO Why can we get away with that?

Consider a query:Q(x,z) :∃y x y z

Considereach possible choiceof(x,z), e.g.,(A.,CESAER)

The queryQ(A.,CESAER)is aYES/NO query:

Q(A.,CESAER.) :∃y x y z

The number of choices for(x,z)ispolynomialin the input graph

→ From now on, all queries areYES/NO queries,

so we have just one YES/NO answer to compute, orjust oneprobability

(28)

Restricting to YES/NO queries

To make the problem simpler to study, we will restrict toYES/NO queries:

Query:maps a graph to YES/NO Why can we get away with that?

Consider a query:Q(x,z) :∃y x y z

Considereach possible choiceof(x,z), e.g.,(A.,CESAER)

The queryQ(A.,CESAER)is aYES/NO query:

Q(A.,CESAER.) :∃y x y z

The number of choices for(x,z)ispolynomialin the input graph

→ From now on, all queries areYES/NO queries,

so we have just one YES/NO answer to compute, orjust oneprobability

(29)

Restricting to YES/NO queries

To make the problem simpler to study, we will restrict toYES/NO queries:

Query:maps a graph to YES/NO Why can we get away with that?

Consider a query:Q(x,z) :∃y x y z

Considereach possible choiceof(x,z), e.g.,(A.,CESAER)

The queryQ(A.,CESAER)is aYES/NO query:

Q(A.,CESAER.) :∃y x y z

The number of choices for(x,z)ispolynomialin the input graph

→ From now on, all queries areYES/NO queries,

so we have just one YES/NO answer to compute, orjust oneprobability

(30)

Restricting to YES/NO queries

To make the problem simpler to study, we will restrict toYES/NO queries:

Query:maps a graph to YES/NO Why can we get away with that?

Consider a query:Q(x,z) :∃y x y z

Considereach possible choiceof(x,z), e.g.,(A.,CESAER)

The queryQ(A.,CESAER)is aYES/NO query:

Q(A.,CESAER.) :∃y x y z

The number of choices for(x,z)ispolynomialin the input graph

(31)

Query languages

Which kinds of queries do we want to express?

Conjunctive query(CQ): can I find a match of apattern?

• e.g.,∃x y z x y z

We want ahomomorphismfrom the pattern to the graph (not necessarilyinjective)

Formally: anexistentially quantified conjunction of atoms (edges)

Union of conjunctive queries(UCQ): can I find a match ofsome pattern?

• e.g., ∃x y z x y z

∨ ∃x y z w x y z w

Formally: afinite disjunctionof CQs

Regular path queries(RPQ): can I find a match of aregular path?

• e.g.,∃x y x y

(32)

Query languages

Which kinds of queries do we want to express?

Conjunctive query(CQ): can I find a match of apattern?

• e.g.,∃x y z x y z

We want ahomomorphismfrom the pattern to the graph (not necessarilyinjective)

Formally: anexistentially quantified conjunction of atoms (edges)

Union of conjunctive queries(UCQ): can I find a match ofsome pattern?

• e.g., ∃x y z x y z

∨ ∃x y z w x y z w

Formally: afinite disjunctionof CQs

Regular path queries(RPQ): can I find a match of aregular path?

• e.g.,∃x y x y

(33)

Query languages

Which kinds of queries do we want to express?

Conjunctive query(CQ): can I find a match of apattern?

• e.g.,∃x y z x y z

We want ahomomorphismfrom the pattern to the graph (not necessarilyinjective)

Formally: anexistentially quantified conjunction of atoms (edges)

Union of conjunctive queries(UCQ): can I find a match ofsome pattern?

• e.g., ∃x y z x y z

∨ ∃x y z w x y z w

Formally: afinite disjunctionof CQs

Regular path queries(RPQ): can I find a match of aregular path?

• e.g.,∃x y x y

(34)

Query languages

Which kinds of queries do we want to express?

Conjunctive query(CQ): can I find a match of apattern?

• e.g.,∃x y z x y z

We want ahomomorphismfrom the pattern to the graph (not necessarilyinjective)

Formally: anexistentially quantified conjunction of atoms (edges)

Union of conjunctive queries(UCQ): can I find a match ofsome pattern?

• e.g., ∃x y z x y z

∨ ∃x y z w x y z w

Formally: afinite disjunctionof CQs

Regular path queries(RPQ): can I find a match of aregular path?

• e.g.,∃x y x y

(35)

Query languages

Which kinds of queries do we want to express?

Conjunctive query(CQ): can I find a match of apattern?

• e.g.,∃x y z x y z

We want ahomomorphismfrom the pattern to the graph (not necessarilyinjective)

Formally: anexistentially quantified conjunction of atoms (edges)

Union of conjunctive queries(UCQ): can I find a match ofsome pattern?

• e.g., ∃x y z x y z

∨ ∃x y z w x y z w

Formally: afinite disjunctionof CQs

Regular path queries(RPQ): can I find a match of aregular path?

• e.g.,∃x y x y

(36)

Problem statement: Probabilistic query evaluation (PQE)

Wefixa queryQ, for instance the CQ:∃x y z x y z

Theinputis a TIDD: A. B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40% 80%

100%

90% 90% 50% 90%

100%

Theoutputis thetotal probabilityof the worlds which satisfy the query:

• Formally:P

W⊆D,W|=QPr(W)

Intuition:theprobabilitythat the query is true

We can always compute the probability in exponential time (go over all possibilities)

(37)

Problem statement: Probabilistic query evaluation (PQE)

Wefixa queryQ, for instance the CQ:∃x y z x y z

Theinputis a TIDD: A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Theoutputis thetotal probabilityof the worlds which satisfy the query:

• Formally:P

W⊆D,W|=QPr(W)

Intuition:theprobabilitythat the query is true

We can always compute the probability in exponential time (go over all possibilities)

(38)

Problem statement: Probabilistic query evaluation (PQE)

Wefixa queryQ, for instance the CQ:∃x y z x y z

Theinputis a TIDD: A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Theoutputis thetotal probabilityof the worlds which satisfy the query:

• Formally:P

W⊆D,W|=QPr(W)

Intuition:theprobabilitythat the query is true

We can always compute the probability in exponential time (go over all possibilities)

(39)

Problem statement: Probabilistic query evaluation (PQE)

Wefixa queryQ, for instance the CQ:∃x y z x y z

Theinputis a TIDD: A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Theoutputis thetotal probabilityof the worlds which satisfy the query:

• Formally:P

W⊆D,W|=QPr(W)

Intuition:theprobabilitythat the query is true

We can always compute the probability in exponential time (go over all possibilities)

(40)

Problem statement: Probabilistic query evaluation (PQE)

Wefixa queryQ, for instance the CQ:∃x y z x y z

Theinputis a TIDD: A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Theoutputis thetotal probabilityof the worlds which satisfy the query:

• Formally:P Pr(W)

(41)

PQE: a simple example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Find the probability of:∃x y x y

It’s easier to compute the probabilityx that there isno match of the query

The probability we want is1x

There is no match of the query iff every red edge is not kept

These choices are independent, soxis: (1−80%)×(1−10%)×(1−40%)× (1−80%)×(1−100%)

This givesx=0%, so the query has probability100%

This process is inpolynomial time

(42)

PQE: a simple example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

Find the probability of:∃x y x y

It’s easier to compute the probabilityx that there isno match of the query

The probability we want is1x

There is no match of the query iff every red edge is not kept

These choices are independent, soxis: (1−80%)×(1−10%)×(1−40%)× (1−80%)×(1−100%)

This givesx=0%, so the query has probability100%

This process is inpolynomial time

(43)

PQE: a simple example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Find the probability of:∃x y x y

It’s easier to compute the probabilityx that there isno match of the query

The probability we want is1x

There is no match of the query iff every red edge is not kept

These choices are independent, soxis: (1−80%)×(1−10%)×(1−40%)× (1−80%)×(1−100%)

This givesx=0%, so the query has probability100%

This process is inpolynomial time

(44)

PQE: a simple example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

Find the probability of:∃x y x y

It’s easier to compute the probabilityx that there isno match of the query

The probability we want is1x

There is no match of the query iff every red edge is not kept

These choices are independent, soxis:

(1−80%)×(1−10%)×(1−40%)× (1−80%)×(1−100%)

This givesx=0%, so the query has probability100%

This process is inpolynomial time

(45)

PQE: a simple example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Find the probability of:∃x y x y

It’s easier to compute the probabilityx that there isno match of the query

The probability we want is1x

There is no match of the query iff every red edge is not kept

These choices are independent, soxis:

(1−80%)

×(1−10%)×(1−40%)× (1−80%)×(1−100%)

This givesx=0%, so the query has probability100%

This process is inpolynomial time

(46)

PQE: a simple example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

Find the probability of:∃x y x y

It’s easier to compute the probabilityx that there isno match of the query

The probability we want is1x

There is no match of the query iff every red edge is not kept

These choices are independent, soxis:

(1−80%)×(1−10%)

×(1−40%)× (1−80%)×(1−100%)

This givesx=0%, so the query has probability100%

This process is inpolynomial time

(47)

PQE: a simple example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Find the probability of:∃x y x y

It’s easier to compute the probabilityx that there isno match of the query

The probability we want is1x

There is no match of the query iff every red edge is not kept

These choices are independent, soxis:

(1−80%)×(1−10%)×(1−40%)× (1−80%)×(1−100%)

This givesx=0%, so the query has probability100%

This process is inpolynomial time

(48)

PQE: a simple example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

Find the probability of:∃x y x y

It’s easier to compute the probabilityx that there isno match of the query

The probability we want is1x

There is no match of the query iff every red edge is not kept

These choices are independent, soxis:

(1−80%)×(1−10%)×(1−40%)× (1−80%)×(1−100%)

This givesx=0%, so the query has

This process is inpolynomial time

(49)

PQE: a simple example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

Find the probability of:∃x y x y

It’s easier to compute the probabilityx that there isno match of the query

The probability we want is1x

There is no match of the query iff every red edge is not kept

These choices are independent, soxis:

(1−80%)×(1−10%)×(1−40%)× (1−80%)×(1−100%)

This givesx=0%, so the query has probability100%

This process is inpolynomial time

(50)

PQE: a more complicated example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1(

1−

(1−10%)×(1− 40%)

)×(1−

(1−50%)×(1−90%))

)× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(51)

PQE: a more complicated example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1(

1−

(1−10%)×(1− 40%)

)×(1−

(1−50%)×(1−90%))

)× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(52)

PQE: a more complicated example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1

(1−80%)×(1−(

1−

(1−10%)×(1− 40%)

)×(1−

(1−50%)×(1−90%))

)× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(53)

PQE: a more complicated example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×

(1−(

1−

(1−10%)×(1− 40%)

)×(1−

(1−50%)×(1−90%))

)× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(54)

PQE: a more complicated example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1

(

1−

(1−10%)×(1− 40%)

)×(1−

(1−50%)×(1−90%))

)× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(55)

PQE: a more complicated example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1(1

(1−10%)×(1− 40%)

)×(1−

(1−50%)×(1−90%))

)

× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(56)

PQE: a more complicated example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1(1(110%)×(1

40%))×(1−

(1−50%)×(1−90%))

)

× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(57)

PQE: a more complicated example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1(1(110%)×(1

40%))×(1−(1−50%)×(1−90%)))

× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(58)

PQE: a more complicated example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1(1(110%)×(1

40%))×(1−(1−50%)×(1−90%)))

× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(59)

PQE: a more complicated example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1(1(110%)×(1

40%))×(1−(1−50%)×(1−90%)))× (1−80%×(1−(1−90%)×(1−90%))),

i.e.,97.65792%

This is scary butpolynomial time

(60)

PQE: a more complicated example

A.

B.

Télécom Paris

Paris Sud

Technion

ParisTech

IP Paris

Paris-Saclay 80%

10%

40%

80%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1(1(110%)×(1

40%))×(1−(1−50%)×(1−90%)))× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(61)

PQE: a more complicated example

A.

B.

İ.

Télécom Paris

Paris Sud

Technion

U. Oxford

ParisTech

IP Paris

Paris-Saclay

CESAER 80%

10%

40%

80%

100%

90%

90%

50%

90%

100%

How to compute the probability of the query from the previous slide?

∃x y z x y z

Key insight: consider all possible choices for the middle variabley

1(180%)×(1(1(110%)×(1

40%))×(1−(1−50%)×(1−90%)))× (1−80%×(1−(1−90%)×(1−90%))), i.e.,97.65792%

This is scary butpolynomial time

(62)

Research goal: Understanding the complexity of PQE

What is the complexity ofPQE(Q)depending on the queryQ?

→ Note that we studydata complexity, i.e.,Qisfixedand the input is thedata My work:several dichotomies on the PQE problem:

Existing resultson UCQ:

PQE(Q)is in #P for any UCQQand is#P-hardfor some CQs

Dichotomyby Dalvi and Suciu:PQE(Q)for a UCQQis either#P-hardorPTIME

This talk:dichotomy onhomomorphism-closed queries

PQE(Q)is#P-hardfor all homomorphism-closed queries not equivalent to a safe UCQ

I’ll also mention some of my work onrestricted graph classes

(63)

Research goal: Understanding the complexity of PQE

What is the complexity ofPQE(Q)depending on the queryQ?

→ Note that we studydata complexity, i.e.,Qisfixedand the input is thedata

My work:several dichotomies on the PQE problem:

Existing resultson UCQ:

PQE(Q)is in #P for any UCQQand is#P-hardfor some CQs

Dichotomyby Dalvi and Suciu:PQE(Q)for a UCQQis either#P-hardorPTIME

This talk:dichotomy onhomomorphism-closed queries

PQE(Q)is#P-hardfor all homomorphism-closed queries not equivalent to a safe UCQ

I’ll also mention some of my work onrestricted graph classes

(64)

Research goal: Understanding the complexity of PQE

What is the complexity ofPQE(Q)depending on the queryQ?

→ Note that we studydata complexity, i.e.,Qisfixedand the input is thedata My work:several dichotomies on the PQE problem:

Existing resultson UCQ:

PQE(Q)is in #P for any UCQQand is#P-hardfor some CQs

Dichotomyby Dalvi and Suciu:PQE(Q)for a UCQQis either#P-hardorPTIME

This talk:dichotomy onhomomorphism-closed queries

PQE(Q)is#P-hardfor all homomorphism-closed queries not equivalent to a safe UCQ

I’ll also mention some of my work onrestricted graph classes

(65)

Research goal: Understanding the complexity of PQE

What is the complexity ofPQE(Q)depending on the queryQ?

→ Note that we studydata complexity, i.e.,Qisfixedand the input is thedata My work:several dichotomies on the PQE problem:

Existing resultson UCQ:

PQE(Q)is in #P for any UCQQand is#P-hardfor some CQs

Dichotomyby Dalvi and Suciu:PQE(Q)for a UCQQis either#P-hardorPTIME

This talk:dichotomy onhomomorphism-closed queries

PQE(Q)is#P-hardfor all homomorphism-closed queries not equivalent to a safe UCQ

I’ll also mention some of my work onrestricted graph classes

(66)

Research goal: Understanding the complexity of PQE

What is the complexity ofPQE(Q)depending on the queryQ?

→ Note that we studydata complexity, i.e.,Qisfixedand the input is thedata My work:several dichotomies on the PQE problem:

Existing resultson UCQ:

PQE(Q)is in #P for any UCQQand is#P-hardfor some CQs

Dichotomyby Dalvi and Suciu:PQE(Q)for a UCQQis either#P-hardorPTIME

This talk:dichotomy onhomomorphism-closed queries

PQE(Q)is#P-hardfor all homomorphism-closed queries not equivalent to a safe UCQ

I’ll also mention some of my work onrestricted graph classes

(67)

Research goal: Understanding the complexity of PQE

What is the complexity ofPQE(Q)depending on the queryQ?

→ Note that we studydata complexity, i.e.,Qisfixedand the input is thedata My work:several dichotomies on the PQE problem:

Existing resultson UCQ:

PQE(Q)is in #P for any UCQQand is#P-hardfor some CQs

Dichotomyby Dalvi and Suciu:PQE(Q)for a UCQQis either#P-hardorPTIME

This talk:dichotomy onhomomorphism-closed queries

PQE(Q)is#P-hardfor all homomorphism-closed queries not equivalent to a safe UCQ

I’ll also mention some of my work onrestricted graph classes

(68)

Table of contents

Introduction and problem statement Existing results

Main result: Dichotomy on homomorphism-closed queries

More restricted instances: Words, trees and bounded treewidth (1 slide) More restricted instances: Unweighted instances (1 slide)

(69)

Basic complexity results

Whenever we canevaluateQin PTIME, thenPQE(Q)is in#P

#P:counting class of problems expressible as thenumber of accepting paths of a nondeterministic polynomial-time Turing Machine

Nondeterministically guess a possible world, then test the query

In particular,PQE(Q)is in#Pfor anyUCQQ

For some queriesQ, the taskPQE(Q)is inPTIME

e.g.,∃x y x y or∃x y z x y z

(70)

Basic complexity results

Whenever we canevaluateQin PTIME, thenPQE(Q)is in#P

#P:counting class of problems expressible as thenumber of accepting paths of a nondeterministic polynomial-time Turing Machine

Nondeterministically guess a possible world, then test the query

In particular,PQE(Q)is in#Pfor anyUCQQ

For some queriesQ, the taskPQE(Q)is inPTIME

e.g.,∃x y x y or∃x y z x y z

(71)

Basic complexity results

Whenever we canevaluateQin PTIME, thenPQE(Q)is in#P

#P:counting class of problems expressible as thenumber of accepting paths of a nondeterministic polynomial-time Turing Machine

Nondeterministically guess a possible world, then test the query

In particular,PQE(Q)is in#Pfor anyUCQQ

For some queriesQ, the taskPQE(Q)is inPTIME

e.g.,∃x y x y or∃x y z x y z

(72)

Basic complexity results

Whenever we canevaluateQin PTIME, thenPQE(Q)is in#P

#P:counting class of problems expressible as thenumber of accepting paths of a nondeterministic polynomial-time Turing Machine

Nondeterministically guess a possible world, then test the query

In particular,PQE(Q)is in#Pfor anyUCQQ

For some queriesQ, the taskPQE(Q)is inPTIME

e.g.,∃x y x y or∃x y z x y z

(73)

PQE is sometimes #P-hard

Let us show thatPQE(Q)is#P-hardfor the CQQ:

x y z w

Reduce from the problem ofcounting satisfying valuationsof a Boolean formula

• e.g., given(xy)z, compute that it has3satisfying valuations

This problem is already#P-hardfor so-calledPP2DNF formulas:

Positive(no negation) andPartitioned variables:X1, . . . ,XnandY1, . . . ,Ym

2-DNF: disjunction of clauses likeXiYj

Example:φ: (X1Y1)(X1Y2)(X2Y2)(X3Y1)(X3Y2)

a01 a02 a03

a1 a2 a3 1/2 1/2 1/2

b1

b2

b01

b02 1/2

1/2 1

1 1

1 1

Idea: Satisfying valuationsofφcorrespond topossible worldswith amatchofQ

Références

Documents relatifs

Attendu qu’il convient suite à l’appel du jockey Sylvain RUIS, de lui rappeler, à toutes fins utiles, concernant son argument selon lequel il aurait pu être entendu devant

Keywords: Graph · Bounded diameter minimum spanning tree · Min- imum spanning tree · Asymptotically optimal algorithm · Probabilistic analysis · Performance guarantees · Random

Dans le cadre du développement des usages numériques dans le domaine de l'éducation, il convient pour les établissements publics locaux d'enseignement de s'adapter aux

Dans le cadre du développement des usages numériques dans le domaine de l'éducation, il convient pour les établissements publics locaux d'enseignement de s'adapter aux

Même un examen sommaire de l’histoire de l’évaluation à grande échelle au Canada révèle que les différences entre les instances sont à la fois importantes et persistantes. Il

We study PQE for homomorphism-closed queries and show: Theorem [Amarilli and Ceylan, 2020]. For any query Q closed

Proof idea: extract instances of a hard problem as topological minors using recent polynomial bounds [Chekuri and Chuzhoy, 2014]..

Formally, we showed that the evaluation of monadic second-order (MSO) queries is tractable (in data complexity) over several probabilistic database formalisms, assuming that