When Can We Answer Queries
Using Result-Bounded Data Interfaces?
Antoine Amarilli1, Michael Benedikt2 December 10th, 2018
1Télécom ParisTech
2Oxford University
Problem: Answering Queries Using Web Services
Directory service DBLP service
Query
Findresearchers from adepartment
Findpapers from aresearcher
Find allpapers written byresearchers from mydepartment?
•
We have severalWeb servicesthat expose data•
We want to answer aqueryusing the Web services→ How can werephrasethe query against the Web services?
2/16
Problem: Answering Queries Using Web Services
Directory service DBLP service Query
?
Findresearchers from adepartment
Findpapers from aresearcher
Find allpapers written byresearchers from mydepartment?
•
We have severalWeb servicesthat expose data•
We want to answer aqueryusing the Web services→ How can werephrasethe query against the Web services?
Problem: Answering Queries Using Web Services
Directory service DBLP service Query
?
Findresearchers from adepartment
Findpapers from aresearcher
Find allpapers written byresearchers from mydepartment?
•
We have severalWeb servicesthat expose data•
We want to answer aqueryusing the Web services→ How can werephrasethe query against the Web services?
2/16
Formalizing the Problem
Service schema:
DBLP(author,title,year)
Input author? Michael Benedikt OK
author title year
Michael Benedikt Goal-Driven Query Answering ... 2018 Michael Benedikt Form Filling Based on ... 2018 Michael Benedikt How Can Reasoners Simplify ... 2018 Michael Benedikt When Can We Answer Queries ... 2018
. . . . . . . . .
•
Model each service as arelation•
Some attributes areinputs•
Access the service by giving abindingfor the input attributes•
The service returnsall tuples that match thebindingQuery:conjunctive query over the relations
?
Find all papers written by researchers from my department? Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)Constraints:express logical relationships between the services
!
Every researcher from the directory is in DBLP Σ :∀d aDirectory(d,a)→ ∃t y DBLP(a,t,y)Formalizing the Problem
Service schema:
DBLP(author,title,year)
Input author? Michael Benedikt OK
author title year
Michael Benedikt Goal-Driven Query Answering ... 2018 Michael Benedikt Form Filling Based on ... 2018 Michael Benedikt How Can Reasoners Simplify ... 2018 Michael Benedikt When Can We Answer Queries ... 2018
. . . . . . . . .
•
Model each service as arelation•
Some attributes areinputs•
Access the service by giving abindingfor the input attributes•
The service returnsall tuples that match thebindingQuery:conjunctive query over the relations
?
Find all papers written by researchers from my department? Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)Constraints:express logical relationships between the services
!
Every researcher from the directory is in DBLP Σ :∀d aDirectory(d,a)→ ∃t y DBLP(a,t,y)3/16
Formalizing the Problem
Service schema:
DBLP(author,title,year)
Input author? OK
author title year
Michael Benedikt Goal-Driven Query Answering ... 2018 Michael Benedikt Form Filling Based on ... 2018 Michael Benedikt How Can Reasoners Simplify ... 2018 Michael Benedikt When Can We Answer Queries ... 2018
. . . . . . . . .
•
Model each service as arelation•
Some attributes areinputs•
Access the service by giving abindingfor the input attributes•
The service returnsall tuples that match thebindingQuery:conjunctive query over the relations
?
Find all papers written by researchers from my department? Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)Constraints:express logical relationships between the services
!
Every researcher from the directory is in DBLP Σ :∀d aDirectory(d,a)→ ∃t y DBLP(a,t,y)Formalizing the Problem
Service schema:
DBLP(author,title,year)
Input author? Michael Benedikt OK
author title year
Michael Benedikt Goal-Driven Query Answering ... 2018 Michael Benedikt Form Filling Based on ... 2018 Michael Benedikt How Can Reasoners Simplify ... 2018 Michael Benedikt When Can We Answer Queries ... 2018
. . . . . . . . .
•
Model each service as arelation•
Some attributes areinputs•
Access the service by giving abindingfor the input attributes•
The service returnsall tuples that match thebindingQuery:conjunctive query over the relations
?
Find all papers written by researchers from my department? Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)Constraints:express logical relationships between the services
!
Every researcher from the directory is in DBLP Σ :∀d aDirectory(d,a)→ ∃t y DBLP(a,t,y)3/16
Formalizing the Problem
Service schema:
DBLP(author,title,year)
Input author? Michael Benedikt OK
author title year
Michael Benedikt Goal-Driven Query Answering ... 2018 Michael Benedikt Form Filling Based on ... 2018 Michael Benedikt How Can Reasoners Simplify ... 2018 Michael Benedikt When Can We Answer Queries ... 2018
. . . . . . . . .
•
Model each service as arelation•
Some attributes areinputs•
Access the service by giving abindingfor the input attributes•
The service returnsall tuples that match thebindingQuery:conjunctive query over the relations
?
Find all papers written by researchers from my department? Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)Constraints:express logical relationships between the services
!
Every researcher from the directory is in DBLP Σ :∀d aDirectory(d,a)→ ∃t y DBLP(a,t,y)Formalizing the Problem
Service schema:
DBLP(author,title,year)
Input author? Michael Benedikt OK
author title year
Michael Benedikt Goal-Driven Query Answering ... 2018 Michael Benedikt Form Filling Based on ... 2018 Michael Benedikt How Can Reasoners Simplify ... 2018 Michael Benedikt When Can We Answer Queries ... 2018
. . . . . . . . .
•
Model each service as arelation•
Some attributes areinputs•
Access the service by giving abindingfor the input attributes•
The service returnsall tuples that match thebindingQuery:conjunctive query over the relations
?
Find all papers written by researchers from my department?Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)
Constraints:express logical relationships between the services
!
Every researcher from the directory is in DBLP Σ :∀d aDirectory(d,a)→ ∃t y DBLP(a,t,y)3/16
Formalizing the Problem
Service schema:
DBLP(author,title,year)
Input author? Michael Benedikt OK
author title year
Michael Benedikt Goal-Driven Query Answering ... 2018 Michael Benedikt Form Filling Based on ... 2018 Michael Benedikt How Can Reasoners Simplify ... 2018 Michael Benedikt When Can We Answer Queries ... 2018
. . . . . . . . .
•
Model each service as arelation•
Some attributes areinputs•
Access the service by giving abindingfor the input attributes•
The service returnsall tuples that match thebindingQuery:conjunctive query over the relations
?
Find all papers written by researchers from my department?Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)
Constraints:express logical relationships between the services
!
Every researcher from the directory is in DBLPΣ :∀d aDirectory(d,a)→ ∃t y DBLP(a,t,y) 3/16
Existing Solutions
Given theservice schemaS, thequeryQand theconstraintsΣ we want to find amonotone planforQusing the services ofS
Example:Find all papers written by researchers from my department? withDirectory(department,person) andDBLP(author,title,year) What is amonotone plan?
•
Access theservicesby givingbindings•
Evaluate monotonerelational algebra•
The plan iscorrectif it returnsQ(D) on any databaseDthat satisfiesΣExample:
T1⇐Directory⇐MyDept; T2⇐DBLP⇐πperson(T1); T3⇐πtitle(T2);
ReturnT3
Extensive literatureabout how to reformulate queries to monotone plans and about thecomplexity
depending on the constraint language
4/16
Existing Solutions
Given theservice schemaS, thequeryQand theconstraintsΣ we want to find amonotone planforQusing the services ofS
Example:Find all papers written by researchers from my department?
withDirectory(department,person) andDBLP(author,title,year)
What is amonotone plan?
•
Access theservicesby givingbindings•
Evaluate monotonerelational algebra•
The plan iscorrectif it returnsQ(D) on any databaseDthat satisfiesΣExample:
T1⇐Directory⇐MyDept; T2⇐DBLP⇐πperson(T1); T3⇐πtitle(T2);
ReturnT3
Extensive literatureabout how to reformulate queries to monotone plans and about thecomplexity
depending on the constraint language
Existing Solutions
Given theservice schemaS, thequeryQand theconstraintsΣ we want to find amonotone planforQusing the services ofS
Example:Find all papers written by researchers from my department?
withDirectory(department,person) andDBLP(author,title,year) What is amonotone plan?
•
Access theservicesby givingbindings•
Evaluate monotonerelational algebra•
The plan iscorrectif it returnsQ(D) on any databaseDthat satisfiesΣExample:
T1⇐Directory⇐MyDept;
T2⇐DBLP⇐πperson(T1); T3⇐πtitle(T2);
ReturnT3
Extensive literatureabout how to reformulate queries to monotone plans and about thecomplexity
depending on the constraint language
4/16
Existing Solutions
Given theservice schemaS, thequeryQand theconstraintsΣ we want to find amonotone planforQusing the services ofS
Example:Find all papers written by researchers from my department?
withDirectory(department,person) andDBLP(author,title,year) What is amonotone plan?
•
Access theservicesby givingbindings•
Evaluate monotonerelational algebra•
The plan iscorrectif it returnsQ(D) on any databaseDthat satisfiesΣExample:
T1⇐Directory⇐MyDept;
T2⇐DBLP⇐πperson(T1); T3⇐πtitle(T2);
ReturnT3
Extensive literatureabout how to reformulate queries to monotone plans and about thecomplexity
depending on the constraint language
New Challenge: Result Bounds
•
Real Web services do not returnall matching tuplesfor accesses!→ Plan results arenondeterministicand may not becorrect! Formalization:DBLP(author,title,year) has aresult boundof 1000
•
If an access matches≤1000tuples then they areallreturned•
If it matches>1000tuples then we get1000of them (random)→How to reformulate queries using result-bounded services?
5/16
New Challenge: Result Bounds
•
Real Web services do not returnall matching tuplesfor accesses!→ Plan results arenondeterministicand may not becorrect! Formalization:DBLP(author,title,year) has aresult boundof 1000
•
If an access matches≤1000tuples then they areallreturned•
If it matches>1000tuples then we get1000of them (random)→How to reformulate queries using result-bounded services?
New Challenge: Result Bounds
•
Real Web services do not returnall matching tuplesfor accesses!→ Plan results arenondeterministicand may not becorrect!
Formalization:DBLP(author,title,year) has aresult boundof 1000
•
If an access matches≤1000tuples then they areallreturned•
If it matches>1000tuples then we get1000of them (random)→How to reformulate queries using result-bounded services?
5/16
New Challenge: Result Bounds
•
Real Web services do not returnall matching tuplesfor accesses!→ Plan results arenondeterministicand may not becorrect! Formalization:DBLP(author,title,year) has aresult boundof 1000
•
If an access matches≤1000tuples then they areallreturned•
If it matches>1000tuples then we get1000of them (random)→How to reformulate queries using result-bounded services?
Formal Problem Statement Input:
•
Service schemaSof relation names and attributes withinput attributesand optionally aresult bound→ Directory(department,person)
→ DBLP(author,title,year) with bound 1000
•
Conjunctive queryQ→ Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)
•
ConstraintsΣ→ ∀d aDirectory(d,a)→ ∃t yDBLP(a,t,y)
Output:
•
Is there a monotone plan forQonSunderΣ? We study:→ What is thecomplexityof deciding plan existence, depending on the constraint language?
→ Inwhich waysare result-bounded services useful?
6/16
Formal Problem Statement Input:
•
Service schemaSof relation names and attributes withinput attributesand optionally aresult bound→ Directory(department,person)
→ DBLP(author,title,year) with bound 1000
•
Conjunctive queryQ→ Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)
•
ConstraintsΣ→ ∀d aDirectory(d,a)→ ∃t yDBLP(a,t,y) Output:
•
Is there a monotone plan forQonSunderΣ?We study:
→ What is thecomplexityof deciding plan existence, depending on the constraint language?
→ Inwhich waysare result-bounded services useful?
Formal Problem Statement Input:
•
Service schemaSof relation names and attributes withinput attributesand optionally aresult bound→ Directory(department,person)
→ DBLP(author,title,year) with bound 1000
•
Conjunctive queryQ→ Q(t) :∃a yDirectory(MyDept,a)∧DBLP(a,t,y)
•
ConstraintsΣ→ ∀d aDirectory(d,a)→ ∃t yDBLP(a,t,y) Output:
•
Is there a monotone plan forQonSunderΣ? We study:→ What is thecomplexityof deciding plan existence, depending on the constraint language?
→ Inwhich waysare result-bounded services useful? 6/16
Summary of Results
→ We showschema simplificationresults that describe when result bounds on services can beremoved
→ We showcomplexity resultsfor many constraint languages on deciding the existence of monotone plans
Fragment Simplification Complexity
Inclusion dependencies(IDs) Existence-check EXPTIME-complete Bounded-width IDs Existence-check NP-complete Functional dependencies(FDs) FD NP-complete
FDs and UIDs Choice NP-hard, inEXPTIME
Equality-free FO Choice Undecidable
Frontier-guarded TGDs Choice 2EXPTIME-complete
→Let’s see theschema simplification resultsand proof techniques
Summary of Results
→ We showschema simplificationresults that describe when result bounds on services can beremoved
→ We showcomplexity resultsfor many constraint languages on deciding the existence of monotone plans
Fragment Simplification Complexity
Inclusion dependencies(IDs) Existence-check EXPTIME-complete Bounded-width IDs Existence-check NP-complete Functional dependencies(FDs) FD NP-complete
FDs and UIDs Choice NP-hard, inEXPTIME
Equality-free FO Choice Undecidable
Frontier-guarded TGDs Choice 2EXPTIME-complete
→Let’s see theschema simplification resultsand proof techniques
7/16
Summary of Results
→ We showschema simplificationresults that describe when result bounds on services can beremoved
→ We showcomplexity resultsfor many constraint languages on deciding the existence of monotone plans
Fragment Simplification Complexity
Inclusion dependencies(IDs) Existence-check EXPTIME-complete Bounded-width IDs Existence-check NP-complete Functional dependencies(FDs) FD NP-complete
FDs and UIDs Choice NP-hard, inEXPTIME
Equality-free FO Choice Undecidable
Frontier-guarded TGDs Choice 2EXPTIME-complete
→Let’s see theschema simplification resultsand proof techniques
Summary of Results
→ We showschema simplificationresults that describe when result bounds on services can beremoved
→ We showcomplexity resultsfor many constraint languages on deciding the existence of monotone plans
Fragment Simplification Complexity
Inclusion dependencies(IDs) Existence-check EXPTIME-complete Bounded-width IDs Existence-check NP-complete Functional dependencies(FDs) FD NP-complete
FDs and UIDs Choice NP-hard, inEXPTIME
Equality-free FO Choice Undecidable
Frontier-guarded TGDs Choice 2EXPTIME-complete
→Let’s see theschema simplification resultsand proof techniques7/16
Existence-Check Simplification
Idea:use result-bounded services tocheck the existenceof tuples
•
Schema: DBLP(author,title,year) with bound 1000•
QueryQ:Has Michael Benedikt published something?•
Plan:accessDBLPwith “Michael Benedikt” and check if emptyA schemaSwith constraintsΣisexistence-check simplifiableif any query that has a plan onSunderΣstill has a plan
on theexistence-check approximation:
•
For each relationDBLP(author,title,year) with a result bound, create a new relationDBLPcheck(author)•
Add two IDs inΣto relateDBLPcheck andDBLP:∀aDBLPcheck(a)↔ ∃t yDBLP(a,t,y)
•
Forbid direct accesses toDBLP(so the result bound is irrelevant)Existence-Check Simplification
Idea:use result-bounded services tocheck the existenceof tuples
•
Schema: DBLP(author,title,year) with bound 1000•
QueryQ:Has Michael Benedikt published something?•
Plan:accessDBLPwith “Michael Benedikt” and check if empty A schemaSwith constraintsΣisexistence-check simplifiableif any query that has a plan onSunderΣstill has a planon theexistence-check approximation:
•
For each relationDBLP(author,title,year) with a result bound, create a new relationDBLPcheck(author)•
Add two IDs inΣto relateDBLPcheck andDBLP:∀aDBLPcheck(a)↔ ∃t yDBLP(a,t,y)
•
Forbid direct accesses toDBLP(so the result bound is irrelevant)8/16
Existence-Check Simplification
Idea:use result-bounded services tocheck the existenceof tuples
•
Schema: DBLP(author,title,year) with bound 1000•
QueryQ:Has Michael Benedikt published something?•
Plan:accessDBLPwith “Michael Benedikt” and check if empty A schemaSwith constraintsΣisexistence-check simplifiableif any query that has a plan onSunderΣstill has a planon theexistence-check approximation:
•
For each relationDBLP(author,title,year) with a result bound, create a new relationDBLPcheck(author)•
Add two IDs inΣto relateDBLPcheck andDBLP:∀aDBLPcheck(a)↔ ∃t yDBLP(a,t,y)
•
Forbid direct accesses toDBLP(so the result bound is irrelevant)Existence-Check Simplification
Idea:use result-bounded services tocheck the existenceof tuples
•
Schema: DBLP(author,title,year) with bound 1000•
QueryQ:Has Michael Benedikt published something?•
Plan:accessDBLPwith “Michael Benedikt” and check if empty A schemaSwith constraintsΣisexistence-check simplifiableif any query that has a plan onSunderΣstill has a planon theexistence-check approximation:
•
For each relationDBLP(author,title,year) with a result bound, create a new relationDBLPcheck(author)•
Add two IDs inΣto relateDBLPcheck andDBLP:∀aDBLPcheck(a)↔ ∃t yDBLP(a,t,y)
•
Forbid direct accesses toDBLP(so the result bound is irrelevant)8/16
Existence-Check Simplification
Idea:use result-bounded services tocheck the existenceof tuples
•
Schema: DBLP(author,title,year) with bound 1000•
QueryQ:Has Michael Benedikt published something?•
Plan:accessDBLPwith “Michael Benedikt” and check if empty A schemaSwith constraintsΣisexistence-check simplifiableif any query that has a plan onSunderΣstill has a planon theexistence-check approximation:
•
For each relationDBLP(author,title,year) with a result bound, create a new relationDBLPcheck(author)•
Add two IDs inΣto relateDBLPcheck andDBLP:∀aDBLPcheck(a)↔ ∃t yDBLP(a,t,y)
•
Forbid direct accesses toDBLP(so the result bound is irrelevant)Existence-Check Simplification Results
Theorem
Any schemaSwith constraintsΣinInclusion dependencies(IDs) isexistence-check simplifiable
→Under IDs, result-bounded services only serve asexistence checks
As the existence-check approximation hasno result bounds, we can reduce to the classical setting and deduce:
Corollary
For any schemaSwith result bounds,IDconstraintsΣ, and queryQ, deciding the existence of a monotone plan isEXPTIME-complete
9/16
Existence-Check Simplification Results
Theorem
Any schemaSwith constraintsΣinInclusion dependencies(IDs) isexistence-check simplifiable
→Under IDs, result-bounded services only serve asexistence checks As the existence-check approximation hasno result bounds,
we can reduce to the classical setting and deduce:
Corollary
For any schemaSwith result bounds,IDconstraintsΣ, and queryQ, deciding the existence of a monotone plan isEXPTIME-complete
FD Simplification
Result-bounded services cando morethan existence checks:
•
SchemaS: directory that returnsaddressesandphone numbers→ Dir2(name,address,phone) with bound 1000
•
Constraints:aFunctional dependency(FD)φ:name→address→ Each person has at most one address
•
Query:Find the address of “Michael Benedikt”•
Plan: accessDir2, return theaddressof any obtained tuple We callSandΣFD-simplifiableif any query that has a plan onSandΣstill has one on theFD approximation:•
For each relationDir2(name,address,phone) with result bound, create a new relationDir2FD(name,address) that outputsthe attributesdeterminedinΣby the input attributes
•
Forbid accesses onDir2and add IDs withDir2FDlike before∀n aDir2FD(n,a)↔ ∃pDir2(n,a,p)
10/16
FD Simplification
Result-bounded services cando morethan existence checks:
•
SchemaS: directory that returnsaddressesandphone numbers→ Dir2(name,address,phone) with bound 1000
•
Constraints:aFunctional dependency(FD)φ:name→address→ Each person has at most one address
•
Query:Find the address of “Michael Benedikt”•
Plan: accessDir2, return theaddressof any obtained tuple We callSandΣFD-simplifiableif any query that has a plan onSandΣstill has one on theFD approximation:•
For each relationDir2(name,address,phone) with result bound, create a new relationDir2FD(name,address) that outputsthe attributesdeterminedinΣby the input attributes
•
Forbid accesses onDir2and add IDs withDir2FDlike before∀n aDir2FD(n,a)↔ ∃pDir2(n,a,p)
FD Simplification
Result-bounded services cando morethan existence checks:
•
SchemaS: directory that returnsaddressesandphone numbers→ Dir2(name,address,phone) with bound 1000
•
Constraints:aFunctional dependency(FD)φ:name→address→ Each person has at most one address
•
Query:Find the address of “Michael Benedikt”•
Plan: accessDir2, return theaddressof any obtained tuple We callSandΣFD-simplifiableif any query that has a plan onSandΣstill has one on theFD approximation:•
For each relationDir2(name,address,phone) with result bound, create a new relationDir2FD(name,address) that outputsthe attributesdeterminedinΣby the input attributes
•
Forbid accesses onDir2and add IDs withDir2FDlike before∀n aDir2FD(n,a)↔ ∃pDir2(n,a,p)
10/16
FD Simplification
Result-bounded services cando morethan existence checks:
•
SchemaS: directory that returnsaddressesandphone numbers→ Dir2(name,address,phone) with bound 1000
•
Constraints:aFunctional dependency(FD)φ:name→address→ Each person has at most one address
•
Query:Find the address of “Michael Benedikt”•
Plan:accessDir2, return theaddressof any obtained tupleWe callSandΣFD-simplifiableif any query that has a plan onSandΣstill has one on theFD approximation:
•
For each relationDir2(name,address,phone) with result bound, create a new relationDir2FD(name,address) that outputsthe attributesdeterminedinΣby the input attributes
•
Forbid accesses onDir2and add IDs withDir2FDlike before∀n aDir2FD(n,a)↔ ∃pDir2(n,a,p)
FD Simplification
Result-bounded services cando morethan existence checks:
•
SchemaS: directory that returnsaddressesandphone numbers→ Dir2(name,address,phone) with bound 1000
•
Constraints:aFunctional dependency(FD)φ:name→address→ Each person has at most one address
•
Query:Find the address of “Michael Benedikt”•
Plan:accessDir2, return theaddressof any obtained tuple We callSandΣFD-simplifiableif any query that has a plan onSandΣstill has one on theFD approximation:•
For each relationDir2(name,address,phone) with result bound, create a new relationDir2FD(name,address) that outputsthe attributesdeterminedinΣby the input attributes
•
Forbid accesses onDir2and add IDs withDir2FDlike before∀n aDir2FD(n,a)↔ ∃pDir2(n,a,p)
10/16
FD Simplification
Result-bounded services cando morethan existence checks:
•
SchemaS: directory that returnsaddressesandphone numbers→ Dir2(name,address,phone) with bound 1000
•
Constraints:aFunctional dependency(FD)φ:name→address→ Each person has at most one address
•
Query:Find the address of “Michael Benedikt”•
Plan:accessDir2, return theaddressof any obtained tuple We callSandΣFD-simplifiableif any query that has a plan onSandΣstill has one on theFD approximation:•
For each relationDir2(name,address,phone) with result bound, create a new relationDir2FD(name,address) that outputsthe attributesdeterminedinΣby the input attributes
•
Forbid accesses onDir2and add IDs withDir2FDlike before∀n aDir2FD(n,a)↔ ∃pDir2(n,a,p)
FD Simplification
Result-bounded services cando morethan existence checks:
•
SchemaS: directory that returnsaddressesandphone numbers→ Dir2(name,address,phone) with bound 1000
•
Constraints:aFunctional dependency(FD)φ:name→address→ Each person has at most one address
•
Query:Find the address of “Michael Benedikt”•
Plan:accessDir2, return theaddressof any obtained tuple We callSandΣFD-simplifiableif any query that has a plan onSandΣstill has one on theFD approximation:•
For each relationDir2(name,address,phone) with result bound, create a new relationDir2FD(name,address) that outputsthe attributesdeterminedinΣby the input attributes
•
Forbid accesses onDir2and add IDs withDir2FDlike before∀n aDir2FD(n,a)↔ ∃pDir2(n,a,p)
10/16
FD Simplification Results
Theorem
Any schemaSwith constraintsΣinFunctional dependencies(FDs) isFD simplifiable
→ Under FDs, result-bounded services are only useful to access outputs that arefunctionally determined (i.e., we are guaranteed to have only one result)
Again, there areno result boundsleft in the FD approximation, so we can use this result to show:
Corollary
For any schemaSwith result bounds,FDconstraintsΣ, and queryQ, deciding the existence of a monotone plan isNP-complete
FD Simplification Results
Theorem
Any schemaSwith constraintsΣinFunctional dependencies(FDs) isFD simplifiable
→ Under FDs, result-bounded services are only useful to access outputs that arefunctionally determined (i.e., we are guaranteed to have only one result)
Again, there areno result boundsleft in the FD approximation, so we can use this result to show:
Corollary
For any schemaSwith result bounds,FDconstraintsΣ, and queryQ, deciding the existence of a monotone plan isNP-complete
11/16
Choice Simplification
With expressive constraints, theFD approximationisnot enough:
Lemma
There is a service schemaS, queryQ, andTGDsΣsuch that Qisnot FD-simplifiable(hence, notexistence-check-simplifiable)
A less drastic simplification is thechoice simplification:
•
For every service with a result bound, change the bound to be1→Intuition:It’s important to getsome tupleif one exists
Choice Simplification
With expressive constraints, theFD approximationisnot enough:
Lemma
There is a service schemaS, queryQ, andTGDsΣsuch that Qisnot FD-simplifiable(hence, notexistence-check-simplifiable) A less drastic simplification is thechoice simplification:
•
For every service with a result bound, change the bound to be1→Intuition:It’s important to getsome tupleif one exists
12/16
Choice Simplification Results
Theorem
Any schemaSwith constraintsΣinequality-free first-order logic (e.g., TGDs) ischoice simplifiable
Thus, plan existence is decidable fordecidable FO fragments, e.g., frontier-guarded TGDs(FGTGDs):
Corollary
For any schemaSwith result bounds, queryQ, andFGTGDsΣ deciding the existence of a monotone plan is2EXPTIME-complete We can also show choice approximability for another fragment:
Theorem
Any schemaSwith constraintsΣthat areFDsandunary IDs(UIDs) ischoice simplifiable
→This implies that plan existence isdecidablefor FDs and UIDs
Choice Simplification Results
Theorem
Any schemaSwith constraintsΣinequality-free first-order logic (e.g., TGDs) ischoice simplifiable
Thus, plan existence is decidable fordecidable FO fragments, e.g., frontier-guarded TGDs(FGTGDs):
Corollary
For any schemaSwith result bounds, queryQ, andFGTGDsΣ deciding the existence of a monotone plan is2EXPTIME-complete
We can also show choice approximability for another fragment: Theorem
Any schemaSwith constraintsΣthat areFDsandunary IDs(UIDs) ischoice simplifiable
→This implies that plan existence isdecidablefor FDs and UIDs
13/16
Choice Simplification Results
Theorem
Any schemaSwith constraintsΣinequality-free first-order logic (e.g., TGDs) ischoice simplifiable
Thus, plan existence is decidable fordecidable FO fragments, e.g., frontier-guarded TGDs(FGTGDs):
Corollary
For any schemaSwith result bounds, queryQ, andFGTGDsΣ deciding the existence of a monotone plan is2EXPTIME-complete We can also show choice approximability for another fragment:
Theorem
Any schemaSwith constraintsΣthat areFDsandunary IDs(UIDs) ischoice simplifiable
→This implies that plan existence isdecidablefor FDs and UIDs
Overview of Proof Techniques
•
Show that result bounds can be axiomatized insimpler ways:• Ensure that doing thesame accesstwice returns thesame result
• Only write thelower bound:“ifiresults exist theniare returned”
•
Show that plan existence can be rephrased asanswerability:→ If a databaseIsatisfiesQandI0 hasmore accessible datathanI thenI0should satisfyQas well
•
Show simplification results using ablowup technique:• Start with acounterexample to answerabilityon the simplification
• Blow it upto a counterexample on the original schema
•
Reduce toquery containment under constraints→ Study the result of the translation to show complexity bounds
14/16
Overview of Proof Techniques
•
Show that result bounds can be axiomatized insimpler ways:• Ensure that doing thesame accesstwice returns thesame result
• Only write thelower bound:“ifiresults exist theniare returned”
•
Show that plan existence can be rephrased asanswerability:→ If a databaseIsatisfiesQandI0 hasmore accessible datathanI thenI0 should satisfyQas well
•
Show simplification results using ablowup technique:• Start with acounterexample to answerabilityon the simplification
• Blow it upto a counterexample on the original schema
•
Reduce toquery containment under constraints→ Study the result of the translation to show complexity bounds
Overview of Proof Techniques
•
Show that result bounds can be axiomatized insimpler ways:• Ensure that doing thesame accesstwice returns thesame result
• Only write thelower bound:“ifiresults exist theniare returned”
•
Show that plan existence can be rephrased asanswerability:→ If a databaseIsatisfiesQandI0 hasmore accessible datathanI thenI0 should satisfyQas well
•
Show simplification results using ablowup technique:• Start with acounterexample to answerabilityon the simplification
• Blow it upto a counterexample on the original schema
•
Reduce toquery containment under constraints→ Study the result of the translation to show complexity bounds
14/16
Overview of Proof Techniques
•
Show that result bounds can be axiomatized insimpler ways:• Ensure that doing thesame accesstwice returns thesame result
• Only write thelower bound:“ifiresults exist theniare returned”
•
Show that plan existence can be rephrased asanswerability:→ If a databaseIsatisfiesQandI0 hasmore accessible datathanI thenI0 should satisfyQas well
•
Show simplification results using ablowup technique:• Start with acounterexample to answerabilityon the simplification
• Blow it upto a counterexample on the original schema
•
Reduce toquery containment under constraints→ Study the result of the translation to show complexity bounds
Other Results in the Paper
•
Better complexity bounds using alinearization technique for query containment underIDs + side informationTheorem
Plan existence underbounded-width IDsisNP-complete
Theorem
Plan existence underUIDs and FDsisNP-hardandinEXPTIME
•
Result forarity-2 constraints TheoremPlan existence isdecidableforguarded two-variable FO + counting
•
Results when the database is assumed to befinite•
Results fornon-monotone plans(= with relational difference)•
Example of FO constraints that arenot choice simplifiable15/16
Other Results in the Paper
•
Better complexity bounds using alinearization technique for query containment underIDs + side information TheoremPlan existence underbounded-width IDsisNP-complete
Theorem
Plan existence underUIDs and FDsisNP-hardandinEXPTIME
•
Result forarity-2 constraints TheoremPlan existence isdecidableforguarded two-variable FO + counting
•
Results when the database is assumed to befinite•
Results fornon-monotone plans(= with relational difference)•
Example of FO constraints that arenot choice simplifiableOther Results in the Paper
•
Better complexity bounds using alinearization technique for query containment underIDs + side information TheoremPlan existence underbounded-width IDsisNP-complete
Theorem
Plan existence underUIDs and FDsisNP-hardandinEXPTIME
•
Result forarity-2 constraints TheoremPlan existence isdecidableforguarded two-variable FO + counting
•
Results when the database is assumed to befinite•
Results fornon-monotone plans(= with relational difference)•
Example of FO constraints that arenot choice simplifiable15/16
Other Results in the Paper
•
Better complexity bounds using alinearization technique for query containment underIDs + side information TheoremPlan existence underbounded-width IDsisNP-complete
Theorem
Plan existence underUIDs and FDsisNP-hardandinEXPTIME
•
Result forarity-2 constraints TheoremPlan existence isdecidableforguarded two-variable FO + counting
•
Results when the database is assumed to befinite•
Results fornon-monotone plans(= with relational difference)•
Example of FO constraints that arenot choice simplifiableOther Results in the Paper
•
Better complexity bounds using alinearization technique for query containment underIDs + side information TheoremPlan existence underbounded-width IDsisNP-complete
Theorem
Plan existence underUIDs and FDsisNP-hardandinEXPTIME
•
Result forarity-2 constraints TheoremPlan existence isdecidableforguarded two-variable FO + counting
•
Results when the database is assumed to befinite•
Results fornon-monotone plans(= with relational difference)•
Example of FO constraints that arenot choice simplifiable15/16
Other Results in the Paper
•
Better complexity bounds using alinearization technique for query containment underIDs + side information TheoremPlan existence underbounded-width IDsisNP-complete
Theorem
Plan existence underUIDs and FDsisNP-hardandinEXPTIME
•
Result forarity-2 constraints TheoremPlan existence isdecidableforguarded two-variable FO + counting
•
Results when the database is assumed to befinite•
Results fornon-monotone plans(= with relational difference)•
Example of FO constraints that arenot choice simplifiableOther Results in the Paper
•
Better complexity bounds using alinearization technique for query containment underIDs + side information TheoremPlan existence underbounded-width IDsisNP-complete
Theorem
Plan existence underUIDs and FDsisNP-hardandinEXPTIME
•
Result forarity-2 constraints TheoremPlan existence isdecidableforguarded two-variable FO + counting
•
Results when the database is assumed to befinite•
Results fornon-monotone plans(= with relational difference)•
Example of FO constraints that arenot choice simplifiable 15/16Summary and future work
•
Problem:Given a schema of services withresult bounds, logical constraints, and a query, is there a plan to answer it?•
Simplification resultsto remove bounds, andcomplexity resultsFragment Simplification Complexity
Inclusion dependencies(IDs) Existence-check EXPTIME-complete Bounded-width IDs Existence-check NP-complete Functional dependencies(FDs) FD NP-complete
FDs and UIDs Choice NP-hard, inEXPTIME
Equality-free FO Choice Undecidable
Frontier-guarded TGDs Choice 2EXPTIME-complete
→ What are other possible uses of thelinearization technique?
→ Do the bounds matter inpractice? (approximate plans?) Thanks for your attention!
Summary and future work
•
Problem:Given a schema of services withresult bounds, logical constraints, and a query, is there a plan to answer it?•
Simplification resultsto remove bounds, andcomplexity results Fragment Simplification ComplexityInclusion dependencies(IDs) Existence-check EXPTIME-complete Bounded-width IDs Existence-check NP-complete Functional dependencies(FDs) FD NP-complete
FDs and UIDs Choice NP-hard, inEXPTIME
Equality-free FO Choice Undecidable
Frontier-guarded TGDs Choice 2EXPTIME-complete
→ What are other possible uses of thelinearization technique?
→ Do the bounds matter inpractice? (approximate plans?) Thanks for your attention!
16/16
Summary and future work
•
Problem:Given a schema of services withresult bounds, logical constraints, and a query, is there a plan to answer it?•
Simplification resultsto remove bounds, andcomplexity results Fragment Simplification ComplexityInclusion dependencies(IDs) Existence-check EXPTIME-complete Bounded-width IDs Existence-check NP-complete Functional dependencies(FDs) FD NP-complete
FDs and UIDs Choice NP-hard, inEXPTIME
Equality-free FO Choice Undecidable
Frontier-guarded TGDs Choice 2EXPTIME-complete
→ What are other possible uses of thelinearization technique?
→ Do the bounds matter inpractice? (approximate plans?)
Thanks for your attention!
Summary and future work
•
Problem:Given a schema of services withresult bounds, logical constraints, and a query, is there a plan to answer it?•
Simplification resultsto remove bounds, andcomplexity results Fragment Simplification ComplexityInclusion dependencies(IDs) Existence-check EXPTIME-complete Bounded-width IDs Existence-check NP-complete Functional dependencies(FDs) FD NP-complete
FDs and UIDs Choice NP-hard, inEXPTIME
Equality-free FO Choice Undecidable
Frontier-guarded TGDs Choice 2EXPTIME-complete
→ What are other possible uses of thelinearization technique?
→ Do the bounds matter inpractice? (approximate plans?)
Thanks for your attention! 16/16
References
Benedikt, M., Leblay, J., Cate, B. t., and Tsamoura, E. (2016).
Generating plans from proofs: the interpolation-based approach to query reformulation.
Morgan & Claypool.