• Aucun résultat trouvé

Schema Mapping Discovery from Data Instances

N/A
N/A
Protected

Academic year: 2022

Partager "Schema Mapping Discovery from Data Instances"

Copied!
55
0
0

Texte intégral

(1)

Free University of Bozen–Bolzano, 2 February 2011

Schema Mapping Discovery from Data Instances

Georg Gottlob, Pierre Senellart

(2)

2 / 25 U. Oxford & Télécom PT Georg Gottlob, Pierre Senellart

Different sources organize the same data differently

Jeffrey D. Ullman

List of publications from the DBLP Bibliography Server FAQ

Coauthor Index Ask others: ACM DL/Guide CiteSeer CSB Google MSN Yahoo

Home Page

2007

240 EE Foto N. Afrati, Chen Li, Jeffrey D. Ullman: Using views to generate efficient evaluation plans for queries. J. Comput. Syst. Sci. 73(5): 703724 (2007)

2005

239 EE Jeffrey D. Ullman: Gradiance OnLine Accelerated Learning. ACSC 2005: 36

238 EE Serge Abiteboul, Rakesh Agrawal, Philip A. Bernstein, Michael J. Carey, Stefano Ceri, W. Bruce Croft, David J. DeWitt, Michael J. Franklin, Hector GarciaMolina, Dieter Gawlick, Jim Gray, Laura M. Haas, Alon Y. Halevy, Joseph M. Hellerstein, Yannis E. Ioannidis, Martin L. Kersten, Michael J. Pazzani, Michael Lesk, David Maier, Jeffrey F. Naughton, HansJörg Schek, Timos K. Sellis, Avi Silberschatz, Michael Stonebraker, Richard T. Snodgrass, Jeffrey D. Ullman, Gerhard Weikum, Jennifer Widom, Stanley B.

Zdonik: The Lowell database research selfassessment. Commun. ACM 48(5): 111118 (2005) 237 EE Serge Abiteboul, Richard Hull, Victor Vianu, Sheila A. Greibach, Michael A. Harrison, Ellis Horowitz,

Daniel J. Rosenkrantz, Jeffrey D. Ullman, Moshe Y. Vardi: In memory of Seymour Ginsburg 1928 2004.

SIGMOD Record 34(1): 512 (2005)

2003

236 EE Jeffrey D. Ullman: A Survey of New Directions in Database System. DASFAA 2003: 3

235 EE Jeffrey D. Ullman: Improving the Efficiency of DatabaseSystem Teaching. SIGMOD Conference 2003:

13

234 EE Jim Gray, HansJörg Schek, Michael Stonebraker, Jeffrey D. Ullman: The Lowell Report. SIGMOD Conference 2003: 680

233 EE Serge Abiteboul, Rakesh Agrawal, Philip A. Bernstein, Michael J. Carey, Stefano Ceri, W. Bruce Croft, David J. DeWitt, Michael J. Franklin, Hector GarciaMolina, Dieter Gawlick, Jim Gray, Laura M. Haas, Alon Y. Halevy, Joseph M. Hellerstein, Yannis E. Ioannidis, Martin L. Kersten, Michael J. Pazzani, Michael Lesk, David Maier, Jeffrey F. Naughton, HansJörg Schek, Timos K. Sellis, Avi Silberschatz, Michael Stonebraker, Richard T. Snodgrass, Jeffrey D. Ullman, Gerhard Weikum, Jennifer Widom, Stanley B.

Zdonik: The Lowell Database Research Self Assessment CoRR cs.DB/0310006: (2003)

232 EE Anand Rajaraman, Jeffrey D. Ullman: Querying websites using compact skeletons. J. Comput. Syst. Sci.

66(4): 809851 (2003)

2001

231 EE Chen Li, Mayank Bawa, Jeffrey D. Ullman: Minimizing View Sets without Losing QueryAnswering Power.

ICDT 2001: 99113

230 EE Anand Rajaraman, Jeffrey D. Ullman: Querying Websites Using Compact Skeletons. PODS 2001 229 EE Foto N. Afrati, Chen Li, Jeffrey D. Ullman: Generating Efficient Plans for Queries Using Views. SIGMOD

Conference 2001: 319330

228 EE Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, Jeffrey D.

Ullman, Cheng Yang: Finding Interesting Associations without Support Pruning. IEEE Trans. Knowl.

Data Eng. 13(1): 6478 (2001)

2000

227 Hector GarciaMolina, Jeffrey D. Ullman, Jennifer Widom: Database System Implementation PrenticeHall 2000

226 EE Jeffrey D. Ullman: A Survey of AssociationRule Mining. Discovery Science 2000: 114 225 EE Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, Jeffrey D.

Ullman, Cheng Yang: Finding Interesting Associations without Support Pruning. ICDE 2000: 489499

(3)

2 / 25 U. Oxford & Télécom PT Georg Gottlob, Pierre Senellart

Different sources organize the same data differently

Advanced Scholar Search

Scholar Preferences Scholar Help

Scholar All articles Recent articles Results 1 10 of about 12 for author:"jd ullman". (0.07 seconds)

jd ullman J Ullman J Hopcroft A Rajaraman B Konikow ska A Aho

Querying websites using compact skeletons all 11 versions »

A Rajaraman, JD Ullman Journal of Computer and System Sciences, 2003 Elsevier Several commercial applications, such as online comparison shopping and process automation, require integrating information that is scattered across multiple w ebsites or XML documents. Much research has been devoted to this problem, ...

Cited by 13 Related Articles Web Search

[BOOK] Wprowadzenie do teorii automatów, jezyków i obliczen JE Hopcroft, JD Ullman, B Konikow ska 2003 Wydaw . Naukow e PWN Cited by 15 Related Articles Web Search

Improving the efficiency of databasesystem teaching all 3 versions »

JD Ullman Proceedings of the 2003 ACM SIGMOD international conference …, 2003 portal.acm.org ABSTRACT The education industry has a very poor record of produc tivity gains.

In this brief article, I outline some of the w ays the teaching of a college course in database systems could be made more ecient, and sta time used ...

Cited by 4 Related Articles Web Search

A survey of new directions in database systems all 5 versions »

JD Ullman Database Systems for Advanced Applications, 2003.(DASFAA …, 2003 ieeexplore.ieee.org A survey of new directions in database systems. Ullman, JD Stanford University;

This paper appears in: Database Systems for Advanced Applications, 2003.

(DASFAA 2003). Proceedings. Eighth International ...

Cited by 3 Related Articles Web Search [CITATION] ????

AV Aho, R Sethi, JD Ullman 2003 ??: ???????

Cited by 6 Related Articles Web Search [BOOK] Automi, linguaggi e calcolabilità

… Hopcroft, R Motw ani, JD Ullman, L Bernardinello, L … 2003 Pearson Education Italia Cited by 5 Related Articles Web Search

[CITATION] ???????

H GarciaMolina, JD Ullman, J Widom 2003 ??: ???????

Cited by 4 Related Articles Web Search [BOOK] Implementacja systemów baz danych

H GarciaMolina, J Widom, M Jurkiew icz, JD Ullman 2003 Wydaw nictw a Naukow oTechniczne Cited by 3 Related Articles Web Search

[BOOK] Projektowanie i analiza algorytmów: klasyczna praca z teorii algorytmów komputerowych AV Aho, JE Hopcroft, JD Ullman, W Derechow ski 2003 Helion

Cited by 2 Related Articles Web Search [CITATION] ???????

AV AHO, JE HOPCROFT, JD ULLMAN 2003 ??: ???????

Cited by 1 Related Articles Web Search

Result Page: 1 2 Next

Google Home About Google About Google Scholar

(4)

Motivation

Context

Multiple data sources containing information about similar entities, with someredundancy (e.g., sources of the deep Web).

Several different ways to present this information, i.e., several different schemata.

No a prioriinformation about (some of) these schemata.

How to know the relationshipsbetween these schemata, by just looking at the instances?

Other way to see this problem: Match operator on schema mappings, in the setting ofdata exchange.

(5)

Motivation

Context

Multiple data sources containing information about similar entities, with someredundancy (e.g., sources of the deep Web).

Several different ways to present this information, i.e., several different schemata.

No a prioriinformation about (some of) these schemata.

How to know the relationshipsbetween these schemata, by just looking at the instances?

Other way to see this problem: Match operator on schema mappings, in the setting ofdata exchange.

(6)

Motivation

Context

Multiple data sources containing information about similar entities, with someredundancy (e.g., sources of the deep Web).

Several different ways to present this information, i.e., several different schemata.

No a prioriinformation about (some of) these schemata.

How to know the relationshipsbetween these schemata, by just looking at the instances?

Other way to see this problem: Match operator on schema mappings, in the setting ofdata exchange.

(7)

Problem definition

Problem

Given two (relational) database instances I and J with different schemata, what is the optimal description Σ of J knowing I (with Σ a finite set of formulas in some logical language)?

What does optimal implies:

Conciseness of description.

Validity of facts predicted byI and Σ. All facts ofJ explainedby I and Σ.

(Note the asymmetry between I and J; context ofdata exchange where J is computed fromI and Σ).

(8)

Problem definition

Problem

Given two (relational) database instances I and J with different schemata, what is the optimal description Σ of J knowing I (with Σ a finite set of formulas in some logical language)?

What does optimal implies:

Conciseness of description.

Validity of facts predicted byI and Σ. All facts ofJ explainedby I and Σ.

(Note the asymmetry between I and J; context ofdata exchange where J is computed fromI and Σ).

(9)

Outline

Introduction

TGDs, Cost, Optimality TGDs

Cost and Optimality Results

Extensions, Variants Conclusion

(10)

Source-to-target tuple-generating dependencies

Definition (Source-to-target tgd) First-order formula of the form:

∀x𝜙(x) → ∃y𝜓(x,y) with:

𝜙conjunctionof source relation atoms;

𝜓 conjunctionof target relation atoms;

all variables ofx bound in𝜙. Example

∀x1∀x2R1(x1,x2) ∧R2(x2) → ∃y R(x1,y)

(11)

Particular tgds

Two ways of having simpler tgds:

Disallow existential quantifiers on the right hand-side: full tgds.

Disallow cycles on both left- and right-hand sides: acyclic tgds.

(Classical notion of acyclicity on hypergraphs extending the basic notion of acyclicity on graphs.)

Examples

x1x2x3R1(x1,x2) ∧R2(x2,x3) ∧R3(x3,x1) →R(x1)iscyclic(and full).

x1x2x3R1(x1,x2) ∧R2(x2,x3) →R(x1)isacyclic(and full).

tgd: arbitrary source-to-target tgds;

full: full tgds;

acyc: acyclic tgds;

facyc: full and acyclic tgds.

(12)

Particular tgds

Two ways of having simpler tgds:

Disallow existential quantifiers on the right hand-side: full tgds.

Disallow cycles on both left- and right-hand sides: acyclic tgds.

(Classical notion of acyclicity on hypergraphs extending the basic notion of acyclicity on graphs.)

Examples

x1x2x3R1(x1,x2) ∧R2(x2,x3) ∧R3(x3,x1) →R(x1)iscyclic(and full).

x1x2x3R1(x1,x2) ∧R2(x2,x3) →R(x1)isacyclic(and full).

tgd: arbitrary source-to-target tgds;

full: full tgds;

acyc: acyclic tgds;

facyc: full and acyclic tgds.

(13)

Particular tgds

Two ways of having simpler tgds:

Disallow existential quantifiers on the right hand-side: full tgds.

Disallow cycles on both left- and right-hand sides: acyclic tgds.

(Classical notion of acyclicity on hypergraphs extending the basic notion of acyclicity on graphs.)

Examples

x1x2x3R1(x1,x2) ∧R2(x2,x3) ∧R3(x3,x1) →R(x1)iscyclic(and full).

x1x2x3R1(x1,x2) ∧R2(x2,x3) →R(x1)isacyclic(and full).

tgd: arbitrary source-to-target tgds;

full: full tgds;

acyc: acyclic tgds;

facyc: full and acyclic tgds.

(14)

Pertinence of a set of tgds

Example

R R

a b c d

a a b b c a d d g h Σ0 = ?

Σ1 = {∀x R(x) →R(x,x)}

Σ2 = {∀x R(x) → ∃y R(x,y)}

Σ3 = {∀x1∀x2R(x1) ∧R(x2) →R(x1,x2)}

Σ4 = {∃y1∃y2 R(y1,y2)}

(15)

Pertinence of a set of tgds

Example

R R

a b c d

a a b b c a d d g h Σ0 = ?

Σ1 = {∀x R(x) →R(x,x)}

Σ2 = {∀x R(x) → ∃y R(x,y)}

Σ3 = {∀x1∀x2R(x1) ∧R(x2) →R(x1,x2)}

Σ4 = {∃y1∃y2 R(y1,y2)}

(16)

Idea

Size of a formula: number of occurrences of variables and constants.

Costof a schema mapping Σ: Size of theminimum repairof Σ that is valid and explains all facts of J.

Types of repairs considered:

“fix” a universal quantifier by adding conditions (x =a or x ̸=a);

“fix” an existential quantifier by giving corresponding constants (𝜏(x) →y=a with𝜏 a conjunction of conditions on universally quantified variables);

addground factsto the target instance.

The problem is then to find a schema mapping ofminimal cost.

(17)

Example cost computation

Example

R R

a b c d

a a b b c a d d g h

∀x R(x) →R(x,x)

Cost: 17

PredictedR a a b b c c d d

(18)

Example cost computation

Example

R R

a b c d

a a b b c a d d g h

∀x R(x)∧x ̸=c →R(x,x)

Cost: 17

PredictedR a a b b d d

(19)

Example cost computation

Example

R R

a b c d

a a b b c a d d g h

∀x R(x)∧x ̸=c →R(x,x) R(c,a)

R(g,h) Cost: 17

PredictedR a a b b c a d d

(20)

Example cost computation

Example

R R

a b c d

a a b b c a d d g h

∀x R(x)∧x ̸=c →R(x,x) R(c,a)

R(g,h)

Cost: 17

PredictedR a a b b c c d d g h

(21)

Example cost computation

Example

R R

a b c d

a a b b c a d d g h

∀x R(x)∧x ̸=c →R(x,x)

∃y1∃y2R(y1,y2) ∧y1 =c∧y2 =a

∃y1∃y2R(y1,y2) ∧y1 =g∧y2 =h

Cost: 17

PredictedR a a b b c c d d g h

(22)

Example cost computation

Example

R R

a b c d

a a b b c a d d g h

∀x R(x) ∧x ̸=c →R(x,x)

∃y1∃y2R(y1,y2) ∧y1 =c∧y2 =a

∃y1∃y2R(y1,y2) ∧y1 =g∧y2 =h Cost: 17

PredictedR a a b b c c d d g h

(23)

Problems considered

Decision problemsof interest:

Cost: Is the cost of a given schema mapping less thanK? Optimality: Is a given schema mapping optimal?

Complexity? Algorithms?

(24)

Problems considered

Decision problemsof interest:

Cost: Is the cost of a given schema mapping less thanK? Optimality: Is a given schema mapping optimal?

Complexity? Algorithms?

(25)

Outline

Introduction

TGDs, Cost, Optimality Results

Justification

Complexity Analysis Extensions, Variants Conclusion

(26)

Behavior for simple operators

Consider the elementary operatorsof the relational algebra:

Projection Intersection

Selection (conjunction of atomic conditions) Cross Product

Join (on a given attribute) Theorem

For any elementary operator 𝛾, the tgd naturally associated with 𝛾 is optimal with respect to (I, 𝛾(I)) (or (𝛾(J),J)), under some basic assumptions.

(27)

Behavior for simple operators

Consider the elementary operatorsof the relational algebra:

Projection Intersection

Selection (conjunction of atomic conditions) Cross Product

Join (on a given attribute) Theorem

For any elementary operator 𝛾, the tgd naturally associated with 𝛾 is optimal with respect to (I, 𝛾(I)) (or (𝛾(J),J)), under some basic assumptions.

(28)

Examples of naturally associated tgds

Examples

Condition I andJ Optimal tgd

Projection

I ̸= ? J= 𝜋1(I) R(x,y) →R(x) 𝜋1(J) ∩ 𝜋2(J) = ?,

|𝜋1(J)| >2 I = 𝜋1(J) R(x) → ∃y R(x,y) Selection |𝜎𝜙(I)| >size(𝜙)+3 2 J= 𝜎𝜙(I) R(x) →R(x)

𝜎𝜙(J) ̸= ? I = 𝜎𝜙(J) R(x) →R(x) Product R1I̸= ?,RI2̸= ? J=RI1×RI2 R1(x) ∧R2(y) →R(x,y)

R1J̸= ?,R2J ̸= ? I =R1J×R2J R(x,y) →R1(x) ∧R2(y)

(29)

The Polynomial Hierarchy

P polynomial deterministic algorithm NP

P1

polynomial non-deterministic algorithm coNP

P1

complement NP

DP problems expressible as the conjunction of a NPand a coNPproblem

ΣP2 polynomial non-deterministic withΣP1 oracle

ΠP2 complement ΣP2

ΣPn+1 polynomial non-deterministic withΣPn oracle ΠPn+1 complement ΣPn+1

Union of all these classes: PH⊆PSPACE, thepolynomial hierarchy.

(30)

The Polynomial Hierarchy

P polynomial deterministic algorithm NP=ΣP1 polynomial non-deterministic algorithm coNP=ΠP1 complement NP

DP problems expressible as the conjunction of a NPand a coNPproblem

ΣP2 polynomial non-deterministic withΣP1 oracle

ΠP2 complement ΣP2

ΣPn+1 polynomial non-deterministic withΣPn oracle ΠPn+1 complement ΣPn+1

Union of all these classes: PH⊆PSPACE, thepolynomial hierarchy.

(31)

The Polynomial Hierarchy

P polynomial deterministic algorithm NP=ΣP1 polynomial non-deterministic algorithm coNP=ΠP1 complement NP

DP problems expressible as the conjunction of a NPand a coNPproblem

ΣP2 polynomial non-deterministic withΣP1 oracle ΠP2 complement ΣP2

ΣPn+1 polynomial non-deterministic withΣPn oracle ΠPn+1 complement ΣPn+1

Union of all these classes: PH⊆PSPACE, thepolynomial hierarchy.

(32)

The Polynomial Hierarchy

P polynomial deterministic algorithm NP=ΣP1 polynomial non-deterministic algorithm coNP=ΠP1 complement NP

DP problems expressible as the conjunction of a NPand a coNPproblem

ΣP2 polynomial non-deterministic withΣP1 oracle ΠP2 complement ΣP2

ΣPn+1 polynomial non-deterministic withΣPn oracle ΠPn+1 complement ΣPn+1

Union of all these classes: PH⊆PSPACE, thepolynomial hierarchy.

(33)

The Polynomial Hierarchy

P polynomial deterministic algorithm NP=ΣP1 polynomial non-deterministic algorithm coNP=ΠP1 complement NP

DP problems expressible as the conjunction of a NPand a coNPproblem

ΣP2 polynomial non-deterministic withΣP1 oracle ΠP2 complement ΣP2

ΣPn+1 polynomial non-deterministic withΣPn oracle ΠPn+1 complement ΣPn+1

Union of all these classes: PH⊆PSPACE, thepolynomial hierarchy.

(34)

(Combined) Complexity Results

tgdfull

Cost ΣP3P2-hard ΣP2,DP-hard Optimality ΠP4,DP-hard ΠP3,DP-hard

acycfacyc

Cost ΣP2, (co)NP-hard NP-complete Optimality ΠP3,DP-hard ΠP2,DP-hard

(35)

(Combined) Complexity Results

tgdfull

Cost ΣP3P2-hard ΣP2,DP-hard Optimality ΠP4,DP-hard ΠP3,DP-hard

acycfacyc

Cost ΣP2, (co)NP-hard NP-complete Optimality ΠP3,DP-hard ΠP2,DP-hard

(36)

(Combined) Complexity Results

tgdfull

Cost ΣP3P2-hard ΣP2,DP-hard Optimality ΠP4,DP-hard ΠP3,DP-hard

acycfacyc

Cost ΣP2, (co)NP-hard NP-complete Optimality ΠP3,DP-hard ΠP2,DP-hard

(37)

(Combined) Complexity Results

tgdfull

Cost ΣP3P2-hard ΣP2,DP-hard Optimality ΠP4,DP-hard ΠP3,DP-hard

acycfacyc

Cost ΣP2, (co)NP-hard NP-complete Optimality ΠP3,DP-hard ΠP2,DP-hard

(38)

(Combined) Complexity Results

tgdfull

Cost ΣP3P2-hard ΣP2,DP-hard Optimality ΠP4,DP-hard ΠP3,DP-hard

acycfacyc

Cost ΣP2, (co)NP-hard NP-complete Optimality ΠP3,DP-hard ΠP2,DP-hard

(39)

Vertex-Cover in r -partite r -uniform hypergraph

Vertex-Cover: find a set of vertices of minimal size that cover all (hyper)edges in a (hyper)graph.

NP-complete for general (hyper)graphs.

PTIME for bipartite graphs (Kőnig’s theorem).

Lemma

Vertex-Cover is NP-complete for r -partite r -uniform hypergraphs for r >3.

r-partite: partition of the set of vertices intor sets, with no hyperedge spanning two vertices of the same set.

r-uniform: every hyperedge spansr vertices.

(40)

Vertex-Cover in r -partite r -uniform hypergraph

Vertex-Cover: find a set of vertices of minimal size that cover all (hyper)edges in a (hyper)graph.

NP-complete for general (hyper)graphs.

PTIME for bipartite graphs (Kőnig’s theorem).

Lemma

Vertex-Cover is NP-complete for r -partite r -uniform hypergraphs for r >3.

r-partite: partition of the set of vertices intor sets, with no hyperedge spanning two vertices of the same set.

r-uniform: every hyperedge spansr vertices.

(41)

Encoding of 3-SAT

∙ ∙

x1

¯ x1

x2

¯ x2 x3

¯ x3

∙ ∙

y1

¯ y1

y2

¯ y2 y3

¯ y3

∙ ∙

∙ z1

¯ z1

z2

¯ z2 z3

¯ z3

¬z ∨x ∨y

(42)

Cost is NP-hard for

facyc

Reduction from Vertex-Cover in 3-partite 3-uniform hypergraphs.

Withoutx =a repairs on the left-hand side of a tgd:

R(x1,x2,x3) →R(x1) Source instance: hypergraph Target instance: empty

Cost: size of the tgd plus twice the minimum size of a vertex cover.

With x =a repairs: a little more difficult, but feasible!

(43)

Cost is NP-hard for

facyc

Reduction from Vertex-Cover in 3-partite 3-uniform hypergraphs.

Withoutx =a repairs on the left-hand side of a tgd:

R(x1,x2,x3) →R(x1) Source instance: hypergraph Target instance: empty

Cost: size of the tgd plus twice the minimum size of a vertex cover.

With x =a repairs: a little more difficult, but feasible!

(44)

Cost is NP-hard for

facyc

Reduction from Vertex-Cover in 3-partite 3-uniform hypergraphs.

Withoutx =a repairs on the left-hand side of a tgd:

R(x1,x2,x3) →R(x1) Source instance: hypergraph Target instance: empty

Cost: size of the tgd plus twice the minimum size of a vertex cover.

With x =a repairs: a little more difficult, but feasible!

(45)

Cost is NP-hard for

facyc

Reduction from Vertex-Cover in 3-partite 3-uniform hypergraphs.

Withoutx =a repairs on the left-hand side of a tgd:

R(x1,x2,x3) →R(x1) Source instance: hypergraph Target instance: empty

Cost: size of the tgd plus twice the minimum size of a vertex cover.

With x =a repairs: a little more difficult, but feasible!

(46)

Outline

Introduction

TGDs, Cost, Optimality Results

Extensions, Variants Relational Calculus Other Cost Functions Conclusion

(47)

Extension to Relational Calculus

Definition of repairs can be extended to relational calculus.

Same definition of cost, optimality.

Costis not recursive (but co-r.e.).

Computability ofOptimality: open(!).

(48)

Other Cost Functions

Why not counting the number of tuples to add or remove in J? . . . because it can beexponential in the size of the schema mapping!

Why not counting the number of tuples to add or remove in I orJ? . . . because selections are not captured!

(49)

Other Cost Functions

Why not counting the number of tuples to add or remove in J? . . . because it can beexponential in the size of the schema mapping!

Why not counting the number of tuples to add or remove in I orJ? . . . because selections are not captured!

(50)

Other Cost Functions

Why not counting the number of tuples to add or remove in J? . . . because it can beexponential in the size of the schema mapping!

Why not counting the number of tuples to add or remove in I orJ? . . . because selections are not captured!

(51)

Other Cost Functions

Why not counting the number of tuples to add or remove in J? . . . because it can beexponential in the size of the schema mapping!

Why not counting the number of tuples to add or remove in I orJ? . . . because selections are not captured!

(52)

Outline

Introduction

TGDs, Cost, Optimality Results

Extensions, Variants Conclusion

Summary

(53)

In summary...

Formal framework for the discovery ofsymbolic relations between two data sources.

High complexity(up to fourth level ofPH).

Link with Inductive Logic Programming?

Heuristics?

Approximation algorithms? Generalization of acyclicity?

(54)

In summary...

Formal framework for the discovery ofsymbolic relations between two data sources.

High complexity(up to fourth level ofPH).

Link with Inductive Logic Programming?

Heuristics?

Approximation algorithms?

Generalization of acyclicity?

(55)

Merci.

Références

Documents relatifs

be further noted that, unlike all other approaches, paris did not even know that the relations and classes were identical, but discovered the class and relation equivalences by

It has been explained that Subject orientation, data integration, non-volatility of data, and time variations are the key issues under consideration that can give base to

This guides the developer to carry out in a structured, traceable, and repeatable manner the tasks to (i) develop a basic ontology in OWL 2 QL or DL-Lite A , (ii) get the

This work presents: (1) the logical characterization of the capability of bridge rules and individual correspondences to propagate concept membership assertions across mapped

Avant d’être utilisé, un conductimètre doit être étalonné, avec une solution étalon dont la conductivité est connue.. La sonde du conductimètre est un objet fragile à

There are a number of open theoretical issues, especially on the computability and precise complexity of Optimality, but the most obvious direction for future work would be to

Actually, we show a stronger result: unless RP = NP, there is no polynomial-time algorithm that, given a ground data example (I, J ), computes a GLAV schema mapping whose cost

Ces deux tableaux représentent les notes de deux classes A et B obtenues après un devoir de mathématiques.. a) Citez les éventualités de cette expérience aléatoire. 2)