• Aucun résultat trouvé

Inverting Schema Mappings (presenting works by Fagin, Kolaitis, Popa, Tan)

N/A
N/A
Protected

Academic year: 2022

Partager "Inverting Schema Mappings (presenting works by Fagin, Kolaitis, Popa, Tan)"

Copied!
142
0
0

Texte intégral

(1)

Inverting Schema Mappings

(presenting works by Fagin, Kolaitis, Popa, Tan)

Dmitri Akatov1 Pierre Senellart2,3

1 2 3

Oxford Computing Laboratory

(2)

Data Exchange

Outline

1 Data Exchange Schema Mappings Dependencies

Solutions of Schema Mappings

2 Composing Schema Mappings

3 Inverses

4 Quasi-Inverses

5 Conclusion

(3)

Data Exchange Schema Mappings

Motivation

Data Exchange deals with the problem of transforming a database (the sourceinstance) into another database (thetarget instance) under a (possibly) different schema, while adhering to a set of conditions (the dependencies) between the source and target instances.

Target instance should reflect the information given in the source instance as accurately as possible

Target instance should be as general as possible (not introducing any more information than the source contained).

The “oldest” problem in Database Theory. Still open to a lot of research.

This talk will concentrate on relational databases.

(4)

Data Exchange Schema Mappings

Motivation

Data Exchange deals with the problem of transforming a database (the sourceinstance) into another database (thetarget instance) under a (possibly) different schema, while adhering to a set of conditions (the dependencies) between the source and target instances.

Target instance should reflect the information given in the source instance as accurately as possible

Target instance should be as general as possible (not introducing any more information than the source contained).

The “oldest” problem in Database Theory. Still open to a lot of research.

This talk will concentrate on relational databases.

(5)

Data Exchange Schema Mappings

Motivation

Data Exchange deals with the problem of transforming a database (the sourceinstance) into another database (thetarget instance) under a (possibly) different schema, while adhering to a set of conditions (the dependencies) between the source and target instances.

Target instance should reflect the information given in the source instance as accurately as possible

Target instance should be as general as possible (not introducing any more information than the source contained).

The “oldest” problem in Database Theory. Still open to a lot of research.

This talk will concentrate on relational databases.

(6)

Data Exchange Schema Mappings

Motivation

Data Exchange deals with the problem of transforming a database (the sourceinstance) into another database (thetarget instance) under a (possibly) different schema, while adhering to a set of conditions (the dependencies) between the source and target instances.

Target instance should reflect the information given in the source instance as accurately as possible

Target instance should be as general as possible (not introducing any more information than the source contained).

The “oldest” problem in Database Theory. Still open to a lot of research.

This talk will concentrate on relational databases.

(7)

Data Exchange Schema Mappings

Definitions and Example

Definition

A schema mappingis a triple

M= (S,T,Σ)

where Sis source schema,T istarget schema (with disjoint sets of relational symbols) and Σis a set of formulae in some logical formalism overhS,Ti definingthe schema mappingM.

Example

Let S=hEDLi,T=hED,DLi. Let

Σ ={∀x,y,z(EDL(x,y,z)→ED(x,y)∧DL(x,y))}.

(8)

Data Exchange Schema Mappings

Definitions and Example

Definition

A schema mappingis a triple

M= (S,T,Σ)

where Sis source schema,T istarget schema (with disjoint sets of relational symbols) and Σis a set of formulae in some logical formalism overhS,Ti definingthe schema mappingM.

Example

Let S=hEDLi,T=hED,DLi. Let

Σ ={∀x,y,z(EDL(x,y,z)→ED(x,y)∧DL(x,y))}.

(9)

Data Exchange Dependencies

Source-Target Conditions vs Target Conditions

We are interested inΣbeing a finite set of Tuple-Generating Dependencies (tgds) and Equality-Generating Dependencies(egds), in particular

Source-target tgds: FO formulae of the form

∀x(φ(x)→ ∃yψ(x,y))

whereφ is conjunction of atoms overS,ψ is conjunction of atoms overT.

Target tgds: FO formulae of the same form where bothφ andψ are conjunctions of atoms overT.

Target egds: FO formulae of the form∀x(φ(x)→xi =xj)where φis conjunction of atoms over Tand xi,xj are in x.

Full tgds: s-t tgds without

Notation: Universal quantifiers are usually ommitted. Existential quantifiers should always be written out!

(10)

Data Exchange Dependencies

Source-Target Conditions vs Target Conditions

We are interested inΣbeing a finite set of Tuple-Generating Dependencies (tgds) and Equality-Generating Dependencies(egds), in particular

Source-target tgds: FO formulae of the form

∀x(φ(x)→ ∃yψ(x,y))

whereφ is conjunction of atoms overS,ψ is conjunction of atoms overT.

Target tgds: FO formulae of the same form where bothφ andψ are conjunctions of atoms overT.

Target egds: FO formulae of the form∀x(φ(x)→xi =xj)where φis conjunction of atoms over Tand xi,xj are in x.

Full tgds: s-t tgds without

Notation: Universal quantifiers are usually ommitted. Existential quantifiers should always be written out!

(11)

Data Exchange Dependencies

Source-Target Conditions vs Target Conditions

We are interested inΣbeing a finite set of Tuple-Generating Dependencies (tgds) and Equality-Generating Dependencies(egds), in particular

Source-target tgds: FO formulae of the form

∀x(φ(x)→ ∃yψ(x,y))

whereφ is conjunction of atoms overS,ψ is conjunction of atoms overT.

Target tgds: FO formulae of the same form where bothφ andψ are conjunctions of atoms overT.

Target egds: FO formulae of the form∀x(φ(x)→xi =xj)where φis conjunction of atoms over Tandxi,xj are in x.

Full tgds: s-t tgds without

Notation: Universal quantifiers are usually ommitted. Existential quantifiers should always be written out!

(12)

Data Exchange Dependencies

Source-Target Conditions vs Target Conditions

We are interested inΣbeing a finite set of Tuple-Generating Dependencies (tgds) and Equality-Generating Dependencies(egds), in particular

Source-target tgds: FO formulae of the form

∀x(φ(x)→ ∃yψ(x,y))

whereφ is conjunction of atoms overS,ψ is conjunction of atoms overT.

Target tgds: FO formulae of the same form where bothφ andψ are conjunctions of atoms overT.

Target egds: FO formulae of the form∀x(φ(x)→xi =xj)where φis conjunction of atoms over Tandxi,xj are in x.

Full tgds: s-t tgds without

Notation: Universal quantifiers are usually ommitted. Existential quantifiers should always be written out!

(13)

Data Exchange Dependencies

Source-Target Conditions vs Target Conditions

We are interested inΣbeing a finite set of Tuple-Generating Dependencies (tgds) and Equality-Generating Dependencies(egds), in particular

Source-target tgds: FO formulae of the form

∀x(φ(x)→ ∃yψ(x,y))

whereφ is conjunction of atoms overS,ψ is conjunction of atoms overT.

Target tgds: FO formulae of the same form where bothφ andψ are conjunctions of atoms overT.

Target egds: FO formulae of the form∀x(φ(x)→xi =xj)where φis conjunction of atoms over Tandxi,xj are in x.

Full tgds: s-t tgds without

Notation: Universal quantifiers are usually ommitted. Existential quantifiers should always be written out!

(14)

Data Exchange Dependencies

Global-as-View and Local-as-View Contexts

There are special forms of tgds that come up particularly in Data Integration:

Local-as-ViewMappings are schema mappings in which Σis a set of s-t tgds in each of which φ(the left-hand side of the tgd) is a single atom.

Global-as-view Mappings are schema mappings in whichΣis a set of s-t tgds in each of which ψ(the right-hand side of the tgd) is a single atom.

LAV mappings have some nice invertibility properties

(15)

Data Exchange Dependencies

Global-as-View and Local-as-View Contexts

There are special forms of tgds that come up particularly in Data Integration:

Local-as-ViewMappings are schema mappings in which Σis a set of s-t tgds in each of which φ(the left-hand side of the tgd) is a single atom.

Global-as-view Mappings are schema mappings in whichΣis a set of s-t tgds in each of which ψ(the right-hand side of the tgd) is a single atom.

LAV mappings have some nice invertibility properties

(16)

Data Exchange Dependencies

Global-as-View and Local-as-View Contexts

There are special forms of tgds that come up particularly in Data Integration:

Local-as-ViewMappings are schema mappings in which Σis a set of s-t tgds in each of which φ(the left-hand side of the tgd) is a single atom.

Global-as-view Mappings are schema mappings in whichΣis a set of s-t tgds in each of which ψ(the right-hand side of the tgd) is a single atom.

LAV mappings have some nice invertibility properties

(17)

Data Exchange Dependencies

Global-as-View and Local-as-View Contexts

There are special forms of tgds that come up particularly in Data Integration:

Local-as-ViewMappings are schema mappings in which Σis a set of s-t tgds in each of which φ(the left-hand side of the tgd) is a single atom.

Global-as-view Mappings are schema mappings in whichΣis a set of s-t tgds in each of which ψ(the right-hand side of the tgd) is a single atom.

LAV mappings have some nice invertibility properties

(18)

Data Exchange Solutions of Schema Mappings

Definition

Definition

Given a schema mapping M= (S,T,Σ) and asource instance I ofS we define a solution for I to be an instance J ofT such thathI,Ji |= Σ. We define Sol(I) to be the set of all solutions for I.

We are interested in the most general solution: Definition

A universal solution J forI is a solution such that for any other solution J0 there exists a homomorphism h:J →J0.

Theorem

IfΣ is a finite set of s-t tgds, then there is always a universal solution for any source instance.

(19)

Data Exchange Solutions of Schema Mappings

Definition

Definition

Given a schema mapping M= (S,T,Σ) and asource instance I ofS we define a solution for I to be an instance J ofT such thathI,Ji |= Σ. We define Sol(I) to be the set of all solutions for I.

We are interested in the most general solution:

Definition

A universal solution J forI is a solution such that for any other solution J0 there exists a homomorphism h:J →J0.

Theorem

IfΣ is a finite set of s-t tgds, then there is always a universal solution for any source instance.

(20)

Data Exchange Solutions of Schema Mappings

Definition

Definition

Given a schema mapping M= (S,T,Σ) and asource instance I ofS we define a solution for I to be an instance J ofT such thathI,Ji |= Σ. We define Sol(I) to be the set of all solutions for I.

We are interested in the most general solution:

Definition

A universal solution J forI is a solution such that for any other solutionJ0 there exists a homomorphism h:J →J0.

Theorem

IfΣ is a finite set of s-t tgds, then there is always a universal solution for any source instance.

(21)

Data Exchange Solutions of Schema Mappings

Definition

Definition

Given a schema mapping M= (S,T,Σ) and asource instance I ofS we define a solution for I to be an instance J ofT such thathI,Ji |= Σ. We define Sol(I) to be the set of all solutions for I.

We are interested in the most general solution:

Definition

A universal solution J forI is a solution such that for any other solutionJ0 there exists a homomorphism h:J →J0.

Theorem

IfΣ is a finite set of s-t tgds, then there is always a universal solution for any source instance.

(22)

Data Exchange Solutions of Schema Mappings

Computing Universal Solutions (Outline)

To compute a universal solution we chasehI,∅iwith Σ: In each chase step we do

Select a dependency inΣ (a tgd or an egd) Find a homomorphism from φ(x) to K.

For a tgd: Extend the homomorphism by defining a freshlabelled null for each variable iny and take the image under ψ.

For an egd: Assume xi,xj have distinct images in K. If both are constants, then the chase ends in failure. Otherwise we either replace a labelled null by a constant or one labelled null by the other.

Repeat until no dependency inΣcan be used to extend the homomorphism

Theorem

For tgds and egds, if hI,Ji is the result of asuccessful finite chase, then J is a universal solution. If there exists a failing finite chase, then there is no solution.

(23)

Data Exchange Solutions of Schema Mappings

Computing Universal Solutions (Outline)

To compute a universal solution we chasehI,∅iwith Σ: In each chase step we do

Select a dependency inΣ (a tgd or an egd)

Find a homomorphism from φ(x) to K.

For a tgd: Extend the homomorphism by defining a freshlabelled null for each variable iny and take the image under ψ.

For an egd: Assume xi,xj have distinct images in K. If both are constants, then the chase ends in failure. Otherwise we either replace a labelled null by a constant or one labelled null by the other.

Repeat until no dependency inΣcan be used to extend the homomorphism

Theorem

For tgds and egds, if hI,Ji is the result of asuccessful finite chase, then J is a universal solution. If there exists a failing finite chase, then there is no solution.

(24)

Data Exchange Solutions of Schema Mappings

Computing Universal Solutions (Outline)

To compute a universal solution we chasehI,∅iwith Σ: In each chase step we do

Select a dependency inΣ (a tgd or an egd) Find a homomorphism from φ(x) to K.

For a tgd: Extend the homomorphism by defining a freshlabelled null for each variable iny and take the image under ψ.

For an egd: Assume xi,xj have distinct images in K. If both are constants, then the chase ends in failure. Otherwise we either replace a labelled null by a constant or one labelled null by the other.

Repeat until no dependency inΣcan be used to extend the homomorphism

Theorem

For tgds and egds, if hI,Ji is the result of asuccessful finite chase, then J is a universal solution. If there exists a failing finite chase, then there is no solution.

(25)

Data Exchange Solutions of Schema Mappings

Computing Universal Solutions (Outline)

To compute a universal solution we chasehI,∅iwith Σ: In each chase step we do

Select a dependency inΣ (a tgd or an egd) Find a homomorphism from φ(x) to K.

For a tgd: Extend the homomorphism by defining a freshlabelled null for each variable iny and take the image under ψ.

For an egd: Assume xi,xj have distinct images in K. If both are constants, then the chase ends in failure. Otherwise we either replace a labelled null by a constant or one labelled null by the other.

Repeat until no dependency inΣcan be used to extend the homomorphism

Theorem

For tgds and egds, if hI,Ji is the result of asuccessful finite chase, then J is a universal solution. If there exists a failing finite chase, then there is no solution.

(26)

Data Exchange Solutions of Schema Mappings

Computing Universal Solutions (Outline)

To compute a universal solution we chasehI,∅iwith Σ: In each chase step we do

Select a dependency inΣ (a tgd or an egd) Find a homomorphism from φ(x) to K.

For a tgd: Extend the homomorphism by defining a freshlabelled null for each variable iny and take the image under ψ.

For an egd: Assume xi,xj have distinct images in K. If both are constants, then the chase ends in failure. Otherwise we either replace a labelled null by a constant or one labelled null by the other.

Repeat until no dependency inΣcan be used to extend the homomorphism

Theorem

For tgds and egds, if hI,Ji is the result of asuccessful finite chase, then J is a universal solution. If there exists a failing finite chase, then there is no solution.

(27)

Data Exchange Solutions of Schema Mappings

Computing Universal Solutions (Outline)

To compute a universal solution we chasehI,∅iwith Σ: In each chase step we do

Select a dependency inΣ (a tgd or an egd) Find a homomorphism from φ(x) to K.

For a tgd: Extend the homomorphism by defining a freshlabelled null for each variable iny and take the image under ψ.

For an egd: Assume xi,xj have distinct images in K. If both are constants, then the chase ends in failure. Otherwise we either replace a labelled null by a constant or one labelled null by the other.

Repeat until no dependency inΣcan be used to extend the homomorphism

Theorem

For tgds and egds, if hI,Ji is the result of asuccessful finite chase, then J is a universal solution. If there exists a failing finite chase, then there is no solution.

(28)

Data Exchange Solutions of Schema Mappings

Computing Universal Solutions (Outline)

To compute a universal solution we chasehI,∅iwith Σ: In each chase step we do

Select a dependency inΣ (a tgd or an egd) Find a homomorphism from φ(x) to K.

For a tgd: Extend the homomorphism by defining a freshlabelled null for each variable iny and take the image under ψ.

For an egd: Assume xi,xj have distinct images in K. If both are constants, then the chase ends in failure. Otherwise we either replace a labelled null by a constant or one labelled null by the other.

Repeat until no dependency inΣcan be used to extend the homomorphism

Theorem

For tgds and egds, if hI,Ji is the result of asuccessful finite chase, then J is a universal solution. If there exists a failing finite chase, then there is no

(29)

Data Exchange Solutions of Schema Mappings

Canonical Universal Solutions

Definition

We call such a J (if it exists)a canonical universal solution. If it is unique (up to isomorphism), we call it the canonical universal solution.

However..

Example

Canonical solutions are not unique in general: Let S=hP,Qi,T=hRi. Let Σ ={P(x)→R(x),Q(x)→ ∃YR(Y)}. LetI ={P(a),Q(a)}. The result of a chase is either {R(a)} or {R(Y),R(a)}.

Canonical solutions may also not exist, in case there is no final chase (cyclic dependencies).

(30)

Data Exchange Solutions of Schema Mappings

Canonical Universal Solutions

Definition

We call such a J (if it exists)a canonical universal solution. If it is unique (up to isomorphism), we call it the canonical universal solution.

However..

Example

Canonical solutions are not unique in general: Let S=hP,Qi,T=hRi.

Let Σ ={P(x)→R(x),Q(x)→ ∃YR(Y)}. LetI ={P(a),Q(a)}.

The result of a chase is either {R(a)} or {R(Y),R(a)}.

Canonical solutions may also not exist, in case there is no final chase (cyclic dependencies).

(31)

Data Exchange Solutions of Schema Mappings

Canonical Universal Solutions

Definition

We call such a J (if it exists)a canonical universal solution. If it is unique (up to isomorphism), we call it the canonical universal solution.

However..

Example

Canonical solutions are not unique in general: Let S=hP,Qi,T=hRi.

Let Σ ={P(x)→R(x),Q(x)→ ∃YR(Y)}. LetI ={P(a),Q(a)}. The result of a chase is either {R(a)} or {R(Y),R(a)}.

Canonical solutions may also not exist, in case there is no final chase (cyclic dependencies).

(32)

Data Exchange Solutions of Schema Mappings

Canonical Universal Solutions

Definition

We call such a J (if it exists)a canonical universal solution. If it is unique (up to isomorphism), we call it the canonical universal solution.

However..

Example

Canonical solutions are not unique in general: Let S=hP,Qi,T=hRi.

Let Σ ={P(x)→R(x),Q(x)→ ∃YR(Y)}. LetI ={P(a),Q(a)}. The result of a chase is either {R(a)} or {R(Y),R(a)}.

Canonical solutions may also not exist, in case there is no final chase (cyclic dependencies).

(33)

Data Exchange Solutions of Schema Mappings

Weakly acyclic tgds and polynomial-length chase

Want to know when computing canonical universal solutions is feasible.

Using onlyfull tgdsensures that the chase is always finite and always has the same result. However full tgds are too restricted in practice. We define special sets of tgds called weakly acyclic sets of tgdswhich strictly include full tgds, as well asacyclic sets of inclusion

dependencies.

Weakly acyclic sets of tgds are sets of tgds, such that there is no cycle in the dependency graph going through aspecial edge(edges representing existentially quantified variables)

Theorem

Let Σbe the union of a weakly acyclic set of tgds and a set of egds. Then there exists a polynomial in the size of K that bounds the length of every chase of K with Σ.

(34)

Data Exchange Solutions of Schema Mappings

Weakly acyclic tgds and polynomial-length chase

Want to know when computing canonical universal solutions is feasible.

Using onlyfull tgdsensures that the chase is always finite and always has the same result. However full tgds are too restricted in practice.

We define special sets of tgds called weakly acyclic sets of tgdswhich strictly include full tgds, as well asacyclic sets of inclusion

dependencies.

Weakly acyclic sets of tgds are sets of tgds, such that there is no cycle in the dependency graph going through aspecial edge(edges representing existentially quantified variables)

Theorem

Let Σbe the union of a weakly acyclic set of tgds and a set of egds. Then there exists a polynomial in the size of K that bounds the length of every chase of K with Σ.

(35)

Data Exchange Solutions of Schema Mappings

Weakly acyclic tgds and polynomial-length chase

Want to know when computing canonical universal solutions is feasible.

Using onlyfull tgdsensures that the chase is always finite and always has the same result. However full tgds are too restricted in practice.

We define special sets of tgds called weakly acyclic sets of tgdswhich strictly include full tgds, as well asacyclic sets of inclusion

dependencies.

Weakly acyclic sets of tgds are sets of tgds, such that there is no cycle in the dependency graph going through aspecial edge(edges representing existentially quantified variables)

Theorem

Let Σbe the union of a weakly acyclic set of tgds and a set of egds. Then there exists a polynomial in the size of K that bounds the length of every chase of K with Σ.

(36)

Data Exchange Solutions of Schema Mappings

Weakly acyclic tgds and polynomial-length chase

Want to know when computing canonical universal solutions is feasible.

Using onlyfull tgdsensures that the chase is always finite and always has the same result. However full tgds are too restricted in practice.

We define special sets of tgds called weakly acyclic sets of tgdswhich strictly include full tgds, as well asacyclic sets of inclusion

dependencies.

Weakly acyclic sets of tgds are sets of tgds, such that there is no cycle in the dependency graph going through aspecial edge(edges representing existentially quantified variables)

Theorem

Let Σbe the union of a weakly acyclic set of tgds and a set of egds. Then there exists a polynomial in the size of K that bounds the length of every chase of K with Σ.

(37)

Data Exchange Solutions of Schema Mappings

Weakly acyclic tgds and polynomial-length chase

Want to know when computing canonical universal solutions is feasible.

Using onlyfull tgdsensures that the chase is always finite and always has the same result. However full tgds are too restricted in practice.

We define special sets of tgds called weakly acyclic sets of tgdswhich strictly include full tgds, as well asacyclic sets of inclusion

dependencies.

Weakly acyclic sets of tgds are sets of tgds, such that there is no cycle in the dependency graph going through aspecial edge(edges representing existentially quantified variables)

Theorem

Let Σbe the union of a weakly acyclic set of tgds and a set of egds. Then there exists a polynomial in the size of K that bounds the length of every chase of K with Σ.

(38)

Composing Schema Mappings

Outline

1 Data Exchange

2 Composing Schema Mappings The Composition Operator

Language for Expressing Composition Second-Order tgds and Data Exchange

3 Inverses

4 Quasi-Inverses

5 Conclusion

(39)

Composing Schema Mappings The Composition Operator

Motivation

Natural operators on schema mappings:

Composition Inverse . . .

High-levelmanagement of schema mappings. Need for well-definedsemantics.

In practice, useful for describing and maintaining successive evolutions of a schema.

S1

T1 M2 T01 M1

S01 M0

1

(40)

Composing Schema Mappings The Composition Operator

Motivation

Natural operators on schema mappings:

Composition Inverse . . .

High-levelmanagement of schema mappings. Need for well-definedsemantics.

In practice, useful for describing and maintaining successive evolutions of a schema.

S1

T1 M2 T01 M1

S01 M0

1

(41)

Composing Schema Mappings The Composition Operator

Motivation

Natural operators on schema mappings:

Composition Inverse . . .

High-levelmanagement of schema mappings.

Need for well-definedsemantics.

In practice, useful for describing and maintaining successive evolutions of a schema.

S1

T1 M2 T01 M1

S01 M0

1

(42)

Composing Schema Mappings The Composition Operator

Motivation

Natural operators on schema mappings:

Composition Inverse . . .

High-levelmanagement of schema mappings.

Need for well-definedsemantics.

In practice, useful for describing and maintaining successive evolutions of a schema.

S1

T1 M2 T01 M1

S01 M0

1

(43)

Composing Schema Mappings The Composition Operator

Motivation

Natural operators on schema mappings:

Composition Inverse . . .

High-levelmanagement of schema mappings.

Need for well-definedsemantics.

In practice, useful for describing and maintaining successive evolutions of a schema.

S1

T1 M2 T01 M1

S01 M0

1

(44)

Composing Schema Mappings The Composition Operator

Motivation

Natural operators on schema mappings:

Composition Inverse . . .

High-levelmanagement of schema mappings.

Need for well-definedsemantics.

In practice, useful for describing and maintaining successive evolutions of a schema.

S1

T1 M2 T01 M1

S01 M0

1

M01−1

(45)

Composing Schema Mappings The Composition Operator

Motivation

Natural operators on schema mappings:

Composition Inverse . . .

High-levelmanagement of schema mappings.

Need for well-definedsemantics.

In practice, useful for describing and maintaining successive evolutions of a schema.

S1

T1 M2 T01 M1

S01 M0

1

M01−1 M01−1◦ M1◦ M2

(46)

Composing Schema Mappings The Composition Operator

Definition

Unambiguous andquery-independentsemantics of schema mappings composition.

Based on natural composition of binary relations.

Definition

hI,Ki instance of M1◦ M2 ⇐⇒ there exists J such that: hI,Ji instance ofM1

hJ,Ki instance of M2.

(47)

Composing Schema Mappings The Composition Operator

Definition

Unambiguous andquery-independentsemantics of schema mappings composition.

Based on natural composition of binary relations.

Definition

hI,Ki instance of M1◦ M2 ⇐⇒ there exists J such that: hI,Ji instance ofM1

hJ,Ki instance of M2.

(48)

Composing Schema Mappings The Composition Operator

Definition

Unambiguous andquery-independentsemantics of schema mappings composition.

Based on natural composition of binary relations.

Definition

hI,Ki instance ofM1◦ M2 ⇐⇒ there exists J such that:

hI,Ji instance ofM1 hJ,Ki instance of M2.

(49)

Composing Schema Mappings The Composition Operator

Example

Example

M12= (S1,S2,Σ12),M23= (S2,S3,Σ23).

S1 = {Takes(·,·)}

S2 = {Takes’(·,·),Student(·,·)}

S3 = {Enrollment(·,·)}

Σ12 = {∀n∀c(Takes(n,c)Takes’(n,c)),

∀n∀c(Takes(n,c)→ ∃sStudent(n,s))}

Σ23 = {∀n∀s∀c(Student(n,s)Takes’(n,c)Enrollment(s,c))}

M12◦ M23= (S1,S3,Σ) with:

Σ ={∀n∃s∀c(Takes(n,c)→Enrollment(s,c))}

(50)

Composing Schema Mappings The Composition Operator

Example

Example

M12= (S1,S2,Σ12),M23= (S2,S3,Σ23).

S1 = {Takes(·,·)}

S2 = {Takes’(·,·),Student(·,·)}

S3 = {Enrollment(·,·)}

Σ12 = {∀n∀c(Takes(n,c)Takes’(n,c)),

∀n∀c(Takes(n,c)→ ∃sStudent(n,s))}

Σ23 = {∀n∀s∀c(Student(n,s)Takes’(n,c)Enrollment(s,c))}

M12◦ M23= (S1,S3,Σ) with:

(51)

Composing Schema Mappings Language for Expressing Composition

A Positive Result. . .

Proposition

IfM1 is a mapping expressed as a finite set offull source-to-targettgds and M2 is a mapping expressed as a finite set of source-to-targettgds, M1◦ M2 can be expressed as a mapping with source-to-targettgds.

Example

A(x,y)→B(x) B(x)→ ∃zC(x,z)

=

A(x,y) → ∃zC(x,z) Remark

If the tgds ofM2 arefull, so will be the tgds of M1◦ M2.

(52)

Composing Schema Mappings Language for Expressing Composition

A Positive Result. . .

Proposition

IfM1 is a mapping expressed as a finite set offull source-to-targettgds and M2 is a mapping expressed as a finite set of source-to-targettgds, M1◦ M2 can be expressed as a mapping with source-to-targettgds.

Example

A(x,y)→B(x) B(x)→ ∃zC(x,z)

=

A(x,y) → ∃zC(x,z) Remark

If the tgds ofM2 arefull, so will be the tgds of M1◦ M2.

(53)

Composing Schema Mappings Language for Expressing Composition

A Positive Result. . .

Proposition

IfM1 is a mapping expressed as a finite set offull source-to-targettgds and M2 is a mapping expressed as a finite set of source-to-targettgds, M1◦ M2 can be expressed as a mapping with source-to-targettgds.

Example

A(x,y)→B(x) B(x)→ ∃zC(x,z)

=

A(x,y) → ∃zC(x,z)

Remark

If the tgds ofM2 arefull, so will be the tgds of M1◦ M2.

(54)

Composing Schema Mappings Language for Expressing Composition

A Positive Result. . .

Proposition

IfM1 is a mapping expressed as a finite set offull source-to-targettgds and M2 is a mapping expressed as a finite set of source-to-targettgds, M1◦ M2 can be expressed as a mapping with source-to-targettgds.

Example

A(x,y)→B(x) B(x)→ ∃zC(x,z)

=

A(x,y) → ∃zC(x,z) Remark

If the tgds ofM2 arefull, so will be the tgds ofM1◦ M2.

(55)

Composing Schema Mappings Language for Expressing Composition

. . . and a Negative Result

Proposition

There exist two mappings M1 andM2, defined with a finite set of source-to-target tgds, such thatM1◦ M2 cannot be expressed in first-order logic (even with least fix point).

Proof.

Composition query problem: givenI and J, ishI,Ji an instance of the composed schema mapping?

Reduction of 3-colorability

NP-completeproblem.

Q.E.D. (descriptive complexity theory) Remark

Fagin’s theorem implies that this can be defined as an existential second- order formula. More precise characterization of the language?

(56)

Composing Schema Mappings Language for Expressing Composition

. . . and a Negative Result

Proposition

There exist two mappings M1 andM2, defined with a finite set of source-to-target tgds, such thatM1◦ M2 cannot be expressed in first-order logic (even with least fix point).

Proof.

Composition query problem: givenI and J, ishI,Ji an instance of the composed schema mapping?

Reduction of 3-colorability

NP-completeproblem.

Q.E.D. (descriptive complexity theory)

Remark

Fagin’s theorem implies that this can be defined as an existential second- order formula. More precise characterization of the language?

(57)

Composing Schema Mappings Language for Expressing Composition

. . . and a Negative Result

Proposition

There exist two mappings M1 andM2, defined with a finite set of source-to-target tgds, such thatM1◦ M2 cannot be expressed in first-order logic (even with least fix point).

Proof.

Composition query problem: givenI and J, ishI,Ji an instance of the composed schema mapping?

Reduction of 3-colorability

NP-completeproblem.

Q.E.D. (descriptive complexity theory) Remark

Fagin’s theorem implies that this can be defined as an existential second- order formula. More precise characterization of the language?

(58)

Composing Schema Mappings Language for Expressing Composition

A Natural Language for Composition

Definition

A second-ordertgd is of the form:

∃f1. . .fm((∀x11→ψ1))∧. . .∧(∀xnn→ψn)))

fi are function symbols

φi are conjunction of source relation atoms and equalities ψi are conjunctions of target relation atoms

(additional safety conditions omitted).

Example

∀n∃s∀c(Takes(n,c) Enrollment(s,c))

∃f(∀n∀c(Takes(n,c) Enrollment(f(n),c)))

(59)

Composing Schema Mappings Language for Expressing Composition

A Natural Language for Composition

Definition

A second-ordertgd is of the form:

∃f1. . .fm((∀x11→ψ1))∧. . .∧(∀xnn→ψn)))

fi are function symbols

φi are conjunction of source relation atoms and equalities ψi are conjunctions of target relation atoms

(additional safety conditions omitted).

Example

∀n∃s∀c(Takes(n,c) Enrollment(s,c))

∃f(∀n∀c(Takes(n,c) Enrollment(f(n),c)))

(60)

Composing Schema Mappings Language for Expressing Composition

A Natural Language for Composition

Definition

A second-ordertgd is of the form:

∃f1. . .fm((∀x11→ψ1))∧. . .∧(∀xnn→ψn)))

fi are function symbols

φi are conjunction of source relation atoms and equalities ψi are conjunctions of target relation atoms

(additional safety conditions omitted).

Example

∀n∃s∀c(Takes(n,c) Enrollment(s,c))

(61)

Composing Schema Mappings Language for Expressing Composition

Composition Theorem

Theorem

Second-order tgds are closedunder composition.

Constructive proof.

All features of the language (disjunction, equalities, second-order quantifiers) are required.

Remark

Any finite set of s-t tgds can be represented as a singlesecond-order tgds.

(62)

Composing Schema Mappings Language for Expressing Composition

Composition Theorem

Theorem

Second-order tgds are closedunder composition.

Constructive proof.

All features of the language (disjunction, equalities, second-order quantifiers) are required.

Remark

Any finite set of s-t tgds can be represented as a singlesecond-order tgds.

(63)

Composing Schema Mappings Language for Expressing Composition

Composition Theorem

Theorem

Second-order tgds are closedunder composition.

Constructive proof.

All features of the language (disjunction, equalities, second-order quantifiers) are required.

Remark

Any finite set of s-t tgds can be represented as a singlesecond-order tgds.

(64)

Composing Schema Mappings Language for Expressing Composition

Composition Theorem

Theorem

Second-order tgds are closedunder composition.

Constructive proof.

All features of the language (disjunction, equalities, second-order quantifiers) are required.

Remark

Any finite set of s-t tgds can be represented as a singlesecond-order tgds.

(65)

Composing Schema Mappings Second-Order tgds and Data Exchange

Properties of the Composition Operator

Two important properties for using second-order tgds in data exchange:

Polynomial chaseof schema mappings defined as a second-order tgd.

PTIME computation of certain answers to union of conjunctive queries.

Remark

Contrast with the fact that computing certain answers with arbitrary first- order mappings isundecidable.

Second-order tgds:

Powerful enough for including normal tgds and being closed under composition.

Restricted enoughto get a PTIME algorithm for answering queries.

(66)

Composing Schema Mappings Second-Order tgds and Data Exchange

Properties of the Composition Operator

Two important properties for using second-order tgds in data exchange:

Polynomial chaseof schema mappings defined as a second-order tgd.

PTIME computation of certain answers to union of conjunctive queries.

Remark

Contrast with the fact that computing certain answers with arbitrary first- order mappings isundecidable.

Second-order tgds:

Powerful enough for including normal tgds and being closed under composition.

Restricted enoughto get a PTIME algorithm for answering queries.

(67)

Composing Schema Mappings Second-Order tgds and Data Exchange

Properties of the Composition Operator

Two important properties for using second-order tgds in data exchange:

Polynomial chaseof schema mappings defined as a second-order tgd.

PTIME computation of certain answers to union of conjunctive queries.

Remark

Contrast with the fact that computing certain answers with arbitrary first- order mappings isundecidable.

Second-order tgds:

Powerful enough for including normal tgds and being closed under composition.

Restricted enoughto get a PTIME algorithm for answering queries.

(68)

Inverses

Outline

1 Data Exchange

2 Composing Schema Mappings

3 Inverses

Motivation and Definition Conditions

Computing Inverses

4 Quasi-Inverses

5 Conclusion

(69)

Inverses Motivation and Definition

Semantics of Inverses

Example

Let M12be EDL(x,y,z)→ED(x,y)∧DL(y,z). LetM21 be ED(x,y)∧DL(y,z)→EDL(x,y,z).

Under what conditions do we obtain the original EDL relation? DefineΓas EDL(x,y,z0)∧EDL(x0,y,z))→EDL(x,y,z). We want M21 to be theinverse of M12 for precisely those instancesI which satisfy Γ.

(70)

Inverses Motivation and Definition

Semantics of Inverses

Example

Let M12be EDL(x,y,z)→ED(x,y)∧DL(y,z). LetM21 be ED(x,y)∧DL(y,z)→EDL(x,y,z).

Under what conditions do we obtain the original EDL relation?

DefineΓas EDL(x,y,z0)∧EDL(x0,y,z))→EDL(x,y,z). We want M21 to be theinverse of M12 for precisely those instancesI which satisfy Γ.

(71)

Inverses Motivation and Definition

Semantics of Inverses

Example

Let M12be EDL(x,y,z)→ED(x,y)∧DL(y,z). LetM21 be ED(x,y)∧DL(y,z)→EDL(x,y,z).

Under what conditions do we obtain the original EDL relation?

DefineΓas EDL(x,y,z0)∧EDL(x0,y,z))→EDL(x,y,z).

We want M21 to be theinverse of M12 for precisely those instancesI which satisfy Γ.

(72)

Inverses Motivation and Definition

Semantics of Inverses

Example

Let M12be EDL(x,y,z)→ED(x,y)∧DL(y,z). LetM21 be ED(x,y)∧DL(y,z)→EDL(x,y,z).

Under what conditions do we obtain the original EDL relation?

DefineΓas EDL(x,y,z0)∧EDL(x0,y,z))→EDL(x,y,z).

We wantM21 to be theinverse of M12 for precisely those instancesI which satisfy Γ.

(73)

Inverses Motivation and Definition

Obvious Solution fails

Let M12 be defined by Σ12.

Let S12={hI,Ji:hI,Ji |= Σ12}. Let S21={hJ,Ii:hI,Ji ∈S12}.

Can we define inverse as mapping associated with set S21? No: If hI,Ji |= Σ12, then for allI0 ⊆I and J0 ⊇J, we also have hI0,J0i |= Σ12. This will not hold for the pairs inS21.

(74)

Inverses Motivation and Definition

Obvious Solution fails

Let M12 be defined by Σ12. Let S12={hI,Ji:hI,Ji |= Σ12}.

Let S21={hJ,Ii:hI,Ji ∈S12}.

Can we define inverse as mapping associated with set S21? No: If hI,Ji |= Σ12, then for allI0 ⊆I and J0 ⊇J, we also have hI0,J0i |= Σ12. This will not hold for the pairs inS21.

(75)

Inverses Motivation and Definition

Obvious Solution fails

Let M12 be defined by Σ12. Let S12={hI,Ji:hI,Ji |= Σ12}.

Let S21={hJ,Ii:hI,Ji ∈S12}.

Can we define inverse as mapping associated with set S21? No: If hI,Ji |= Σ12, then for allI0 ⊆I and J0 ⊇J, we also have hI0,J0i |= Σ12. This will not hold for the pairs inS21.

(76)

Inverses Motivation and Definition

Obvious Solution fails

Let M12 be defined by Σ12. Let S12={hI,Ji:hI,Ji |= Σ12}.

Let S21={hJ,Ii:hI,Ji ∈S12}.

Can we define inverse as mapping associated with set S21?

No: If hI,Ji |= Σ12, then for allI0 ⊆I and J0 ⊇J, we also have hI0,J0i |= Σ12. This will not hold for the pairs inS21.

(77)

Inverses Motivation and Definition

Obvious Solution fails

Let M12 be defined by Σ12. Let S12={hI,Ji:hI,Ji |= Σ12}.

Let S21={hJ,Ii:hI,Ji ∈S12}.

Can we define inverse as mapping associated with set S21? No: If hI,Ji |= Σ12, then for allI0 ⊆I and J0 ⊇J, we also have hI0,J0i |= Σ12. This will not hold for the pairs in S21.

(78)

Inverses Motivation and Definition

Definition

Definition

We say two schema mappings are equivalent on I, if they have the same solutions for I.

Definition

Let MId= (S1,Sc1,ΣId)be the identity mapping. Let

M12= (S1,S2,Σ12) andM21= (S2,Sc1,Σ21) be schema mappings. Let σ be the composition formulaΣ12Σ21 and letM11= (S1,Sc1, σ). LetI be an instance of S1. ThenM21 is an inverse of M12 for I if M11 and MId are equivalent onI.

(79)

Inverses Motivation and Definition

Definition

Definition

We say two schema mappings are equivalent on I, if they have the same solutions for I.

Definition

Let MId= (S1,Sc1,ΣId) be the identity mapping. Let

M12= (S1,S2,Σ12) andM21= (S2,Sc1,Σ21) be schema mappings. Let σ be the composition formulaΣ12Σ21 and letM11= (S1,Sc1, σ). LetI be an instance of S1. ThenM21 is an inverse of M12 for I if M11 and MId are equivalent onI.

(80)

Inverses Motivation and Definition

Local and Global Inverses

We call such an inverse a local inverse.

IfS is a class of instances such thatM21 is an inverse of M12 for eachI in S, then we call it anS-inverse.

IfS is the class of all instances, we call it a global inverse.

(81)

Inverses Motivation and Definition

Local and Global Inverses

We call such an inverse a local inverse.

IfS is a class of instances such thatM21 is an inverse of M12 for eachI in S, then we call it anS-inverse.

IfS is the class of all instances, we call it a global inverse.

(82)

Inverses Motivation and Definition

Local and Global Inverses

We call such an inverse a local inverse.

IfS is a class of instances such thatM21 is an inverse of M12 for eachI in S, then we call it anS-inverse.

IfS is the class of all instances, we call it a global inverse.

(83)

Inverses Conditions

Unique Solutions Property

We would like to know when global inverses exist.

IfM21 is an inverse of M12, andI1 andI2 are distinct source instances, then the solutions ofI1 andI2 are different underM12. Unique solutions property. A mapping M12 has it if wheneverI1,I2

are distinct source instances, then their solution sets are distinct This is necessary for global inverses to exist.

For LAV schema mappings it is also sufficient.

(84)

Inverses Conditions

Unique Solutions Property

We would like to know when global inverses exist.

IfM21 is an inverse of M12, andI1 andI2 are distinct source instances, then the solutions ofI1 andI2 are different under M12.

Unique solutions property. A mapping M12 has it if wheneverI1,I2

are distinct source instances, then their solution sets are distinct This is necessary for global inverses to exist.

For LAV schema mappings it is also sufficient.

(85)

Inverses Conditions

Unique Solutions Property

We would like to know when global inverses exist.

IfM21 is an inverse of M12, andI1 andI2 are distinct source instances, then the solutions ofI1 andI2 are different under M12. Unique solutions property. A mapping M12 has it if wheneverI1,I2

are distinct source instances, then their solution sets are distinct

This is necessary for global inverses to exist. For LAV schema mappings it is also sufficient.

Références

Documents relatifs

We use three of these no- tions, unitary, dense and planar subcategories in Section 2 to describe the kernels of certain morphisms of categories (functors), called quotient maps, from

School of Mathematics and Statistics, Southwest University, Chongqing, 400715, People’s Republic of China – and – School of Mathematical Science, Kaili University, Kaili,

McCoy [17, Section 6] also showed that with sufficiently strong curvature ratio bounds on the initial hypersur- face, the convergence result can be established for α > 1 for a

The classes of rule sets whose chase terminates on all paths (all possible derivation sequences of chase steps) independent of given databases (thus all instances) is denoted by CT ∀∀

initialisation (par lecture) des variables d'entrée ; writeln est facultatif, readln indispensable. pas de point virgule avant

Yinlong Qian, Jing Dong, Wei Wang, and Tieniu Tan, ”Deep Learning for Steganalysis via Convolutional Neural Networks,” in Proceedings of SPIE Media Watermarking, Security, and

Seemingly inescapable consequences of the physiologist's account of perception appear to demolish not just the credentials of some other theory of perception, but the

[r]