• Aucun résultat trouvé

slides (pdf)

N/A
N/A
Protected

Academic year: 2022

Partager "slides (pdf)"

Copied!
44
0
0

Texte intégral

(1)

Fuzzy Constraint-based Schema Matching Formulation

Alsayed Algergawy, Eike Schallehn, Gunter Saake {alshahat|eike|saake}@ovgu.de

Department of Computer Science University of Magdeburg

Magdeburg, Germany

The First Workshop in Advanced Deep Web 2008 ADW2008

(2)

Road Map

1. Motivations

2. Preliminaries 3. Schema Graphs

4. Schema Matching as an FCOP 5. Summary and Future Work

(3)

Motivations

Schema matchingis defined as the task of identifying the semantic correspondences from heterogeneous data sources

Current Approaches

Lack of formulation

Discovering simple mappings

Matching Performance

Matching Scalability

Uncertainty

Therefore, we need a formalization framework that enables us to cope with:

Discovering complex mappings as well as simple mappings

Trading-off between two performance aspects—matching effectiveness and matching efficiency

Dealing with schema matching uncertainty

(4)

Motivations

Schema matchingis defined as the task of identifying the semantic correspondences from heterogeneous data sources

Current Approaches

Lack of formulation

Discovering simple mappings

Matching Performance

Matching Scalability

Uncertainty

Therefore, we need a formalization framework that enables us to cope with:

Discovering complex mappings as well as simple mappings

Trading-off between two performance aspects—matching effectiveness and matching efficiency

Dealing with schema matching uncertainty

(5)

Road Map

1. Motivations 2. Preliminaries

3. Schema Graphs

4. Schema Matching as an FCOP 5. Summary and Future Work

(6)

Preliminaries

Our fuzzy constraint optimization framework is based on:

Rooted labeled graphs

Constraint programming

(7)

Rooted Labeled Graphs

Schemas to ba matched can be modeled as rooted labeled graphs called schema graphs SG

G= (NG,EG,LabG,src,tar,l)

NG={nroot,n2, ...,nn}Va finite set of nodes

EG={(ni,nj)|ni,nj NG}Va finite set of edges,

LabG={LabNG,LabEG}Va finite set of node labelsLabNG, and a finite set of edge labelsLabEG

srcandtar:EG7→NGVtwo mappings source and target,

l:NGEG 7→LabGVa mapping label assigning

(8)

Rooted Labeled Graphs

Schemas to ba matched can be modeled as rooted labeled graphs called schema graphs SG

G= (NG,EG,LabG,src,tar,l)

NG={nroot,n2, ...,nn}Va finite set of nodes

EG={(ni,nj)|ni,nj NG}Va finite set of edges,

LabG={LabNG,LabEG}Va finite set of node labelsLabNG, and a finite set of edge labelsLabEG

srcandtar:EG7→NGVtwo mappings source and target,

l:NGEG 7→LabGVa mapping label assigning

(9)

Rooted Labeled Graphs

Schemas to ba matched can be modeled as rooted labeled graphs called schema graphs SG

G= (NG,EG,LabG,src,tar,l)

NG={nroot,n2, ...,nn}Va finite set of nodes

EG={(ni,nj)|ni,nj NG}Va finite set of edges,

LabG={LabNG,LabEG}Va finite set of node labelsLabNG, and a finite set of edge labelsLabEG

srcandtar:EG7→NGVtwo mappings source and target,

l:NGEG 7→LabGVa mapping label assigning

(10)

Rooted Labeled Graphs

Schemas to ba matched can be modeled as rooted labeled graphs called schema graphs SG

G= (NG,EG,LabG,src,tar,l)

NG={nroot,n2, ...,nn}Va finite set of nodes

EG={(ni,nj)|ni,nj NG}Va finite set of edges,

LabG={LabNG,LabEG}Va finite set of node labelsLabNG, and a finite set of edge labelsLabEG

srcandtar:EG7→NGVtwo mappings source and target,

l:NGEG 7→LabGVa mapping label assigning

(11)

Rooted Labeled Graphs

Schemas to ba matched can be modeled as rooted labeled graphs called schema graphs SG

G= (NG,EG,LabG,src,tar,l)

NG={nroot,n2, ...,nn}Va finite set of nodes

EG={(ni,nj)|ni,nj NG}Va finite set of edges,

LabG={LabNG,LabEG}Va finite set of node labelsLabNG, and a finite set of edge labelsLabEG

srcandtar:EG7→NGVtwo mappings source and target,

l:NGEG7→LabGVa mapping label assigning

(12)

Constraint Programming I

A lot of problems in computer science, most notably in AI, can be interpreted as special cases of constraint programming.

Semantic schema matching is an intelligent process

Therefore, constraint programming is a suitable framework for interpreting and understanding the schema matching problem

Types of constraint problems

Constraint Satisfaction ProblemCSP

Constraint Optimization ProblemCOP

Fuzzy Constraint Optimization ProblemFCOP

(13)

Constraint Programming I

A lot of problems in computer science, most notably in AI, can be interpreted as special cases of constraint programming.

Semantic schema matching is an intelligent process

Therefore, constraint programming is a suitable framework for interpreting and understanding the schema matching problem

Types of constraint problems

Constraint Satisfaction ProblemCSP

Constraint Optimization ProblemCOP

Fuzzy Constraint Optimization ProblemFCOP

(14)

Constraint Programming II

CSP P is a 3-tuple,

P= (X,D,C)

X is a finite set of variables

Dis a collection of finite domains

Cis a set of constraints

Constraint

Cs ⊆D1×...×Dr → {0,1} S={x1,x2, ...xr}

Solution of a CSP

An assignmentΛis a solution of aCSPif it satisfies all the constraints of the problem.

(15)

Constraint Programming II

CSP P is a 3-tuple,

P= (X,D,C)

X is a finite set of variables

Dis a collection of finite domains

Cis a set of constraints

Constraint

Cs ⊆D1×...×Dr → {0,1}

S={x1,x2, ...xr}

Solution of a CSP

An assignmentΛis a solution of aCSPif it satisfies all the constraints of the problem.

(16)

Constraint Programming II

CSP P is a 3-tuple,

P= (X,D,C)

X is a finite set of variables

Dis a collection of finite domains

Cis a set of constraints

Constraint

Cs ⊆D1×...×Dr → {0,1}

S={x1,x2, ...xr}

Solution of a CSP

An assignmentΛis a solution of aCSPif it satisfies all the constraints of the problem.

(17)

Constraint Programming III

COP COP Q is a 2-tuple,Q= (P,g)

Pis a CSP

gis an objective function

While powerful, bothCSP andCOP present some limitations

ALL constraints are mandatory (CRISP CONSTRAINTS)

Fuzzy Constraints: A fuzzy constraintCµis represented by the fuzzy relationRf, defined by

µR: Y

xi∈var(C)

Di →[0,1]

Fuzzy Constraint Optimization Problem FCOPQµis a 4-tuple Qµ= (X,D,Cµ,g)

(18)

Constraint Programming III

COP COP Q is a 2-tuple,Q= (P,g)

Pis a CSP

gis an objective function

While powerful, bothCSP andCOP present some limitations

ALL constraints are mandatory (CRISP CONSTRAINTS)

Fuzzy Constraints: A fuzzy constraintCµis represented by the fuzzy relationRf, defined by

µR: Y

xi∈var(C)

Di →[0,1]

Fuzzy Constraint Optimization Problem FCOPQµis a 4-tuple Qµ= (X,D,Cµ,g)

(19)

Constraint Programming III

COP COP Q is a 2-tuple,Q= (P,g)

Pis a CSP

gis an objective function

While powerful, bothCSP andCOP present some limitations

ALL constraints are mandatory (CRISP CONSTRAINTS)

Fuzzy Constraints: A fuzzy constraintCµis represented by the fuzzy relationRf, defined by

µR: Y

xi∈var(C)

Di→[0,1]

Fuzzy Constraint Optimization Problem FCOPQµis a 4-tuple Qµ= (X,D,Cµ,g)

(20)

Constraint Programming III

COP COP Q is a 2-tuple,Q= (P,g)

Pis a CSP

gis an objective function

While powerful, bothCSP andCOP present some limitations

ALL constraints are mandatory (CRISP CONSTRAINTS)

Fuzzy Constraints: A fuzzy constraintCµis represented by the fuzzy relationRf, defined by

µR: Y

xi∈var(C)

Di→[0,1]

Fuzzy Constraint Optimization Problem FCOPQµis a 4-tuple Qµ= (X,D,Cµ,g)

(21)

Road Map

1. Motivations 2. Preliminaries 3. Schema Graphs

4. Schema Matching as an FCOP 5. Summary and Future Work

(22)

A Unified Schema Matching Framework

(23)

Transformation Rules

Everyprepared matching objectin a schema such as schema, relations, elements, attributes etc. is represented bya nodein the schema graph

Thefeaturesof the prepared matching object are represented by node labels LabNG

Therelationshipbetween two prepared matching objects is represented byan edge of the schema graph

Thefeaturesof the relationship between prepared objects are represented byedge labels LabEG

(24)

Schema Graph Example I

Relational Schema Schema S

create table Personnel(

Pno int primary key, Pname string,

Dept string, Born date);

Schema Graph

(25)

Schema Graph Example I

Relational Schema Schema S

create table Personnel(

Pno int primary key, Pname string,

Dept string, Born date);

Schema Graph

(26)

Schema Graph Example II

Relational Schema Schema T

create table Employee(

EmpNo int primary key, EmpName varchar(20),

DeptNo int REFERENCES Department, Salary int,

BirthDate date);

create table Department(

DeptNo int primary key, DeptName varchar(30));

Schema Graph

(27)

Schema Graph Example II

Relational Schema Schema T

create table Employee(

EmpNo int primary key, EmpName varchar(20),

DeptNo int REFERENCES Department, Salary int,

BirthDate date);

create table Department(

DeptNo int primary key, DeptName varchar(30));

Schema Graph

(28)

Road Map

1. Motivations 2. Preliminaries 3. Schema Graphs

4. Schema Matching as an FCOP 5. Summary and Future Work

(29)

Schema Matching as Graph Matching I

The schema matching problem is converted into graph matching

Graph Morphism;N16=N2(schema matching)

Graph Homomorphism;N1=N2

Graph Morphism

φ:SG1→SG2

SG1= (NGS,EGS,LabGS,srcS,tarS,lS) SG2= (NGT,EGT,LabGT,srcT,tarT,lT)

φ= (φN, φE)such thatφN:NGS →NGTE :EGS →EGT 1. ∀nNGSlS(n) =lTN(n))(node label preserving) 2. ∀eEGSlS(e) =lTE(e))(edge label preserving) 3. ∀eEGSa pathp0 NGT×EGTsuch thatp0=φE(e)and

φN(srcS(e)) =srcTE(e))φN(tarS(e)) =tarTE(e)). (graph structure preserving)

(30)

Schema Matching as Graph Matching I

The schema matching problem is converted into graph matching

Graph Morphism;N16=N2(schema matching)

Graph Homomorphism;N1=N2

Graph Morphism

φ:SG1→SG2

SG1= (NGS,EGS,LabGS,srcS,tarS,lS) SG2= (NGT,EGT,LabGT,srcT,tarT,lT)

φ= (φN, φE)such thatφN :NGS →NGTE :EGS →EGT

1. ∀nNGSlS(n) =lTN(n))(node label preserving) 2. ∀eEGSlS(e) =lTE(e))(edge label preserving) 3. ∀eEGSa pathp0 NGT×EGTsuch thatp0=φE(e)and

φN(srcS(e)) =srcTE(e))φN(tarS(e)) =tarTE(e)). (graph structure preserving)

(31)

Schema Matching as Graph Matching I

The schema matching problem is converted into graph matching

Graph Morphism;N16=N2(schema matching)

Graph Homomorphism;N1=N2

Graph Morphism

φ:SG1→SG2

SG1= (NGS,EGS,LabGS,srcS,tarS,lS) SG2= (NGT,EGT,LabGT,srcT,tarT,lT)

φ= (φN, φE)such thatφN :NGS →NGTE :EGS →EGT 1. ∀nNGSlS(n) =lTN(n))(node label preserving) 2. ∀eEGSlS(e) =lTE(e))(edge label preserving) 3. ∀eEGSa pathp0 NGT×EGT such thatp0=φE(e)and

φN(srcS(e)) =srcTE(e))φN(tarS(e)) =tarTE(e)). (graph

(32)

Schema Matching as Graph Matching II

Graph matching is considered to be one of the most complex problems in computer science. Its complexity is due to two major problems:-

The time complexity

The fact that all of the algorithms for graph matching found so far can only be applied to two graphs at a time.

To tackle these challenges, as well as the mentioned motivations, we decide to extend graph matching into an FCOP

(33)

Schema Matching as Graph Matching II

Graph matching is considered to be one of the most complex problems in computer science. Its complexity is due to two major problems:-

The time complexity

The fact that all of the algorithms for graph matching found so far can only be applied to two graphs at a time.

To tackle these challenges, as well as the mentioned motivations, we decide to extend graph matching into an FCOP

(34)

Graph Matching as an FCOP

Graph matching→an FCOP using the following rules:

take theobjects of one schema graphto be matched as theCPs set of variables,

take theobjects of the other schema graphto be matched as the variables domain

find a proper translation of theconditions that apply to a schema matchinginto aset of constraints, and

form theobjective functionsto be optimized.

(35)

Schema Matching as an FCOP: Example

The set of variables X:

X=XNXE

={xn1,xn2,xn3,xn4,xn5,xn6} ∪ {xe12,xe23,xe24,xe25,xe26}

={xn1,xn2,xn3,xn4,xn5,xn6,xe12,xe23,xe24,xe25,xe26}

The set of domain D:

D=NGT EGT

={Dn1,Dn2,Dn3,Dn4,Dn5,Dn6} ∪ {De12,De23,De24,De25,De26}

={Dn1,Dn2,Dn3,Dn4,Dn5,Dn6,De12,De23,De24,De25,De26} Dn1=Dn2=Dn3=Dn4=Dn5=Dn6= {n1T,n2T,n3T,n4T,n5T,n6T,n7T,n8T,n9T,n10T}

(36)

Schema Matching as an FCOP: Example

The set of variables X:

X=XNXE

={xn1,xn2,xn3,xn4,xn5,xn6} ∪ {xe12,xe23,xe24,xe25,xe26}

={xn1,xn2,xn3,xn4,xn5,xn6,xe12,xe23,xe24,xe25,xe26}

The set of domain D:

D=NGT EGT

={Dn1,Dn2,Dn3,Dn4,Dn5,Dn6} ∪ {De12,De23,De24,De25,De26}

={Dn1,Dn2,Dn3,Dn4,Dn5,Dn6,De12,De23,De24,De25,De26}

Dn1=Dn2=Dn3=Dn4=Dn5=Dn6= {n1T,n2T,n3T,n4T,n5T,n6T,n7T,n8T,n9T,n10T}

(37)

Schema Matching as an FCOP: Example

The set of variables X:

X=XNXE

={xn1,xn2,xn3,xn4,xn5,xn6} ∪ {xe12,xe23,xe24,xe25,xe26}

={xn1,xn2,xn3,xn4,xn5,xn6,xe12,xe23,xe24,xe25,xe26}

The set of domain D:

D=NGT EGT

={Dn1,Dn2,Dn3,Dn4,Dn5,Dn6} ∪ {De12,De23,De24,De25,De26}

={Dn1,Dn2,Dn3,Dn4,Dn5,Dn6,De12,De23,De24,De25,De26} Dn1=Dn2=Dn3=Dn4=Dn5=Dn6=

(38)

Constraint Construction

Syntactic constraints

Domain Constraint

Cµ(xdom

ni)={di DNi}

Cµ(xdom

ei)={di DEi}

Structural Constraints

Parent Constraint Cparentµ(x

ni,xnj)={(di,dj)DN×DN| ∃e (di, dj) s.t. src(e)=di}

Child Constraint Cchildµ(x

ni,xnj)={(di,dj)DN×DN| ∃e (di, dj) s.t. tar(e)=dj}

Semantic constraints

Labeled Constraints Cµ(xLab

i)={dj DN|lsim(lS(xi),lT(dj))≥t } Cµ(xLab

i)={dj DE|lsim(lS(xi),lT(dj))t}

(39)

Constraint Construction

Syntactic constraints

Domain Constraint

Cµ(xdom

ni)={di DNi}

Cµ(xdom

ei)={di DEi}

Structural Constraints

Parent Constraint Cparentµ(x

ni,xnj)={(di,dj)DN×DN| ∃e (di, dj) s.t. src(e)=di}

Child Constraint Cchildµ(x

ni,xnj)={(di,dj)DN×DN| ∃e (di, dj) s.t. tar(e)=dj}

Semantic constraints

Labeled Constraints Cµ(xLab

i)={dj DN|lsim(lS(xi),lT(dj))≥t } Cµ(xLab

i)={dj DE|lsim(lS(xi),lT(dj))t}

(40)

Objective Function Construction

is the function associated with the optimization process

constitutes the implementation of the problem to be solved.

The input parameters are the object parameters

The output is the objective value representing the evaluation/quality of the individual

g =min|max( X

setofconstraint

fcost + X

setofassignment

fenergy)

(41)

Road Map

1. Motivations 2. Preliminaries 3. Schema Graphs

4. Schema Matching as an FCOP 5. Summary and Future Work

(42)

Summary and Future Work

Building a conceptual connection between the schema matching problem and fuzzy constraint optimization problem

Developing a formal framework for the SMP, which

generic framework; model and domain independent

able to handle uncertainty

able to cope with complex mappings

Benefits behind formulation:

Increase our understanding of the problem

Help mapping of the problem into another well-known problem

Open a path to adopt of different existing algorithms

Guide the initial design of the schema matching prototype

Future work?? Implementation, evaluation, and comparison with other mainstream systems

(43)

Thank You

Questions??

(44)

Thank You

Questions??

Références

Documents relatifs

A Declarative Framework for Security: Secure Concurrent Constraint Programming..

on Principles and Practice of Con- straint Programming (CP’95), volume 976 of Lecture Notes in Computer Science, pages 362–379. Abstracting synchronization in con- current

Let us consider for example the extended graph edit distance (EGED) defined in [14]. This distance extends GE D by adding two edit operations for splitting and merging nodes.

Introduction (You are here) Video iPhone Slides Handout First Steps - Hello World Video iPhone Slides Handout Application Overview Video iPhone Slides Handout Basic Constraint

Now it seemed to us that a constraint programming (cp) approach should be able to limit, indeed to avoid, these disadvantages. Moreover, it constitutes a new solving strategy for

L’hématite est un catalyseur qui a montré une grande efficacité lors de la dégradation par UV, cependant nous avons testé son efficacité dans le traitement de méthyl violet 2B par

Section 3 is devoted to the Queens n 2 problem, including preliminary results obtained by a solution approach based on the previously described method.. Finally, Section 4

Given a sequential branching algorithm such as that described in Section 3.2, every node defines one partial transfer plan φ t of size t (t being the number of transfers that have