Alignment composition - Alignment combination operators

7.3 Alignment combination operators

7.3.1 Alignment composition

For running some alignment methods or processes on heterogeneous re-sources, there is sometimes a need for a starting set of correspondences as a parameter to build more sophisticated alignments [Shvaiko & Euzenat 2013].

The starting set of correspondences is whether created manually or built by composing other alignments. The composition of alignment resources is an operator that creates a new alignment from two existing alignments sharing a common resource. In order to define an operator for composing alignment resources, at first, we need to define the composition at the correspondence level and, then, at the alignment level.

7.3.1.1 Composing correspondences

Given two correspondences c1 = he_x, ey, R1i and c2 = he_t, ez, R2i where, R1 ={(r₁, w1), . . . ,(rn, wn)} and R2 ={(s₁, b1), . . . ,(sm, bm)} and ey =et, the composition of c₁ and c₂ is a correspondence:

ccomp=Comp_ζ(c1, c2) =he_x, ez, Rcompi where

Rcomp ={(r_i∗sj, wiζwj)|i= 1,−n; j = 1,−m}

∗ is the composition operator for the considered relation algebra (e.g. Table 7.1), and ζ is an associative operator for combining the confidence levels (in our case max−min).

Other than the alignment relations defined by the A5 algebra, our ap-proach considers semantic relations used in a large alignment repositories

(such as Bioportal [Noyet al. 2008]). Thus, we created a similar table for composing semantic relations (such as “exactMatch”, “narrowMatch”, etc.).

We used the SKOS entailment rules for defining this table. In the case of composing heterogeneous alignment relations (logical with semantic), we created transformation tables from one type of relations to the other.

◦ @ A ≡ ⊥ G

@ @ > @ ⊥ @,⊥,G A @,A,≡,G A A A,⊥,G A,G

≡ @ A ≡ ⊥ G

⊥ @,⊥,G ⊥ ⊥ > @,⊥,G G @,G A,⊥,G G A,⊥,G >

Table 7.1: Composition table for logical relations as defined by [Euzenat 2008]

Condition 2 Let Γ be the set of individual alignment relations of a certain category (logical, semantic or other). Let Θbe the power set of Γ (Θ = 2^Γ).

The set of alignment relations Θshould have closure under the composi-tion operator (∗): ∗: Θ×Θ→Θ

There are many fuzzy composition operators defined in the litera-ture [Portilla et al.2000]. We define a composition operator that uses the fuzzy relation composition and implements different methods for com-bining confidence measure such as the max−min and max−product [Loetamonphong & Fang 2001] compositions.

In order to calculate the new confidence that is attributed to each com-posed alignment relation, we use the fuzzy relation interpretation (see sec-tion 7.2.2.1) and the “max−min” composition of fuzzy sets [Zadeh 1971, Abbasbandyet al. 2006]. By definition this composition is associative, con-sequently the alignment composition is associative.

The confidence measure attributed to the relationri∗sj that we denoted asw_iζw_j is calculated as follows:

wiζwj =µ^r_Aⁱ^∗s^j

1◦A₂(ex, ez) = max

(min(µ^r_Aⁱ

1(ex, ey), µ^s_A^j

2(ey, ez))) where µ^r_Aⁱ

1(ex, ey) is the membership function of the fuzzy relation be-tween e_x and e_y for the specific alignment relation r_i and µ^R_A

1◦A₂ is the membership function that calculates the confidence measure of the resulting fuzzy relationR.

For example, the composition of

hCity,Town,{({=},0.2),({A},0.6)}i and

hTown,Ciudad,{({=},0.3),({@},0.6)}i is

hCity,Ciudad,{({=}∗{=},0.2),({=}∗{@},0.2),({A}∗{=},0.3),({A}∗{@},0.6)}i

=hCity,Ciudad,{({=},0.2),({@},0.2),({A},0.3),({A,@,=,G},0.6)}i

A_i

Composi)on

Normalize_Bel Aj

A_k

Compute Mass and Plausibility Aggregate

Correspondences

Ak A_k

Aggrega)on method

Figure 7.3: Composition of two alignments

Normalizing resulting confidence measures for fuzzy relations

e_y et

e_s

(⊂, α₁) (≡, β₂)

(≡, β₁) (≡, γ₂)

(⊂, γ₁) (≡, α₂)

Figure 7.4: Multiple paths for alignment composition

Given the fact that there might be multiple alignment relations per cor-respondence (the alignment representation model does not exclude conjunc-tions within alignment relaconjunc-tions). Multiple alignment correspondences might lead to the same composition (multiple common entities) as described in fig-ure 7.4, then the composition of each couple of alignment relations might

give as a result the same relation (cf. composition table). For instance, the compositions ⊂ ∗ ≡, ⊂ ∗ ⊂ and ≡ ∗ ⊂ give the same composed relations, which is⊂. This leads to calculating alignment confidences for each relation.

In this case we apply a simple aggregation to group all the results of calculating the confidence measures using the fuzzy “max” aggregator. Thus, if multiple paths of composition or multiple relations composition have led to the same composed relationr ∈Θ, then we group the results byr and we aggregate the membership functions using the “max” aggregator.

In order to aggregate the membership functions that have been computed for each relation between the two entities e_i and e_j, we use the fuzzy set interpretation (see section 7.2.2.2). This means that each relationr_kofR_comp that denotes the set of true relations between both entities has a membership function µ_A_comp(r_k)

Let c₁ = he_x, e_y, R₁i and c₂ = he_y, e_z, R₂i be two alignment corre-spondences, the composition of c₁ and c₂ is a correspondence c_comp = he_x, ez, Rcompi where Rcomp{(r_k, µAcomp(rk)), rk = (ri ∗sj)|r_i ∈ R1, sj ∈ R₂, i= 1,−n, j = 1,−m}.

Two normalizations are applied:

1. if R_comp contains couples of fuzzy relations (r_k, µ_A_comp(r_k)), (rs, µAcomp(rs)) where rk = rs = r, then we group these relations in a subsetR and we apply amax triangular co-norm to aggregate their membership value and remove all duplicates inR_comp.

N ormmax(Rcomp) ={(r,max

ri∈R(µAcomp(ri))}

2. if R_comp contains couples of fuzzy relations (r_k, µ_A_comp(r_k)), (rs, µAcomp(rs)) where rk ⊆ rs and µAcomp(rk) ≥ µAcomp(rs), then rs

is removed from the list of relations since the first one entails it.

7.3.1.2 Composing Alignments

LetA1 and A2 be two alignments, we define:

• Paths(ex, ez, A1, A2) ={(c₁, c2)|∃e_y∃R∃S : c1 =he_x, R, eyi ∈A1 and c₂ = he_y, S, e_zi ∈ A₂} as the set of correspondences associating e₁ to e₃;

• Corrζ(ex, ez) = {Comp_ζ(ci, cj)|(c_i, cj) ∈ Path(ex, ez)} as the set of conflicting correspondences resulting from composing each pair of cor-respondences fromP ath(e_x, e_z). ζ is the confidence combination func-tion (max−min).

• IfP ath(ex, ez) is empty thenCorr(ex, ez) =∅

Finally we define the composition of two alignmentsA₁ andA₂ aligning respectively R1 to R2,R2 to R3 as:

Comp_φζ(A₁, A₂) =Aggr_φ({Corr_ζ(e_x, e_z)|e_x∈R₁, e_z ∈R₃})

WhereAggr_φis an operator that normalizes conflicting correspondences as defined in the following section.

Alignment composition is a complex operation especially for ontologi-cal resources with regard to consistency issues. At the current level of our research we define a composition operator and then we apply an aggrega-tion funcaggrega-tion to avoid inconsistency (only for treating conflictual correspon-dences). For further consistency checking, we assume that an external tool can be used.

Dans le document An ontology-based repository for combining heterogeneous knowledge resources (Page 148-152)