• Aucun résultat trouvé

On Estimating the Cardinality of Aggregate Views

N/A
N/A
Protected

Academic year: 2022

Partager "On Estimating the Cardinality of Aggregate Views"

Copied!
10
0
0

Texte intégral

(1)

On Estimating the Cardinality of Aggregate Views

Paolo Ciaccia Matteo Golfarelli Stefano Rizzi DEIS - University of Bologna

40136 Bologna, Italy

f

pciaccia, mgolfarelli, srizzi

g

@deis.unibo.it

Abstract

Accurately estimating the cardinality of aggregate views is crucial for logical and physical design of data warehouses. While the warehouse is un- der development and data are not available yet, the approaches based on accessing data cannot be adopted. This paper proposes an approach to es- timate the cardinality of views based on a-priori information derived from the application domain.

We face the problem by first computing satisfac- tory bounds for the cardinality, then by capital- izing on these bounds to determine a good prob- abilistic estimate for it. Bounds are determined by using, besides the functional dependencies ex- pressed by the multidimensional scheme, addi- tional domain-derived information in the form of cardinality constraints which may bound either the cardinality of a given view or the ratio between the cardinalities of two given views. In particular, we propose a bounding strategy which achieves an effective trade-off between the tightness of the bounds produced and the computational complex- ity.

1 Introduction and Motivation

The multidimensional model is the foundation for data representation and querying in multidimensional databases

This work has been partially supported by the D2I MURST project.

The copyright of this paper belongs to the paper’s authors. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage.

Proceedings of the International Workshop on Design and Management of Data Warehouses (DMDW’2001)

Interlaken, Switzerland, June 4, 2001

(D. Theodoratos, J. Hammer, M. Jeusfeld, M. Staudt, eds.) http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-39/

and data warehouses [AGS97]. It represents facts of inter- est for the decision process into cubes in which each cell contains numerical measures which quantify the fact from different points of view, while each axis represents an in- teresting dimension for analysis. For instance, within a 4- dimensional cube modeling the phone calls supported by a telecommunication company, the dimensions might be the calling number, the number called, the date, and the time segment in which the call is placed; each cube cell could be associated to a measure of the total duration of the calls made from a given number to another number on a given time segment and date.

The basic mechanism to extract significant information from the huge quantity of fine-grained data stored in base cubes is aggregation according to hierarchies of attributes rooted in dimensions [GL97]. In most application cases, cubes are significantly sparse (for instance, most couples of telephone numbers are never connected by a call in a given date), and so are the aggregate views.

Accurately estimating the actual cardinality of each view is crucial for logical and physical design as well as for query processing and optimization [Vas00]. As a relevant case, consider the view materialization problem, where the aggregate views which are the most useful in answering the workload queries have to be selected for materializa- tion (see [TB00] for a survey). Since the number of pos- sible views which can be derived by aggregating a cube is exponential in the number of attributes, most approaches assume that a constraint on the total disk space occupied by materialization is posed, and attempt to find the subset of views which contemporarily satisfies this constraint and minimizes the workload cost [GR00, Gup97, HRU96]. An- other case where estimation of view cardinalities is relevant is index selection [GHRU97].

If the data warehouse has already been loaded, view car- dinalities can be quite accurately estimated by using sta- tistical techniques based, say, on histograms [MD88] or sampling [HO91]. However, such techniques cannot be applied at all if the data warehouse is still under develop- ment, and the estimation of view cardinalities is needed for

(2)

design purposes. To obviate this, current approaches are based on estimation models that only exploit the cardinal- ity of the base cube and that of the single attribute domains [RS97, SDNR96], which however leads to significant over- estimation.

In this paper we propose a novel approach to estimate the cardinality of views based on a-priori information de- rived from the application domain. Similarly to what is done when estimating the cardinality of projections in re- lational databases [CM95], we face the problem by first computing satisfactory bounds for the cardinality, then by capitalizing on these bounds to determine a good proba- bilistic estimate for it. Besides the functional dependen- cies expressed by the multidimensional scheme, the bounds we determine also take into account additional domain- derived information expressed in the form of cardinality constraints, namely, bounds of the cardinality of some views and bounds (called k-dependencies) on the ratio be- tween the cardinalities of two views. The computation of bounds is based on a bounding strategy, which is aimed at achieving an effective trade-off between the tightness of the bounds produced and the computational complexity.

The paper is organized as follows. After providing some basic definitions in Section 2, in Section 3 we introduce k- dependencies. In Section 4 we outline our overall approach to estimation and show its benefits with an example. Sec- tion 5 introduces the basic properties of bounds, proposes an efficient bounding strategy, and sketches a branch-and- bound approach to determine the upper bound of the car- dinality of a given view when the cardinality constraints in input do not contain k-dependencies; besides, it discusses how the strategy introduced can be improved. Section 6 shows how the bounds derived may be used to improve the cardinality estimates. Finally, Section 7 discusses the most interesting open issues.

2 Background and Working Example

In this section we formalize the concept of view, define a partial ordering on the set of views, and present the appli- cation domain we will use as an example.

Definition 1 (Dimensional Scheme) We call dimensional scheme Da couple(U;F)where U is a set of attributes andF =fAi

!A

j j A

i

;A

j

2 Ugis a set of functional dependencies (FD’s) which relate the attributes ofUinto a set of pairwise disjoint directed trees. We call dimensions the attributesAk

2 U in which the trees are rooted, i.e., such that8Ai

2 U (A

i

! A

k

) 62 F; letdim(D ) U denote the set of dimensions ofD.

Definition 2 (View) Let D = (U;F) be a dimensional scheme. We call view onDany subset of attributesV U such that8Ai

;A

j

2 V (A

i

! A

j ) 62 F

+, whereF+ denotes the set of all functional dependencies logically im- plied byF.

It should be noted that we are using the term view to de- note the set of grouping attributes used for aggregation, while the “actual” views will typically include also one or more measures. This slight abuse in terminology is possi- ble since we are interested in determining the cardinality of views, which only depends on the grouping attributes.

Definition 3 (Roll-up) Given the set VD of all possible views onD, we define onVD the roll-up partial order as follows:VW iff8Ai

2V 9A

j

2W j(A

j

!A

i )2

F

+, i.e., iffW ! V. We call multidimensional lattice forDthe corresponding lattice, whose top and bottom el- ements are dim(D ) and the empty view fg, respectively.

We will denote withVW the view that is the least upper bound ofV andW in the lattice; given a set of viewsS, we will briefly denote with(S)the view that is their least upper bound.

Example 1 Consider an enterprise with branches in differ- ent cities. A simple dimensional scheme Transfers mod- eling the transfers of employees between offices might in- clude:

U =fdate;month;year ;fromOÆce;fromDept;fromCity ;

toOÆce;toDept;toCity ;employeeg

F =fdate!month;month!year ;

fromOÆce!fromDept;fromOÆce!fromCity;

toOÆce!toDept ;toOÆce!toCityg

thus dim(D ) = fdate;fromOÆce;toOÆce;employeeg. Examples of views on the Transfers scheme are

V =fmonth;fromOÆce;toCity ;employeeg

W =fmonth;fromCity;fromDeptg

Z=fyear ;fromOÆce;toCityg

It is W Z = fmonth;fromOÆce;toCityg, with

(WZ) V. 2

The following notation is used throughout the rest of the paper. Uppercase letters from the beginning of the alpha- bet (A,B,::: ) denote dimensions. Attributes which are functionally determined by another attribute, i.e. attributes other than dimensions, are denoted by the corresponding primed letters (e.g.,A!A0,A!A00). Sets of attributes are represented by omitting braces, thus writingABC for

fA;B;Cg. V is the view whose cardinality is to be es- timated, whileW,X,Y, andZ, possibly with subscripts (W1

;W

2

;:::), denote generic views inVD. Finally, low- ercase letters are used for the cardinalities of views and at- tributes (e.g., w is the cardinality of view W, abcis the cardinality of the view with attributesABC, and so on).

3 The k-dependencies

A k-dependency is a relevant case of cardinality constraint which naturally generalizes a functional dependency. In the

(3)

authors’ experience, k-dependencies are particularly use- ful to characterize the knowledge of the business domain held by the experts in the field. For instance, in the trans- fer domain, we might have some information concerning the number of destination cities for an employee, or on the number of distinct areas moved to from each area. If such information is in the form of bounds, it can be effectively used to improve the bounds of view cardinality.

Definition 4 (k-dependency) LetX andY be two views onD. We say that a k-dependency (kD) holds betweenX andY, and denote it withX !k Y, whenk(k 1) is an upper bound of the number of distinct tuples ofY which correspond to each distinct tuple ofXwithin viewX Y.

Example 2 In the Transfers scheme, assume the domain expert provides the following information: The maximum number of inter-department transfers of an employee dur- ing one year is 2. This constraint can be formalized by the following kD:X !2 Y, whereX =fyear ;employeeg,

Y = ftoDeptg. Intuitively, from this we can derive that the cardinality of the viewfyear ;employee;toDeptgcan- not exceed twice the cardinality ofX. 2 The kD’s have been studied in the context of relational database theory, where they are also known as numerical dependencies. Grant and Minker [GM83] proved that kD’s are not finitely axiomatizable, thus no fixed set of inference rules can be used to determine whether or not a given kD is logically implied by a set of kD’s. Nonetheless, a ba- sic set of rules, which naturally extend those for FD’s, was proposed in [GM83]. The rules we use, generalized to the multidimensional lattice, are:

R1: X !k Y ` X Z !k YZ

R2: X !k Y ^Y !l Z ` X k l!YZ R3: X !k YZ ` X !k Y R4:X!k Y ^X !l Z ` X k l!YZ

Note that the “union” rule R4 is not strictly needed, since it can be derived from rules R1 (“extension”), R2 (“transitiv- ity”), and R3 (“decomposition”).

4 A Framework for Estimation

The framework for this work is the logical design of mul- tidimensional databases carried out off-line, i.e. assuming that the source data cannot be directly queried to estimate the cardinality of multidimensional views. Without loss of generality, in the following we consider that estimates are needed for the purpose of view materialization, thus reli- able information on the size of the candidate views has to be supplied to the materialization algorithm.

As sketched in Figure 1, whenever the materialization algorithm requires information about a candidate viewV,

bounds dimensional

scheme, cardinality constraints

dimensional scheme

workload logical scheme bounder

estimator

view materializer candidate viewcardinality estimate

Figure 1: Overall architecture for logical design our approach works in two steps. First, the bounder uses the setIof cardinality constraints supplied by the user to determine effective bounds for the cardinalities of a proper set of views; then, the estimator uses these bounds to derive a probabilistic estimate for the cardinality ofV. Note that this two-steps approach generalizes well-known paramet- ric models for the estimation of the cardinality of relational queries [MCS88], and in particular those for projection size estimation [CM95], for which bounds are typically given as input parameters.

The different forms of cardinality constraints we will consider are:

1. a lower (w ) and/or an upper (w+) bound of the car- dinalitywof a viewW;

2. a k-dependency (X !k Y) expressing an upper bound of the ratio between the cardinalities of two viewsX andY.

We will assume that at least the upper bounds of the car- dinalities of all the single attributes in the dimensional scheme are known. This assumption, which is perfectly reasonable in all application domains, is necessary in order to guarantee that at least one upper bound can be deter- mined for each view.

The setI, together with the dimensional schemeD, uni- vocally determines two bounds for the cardinality ofV, which are called the greatest lower bound and the least upper bound, denoted as v andv+, respectively.1 The interpretation of such bounds is as follows:

1. in each instance ofDthat does not violate any con- straint inI, the cardinalityv ofV is such thatv 2

[v ;v +

]; and

2. there exist two instances, both compatible with I, wherevequalsv andv+, respectively.

We say a constraintc 2 Iis redundant iff all the greatest lower bounds and the least upper bounds determined byI are equal to those determined byI fcg.

1For simplicity of notation, in denoting bounds we omit the depen- dence onDandI.

(4)

Definition 5 (Sound and Minimal Input) Let I be a set of cardinality constraints on dimensional scheme D. We say I is sound iff there exists at least one non-empty in- stance ofDwhich satisfies all the constraints inI. We say

Iis minimal iff no constraint inIis redundant.

In this paper we will assume that the inputIis sound and minimal. It is straightforward to derive that, in this case, all the bounds inIare either greatest lower bounds or least upper bounds (whereas the opposite is not necessarily true).

Computing the bounds implied by I turns out to be a challenging combinatorial problem, even for “simple”

forms of cardinality constraints. For instance, it is known that the problem is NP-hard for arbitrary patterns of func- tional dependencies [CM92]. Furthermore, the actual com- putational effort needed to compute these bounds might limit applicability in real-world cases. For this reason, the bounder is built around the concept of bounding strategy. A bounding strategysis characterized by a couple of bound- ing functions that, givenI;D, andV, compute boundsv

s

andv+

s

such thatv

s

v andv+ v+

s

both hold. In other terms, a bounding strategy never computes bounds which are more restrictive than the ones logically implied by the input constraints, trading-off accuracy for speed of evaluation. We say that a strategysis decoupled iff com- putingv+

s

for an arbitrary viewV only requires the knowl- edge of upper boundsw+

s

of other viewsW, but no knowl- edge of lower boundsw

s

, and vice versa. Thus, for a de- coupled bounding strategy, the two bounding functions can be defined independently of each other.

Turning to the estimator, our framework supports differ- ent probabilistic models. A probabilistic model is a func- tion that, given I;D,V, as well as bounds computed by the bounder, provides an estimate,v, for the cardinality of

V. In general, this step can use further information from the application domain that is not suitable to derive bounds.

Typically this is the case with information concerning aver- age values (e.g., the number of transfers of each employee on each year is 1.5, on the average).

Example 3 Let104be the number of employees who have been transfered at least once, and let the enterprise con- sist of 103 offices distributed over 10 cities and belong- ing to one of 10 departments; let103days be the observa- tion period. LetV =fdate;fromOÆce;toOÆceg. Since each office is involved in transfers at most with every other office on each date, the first trivial upper bound of v is

10 3

10 3

10 3

=10

9. If the maximum number of transfers for an employee during one year is 2, and since we con- sider 3 years, it is derived that the cardinality of the base cube is at most23 =6times the number of transferred employees, i.e.6104. Thus, the upper bound ofvcan be improved to6104as well (the cardinality of a view can- not exceed that of its base cube). On the other hand, if we assume that each office is involved in at least one transfer,

it isv103. Finally, by using the model in Section 6, the cardinality ofV is estimated asv=3:8104. 2

5 The Bounder

The basic observation to determine bounds for view car- dinalities using bounds of the cardinalities of other views is that the multidimensional lattice induces an isomorphic structure over such cardinalities. In fact, from Definition 3 it follows thatW Z implieswzin each instance of

D, sinceZ ! W holds. This inequality also applies to bounds.

Lemma 1 IfWZ, thenw z andw+z+. Proof: (w z ) Assumew > z . Then, there is an instance ofDin whichww >zz , thusw>z, which is a contradiction. Similarly forw+z+. 2 As to k-dependencies, their influence on the determina- tion of bounds is summarized by the following lemma.

Lemma 2 LetZ =X Y. IfX !k Y, thenx z =k andz+kx+.

Proof: From Definition 4 it follows immediately that, if

X k

!Y, the cardinalityzofZis related to the cardinality

xofXby inequalityzkx. The inequalities on bounds

follow immediately. 2

In the rest of this section we first propose a decoupled strategy to compute upper bounds (Section 5.1), then we discuss some issues related to coupled strategies (Section 5.2).

5.1 A Decoupled Upper Bounding Strategy

The bounding strategy we propose in this section, called cover-based, relies on the concept of cover of a view to compute upper bounds. The following are two preliminary definitions whose aim is to precisely characterize how sets of views and kD’s can be sinergically combined together.

Definition 6 (Graph of a set of kD’s) Let K = fX1 k1

!

Y

1

;:::;X

p k

p

! Y

p

gbe a set of kD’s. The (labelled ori- ented) graph ofK isG(K) = (N;E), with set of nodes

N = S

i fX

i

;Y

i

g, set of edgesE = fei

=(X

i

;Y

i );i =

1;:::;pg, and labeling functionsuch that(ei )=k

i.2 Definition 7 (K-set of views) Let S = fW1

;:::;W

m g

be a non-empty set of views, and let K = fX1 k1

!

Y

1

;:::;X

p kp

! Y

p

gbe a set of kD’s. The couple C =

2Technically,G(K)is a multi-graph, since two edges may share the same couple of nodes. This, however, does not influence the following arguments.

(5)

(S;K) is called a k-set of views iff 8i = 1;:::;pit is

Y

i

2 S and there exists a set of kD’s, K0 = fWj1 k1

!

Y

1

;:::;W

j

p k

p

! Y

p

g, such that: 1) 8i = 1;:::;pit is

X

i W

j

i

withWj i

2S, and 2)G(K0)=(N0;E0)is a for- est, i.e., a set of disjoint directed trees. We callS-compliant a setK0with such properties.

Each kD in anS-compliant setK0is derived from a cor- responding kD inK by applying rules R1 and R3 (since, by hypothesis, it isXi

W

ji

=W

ji). Note thatN0 S always holds and that, in general, multipleS-compliantK0 sets can be derived from the same C, depending on how eachWj

i

2Sis chosen.

Example 4 C1

=(fA 0

B;C ;D g;K), withK =fA0B !k1

C ;C k2

! D g, is a k-set of views, sinceKisS-compliant.

The same is true forC2

=(fAB;C ;D g;K), sinceK0 =

fAB k1

!C ;C k2

!D gisS-compliant (in fact,A0B AB).

On the other hand, C3

= (fB;C ;D g;K)is not a k-set since noS-compliant set of kD’s can be found.

It is important to remark that Definition 7 requiresK0, and not necessarily K, to be a forest. For instance, the couple(fAg;fA0!k Ag)is not a k-set, thoughG(fA0!k

Ag)is a forest, sinceG(fA!k Ag)is cyclic. On the other hand,C4

=(fA;A 0

;Bg;fA 0

k

1

! B;B k

2

! A 0

g)is a k-set (after derivingA !k1 B fromA0 !k1 B) even ifG(fA0 !k1

B;B k2

!A 0

g)is cyclic.

Finally, for the k-set C5

= (fA;A 0

B;A 0

Cg;fA 0

k

!

Ag), two S-compliant sets, K0

1

= fA 0

B k

! Ag and

K 0

2

=fA 0

C k

!Ag, can be derived. 2

Definition 8 (Cover) Let V 2 VD be a view on D and

C = (S;K)be a k-set of views. C is called aV-cover iff

V (S).

As the following example suggests, a V-cover can be used to bound from above the cardinality ofV by generaliz- ing Lemma 1 to the case of multiple views (sinceV (S) holds). When also kD’s are present, Lemma 2 can be ex- ploited to improve the bound. Since a cover must be a k-set, we are guaranteed that the cardinalities of some views inS can be safely “replaced” by theki’s of the kD’s inK0. Example 5 LetV =ABC. Below we consider some no- table examples of V-covers and show how each of them can be used to derive an upper bound for v. In order to help the reader, Figure 2 depicts the roll-up relationships between the views involved.

C

1

=(fABCD g;;)is aV-cover sinceV (S1 )=

ABCD. From Lemma 1 it is derivedabcabcd+.

C

2

=(fAB;BCg;;)is aV-cover sinceV (S2 )=

ABC. Since the natural join between two views is a subset of their Cartesian product, it isabcab+bc+.

ABCD ABC

AB BC CD

A'B

A B C D

A'

Figure 2: Roll-up relationships of views in Example 5

C

3

= (fAB;Cg;fAB k

! Cg). From Lemma 2 it immediately followsabcab+k.

C

4

=(fA;B;Cg;fA k

1

!B;B k

2

!Cg). By applying rule R2, we deriveAk!1k2BC, thusabca+k1

k

2.

C

5

= (fA;B;Cg;fA k1

! B;A k2

! Cg). Rule R4 is now used to deriveAk!1k2BC, thusabca+k1

k

2.

C

6

=(fA;A 0

B;Cg;fA 0

k

!Ag). According to rule R1 it isA0B !k AB, and from Lemma 2ab+ k

a 0

b

+. On the other hand,abcab+c+, thusabc

ka 0

b +

c

+. 2

The following theorem precisely characterizes how bounds are related to the graph ofK0.

Theorem 1 (Cover-based bounding) LetVbe a view and

C = (S;K) be a V-cover, with S = fW1

;:::;W

m g

and K = fX1 k

1

! Y

1

;:::;X

p kp

! Y

p

g. Let K0 be an

S-compliant set, R(G(K0)) be the set of root nodes of the forest G(K0) = (N0;E0) associated toK0, and let

S

0

=S N

0stand for the set of views which are not nodes inG(K0). Then:

vu(C;K 0

) def

= p

Y

i=1 k

i

Y

Wj2R(G(K 0

))[S0 w

+

j

(1)

Proof: The intuition behind the proof is that each tree

G

t

= (N 0

t

;E 0

t

)ofG(K0)contributes tou(C;K0)with the upper bound of the cardinality of its rootWttimes all the

k

i’s which label the edges inE0

t

.

Since by Definition 8 it is V (S), it is sufficient to prove thatu(C;K0)is an upper bound of(S). Since the size of the natural join of a set of views can never exceed

(6)

that of their Cartesian product, it is

car d((S))car d(((S

0

)) ((N 0

)))

car d((S

0

))car d((N 0

))

Y

Wj2S0 w

+

j

Y

t

car d((N 0

t ))

Since theki’s are partitioned over the trees, it is enough to prove thatwt+

Q

i2 0

t k

i, whereWtis the root ofGtand0t

is the set of labels inGt, is an upper bound ofcar d((N0

t )). This is proved by induction on the numberLof levels inGt. Base step (L=2). 3 In this caseGtcorresponds to the set of kD’sfWt

k2;1

! W

2;1

;:::;W

t k

2;q

2

! W

2;q

2

g. From the union rule R4 it immediately follows thatcar d((N0

t ))

w +

t

Q

q2

i=1 k

2;i.

Inductive step (L 1 ) L). Let N0

t

(L 1)

be the set of nodes in the first L 1 levels. By in- ductive hypothesis it is car d((Nt0

(L 1))) w +

t

Q

L 1

l=2 Q

ql

i=1 k

l;i. Adding theL-th level introduces new edges with labels kL;1

;:::;k

L;qL and corresponding ter- minal nodesWL;1

;:::;W

L;qL. From thei-th of the corre- sponding kD’s we can derive (using rules R1 and R3) the kD(N0

t

(L 1)) kL;i

! W

L;i. From the union rule R4 it is derived:

(N 0

t

(L 1)) Q

q

L

i=1 kL;i

! (fW

L;1

;:::;W

L;q

L g)

which, due to Lemma 2, leads to:

car d(((N 0

t

(L 1))) ((fW

L;1

;:::;W

L;q

L g)))=

=car d((N 0

t (L)))

qL

Y

i=1 k

L;i w

+

t

L 1

Y

l=2 ql

Y

i=1 k

l;i

=w +

t

L

Y

l=2 ql

Y

i=1 k

l;i 2

It is possible to prove that (1) is valid even ifG(K0)is not a forest, provided thatR(G(K0))contains (at least) a set of nodes from which every other node inG(K0)can be reached through a directed path. On the other hand, the bounds determined by such “non-forest”V-covers are al- ways redundant, meaning that a properV-cover yielding a better bound forvcan always be found.

Example 6 Let V = ABC, and consider the couple

(fA;B;Cg;K)withK = fA !k1 C ;B !k2 Cg), which is not a k-set since the graph ofK0=Khas two roots (A andB). The bound returned by (1) isvk1

k

2 a

+

b +

which is redundant, since a better bound is obviously ob- tained through theV-cover(fA;B;Cg;fA!k1 Cg). 2 The following lemma shows that, when multiple S- compliant sets exist for a given cover, the bound returned

3The caseL=1cannot arise, since eachGthas at least one edge.

by (1) is actually independent of the one chosen. For in- stance, the reader may immediately verify that, in Example 4, it isu(C5

;K 0

1 )=u(C

5

;K 0

2

)=ka 0

b +

a 0

c +.

Lemma 3 LetC = (S;K)be aV-cover, and letK10 and

K 0

2 be two arbitraryS-compliant sets. It isu(C;K10 ) =

u(C;K 0

2 )

def

= u(C).

Coherently with Theorem 1 and Lemma 3, the cover- based bounding strategycbcomputesv+

cb

as:

v +

cb

= (

v

+ ifv+ 2I;

minfu

cb

(C)jCis aV-coverg ifv+ 62I: (2) where ucb

(C) is obtained by replacingw+

j

withw+

j;cb

in

u(C). In general, evaluating the cover-based bound leads to a recursive computational flow; note that the “case-0” of recursion,v+

cb

=v

+, is correctly defined since we assumed the inputIto be minimal.

The space of the V-covers to be analyzed in order to determinev+

cb

has exponential size. On the other hand, the following theorem shows that, under some circumstances, a

V-coverC2can be discarded from the search space without even computingucb

(C

2 ). Theorem 2 LetC1

=(S

1

;K

1

)andC2

=(S

2

;K

2

)be two

V-covers. IfS1 S

2 andK1

= K

2 or S1

= S

2 and

K

2 K

1, thenucb (C

1 )u

cb (C

2 ).

5.1.1 Reasoning without k-dependencies

When no k-dependencies are included among the input constraintsI, covers degenerate into sets of views, which allows us to precisely characterize the set of V-covers that can provide useful (non redundant) bounds. To see how such covers are determined, two orthogonal aspects are considered: a domination relationship between sets of views and the input information,I. While the former in- duces a partial order on the bounds obtainable from V- covers, regardless of the specific inputI, the latter can be used to restrict the set of usefulV-covers to those including only views inI.

In this section, since we assumeK =;, we will work only with the S part ofV-covers. Consequently, in (2),

u

cb

(C)can be replaced by

u

cb (S)=

Y

W

i 2S

w +

i

: (3)

Definition 9 (Domination between sets of views) Let S1

= fW

1;1

;:::;W

1;i

;:::;W

1;m

g and S2

=

fW

2;1

;:::;W

2;j

;:::;W

2;n

g be two sets of views. We say that S1 dominates S2, written S1

vS

2, iff S2 can be partitioned into m subsets S2;1

;:::;S

2;m such that

W

1;i (S

2;i

)8i=1;:::;m.

Références

Documents relatifs

 Dans le cadre de la douleur thoracique aigue, le coroscanner élimine la maladie coronaire avec une excèllente VPN et un bon rapport de vraisemblance négatif

Dans un deuxième temps, choisies dans l’ensemble des photographies, deux séries d’images, organisées et associées à de courts textes, seront publiées sous forme d’albums

This paper aims at putting in evidence a decreasing relationship between the traffic level and the elasticity of the traffic with respect to economic growth and proposes a

Pour concevoir des interventions appropriées en vue d’enrayer la stigmatisation et d’améliorer la qualité des services de première ligne, nous devons mieux comprendre les

Therefore dis- crete fuzzy measures, cooperative games, and structure functions of coherent systems are all described by pseudo-Boolean functions.. The use of discrete fuzzy

On the cardinality index of fuzzy measures and the signatures of coherent systems.. Jean-Luc Marichal ∗ and Pierre

On the other hand, exact tests based on response time analysis (RTA) provide worst response time for each task and are more suited to be used as cost functions.. Thus, the sum of

Hence, rely- ing on local range consistency is not enough to detect all cardinality inconsistencies in a fm, as it does not take into account the overall number of feature