• Aucun résultat trouvé

The Concept of Granularity in Rules

Dans le document Data Mining (Page 97-101)

Knowledge Representation

10. The Concept of Granularity in Rules

Rules are generic constructs of knowledge representation whose components are information granules. By virtue of their very nature, they tend to capture general relationships between variables (attributes). Furthermore, to make such dependencies meaningful, they have to be expressed at the level of abstract entities – information granules. For instance, a rule of the form

“if high temperature then a substantial sale of cold drinks” is quite obvious. The relationship “if temperature of 28C then sale of 12,566 bottles of cold drinks” does not tell us too much and might not be a meaningful rule. The bottom line is that rules come hand in hand with information granules.

To emphasize the relevance of the use ofgranular information in rule-based computing, let us consider a rule of the form “If A then B” where A and B are represented as numeric intervals in the space of real numbers. In this context, the idea of information granularity comes with a clearly articulated practical relevance. A low level of granularity of the condition associated with a high level of granularity of the conclusion describes a rule of high relevance: it applies to a wide range of situations (as its condition is not very detailed) while offering a very specific conclusion. On the other hand, if we encounter a rule containing a very specific (detailed) condition with quite limited applicability while the conclusion is quite general, we may view a rule’s relevance to be quite limited. In general, increasing granularity (high specificity) of the condition and decreasing granularity of the conclusion decrease the quality of the rule. We can offer some qualitative assessment of rules by distinguishing between those rules that are still acceptable and those whose usefulness (given the specificity of conditions and conclusions) could be questioned. A hypothetical boundary between these two categories of rules is illustrated in Figure 5.17. Obviously the detailed shape of the boundary could be different; our primary intent was to illustrate the main character of such a relationship.

granularity of condition increased usefulness of rule

acceptable rules

unacceptable rules granularity of conclusion

Figure 5.17. Identifying relevant rules with reference to the granularity of their conditions and conclusions.

The boundary is intended to reflect the main tendency by showing the reduced usefulness of the rule when associated with an increased level of granularity of the condition and a decreased level of granularity of the conclusion.

11. Summary and Bibliographical Notes

In this Chapter, we covered the essentials ofknowledge representationby presenting the most commonly encountered schemes such asrules,graphs,networks, and their generalizations. The fundamental issue of abstraction of information captured byinformation granulation and the resulting information granules was discussed in detail. We stressed that the formalisms existing within the realm ofgranular computingoffer a wealth of possibilities with reference it comes to representation of generic entities of information. Similarly, by choosing a certain level of granularity (specificity), one can easily cater the results to the needs of the user. In this way, information granulesoffer the important feature of customization of data mining activities. We showed that the choice of a certain formalism of information granulation depends upon a number of essential factors spelled out in Sec. 5.8. Given that in data mining we are faced with an enormous diversity of data one has to make prudent decision about the underlying schemes of knowledge representation.

An excellent coverage of generic knowledge-based structures including rules is offered in [4, 6, 9, 21]. Graph structures are discussed in depth in [5, 14]. A general overview of the fundamentals and realization of various schemes ofknowledge representation is included in [23].Granular computingregarded as a unified paradigm of forming and processing information granules, is discussed in [1, 19, 24]. The role of information granulation is stressed in [26].

Set-theoretic structures (in particular, interval analysis) are presented in [8, 11, 24] Fuzzy sets introduced by Zadeh [25] are studied in depth in [16, 27 – 29, 31]. An in-depth coverage of rough sets developed by Pawlak is provided in [13, 20, 22]. Shadowed sets are presented in [15, 17, 18].

Some generalizations of fuzzy sets are discussed in [10, 12].

References

1. Bargiela, A., and Pedrycz, W. 2003.Granular Computing: An Introduction, Kluwer Academic Publishers 2. Bubnicki, Z. 2002.Uncertain Logics, Variables and Systems.Lecture Notes in Control and Information

Sciences, no. 276, Springer Verlag

3. Butnariu, D., and Klement, E.P. 1983. Triangular Norm Based Measures and Games with Fuzzy Coalitions, Kluwer Academic Publishers

4. Cowell, R.G., David A.P., Lauritzen, S.L., and Spiegelhalter, D.J. 1999.Probabilistic Networks and Expert Systems, Springer-Verlag

5. Edwards, D. 1995.Introduction to Graphical Modelling, Springer-Verlag

6. Giarratano, J., and Riley, G. 1994.Expert Systems: Principles and Programming, 2nd edition, PWS Publishing

7. Hansen, E. 1975. A generalized interval arithmetic,Lecture Notes in Computer Science, Springer Verlag, 29: 7–18

8. Jaulin, L., Kieffer, M., Didrit, O., and Walter, E. 2001.Applied Interval Analysis, Springer Verlag 9. Lenat, D.B., and Guha, R.V. 1990.Building Large Knowledge-Based Systems, Addison Wesley 10. Mendel J.M., and John RIB. 2002. Type-2 fuzzy sets made simple, IEEE Transactions. on Fuzzy

Systems, 10 (2002): 117–127

11. Moore R. 1966.Interval Analysis, Prentice Hall

12. Pal, S.K., and Skowron, A. (Eds.). 1999.Rough Fuzzy Hybridization. A New Trend in Decision-Making, Springer Verlag

13. Pawlak, Z. 1991. Rough Sets. Theoretical Aspects of Reasoning About Data, Kluwer Academic Publishers

14. Pearl, J. 1988.Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann

15. Pedrycz, W. 1998. Shadowed sets: representing and processing fuzzy sets, IEEE Transactions. on Systems, Man, and Cybernetics, part B, 28: 103–109

90 12. Exercises

16. Pedrycz, W., and Gomide, F. 1998.An Introduction to Fuzzy Sets; Analysis and Design. MIT Press 17. Pedrycz, W. 1999. Shadowed sets: bridging fuzzy and rough sets, In:Rough Fuzzy Hybridization. A New

Trend in Decision-Making, Pal, S.K., and Skowron, A. (Eds), Springer Verlag, 179–199

18. Pedrycz, W., and Vukovich, G. 2000. Investigating a relevance of fuzzy mappings,IEEE Transactions.

on Systems Man and Cybernetics, 30: 249–262

19. Pedrycz, W. (Ed.). 2001.Granular Computing: An Emerging Paradigm, Physica Verlag

20. Polkowski, L., and Skowron, A. (Eds.). 1998.Rough Sets in Knowledge Discovery, Physica Verlag 21. Russell, S., and Nonig, P. 1995.Artificial Intelligence: A Modern Approach, Prentice-Hall

22. Skowron, A. 1989. Rough decision problems in information systems,Bulletin de l’Academie Polonaise des Sciences (Tech), 37: 59–66

23. Sowa, J. 2000.Knowledge Representation, Brooks/Cole

24. Warmus, M. 1956. Calculus of approximations,Bulletin de l’Academie Polonaise des Sciences, 4(5):

253–259

25. Zadeh, L.A. 1965. Fuzzy sets,Information & Control, 8: 338–353

26. Zadeh, L.A. 1979. Fuzzy sets and information granularity, In: Gupta, M.M., Ragade, R.K., and Yager, R. R. (Eds.),Advances in Fuzzy Set Theory and Applications, North Holland, 3–18

27. Zadeh, L.A. 1996. Fuzzy logic = Computing with words, IEEE Transactions on Fuzzy Systems, 4:

103–111

28. Zadeh, L.A. 1997. Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic,Fuzzy Sets and Systems, 90: 111–117

29. Zadeh, L.A. 1999. From computing with numbers to computing with words-from manipulation of measurements to manipulation of perceptions, IEEE Transactions. on Circuits and Systems, 45:

105–119

30. Zadeh, L.A., and Kacprzyk, J. (Eds.). 1999.Computing with Words in Information/Intelligent Systems, Physica-Verlag

31. Zimmermann, H.J., 2001. Fuzzy Set Theory and Its Applications, 4th edition, Kluwer Academic Publishers

12. Exercises

1. Offer some examples of quantitative and qualitative variables.

2. What would be a shadowed set induced by the membership functionAx=cosx defined over0 /2?

3. What rules would you suggest to describe the following input-output relationship?

x y

4. Suggest a rule-based description of the problem of buying a car. What attributes (variables) would you consider? Give a granular description of these attributes. Think of a possible quantification of relevance of the rules.

5. Derive a collection of rules from the decision tree shown below; order the rules with respect to their length (number of conditions).

A

B

C

D

a1 a2

b1 b2

c1 c2

d1 d2

d3

ω1 ω2

ω2

ω1

ω3

6. The “curse of dimensionality” is present in rule based systems. Consider that we are given

“n” variables and each of them assumes “p” values. What is the number of rules in this case?

To get a better sense as to the growth of this number, takep=5 and vary nfrom 5 to 20.

Plot your findings, treating the number of rules as a function ofn. How could you avoid this curse?

7. Obtain the rules from the following network:

a

b

d c

y and

or

xor

or

8. Suggest a membership function for the concept offastspeed. Discuss its semantics. Specify conditions under which such a fuzzy set could be effectively used.

9. Given are two fuzzy sets A and B with the following membership functions:

A=07 06 02 00 09 10 B=09 07 05 02 01 00 Compute their union, intersection and the expressionC=A∩ ¯B.

10. You are given a two-dimensional grid in the x-y space where the size of the grid in each coordinate is one unit. How could you describe the concept of a circlex−102+y−52=4 in terms of the components of this grid?

Part 3

Dans le document Data Mining (Page 97-101)