Decision Trees and Rough Set Rules - Evolution-based Learning of Ontological Knowledge for a La

Evolution-based Learning of Ontological Knowledge for a Large-scale Multi-agent

7.6 Decision Trees and Rough Set Rules

Previously [12] computed decision trees from Tierras Largas phase through Monte Alban IIIa for all regions in the valley using Utgoff’s decision tree algorithm. The goal was to diferentiate between the sites that are targets for attack and the those that are not. Three variables were used in order to compute the decision: the pres-ence of burnt daubt at the site, other evidpres-ence of burning, and the prespres-ence of de-fensive walls. The variables used to predict these decisions from positive and negative examples in the training set were: Environmental zone, Slope, Hiltop or Ridge top, Soil Character, On the boundary between the loam and the swampy reagion, Water source, Depth of Water Table, Type of irrigation, and Land use type among others.

In section 7.2 we presented a decision tree (Figure 7.1) and a corresponding de-cision system (Table 7.2) for the Rosario phase (700-500 B.C.) generated by the decision tree approach. It is the fourth phase of occupation in the study, and at that time population size and warfare increased substantially [6]. For example, it was observed that chunks of burnt daub appear on the surface of the villages seven times more frequently than in the previous phases. There are 36 sites in the Rosario phase. The archaic state emerged in the period following this phase of increased warfare.

First, we performed a feature selection using the rough set guided by Genetic Algorithm with the variables above. The rough set approach extracted the same five variables as did the decision tree approach. They are: Environmental Zone, Slope, Hiltop or Ridge Top, Water Source, Type of Irrigation, and Land Use. We then computed the reducts, and the corresponding decision system is given in Ta-ble 7.3. This taTa-ble represents the exhaustive set of rules produced. While it is clear that several of the rules are so simple that they can be easily combined to produce a smaller set of rules overall, it is sufficient for comparative purposes here.

Our focus here is on the impact of the use of a technique, such as rough sets, that explicitly is able to deal with uncertainty in the recognition decision. From this standpoint there are two basic points of comparison. First, how many of the rules identify a site for attack unambiguously and, what percentage of the rules that se-lect sites for attack do they comprise? Second, in those cases in which the rule pro-duces a split decision we will need to resolve the tie using other means. The ques-tion is, how much effort do we need to spend in order to find out that we must contact another source to resolve the question?

In answer to the first question, explicitly dealing with uncertainty using the rough set representation produced four rules that identify sites for attack as op-posed to just three rules in the decision tree approach. Of these four rules, two of the four (11 and 16) result in unambiguous decisions. That is, 50% of the rules that can conclude that a site can be attacked are unambiguous whereas the other two need further clarification. The decision trees approach produces 3 rules that can conclude that a site can be attacked, with only one of them (rule 3) being conclu-sive. Thus, only 33% of the rules that identify a site for attack are conclusive as opposed to 50% for the rough set approach. By taking data uncertainty into ac-count, the rough set approach not only produced more rules for the identification of the target concept, but also a higher percentage of unambiguous ones.

Table 7.3 Exaustive Decision System for the Rosario Phase in Etla Region

Rules

1 env_zone(2) => decision(0)

2 water(1) AND land(2) => decision(0) OR decision(1) 3 env_zone(3) AND slope(2) AND water(2) AND

irrig_type(0) AND land(2) => decision(0) OR decision(1) 4 slope(1) and water(2) => decision(0)

5 water(4) => decision(0) 6 land(1) => decision(0) 7 land(4) => decision(0) 8 env_zone(4) => decision(0) 9 irrig_type(3) => decision(0) 10 irrig_type(4) => decision(0)

11 slope(1) and water(3) => decision(1) 12 slope(2) and water(3) => decision(0) 13 water(0) => decision(0)

14 irrig_type(2) => decision(0) 15 irrig_type(1) => decision(0)

16 water(3) and irrig_type(0) => decision(1)

The other question concerns the relative amount of effort expended to produce an uncertain conclusion. In the decision system produced using Rough Sets, the in-conclusive rules have fewer conditions to be checked than for those from the deci-sion trees approach. Specifically, the inclusive rough set rules have 2 and 5 condi-tions respectively for a total of 7 condicondi-tions, one of which is shared between them (land type =2). In the decision tree system 8 conditions must be checked in the 2 inconclusive rules for a total of 16. However, each shares the same 8 so that the total number of unique conditions to be tested is 8 as opposed to 6 for the Rough Set approach. More effort must then be expended in order to check the inconclu-sive rules in the decision tree approach as opposed to that for Rough Sets.

Since both approaches extracted the same set of condition variables, the differ-ences are likely to reflect the impact that noise in the data had on the relative per-formance of the approaches. By allowing for the presence of noise in the system the number and percentage of conclusive rules have been increased and the amount of effort spent on evaluating inconclusive rules decreased. This region and phase combination reflects an increased complexity in the warfare patterning when com-pared to previous periods. While the complexity isn’t nearly as great in the subse-quent periods when the state emerges, even in this case specific efficiencies accrue to the use of approaches that take uncertainty explicitly into account. In the next section we will investigate this hypothesis by comparing the performance of the two approaches over all periods of interest in the valley.

7.7 Results

The two representational approaches were compared over the seven periods that chronicle the emergence of social complexity in the valley. The decision trees results are based on the average of the best solution for each of 20 runs with Utgoff’s Decision Tree algorithm. The Rough Set approach describes the best solution produced by the Genetic Algorithm-guided Rough Set algorithm using the performance function described earlier.

DT-#c and RS-#c refer to the number of conditions in an average rule for the best decision trees and rough sets rule set, respectively. For each of the 7 periods, the number of conditions in the rough set approach is never greater than that for decision tree approach. In fact, aside from period II the number of terms in the rough set representation is less than that for decision trees. Rosario through Monte Alban Ic correspond to periods of escalating warfare associated with the emergence of a state centered at the site of Monte Alban in the valley. Period II corresponds to a period in which the entire valley is under control of the state and the focus of warfare moves outside of the valley as the Oaxacan state attempted to subdue neighboring areas. Thus, the amount of warfare present in the valley at that time is markedly reduced, and the simplicity of its patterns is equally characterized by both approaches.

During the periods in which warfare patterns were the most complex (Rosario, Monte Alban Ia, Monte Alban Ic), the rough set representation produced rules with a total of 143 conditions as opposed to a total of 223 conditions for decision trees, a significant reduction in complexity. In terms of the simulation, if these rules need to be checked every time step for each of several thousand sites and several thousand agents per site, the computational time saved can be significant.

DT-depth and RS-depth correspond to the maximum number of conditions in a rule in the best rule set. In this case the rough set representation always has fewer conditions on average than the decision tree representation. The increased number of conditions in the decision tree representation corresponds to that fact that explicit sources of noise are included as terms in the rules as opposed to being removed in the rough set representation. Variation can be produced by different surveyors and different landscapes and if their charcaterization is not part of our goal, an approach such as Rough Sets that works to exclude these terms will be more successful.

Dt-#var and RS-#var correspond to the number of unique variables used as terms found in the best rule set of each. What is interesting here is that although the rough set approach produces a rule set with fewer rules and fewer conditions per rule, the number of variables never differs by more than one between the two approaches. They are both using the same information to a different effect in each.

This is, in fact, what we observed in the previous section where both approaches used the same subset of variables for their rule conditions. But, we observed that the actual behavior of the rules that were produced was different in terms of identifying the target concept.

Table 7.4A comparison of the rules produced by using strict (DT) and rough set (RS) constraint representations. # is the average number of conditions in each rule of a rule set. Depth is the maximum length of the rules in the rule set. #var corre-sponds to the number of different variables used in each of the rules in the rule set.

7.8 Conclusions

In this chapter, the goal was to employ evolution-based techniques to mine a large-scale spatial data set describing the interactions of agents over several occupational periods in the ancient valley of Oaxaca, Mexico. Specifically, we want to extract from the data set spatial constraints on the interaction of agents in each temporal period. These constraints will be used to mediate the interactions of agents in a large-scale social simulation for each period and will need to be checked many times during the course of the simulation.

One of the major questions was how to represent the constraint knowledge.

Popular data mining methods such as decision trees work well with data collected in a quantitative manner. However, the conditions under which the surface survey data was collected here introduced some uncertainty into the data. Would a representation that explicitly incorporated uncertainty into its structure produce a more efficient representation of the constraints here that one than did not? This is important since the complexity of the constraint set will impact the complexity of the simulation that uses those rules.

Here, we use Genetic Algorithms to guide the search for a collection of rough set rules to describe constraints on the location of particular types of warfare in the valley. Since warfare was a major factor in the social evolution in the valley, the constraints reflecting its spatial and temporal patterning are important ingredients in the model. The rules generated are compared sy with those produced by a Utgoff’s Decision Tree algorithm. In each of the phases examined, the best rule set that used the rough set representation always had fewer conditions in it, and the average rule length was less than that for the decision tree approach in every case but one. In that case they were equal. The differences were most marked in those periods where the warfare patterns were most complex. It was suggested that the differences reflect the inclusion of noise factors as explicit terms in the decision tree representation and their exclusion in the rough set approach.

Phase

A comparison of two decision systems from the first period where the two approaches begin to show larger differences in rule and condition number, Rosario, demonstrates that the rough set approach has a fewer percentage of inconclusive rules and a larger percentage of conclusive ones than for the decision tree approach. In addtion, the rough set approach needs to evaluate fewer conditions relative to the inconclusive ones than the decision tree approach. These differences, it is argued, result from the explicit consideration of uncertainty into a period that is more complex and more prone to the introduction of such uncertainty than previous periods.

The focus of the comparisons here was on the syntactic or structural differences in the decision systems produced. In future work a comparison of the semantic differences will be accomplished by using the approaches to produce alternative ontologies in the agent-based simulation and assess the differences that are produced. In other words, do the syntactic differences reflect semantic differences in simulation model performance? And, what impact does the use of uncertainty to represent ontological knowledge of the agents have on the basic simulation results?

References

1. Weiss SM, Indurkhya N (1998). Predictive Data Mining A Practical Guide. Morgan Kaufmann Publishers, Inc., San Francisco, Ca.

2. Russell SJ & Norvig P (1995). Artificial Intelligence a Modern Approach. Prentice Hall, Upper Saddle River, NJ.

3. Fox M, Barbuceanu M, Gruninger M & Lin J (1998). An Organizational Ontology for Enterprise Modeling. Simulation Organizations Computational – Models of Institu-tions and Groups. Prietula, MJ, Carley KM, Grasser, L eds. AAAI Press/ MIT Press Menlo Park, Ca., Cambridge, Ma.

4. Reynolds RG (1984). A computational model of hierarchical decision systems,'' Jour-nal of Anthropological Archaeology, No. 3, pp. 159-189.

5. Sandoe K (1998) Organizational Mnemonics Exploring the Role of Information Tech-nology in Collective Remembering and Forgetting. Simulation Organizations Com-putational – Models of Institutions and Groups. Prietula, MJ, Carley KM, Grasser, L eds. AAAI Press/ MIT Press, Menlo Park, Ca, Cambridge, Ma.

6. Marcus J, Flannery KV, (1996) Zapotec Civilization – How Urban Societies Evolved in Mexico’s Oaxaca Valley, Thames and Hudson Ltd, London.

7. Blanton RE (1989). Monte Albán Settlement Patterns at the Ancient Zapotec Capital.

Academic Press.

8. Blanton R.E, Kowalewski S, Feinman G, Appel J (1982) Monte Albán's Hinterland, Part I, the Prehispanic Settlement Patterns of the Central and Southern Parts of the Valley of Oaxaca, Mexico. The Regents of the Univerity of Michigan, The Museum of Anthropology.

9. Kowalewski SA, Feinman, GM, Finsten L, Blanton RE & Nicholas LM (1989) Monte Albán's Hinterland, Part II, Prehispanic Settlement Patterns in Tlacolula, Etla, and Ocotlan, the Valley of Oaxaca, Mexico. Vol. 1. The Regents of the University of Michigan, The Museum of Anthropology.

10. Utgoff P E. (1989) Incremental Induction of Decision Trees, in Machine Learning, P.

Langley, (ed.) pp. 161-186 Boston Kluwer.

11. Reynolds RG, Al-Shehri H (1998) Data Mining of Large-Scale Spatio-Temporal Da-tabases Using Cultural Algorithms and Decision Trees, in Proceedings of 1998 IEEE World Congress on Computational Intelligence, Anchorage, Ak.

12. Reynolds RG (1999). The Impact of Raiding on Settlement Patterns in the Northern Valley of Oaxaca: An Approach Using Decision Trees. In Dynamics in Human and Primate Societies (T. Kohler and G. Gumerman, (eds.), Oxford University Press.

13. Goldberg DE (1989). Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Publishing Company, Inc.

14. Holland JH (1975). Adaptation in Natural and Artificial Systems. University of Michi-gan Press. Ann Arbor, MI

15. Pawlak Z (1991). Rough Sets - Theoretical Aspects of Reasoning about Data. Kluwer Academinc Publishers.

16. Øhrn A, Komorowski, J., Skowron, A., Synak, P. (1998). The Design and Implemen-tation of a Knowledge Discovery Toolkit Based on Rough Sets: The Rosetta System.

Rough Sets in Knowledge Discovery L. Polkovski & A. Skowron, (eds.), Physica Verlag, Heidelberg, Germany:.

17. Øhrn A (2000). Rosetta Technical Reference Manual. Tech. Rep., Department of Computer and Information Science, Norwegian University of Science and Technology (NTNU), Trondheim, Norway.

18. Lazar A, Sethi IK (1999). Decision Rule Extraction from Trained Neural Networks Using Rough Sets. Intelligent Engineering Systems Through Artificial Neural Net-worksCH Dagli, AL Buczak, & J Ghosh, (eds.), Vol. 9 pp. 493-498, ASME Press, New York, NY.

19. Ågotnes T (1999, February). Filtering Large Propositional Rule Sets While Retaining Classifier Performance, Master's thesis, Norwegian University of Science and Tech-nology.

20. Vinterbo S & Øhrn A (1999). Approximate Minimal Hitting Sets and Rule Templates.

21. Vinterbo S (1999, December). Predictive Models in Medicine: Some Methods for Construction and Adaptation. Ph.D. Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science.

22. Øhrn A & Komorowski J (1997, March). Rosetta - a Rough Set Toolkit for Analysis of Data. Proceedings of Third International Joint Conference on Information Sci-ences, Durham, NC, Vol. 3, pp. 403-407.

23. Komorowski J & Øhrn A (1999) Modelling prognostic power of cardiac tests using Rough Sets. Artificial Intelligence in Medicine, Vol. 15, No. 2, pp. 167-191.

24. Fogel DB (1995). Evolutionary Computation - Toward a New Philosophy of Machine Learning. IEEE Press.

25. Ågotnes T, Komorowski J and Øhrn A (1999). Finding High Performance Subsets of Induced Rule Sets: Extended Summary. Proceedings Seventh European Congress on Inteligent Techniques and Soft Computing (EUFIT'99), Aachen, GermanyH. J. Zim-mermann and K. Lieven, (eds.).

26. Crowston K (1994) Evolving Novel Organizational Forms. Computational Organiza-tion Theory. Carley KM, Prietula MJ, eds. Lawrence Erlbaum Associates Publisher, Hillsdale, NJ.

Summary.

8.1 Introduction

An Evolutionary Algorithms Approach to

Dans le document Advanced Information and Knowledge Processing (Page 108-114)