• Aucun résultat trouvé

Automatic' learning strategies and their application to electrophoresis analysis

N/A
N/A
Protected

Academic year: 2022

Partager "Automatic' learning strategies and their application to electrophoresis analysis"

Copied!
10
0
0

Texte intégral

(1)

Article

Reference

Automatic' learning strategies and their application to electrophoresis analysis

ROCH, Christian Maurice, et al.

Abstract

Automatic learning plays an important role in image analysis and pattern recognition. A taxonomy of automatic learning strategies is presented; this categorization is based on the amount of inferences the learning element must perform to bridge the gap between environmental and system knowledge representation level. Four main categories are identified and described: rote learning, learning by deduction, learning by induction, and learning by analogy. An application of learning by induction to medical image analysis is then exposed. It consists in the classification of two-dimensional gel electrophoretograms into meaningful distinct classes, as well in their conceptual description.

ROCH, Christian Maurice, et al . Automatic' learning strategies and their application to

electrophoresis analysis. Computerized Medical Imaging and Graphics , 1989, vol. 13, no.

5, p. 383-391

DOI : 10.1016/0895-6111(89)90225-5

Available at:

http://archive-ouverte.unige.ch/unige:47497

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

Compurid Medical Imagiw ad Graphics. Vol. 13, No. 5, pp. 383-39 I, 1989 F’rintcd in the U.S.A. AS ri&s -cd.

089541 I l/89 $3.00 + .oo Copyright Q 1989 Pergamoo Pnss plc

AUTOMATIC LEARNING STRATEGIES AND THEIR APPLICATION TO ELECTROPHORESIS ANALYSIS

Christian Roth’, Thierry Pun’, Denis F. Hochstrasses and Christian Pellegrini’

‘Computer Sciencie Center, University of Geneva, 12, rue du Lac, CH- 1207 Geneva, Switzerland, and

‘DigitaI Imaging Group, University Hospital, CH- 12 11 Geneva 4, Switzerland (Received 30 November 1988; Revised 10 March 1989)

Abstract-Automatic learning plays an important role in image analysis and pattern recognition. A taxonomy of automatic learning strategies is presented, this categorization is based on the amount of inferences the learning element must perform Ito bridge the gap between environmental and system knowledge representation level. Four main categories are identified and described: rote learning, learning by deduction, learning by induction, and learning by analogy. An application of learning by induction to medical image analysis is then exposed. It consists in the classification of ~two-dimensional gel electrophoretograms into meaningful distinct classes, as well in their conceptual description.

Key Words: Image analysis, Pattern recognition, Automatic learning, Artificial intelligence, Expert system, Conceptual clus- tering, Two-dimensional gel electrophoresis

1. INTRO~DUCI’ION

Why presenting a review about learning, more pre- cisely, about automatic learning strategies in a medi- cally oriented journal?

In pattern recognition, afi:er having extracted an ob- ject, for instance from a medical image, we must be able to classify it into one of many predetermined categories. The classification is usually performed using discriminant function(s) or decision trees ( 1,2, 3, 4). Automatic learning strategies may be used to construct such discriminant function(s) by examin- ing a large collection of objects and learning what classes may be developed for their categorization; this process is sometimes called the inverse pattern recog- nition problem. In art$cial intelligence these dis- criminant functions are used to formulate expert sys- tem 3 rules that could help to establish a diagnosis.

It is worth saying that classification is not leam- ing: learning uses classification results from a set of given objects in order to construct (new) rules. These rules will be used later to interpret (classify, for in- stance) a new object.

Most of the literature about automatic learning begins with philosophical arguments about the “gen- uine” meaning of what is called learning, in a cogni-

tive (psychological) sense as well as in a “computer”

sense ( 1, 5,6,7). We have not sufficient room here to

Correspond to Christian Roth; electronic mail address:

(UUCP) mcvax!cernvax!cui!roch, (BITNET) roch@cgeuge5 1, (EAN) roch@cui.unige.ch.

elaborate on this; we will therefore just give a defini- tion from Herbert Simon in (1):

Learning denotes changes in the system that are adaptive in the sense that they enable the system to do the same task or tasks drawn from the same population more efficiently the next time.

This article is concerned with establishing a tax- onomy of automatic learning strategies belonging to

“artificial intelligence” methods (by opposition to statistical ones). By artificial intelligence methods one must understand not only methods which are incor- porated in an expert system, but also, and mostly, methods which deal with high-level concepts and try to reason in a human-like fashion (8, 9, 10).

At all times men tried to create machines mim- icking human behavior, giving them capabilities of learning. If we examine the computer era since 1945, we can distinguish three different stages in the evolu- tion of automatic learning. The first centered on building general purpose learning systems that start with little or no initial knowledge. These systems were generally referred to as neural nets or self-or- ganizing systems. Theoretical as well as hardware limitations were discovered that dampened the opti- mism of these early Al researchers ( 11, 12, 13).

The second stage begins with the consideration that a system cannot learn any high-level concepts when starting without any knowledge at all ( 14).

Some researchers incorporated large amounts of do- main knowledge into learning systems so that they could discover high-level concepts (Meta-Dendral

1151, AM [161).

383

(3)

384 Computerized Medical Imaging and Graphics September-October/l989, Volume 13, Number 5

The last stage finally started when people build- ing expert system looked for new ways of acquiring knowledge for and by their system. While the first two stages focused on rote learning and learning from examples only, a large spectrum of learning methods were then investigated (learning from instruction, learning by analogy, learning by observation, etc.).

In section 2 automatic learning strategies are presented whose taxonomy is based on the amount of inferences the learning element performs. Section 3 exposes the learning algorithm we implemented in the MELANIE system. Some concluding remarks are finally issued.

All along the article, bibliographical references are mentioned as a guide to further readings. In (1) and (6) can be found very good bibliographies about automatic learning until 1983 (1) and until 1986 (6).

These bibliographies have categorical and individual entries. Categories are learning strategies, domain of application, research methodologies, etc. Individual entries are indexed by a pointer into which category or categories they belong to.

2. LEARNING STRATEGIES

A learning element receives information mostly from the environment (knowledge from an external source), and sometimes from the system itself since new information gained during attempts to perform a task may be fed back to the learning element. A leam- ing system must indeed have some way of reusing its own conclusions (5). To do so, the system must eval- uate these conclusions in order to decide whether it can use them to refine rules and/or learn new ones.

This is analogous to the concept of meta-rules, that is rules that control other rules. The most important task of the learning element is therefore to transform information provided by the environment into some representation, and add this new information into the knowledge base.

This translation is a two-faceted problem: syn- tactic and semantic. Syntactic because environmental information is provided in a “analogical” way (e.g., a medical image) while system information is stored

“digitally” using description mechanisms such as production rules, semantic nets, etc. (8,9, 10, 17, 18,

19, 20).

Semantic as well because information level can be very high (abstract information) or very low (de- tailed information): the learning element must there- fore, respectively specialize or generalize the infor- mation in order to allow easy interpretation of this information. This article deals only with the semantic translation problem.

The taxonomy of learning strategies exposed far- ther on is based on the following criterion: how much inferences (hypotheses) does the learning element perform to bridge the gap between environmental and system knowledge level? For instance, a medical image is stored digitally pixel by pixel (a pixel is a picture element); a more abstract representation could be the pixel group; more and more abstract representations may use attributes such as the size of the image, the number of rows and columns, the number of spots, etc.

This criterion is the most frequently used in the literature (1, 5, 6): other classification criteria may depend on the type and formalism of knowledge ac- quired (decision trees, production rules, etc.) or on the domain of application.

2. I Rote learning

In this strategy the information supplied by the environment (or the teacher) is directly usable by the learning element without any inference: rote learning is memorization (21). Other denominations for rote learning are learning by being programmed and learning by memorization.

This strategy is therefore more important than can be believed at first sight, since learning systems must have a rote-learning process to store, maintain, and retrieve new information stemming from either external or internal sources.

In fact the learner does nothing but store infor- mation: all the work is done by the teacher who forms, organizes, and sorts knowledge in such a way the learner can understand it without transforma- tions.

2.2 Learning by deduction

In the learning by deduction strategy the infor- mation given by the teacher is too abstract, too gen- eral to be understood by the system. The learning element must therefore transform those abstract data into more specific ones by forming hypotheses to fill in missing details. This is the reason why we speak of deductive learning, since deduction is the process of inferring specific facts from general data. From the background knowledge that “all persons born in USA are American” and from the sentence “Ronald is born in USA,” we can deduce that “Ronald is an American.”

In (6) the authors have subdivided the learning by deduction strategy into two subcategories: learn- ing by being told and learning by reformulation.

2.2.1 Learning by being told

This strategy was, until 1986, the only type of learning by deduction (1, 5). Learning by taking ad-

(4)

Automatic learning strategies 0 C. RICH et al. 385

vice, learning from instruction, and operationaliza- tion are other notations used in the literature.

The role of the teacher is somehow less impor- tant than in rote learning, since the teacher does not need to transform the information into an intemally- usable representation. Nevertheless, in order for the learning element to quickly acquire new knowledge, the teacher must present and organize his advice in the same way a school-teacher would do it. In that sense, “learning from instruction parallels most for- mal education methods” ( 1, p. 8).

Hayes-Roth, Klahr and Mostow (5) have defined steps an expert system hats to perform when he re- ceives advice from a teacher:

request (advice from expert)

interpret (transformation into internal representa- tion)

operationalize (convert into usable form) integrate (into knowledge base)

evaluate (resulting actions of the performance ele- ment).

We cannot speak abaaut learning by being told

. . ~~

without mentioning Mostow’s program FOO (First Operational Operationalizer) (5, 1, chap. 12). FOO needs human assistance to perform both the inter- pretation step and the operationalization step. FOO provides, nevertheless, an in-depth analysis of the techniques required to perform operationalization. A well-known application used by FOO is the card game of Hearts. The game is played as a sequence of tricks. In each trick, one player, who is said to have the lead, starts the trick by playing a card and each of the other players continues the trick by playing a card during his (or her) turn. If he can, each player must follow suit, that is, play a card of the same suit as the suit led. The player who played the highest valued card in the suit, takes the trick and any points card contained in it. Every heart counts as one point, and the queen of spades is worth 13 points. The goal of the game is to avoid taking points (5). So typical in- structions could be: “do not lead a high card in a suit in which an opponent is void,” “if a opponent has the queen of spades, try to flush it,” or even “avoid taking points.”

2.2.2 Learning by reformulation

This strategy was recently identified (6). The learner reformulates and restructures already-avail- able knowledge in order to learn. This process takes place mainly at the syntactic level. Other key words are problem reformulation, knowledge compilation, and reconstructive memory.

It is not a strategy per se; it is often combined

with another learning strategy which learns new in- formation on which the reformulation is performed.

For instance, UNIMEM is a program that can accept a large quantity of relatively unstructured facts about a domain, use generalization techniques to determine important concepts, and then use these concepts to organize the information in a fashion that allows fur- ther generalization and intelligent question answering (6, p. 194) (22).

2.3 Learning by induction

Among all learning strategies learning by induc- tion is the one which created the greatest number of research works, particularly in the learning from ex- amples category (1, 5, 6, 23).

Unlike the learning by deduction strategy where environmental information was too abstract, the teacher in inductive learning provides specific and detailed information to the learning element whose task is therefore to hypothesize more general rules.

The inductive paradigm can be stated as: given facts and background knowledge, find or “induce” a hypothesis rule H which together with the back- ground knowledge BK implies or explains the given facts F:

H&BKa F.

For example: from the facts “Ronald, republi- can, is in favour of the SD1 project” and “George, republican, is in favour of the SD1 project” and from the background knowledge that “all republicans share common political ideas,” we can induce that

“all republicans are in favour of the SD1 project.”

As we can see from the inductive paradigm, the information given by the teacher (facts or examples) are used to improve the expert system by finding or refining its rules for manipulating knowledge, while in deductive learning we mostly add new facts into the knowledge base.

Since, in the inductive learning process, we want to acquire new or better rules through examples in order for the system to act properly (when new facts arrive), the teacher must provide examples and counter-examples. The learning element may there- fore find one or more rules to separate examples from counter-examples.

Depending on whether the learning element knows or does not know which facts are examples or counter-examples, we can further subdivide the in- ductive learning strategy into three categories.

2.3.1 Learning from examples

Most of research works studying the learning

process have been done in the learning from exam-

(5)

386 Computerized Medical Imaging and Graphics September-October/ 1989, Volume 13, Number 5

ples strategy ( 1, 5, 6, 14, 15, 16, 24). The teacher provides examples and counter-examples illustrating a concept. The learning element, knowing that a fact is an example or not, induces a general concept de- scription that describes all positive examples and none of the counter-examples. In that sense we speak also of concept acquisition, generalization, etc. We can notice that the amount of inferences performed by the learner is much greater than in learning by deduction since no general concepts are provided by a teacher.

Michalski (6, p. 15) has made two subdivisions within this form of learning: instance-to-class gener- alization and part-to-whole generalization. In the former the goal is to induce a general description of a class, given a set of examples (instances) of this class.

In the latter the aim is to hypothesize a description of a whole object (scene, situation, process) given se- lected parts of it.

2.3.2 Learning from observation

The learner receives unclassified examples and must classify them by observing their similarities and differences. The learner must therefore perform more inferences than in the other learning strategies in order to find, for instance, concept description of multiple classes (1, 6). Other approaches are taxo- nomic classification, conceptual clustering, unsuper- vised learning, etc.

We will discuss this strategy in more detail in the next paragraph where we will expose the conceptual clustering algorithm from Michalski and Stepp (1, chap. 11). It has been implemented in order for our image analysis expert system, MELANIE, to learn concept description of patient’s classes from a given set of medical images.

2.3.3 Learning by discovery

The learner investigates a domain in an unaided, exploratory fashion and discovers new concepts and relationships among them (1, 6, 25). Theory forma- tion is an other name for this strategy.

AM (16) and BACON (1) are two well-known systems using partly a learning by discovery strategy;

they aim at discovering concepts in, respectively, ele- mentary mathematics and set theory, and classical scientific laws.

Before leaving the inductive learning strategy, let us say a few words about statistical learning. Most of inductive learning techniques have an objective to classify objects: so does easily the statistical approach.

In addition, statistical learning is faster than the ap- proach mentioned here. However, it neglects the problem of explicitly describing the results that are

obtained. “Statistical methods provide only a class description of a pattern. They do not describe a pat- tern so as to allow its generation given its class, nor do they describe aspects of a pattern which make it ineli- gible for assignment to another class” (26, p. 337).

This point is developed in section 3, where results obtained using both statistical learning and inductive approach are compared.

2.4 Learning by analogy

The learning by analogy strategy, the last of our enumeration, is a combination of deductive and in- ductive learning (1, 6, 27). Other key words are con- cept learning by analogy analogical problem solving, reminding, etc.

Learning by analogy is inductive learning, for in order to find common structures between two prob- lems the learning element must first generalize (make an induction from) each problem: two detailed facts cannot have common features unless they are identi- cal. It is deductive as well, when we deduce from the rule explaining the already-solved problem a new rule explaining the new problem.

Figure 1 is a simple sketch of the learning by analogy paradigm. The Base is an already-solved ob- ject; that is, dependence relations between A and B are known and stored in the knowledge base. The

Target is the object the learning element tries to compare with the Base (find analogies) in order to learn new rules /3’ or new facts A’ or B’. Once com- mon structures between A and A’ (labeled a), and B and B’ (a’), have been found, new rules /3’, explaining why A’ implies B’, can be deduced from the rule @ (explaining why A implies B) (11). On the other hand, if there are some similarities between A and A’ and if 8’ is known, for instance B’ = /3, new facts B’ can be discovered (27).

3. LEARNING IN MELANIE 3.1 The domain of application

Two-dimensional electrophoresis is a fairly new and very powerful biochemical technique that allows separation of proteins contained in a biological sam- ple (28). The process yields gelatinous bidimensional protein maps (2D gels). Since biological material contains many proteins, a 2D gel is typically charac- terized by a thousand or more spots of various sizes.

When 2D gels as used as a diagnosis tool, medical conclusions have to be drawn based on the presence or absence. of some of these spots. A typical occur- rence of this is when each gel comes from a different patient; the analysis to be performed aims at discrimi- nating between normal and pathological samples (29,

(6)

Automatic learning strategies 0 C. RICH et al. 387

I

B* a’

.

B'

resemblances/differences relations (similarity) p, 13’: dependance relations (causality)

Fig. 1. A theoretical model of the learning by analogy strategy.

30). Due to the high complexity of these images,

“human” comparison of gels is almost impossible;

recourse to computerized analysis is therefore man- datory (Fig. 2).

Computerized classification of gels proceeds in two phases: learning, wh:ich is accomplished only

Fig. 2. A bidimensional electrophoretogram obtained from human plasma: original size is approximately 16 X 20 cm.

The number of spots is about 600.

once using a set of gels representative of the disease, and routine exploitation, when diagnosis must be provided for new gels. During the first phase, gels stemming from the same body part (liver, serum, plasma, etc.) but from different patients are analyzed.

Spots which characterize them the best are deter- mined and used to construct new rule(s). The expert system has therefore learned how to discriminate pathological from healthy patient’s gels and is now able to set a diagnosis when a new gel is considered (routine exploitation phase).

The MELANIE system (Medical ELectrophore- sis ANalysis-Interactive Expert system) implements a learning process in order to find rule discriminating healthy from sick patients (3 1,32). In which learning strategy does this learning process fall according to the taxonomy given above? Obviously, we are in the learning by induction strategy since we have specific facts (gels) that we want to classify in order to be able to infer a general rule. As we do not know beforehand which gels are from healthy or from sick patients, we chose an algorithm in the learning from observation strategy, more precisely a conceptual clustering algo- rithm.

3.2 Conceptual clustering

Conceptual clustering (1, 2, 3, 6, 33, 34, 35, 36, 37, 38) means that we classify objects using concept- sensitive measures of similarity, that is, measures de- pending not only on the properties of individual ob- ject, but also on any external concept(s) which might

be useful to characterize object configurations. The aim here is to avoid cluster characterizations harder to interpret by humans, such as would be obtained with statistical measures.

(7)

388 Computerized Medical Imaging and Graphics

Fisher and Langley (33) have given in an excel- lent paper two views of conceptual clustering:

0

0

Methods of conceptual clustering are viewed as extensions or analogs to techniques of numerical taxonomy, a collection of methods developed by natural and social scientists used to form classifi- cation schemes over data sets.

Already alluded to is that conceptual clustering is a form of concept formation or learning by observa- tion as opposed to learning from examples.

They have also divided conceptual clustering into two problems: the aggregation problem (creation of a set of classes) and the characterization problem (determining conceptual description(s) for a given class). We can note that the second problem is noth- ing else but the learning from examples paradigm.

The conceptual clustering algorithm described by Michalski and Stepp in (1, chap. 11) and imple- mented in the MELANIE system has the following general structure:

Letgl,&, . . . , g. be. n 2D gels and let k be the number of classes to constitute. k gels are selected (calkd the starting gels) of which each one is assumed to belong to a different class. Thus k classes will be defined, each one with one representative gel. The two steps to be carried out are the following:

1. Heurktic search. Insert each of the n - k gels not yet selected into one of the k classes, then give a conceptual description of each class.

2. Iterative phase. Select one gel per class, in order to form k new classes, each one with one representative gel, then repeat step 1.

This process is repeated until the classifications converge (i.e., until no better classification can be obtained). The best classifica- tion and the descriptions of the corresponding classes form the result. Clustering quality is evaluated according to some measures (depending or not on the domain of application) called by Mi- chalski lexicographical evaluation finctionals (1). For instance, a

geptember-Gctober/l989, Volume 13, Number 5

classification is better than another, if the classes that compose it are “mom” disjoint than the classes constituting the other classifi- cation. For a given classification, such as ‘king disjoint” property can be numerically evaluated as follows: the sum over all charac- teristic spots of the intervals between ranges of values is computed.

For instance (Table l), this value is:

4 (spot lo: 4 - 0) + 3 (spot 25: 3 - 0)

+ . . . +7(spot477: 13-6) that is 43. The classification yielding to the highest result is chosen.

These two steps form the clustering module, the inner part of the whole algorithm. The surrounding part is the hierarchy-building module, which uses the former module to determine a hierarchy of clusters.

First, it tries to find the best number k of classes for a given set of objects (remember that the clustering module needs as a parameter the number k of classes). Secondly, it subdivides each class into sub- classes in order to construct a hierarchy.

Let us study an experiment. Eight 2D gels of plasma are to be classified into two classes: four gels belong to one class and the remaining four to the other class. Together with the classification, we are interested in knowing which are the characteristic spots, that is, spots which contribute the most to the classification. Both conceptual clustering and statisti- cal methods (correspondence analysis and ascendant hierarchical class&ation [30, 391) have found the correct classification. Nevertheless each provides dif- ferent characteristic spots.

Table 1 indicates the characteristic spots given by the conceptual clustering algorithm. The number appearing in the “class 1” and “class 2” columns are spot integrated densities that were measured on the image. They range from 0 (nonexistent spot) to 15 (highest density).

Table 1. Listing of the characteristic spots and their corresponding values in each class, as result of the conceptual clustering algorithm. Spots are numbered from 1 to 578 (first column). Columns 2 and 3 give spots densities,

ranging from 0 to 15.

SRSUU!WE Clesrl U!SQ

10 45,6 0

25 %5,6,7 0

45 6,7,6,9,10 0

110 3,4,6 0

124 8,11,12,13 0

154 %5,6,7 0

155 3,4,5,6,7 0

159 6,7,8,9,10 0

477 13,14,75 1,456

(8)

This table shows that a. gel with a density for spot 10 between 4 and 6 belongs to the first class; gels belonging to the second class have no spot corre- sponding to spot 10. For spot 477, a value in ( 13, 14,

15) indicates a gel in class 1, and a value in ( 1,4,5,6) indicates class 2.

Correspondence analysis gives results which can be represented under the form of a graph in a factorial space (Fig. 3). Gels and spots that are found close to each other correspond to the same class; this indicates which are the essential characteristics for constituting the partition. The factorial analysis approach may therefore indicate which are the characteristic spots, but it does not provide the range of densities in which they must be to specify a class.

4. CONCLUSION

The purpose of this paper is to show that auto- matic learning is currently becoming a very well-de- fined domain: extensive studies have been done, as well as unification or integration works that lead to lay down taxonomies such as the one presented here.

This research area is a very exciting and promising field and we hope this review paper can help new researchers to apprehend the automatic learning re- search domain.

A taxonomy of automatic learning strategies was exposed; several categories were defined according to the amount of work the learning element must per- form to semantically digest new information. These categories are nevertheless not disjoint: rote learning participates in each category, learning from examples is a step of the conceptual clustering strategy, etc.

A conceptual clustering algorithm implemented in a medical image analysis system, MELANIE, has been mentioned: belonging to the learning by obser- vation strategy it classifies 2D gels in several classes (sick or healthy patients) according to spot densities and gives conceptual class descriptions.

SUMMARY

Automatic learning plays an important role in image analysis and pattern recognition. A taxonomy of automatic learning strategies is presented: this ca-

Automatic learning strategies 0 C. ROCH et al. 389

2nd factor

‘$’ 478

3cl

l7’7+ + 4’

1 ? ?20 20+7+ 98

1;:3 4$7 4f9

6 ? ?*

? ?

gels 1, 2, 3,...8 + spots E [I . ..573]

1 st factor

-

Fig. 3. Simultaneous projection of the 8 gels and 10 most significant spots into the factorial space (defined by the two largest nontrivial eigenvalues).

(9)

390 Computerized Medical Imaging and Graphics

tegorization is based on the amount of inference the learning element must perform to bridge the gap be- tween environmental and system knowledge repre- sentation level. The information supplied by the teacher can either be too detailed, the learning ele- ment having therefore to induce more general hy- potheses, or too abstract, the learning element having to deduce some specific facts.

4.

Four main categories are identified:

rote learning; no inference are performed, the in- formation is directly usable by the expert system.

learning by deduction; the information is too ab- stract (high-level information) and hypotheses must be stated in order to fill in missing details.

Learning by being told and learning by reformula- tion are two subcategories of learning by deduc- tion: in the former, the teacher gives advice or instructions to the system which tries with its background knowledge and the knowledge repre- sentation characteristics to form a rule expressing the given advice; in the latter subclass, the learner reformulates and restructures already-available knowledge in order to learn.

learning by induction; in this strategy, the infor- mation is too detailed and most of the work is to infer (form hypotheses) less specific information by generalizing. Three subclasses are identified within this category of learning; learning from ex- amples, learning from observation and learning by discovery.

In the learning from examples category, the teacher provides examples and counter-examples;

the learning element, knowing which events are positive or negative, induces a general concept de- scription that explains all of the positive examples and none of the negative. The learning from ob- servation strategy tries also to give descriptions of objects classes, but without knowing in which class an object belongs to, (the learning element must first classify the objects). In the last subcategory of learning by induction, the learning by discovery strategy, the learner investigates a domain in an unaided, exploratory fashion and discovers new concepts and relationships among them.

learning by analogy; this strategy is formed both with inductive and deductive learning. The leam- ing element searches in its knowledge base a case similar to the one presented by the teacher; if it finds such case, it tries to act with the new case in a similar manner.

A conceptual clustering algorithm is then pre- sented as an illustration of the learning from observa- tion strategy. Conceptual clustering methods try,

September-October/ 1989, Volume 13, Number 5

after classification, to give conceptual description of classes; conceptual descriptions differ from numeri- cal descriptions in the sense that they explicitly pro- vide class attributes. Numerical methods tend to pre- sent results more difficult to interpret. This algorithm has been implemented in order to classify two-di- mensional medical images (of electrophoresis gels) that human expert has difficulty to interpret, and to explain conceptually why it has found these classes.

Acknowledgments-The authors would like to thank all researchers contributing to the MELANIE project. They are particularly in- debted to Drs. R. Appel and M. Funk from the Computer Science Department of the Faculty of Sciences of the University of Geneva, as well as to V. Villars-Augsburger and Professor A. F. Mtlller from the University Hospital of Geneva. They also want to thank J.-M.

Bost, D. Brunet, C. Coiteux-Rosu, M. Miller (National Institutes of Health, Bethesda, Maryland, USA), for their contribution. This research is supported in part by the FNRS (Swiss National Fund for Scientific Research), under grant No. 2.448-0.87.

REFERENCES

1.

2.

3.

4.

5.

Michalski, R.S.; Carbonell, J.G.; Mitchell, T.M., editors. Ma- chine learning: an artificial intelligence approach, Vol. 1. Palo Alto, CA. Tioga Publishing Company; 1983.

Michalski, R.S. Knowledge acquisition through conceptual clustering: a theoretical framework and an algorithm for parti- tioning data into conjunctive concepts. International Journal of Policy Analysis and Information Systems 4:2 19-244; 1980.

Michalski, R.S.; Stepp, R.E.; Diday, E. A recent advance in data analysis: clustering objects into classes characterized by conjunctive concepts. In: Kanal, L.N.; Rosenfeld, A., eds.

Progress in Pattern Recognition. Amsterdam: North-Holland Publishing Company, 198 1; 33-56.

Quinlan, J.R. Discovering rules from large collections of exam- ples: a case study. In Expert Systems in the Micro Electronic Age, Michie, D., ed. Edinburgh University Press, Edinburgh;

lY7Y.

Cohen, P.R.; Feigenbaum, E.A., editors. The handbook of ar- tificial intelligence, Vol. 3, Chapter XIV. William Kaufmann, Inc.; 1982.

6. Michalski, R.S.; Carbonell, J.G.; Mitchell, T.M.; editors. Ma- chine learning: an artificial intelligence aporoach. Vol. II.

7.

8.

9.

Morgan Kaufmann Publishers, Inc., Los Altos, CA, ‘1986.

Minsky, M.L. The society of mind. Simon and Schuster, NY;

1985-1986.

Lenat, D.B. The nature of heuristics. Artificial Intelligence 19:189-249; 1982.

Lenat, D.B. Theory formation by heuristic search: the nature of heuristics II; background and examples. Artificial Intelli- gence 21:31-59; 1983.

Lenat, D.B. EURISKO: a program that learns new heuristics and domain concepts; the nature of heuristics III: program design and results. Artificial Intelligence 21:61-98; 1983.

McCulloch, W.S.; Pitts, W. A logical calculus of ideas immi- nent in nervous activity. Bull. Math. Biophysics. 5:115-133;

1943.

Minsky, M.L.; Papert, S. Perceptrons; an introduction to computational geometry. MIT Press, Cambridge, MA, 1969.

Rosenblatt, F. The Perceptron: a perceiving and recognizing automaton. Rep. No. 85-460-1, Project PARA, Cornell Aero- nautical Laboratory; 1957.

Winston, P.H. Learning Structural Descriptions from Exam- ples, The Psychology of Computer Vision, P.H. Winston (ed.), McGraw-Hill, New York; 1975.

10.

11.

12.

13.

14.

(10)

Automatic learning strategies 0 C. ROCH et al. 391

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.

28.

29.

30.

31.

32.

33.

34.

Buchanan, B.G.; Feigenbaum., E.A. DENDRAL and META- DENDRAL: their Applications Dimension. Artificial Intelli- gence 11:5-24; 1978.

Lenat, D.B. AM an artificial intelligence approach to discov- ery in mathematics as heuristic search. Doctoral dissertation reprinted in R. Davis and D. B. Lenat: Knowledge-based sys- tems in artificial intelligence, New-York, McGraw-Hill; 1980.

Charniak, E.; McDermott, D. Introduction to artificial intelli- gence. Reading, MA: Addison-Wesley; 1985.

Nillson, N.J. Principles of artificial intelliaence. Tioaa uub- lishing company, Pai0 Alto, CA, 1980. - - _ Rich, E. Artificial intelligence. McGraw-Hill; NY, 1983.

Winston, P.H. Artificial intelligence. Reading, MA: Addison- Wesley, 2nd edition; 1984.

Samuel, A.L. Some studies in machine learning using the game of checkers. In: Computers and Thought. Feigenbaum, E.A.;

Feldman, J. eds. McGraw-Hild, New York, pp. 7 l-105; 1963.

Lebowitz, M. Experiments with incremental concept forma- tion: UNIMEM. Machine Learning. 2: 103- 138; 1987.

Michalski, R.S. A theory and methodology of inductive leam- ing. Artificial Intelligence 20: I1 I-161; 1983.

Mitchell, T.M.; Keller, R.M.; Kedar-Cabelli, ST. Explana- tion-based generalization: a Unifying view, Machine Learning, Vol. 1, No. I, pp. 47-80, Kluwer Academic Publishers, Boston

1986.

Langley, P.; Michalski, R.S. Maching Learning and Discovery, Editorial of Machine Leamina. Vol. 1. No. 4. DD. 363-366.

Kluwer Academic Publishers, Boston; i986. - -

Bunke, H. Hybrid approachles. In: Syntactic and Structural Pattern Recognition. Fern&, G.; Pavlidis, T.; Sanfeliu, A.;

Bunke, H., eds. Berlin: Springer Verlag; 1988: pp. 335-36 1.

Winston, P.H. Learning and Reasoning by Analogy, Commu- nications of the ACM, Vol. 23, No. 12, pp. 689-702 De- cember, 1980.

O’Farell, P.H. High resolution two-dimensional electrophore- sis of proteins. J. Biol. Chem. 250:4007-4021; 1975.

Hochstrasser, D.; Funk, M.; Appel, R.D.; Pellegrini, C.;

Mfiller, A.F. Clinical application of automatic computer com- parison of 2D gel electrophoresis images: diagnosis of Walden- stroem macroglobulinemia?, In: Gaheau, M.-M.; Siest, G. eds.

Recent Progresses in Two-Dimensional Electrophoresis.

Presses Universitaires de Nancy: pp. 1 I7- 120; 1986.

Pun, T.; Hochstrasser, D.F.; Appel, R.D.; Funk, M.; Villars- Augsburger, V.; Pellcgrini, C. Computerized classification of two-dimensional gel electroplhoretograms by correspondence analysis and ascendant hierarchical clustering. Journal of Ap plied and Theoretical Electrophoresis 1:3-9; 1988.

Appel, R.D. MELANIE: un systeme d’analysc et d’interpr&a- tion automatique d’images de gels d%lectrophor&e bidimen- sionnelle; systemes experts el apprentissage automatique. Le Concept Moderne, Geneva; 1987.

Funk, M. MELANIE: un systeme d’analyse et d’interpretauon automatique d’images de gels d’Clectropho&e bidimension- nelle; traitement de l’image et systemes experts. Lc Concept Moderne, Geneva; 1987.

Fisher, D.H.; Langley, P. Approaches to conceptual clustering.

Proc. 9th IJCAI, pp. 691-697, Los Angeles; 1985. ---

Fisher, D.H.; Langley, P. Conceptual clustering and its rela- tions to numerical taxonomy. In: Gale, W., ed. Artificial Intel- ligence and Statistics. Reading, MA: Addison-Wesley; 1986;

77-l 16.

35.

36.

37.

38.

39.

Fisher, D.H. Knowledge acquisition via incremental concep tual clustering. Machine Learning, 2:139-172; 1987.

Langley, P. Machine Learning and Concept Formation, Edito- rial of Machine Learning, Vol. 2, No. 2, pp. 99-102, Kluwer Academic Publishers, Boston; 1987.

Michalski, R.S.; Stepp, R.E. Automated construction of classi- fications: conceptual clustering versus numerical taxonomy.

IEEE Transactions on Pattern Analysis and Machine Intelli- gence, PAMI-5, No. 4; July, 1983.

Stepp, R.E.; Michalski, R.S. Conceptual clustering of struc- tured objects: a goal-oriented approach. Artificial Intelligence 28:43-69; 1986.

Benzecri, J.P. L’Analyse des Donni%s. Tome 2: l’Analyse des Correspondanccs. Dunod, Paris; 1973.

About the Author-CHRISTIAN ROCH received his M.Sc. degree in Computer Science at the University of Geneva in 1986. Since then he works in the Computer Science Center, University of Geneva, on a research project on Artificial Intelligence and Medicine. His interest relies mostly on machine learning.

About the Author-THIERRY PUN received his electrical engineer diploma in 1979, and hi doctoral degree in image processing in

1982, both from the Swiss Federal Institute of Technology in Lau- sanne. From 1982 to 1985, he was a visiting fellow at &National Institutes of Health (Bethesda. MD. USA). where he develoned several methods and &ware packages for medical image analysis.

From 1985 to 1986, he was at the European Laboratory for Particle Physics in Geneva (CERN), where he worked on signal and image analysis problems. Since then, he is with the Computer Science Center, University of Geneva, where he currently holds a position of associate professor of computer science. He teaches computer graphics, image processing and analysis, and computer vision. His research interests deal with medical image analysis and computer vision. Dr. T. Pun has authored or co-authored numerous publica- tions in image processing and analysis and computer vision.

About the Author-DR DENIS HocHSTRASSER obtained his M.D.

degree at Geneva University Medical School, did his internship residency at U.N.C. at Chapel Hill (NC), did a fellowship in Inter- nal Medicine at Geneva University Hospital, was a guest researcher at the N.I.H. In Bethesda in biochemical genetics for one year and is now attending on the ward of Medicine at Geneva University Hospital and leads a group of numeric imaging at the same hos- pital.

About the Author-CHRISTIAN R. PELLEGRINI was born in Ge- neva (Switzerland) in 1945. He attended the University of Geneva and received the M.Sc. degree in physics in 1970 and the Ph.D.

degree in computer science in 1975. Dr. Pellegrini worked as a research associate in the newly created department of computer science. From 1977 to 1978 he jointed the IBM T.J. Watson Re- search Center in Yorktown Heights (NY) as a post-doctoral fellow.

He worked in laboratory automation and distributed systems.

From 1978 to 1980 he was lecturer and research associate at the University of Geneva. Professor of computer science since 1980, Dr. Pellegrini is at present chairman of the computer science de- partment. His research activities are in artificial intelligence with special interest to knowledge representation, reasoning, machine learning, and neural network. He is member of ACM, Eurograph- its and the Swiss Association for Informatics (SI).

Références

Documents relatifs

The death of a young child or an adolescent is considered to be devastating for all of those involved. support systems are vital in facilitating griefwork with the child and

We collected game replay data from an online StarCraft II community and focused on high-level players to design the proposed classifier by evaluating four feature functions: (i)

Our results demonstrate the critical role of sexual experience in the endocrine re- sponse to the ram odor and suggest that olfactory associative learning mechanisms can occur

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

In order to test the efficiency of the CMA Es method for Learning parameters in DBNs, we look at the following trend following algorithm based on the following DBN network where

In the early days of machine learning, Donald Michie intro- duced two orthogonal dimensions to evaluate performance of machine learning approaches – predictive accuracy

Among the commonly used techniques, Conditional Random Fields (CRF) exhibit excellent perfor- mance for tasks related to the sequences annotation (tagging, named entity recognition

In assisting countries to assess their health priorities with respect to drugs, the Action Programme is undoubtedly contributing to better health and saving lives.. It helps