• Aucun résultat trouvé

Construction of optimal sequential decision trees,

N/A
N/A
Protected

Academic year: 2021

Partager "Construction of optimal sequential decision trees,"

Copied!
24
0
0

Texte intégral

(1)
(2)

LIBRARY OFTHE MASSACHUSETTS INSTITUTE

(3)
(4)
(5)

CONSTRUCTION OF OPTIMAL SEQUENTIAL DECISION TREES

by CZrfhu^r WilliamA. Martin April 1971

MASSACHUSETTS

^UTE

OF

TECHNOLOGY

50

MEMORIAL

DRIVE

[BRIDGE.

MASSACHUSETTS

(

(6)
(7)

I

MASS.INST.TECH.

JUL

12

1971

LIBRARY

DEWEr

CONSTRUCTION OF OPTIMAL SEQUENTIAL DECISIONTREES

by Cirthu.r WilliamA. Martin

(8)
(9)

ABSTRACT

The goal of a sequential decision problem is to identify one object selected from a knovm set of objects by applying aseries of tests to the object. The outcomes of tests already performed can be used to select the next test. Each test may be applied at most once. Given the a priori probabi-lities of the objects, the cost of each test, and the probabilities of the outcomes of each test given the object to vjhlch it is applied, an algorithm

is given for constructing the tree of tests v;hich minimizes the average

cost of identifying an object. It is assumed that the test results do not depend on the order inwhich the tests are applied. The algorithm uses a

branch and bound procedure to construct partial decision trees. A lov7er bound on the cost of completing a partial tree is taken as the minimum

cost of a set of tests V7hich will provide information about the objects

equal to their entropy, under the assum.ption that the tests are independent.

(10)
(11)

A Lower Bound

We are given a set X of objects and a series of K tests. The k test has outcomes in the set Y . We are also given the cost C of each test and

the joint probability distribution p(>:, y , ...,y, ) that object X occurs

and tests 1 to k produce outcomes y to y •

The entropy H(X) of the set of objects before any tests are performed is defined to be

Since each term in the sum is > 0, it is clear that the entropy is zero if and only if the probability of some object is equal to 1.

There-fore, if the entropy is not zero after several tests have been applied it

is not possible that the object has been identified. Similarly, if the

average decrease in entropy produced from several tests is not at least equal to the initial entropy then it is not possible for the objects to be identified in every case.

The decrease in entropy from applying testY, is

1(X;Y ) = H(X) -H(xIy ) =

-V

p(>:,y,,•••,y^) log p(x)

^'^1

\

X.Y-..,Y^

S

^ p^'^-'^'i

V

^°2 p(^'Iyi)

S-^ PCxIy )

(12)
(13)

- 2

Similarly, the decrease in entropy from applying tv.'O tests Y and Y is

H(X) -HCXJY.Y ) =

-Z2

p(x,y^,...,y^,) log p(x)

1 J YY Y

"*'2_/ p(x,y ,...,y ) log p(x|y y )

2^

^ pU.yi>--.>yK)

1^2-:^—

+

2.

. p(>-'»yi.---.yK>

^^^-YuWir

= I(X;YJ + I(X;Y^|y^)

But I(X;Y.|y.) < I(X;Y.)

since

EpCy.)p(y.)

V7hich using the inequality logV7 < (u - 1) becomes

Y^

pCy^pCyJ

l(X;Y.|Y^) - l(X;Y.) < (

2.

p(x,y,....y,)

(14)
(15)

(Note that equality occurs when pCy.jy.) = p(y.!)p(y.).)

We use the above vrell-knovm result to compute a lower bound on the cost of testing the objects. The result says that each test yields at most an

entropy equal to that Which results when it is applied first. We give the

tests the benefit of the doubt and compute these entropies. Using the costs of the tests vje then compute the cost per unit of entropy hopefully to be received from each test and rank the tests in the order of increasing

cost per unit. Finally, using this ranking, we form the lox-?est cost set of tests v.-'hich could possibly yield entropy equal to that of the set of objects. A fractional part of one test may be included in this set in order

to get the exact amount of entropy needed. VJe then include the sair.e fraction of its cost in the cost of this set, which is the lower bound in question.

Branch and Bound

ArmedV7ith this lower boundwe can employ a branch and bound algorithm

like that described by Reinwald and Soland (2). We form a tree of partial

decision trees. The root node of the tree of partial decision trees corresponds to no tests applied. Anode is a complete node of a decision

tree if no tests remain which have not been applied in reaching that node, if one object has probability one at that node, or if branches extend from

that node. Anode of the tree of partial decision trees is a terminal node if the decision tree at that node contains no incomplete nodes. The

ave-rage cost of the decision tree of each non-terminal node is bounded by the costs of the tests applied so far plus the lovjer bounds on the cost of

(16)
(17)

- 4

each of the sub-decision trees v^hich has xiotbeen completed.

When the cost of a terminal node is less.than the lower bounds on all thenon-terninal nodes, this terminal node represents the optimum tree.

When this is not the case the non-terminal nodeV7hich contains the partial decision tree V7ith the lovrest cost bound is extended by adding abranch foreach test not yet applied.

It is possible that some uncertainty vjill remain after all of the tests

are applied. In this case, more than one order of applying the tests vill yield the same results. This situation can be detected by evaluatlns the result of applying all of the tests in a given order. If the reinaining un-certainty is small it may be desirable to find the best tree which reduces

(18)
(19)

References

1. Gallager, H.G., "InformationTlieory and Reliable Communication", Wiley (1968).

2. Reinwaldj L_. and Soland R., "Conversion of Limited-Entry Decision

Tables to Opt' ;! Computer Programs I: MinimumAverage

ProcessingTj . JACM (July 1966)

.

2. Reinv7ald, L. and Soland R., "Conversion of Limited-Entr>' Decision

Tables to Optimum Computer Programs II: Minimum Storage

(20)
(21)
(22)

Date

Due

(23)

^2-?-3 TDfiO

0D3

fl75

3b3

5A^-3 TOflO

0D3 TOb

40fi

||||||||ll|||||||((||||||n HD28 M.I.T. AlfredP. Sloan lllllillllllllllllllll 5'z5,ff

Mklk

school of Management

3T0flDD03TDb3fl5

Nos.523-71-Working Papers.

Nos. 532-71 Nos. 523-71 to Nos. 532-71 3 TOflO

DD3

fl7S

3T7

I 3 TOflD

DD3 TDb 3T0

3 TDflD

0D3 TOb 3bb

3 TDfiD

D03

fl7S

413

3 ^OfiD

DD3 TDb

fi4b 3

S060 003

TOb

ttbJ. 3 TDfiO

003 TOb 671

1527-7^

BASFMENF

^ J, SIGN

thiscardandpresentitwithbook

atthe

CIRCULATION

DESK

5>9'7| ^M-71 531

MASSACHUSETTS

INSTITUTE

OF

TECHNOLOGY

LIBRARIES

DEWEY

LIBRARY

f?2-7'

(24)

Références

Documents relatifs

For this reason, and in order to provide an attractive hook for users, we have included support for personal information management and gamification elements into

Indeed, the Situational Method Engineering [4, 5] is grounded on the assimilation approach: constructing ad-hoc software engineering processes by reusing fragments of existing

Suppose R is a right noetherian, left P-injective, and left min-CS ring such that every nonzero complement left ideal is not small (or not singular).. Then R

Facebook founder Mark Zuckerberg famously claimed that “By giving people the power to share, we’re making the world more transparent.” Lanier is interested in a type of

In summary, the absence of lipomatous, sclerosing or ®brous features in this lesion is inconsistent with a diagnosis of lipo- sclerosing myxo®brous tumour as described by Ragsdale

The Canadian Task Force on Preventive Health Care recommends screening adults 60 to 74 years of age for colorectal cancer (strong recommendation), but given the severe

Ce but est atteint selon la presente invention grace a un detecteur de pression caracterise par le fait que la jante de la roue est realisee en un materiau amagnetique, le capteur

Le nouveau produit comporte de la sphaigne blanchie en combinaison avec de la pulpe de bois mecanique flnement dlvisee ayant one valeur CSF (Canadian Standart Freeness est