SCHEMA THEORY - Mathematical Models of Genetic Algorithms

Mathematical Models of Genetic Algorithms

4.1 SCHEMA THEORY

Consider the simple problem maxx f(x), where f(x) = x². Suppose we encode x as a 5-bit integer, where the bit string 00000 represents decimal 0, and the bit string 11111 represents decimal 31. The maximum of f(x) occurs when x = 11111. Not only that, but any bit string that begins with a 1 is better than every bit string that begins with a 0. This leads to the concept of a schema. A schema is a bit pattern that describes a set of individuals, where an * is used to represent a "don't care"

bit. For example, the bit strings 11000 and 10011 both belong to the schema 1****.

This schema is a very high-fitness schema for the function x2. Any bit string that belongs to this schema is better than every bit string that does not belong to it.

G As combine schemata in a way that results in a highly fit individuals.

Consider bit strings of length two. The schemata (plural of schema) with length two are **, 0*, 1*, *0, *1, 00, 01, 10, and 11. There are a total of nine unique schemata of length two. In general, there are a total of 3^l schemata of length /.

Now consider the number of schemata to which a bit string belongs. As an example, notice that 01 belongs to four schemata: 01, *1, 0*, and **. In general, a bit string of length / belongs to 2^l schemata.

Now consider a population of AT bit strings, each of length I. Each bit string in the population belongs to a certain set of schemata. We say that the union of these N sets of schemata is the set to which the entire population belongs. If all the bit

Markov's son, Andrey Markov Jr., was also an accomplished mathematician.

SECTION 4.1: SCHEMA THEORY 6 5

strings are identical, then each bit string belongs to the same 2^l schemata, and the entire population belongs to 2^l schemata. At the other extreme, all the bit strings may be unique and not belong to any of the same schemata except for the universal schema * * · · · * * . In this case the entire population belongs to N2^l — (N — 1) schemata. We see that a population of N bit strings, each of length /, belongs to somewhere between 2^l and (N(2^l — 1) + 1) schemata.

The number of denned bits (that is, non-asterisks) in a schema is called the order o of the schema. For example, o(l * * * 0) = 2, and o(0 * 11*) = 3.

The number of bits from the left-most defined bit to the right-most defined bit in a schema is called is defining length Ô. For example, δ(1 * * * 0) = 4, £(0 * 11*) = 3, and δ(1 * * * * ) = 0.

A bit string that belongs to a schema is called an instance of the schema. For example, the schema 0 * 11* has four instances: 00110, 00111, OHIO, and 01111.

In general, the number of instances that a schema has is equal to 2^A, where A is the number of asterisks in the schema. Note that A — l — o.

We use the notation m(h, t) to represent the number of instances of schema h at generation t in a G A. We use /(#) to denote the fitness of the bit string x. We use f(h,t) to denote the average fitness of the instances of schema h in the population at generation t:

m{n,t)

We use f(t) to denote the average fitness of the entire population at generation t.

If we use roulette-wheel selection to choose the parents of the next generation, we see that the expected number of instances of h after selection is

E[m{h,t + 1)] =

Σχ

γ

{

^

Χ)

f(h,t)m(h,t)

fit) ·

^{( }}

Next we perform crossover with probability p^c. We assume that the crossover point is between bits, and never at the end of a bit string. We obtain two children from each pair of parents. What is the probability that crossover will destroy a schema?

Let us look at a few examples.

• Consider the schema h — 1 * * * *. Crossover will never destroy this schema.

If an instance of this schema crosses with another bit string, at least one child will be an instance of h.

• Consider the schema h = 11***. If an instance of this schema crosses with bit string x, the crossover point could be one of four places. If the crossover point is between the two most significant bits, then the schema might be destroyed, depending on the value of x. However, if the crossover point is to the right of that point (three other possible crossover points), then the schema will never be destroyed; at least one child will be an instance of h. We see that the probability of destroying the schema h is less than or equal to 1/4, depending on where crossover occurs.

• Consider the schema /i = 1 * 1 * *. If an instance of this schema crosses with bit string x, the crossover point could be one of four places. If the

crossover point is between the two 1 bits (two possible crossover points), then the schema might be destroyed, depending on the value of x. However, if the crossover point is to the right of the rightmost 1 bit (two other possible crossover points), then the schema will never be destroyed; at least one child will be an instance of h. We see that the probability of destroying the schema h is less than or equal to 1/2, depending on where crossover occurs.

Generalizing the above, we see that the probability that crossover will destroy a schema, if it occurs, is less than or equal to 6/(1 — 1). The probability that crossover occurs at all is pc, so the total probability that crossover destroys a schema is less than or equal to p^c8/(l — 1). Therefore, the probability that a schema will survive crossover is

P.>I-PC(J4Ï)·

( 4

·

3 )

Next we perform mutation with a probability of p^m per bit. The number of defined (non-asterisk) bits in h is the order of h and is denoted as o(h). The probability that a defined bit mutates is pm, and the probability that it does not mutate is 1 — p^m. Therefore, the probability that none of the defined bits mutate is (1 — p^m)0^h\

This probability is of the form g(x) = (1 — x)^y. The Taylor series expansion of g(x) around XQ is

(z) = £

^{f l}

(">(*o)^P-. (4.4)

_n\

Setting #o = 0 gives

g(x) = X><»>(0):

n=Q

x2y{y - 1) x³y(y - l)(y - 2)

= l-xy+ - + ··■

« 1 — xy for xy <C 1. (4.5) So if Pmo(h) <ξ^ 1, then (1 — p^m)°^ ~ 1 — Pmo(h). Combining this with

Equa-tions (4.2) and (4.3) gives

£|m(M + l)] >

%™

^{( M )}

(l-!>.(

j

^Î

))(l-

^Î

>,,.,>W)

Suppose that a schema is short; that is, its defining length δ is small. Then 6/(1 — l ) C l . Suppose that we use a low mutation rate, and a schema is of low order; that is, there are not many defined bits. Then p^mo(h) <C 1. Suppose that a schema has above-average fitness; that is, f(h)/f(t) = k > 1, where k is some constant. Finally, suppose that we have a large population so that E [m(h, t + 1)] « ra(/i, t +1). Then we can approximately write

ra(/i, t + 1) > km(h, t) = fc*ra(/i,0). (4.7) This results in the following theorem, which is called the schema theorem.

SECTION 4.1: SCHEMA THEORY 6 7

Theorem 4.1 Short, low-order schemata with above-average fitness values receive

Dans le document EVOLUTIONARY OPTIMIZATION ALGORITHMS (Page 98-101)