A SIMPLE CONTINUOUS GENETIC ALGORITHM

Genetic Algorithms

3.5 A SIMPLE CONTINUOUS GENETIC ALGORITHM

Figure 3.6 outlines a simple binary G A, and that is exactly the algorithm that we used in Examples 3.2 and 3.3. However, the problems in those examples are defined on a continuous domain, and so we had to discretize the domain to apply the binary GA. It would be simpler and more natural if we could apply a GA directly to the continuous domain of the problems. We use the term continuous GAs, or real-coded G As, to refer to GAs that operate directly on continuous variables.

The extension of GAs from binary domains to continuous domains is pretty straightforward. In fact, we can still use the algorithm of Figure 3.6 - we just need to modify some of the steps in that algorithm. Look at the operations in Figure 3.6 and consider how they might work on an optimization problem with a continuous domain.

1. In Figure 3.6, we first generate a random initial population. We can easily do this on a continuous domain. Suppose that we want to generate N individuals in our G A. Then we denote the i-th individual as Xi for i G [1,-/V]. Also suppose that we want to minimize an n-dimensional function on a continuous domain. Then we use Xi(k) to denote the k-th element of Xf.

Xi= [ Xi(l) Xi(2) ··· Xi{n) ] . (3.15)

Suppose that the search domain of the k-th. dimension is [x^min(k), xmax(k)]'.

Xi(k) G [ (), s w W ] (3.16)*

for i G [1,N] and k G [l,n]. We can generate a random initial population, as in the first line of Figure 3.6, as follows:

For i = 1 to N For k — 1 to n

%i\k) ^~~ ^l#min(#J> ^max(^)J Next k

Next i

That is, we simply set each Xi(k) equal to a realization of a random variable that is uniformly distributed between x^min(k) and x^max(k).

2. Next, we begin the "while not (termination criterion) loop" in Figure 3.6. The first step in that loop is to calculate the fitness of each individual. If we are trying to maximize /(x), then we calculate the fitness of each Xi by computing f(xi). If we are trying to minimize /(#), then we calculate the fitness of each Xi by computing the negative of f(x%).

3. Next, we begin the "while |Children| < |Parents|" loop in Figure 3.6. The first step in that loop is to "use fitnesses to probabilistically select a pair of parents for mating." We perform this step using roulette-wheel selection, as we discussed in Section 3.4.2. We discuss other options for this step in Section 8.7, but for now we simply use roulette-wheel selection.

4. Next, we perform the "Mate the parents" step in Figure 3.6 to create two children. We perform this step using single-point crossover as illustrated in Figure 3.1. The only difference is that we combine continuous-domain individuals rather than binary-domain individuals. We illustrate single-point crossover for continuous-domain individuals in Figure 3.9. We discuss other types of crossover for continuous G As in Section 8.8.

Two Parents Two Children

4J2 0,6B 3J3 e.m ¹⁵⁴ ^8J2

5.82 1.10 9.22 3.61 8.30 2.99

4J2 0.68 3J3 $M 8.30 2.99 5.82 1.10 9.22 3.61 1,54 BJ2

crossover point

Figure 3.9 Illustration of crossover in a continuous-domain GA. The crossover point is randomly chosen. The two parents produce two children.

5. Next, we perform the "Randomly mutate" step in Figure 3.6. In binary EAs, mutation is a straightforward operation, as shown in Equation (3.6). In a continuous GA, we mutate Xi(k) by assigning it a random number that is generated from a uniform distribution on the search domain:

Xi(k)

tf[0,l]

f Xi(k) c(*0]

if r > p

if r < p (3.17) for i e [l,N] and k G [l,n], where p is the mutation rate. We discuss other possibilities for mutation in continuous-domain G As in Section 8.9.

Mutation in Continuous GAs

Note that a given mutation rate has a different effect in a binary GA than in a continuous G A. If we have a continuous-domain problem with n dimensions and a mutation rate of p^c, then each solution feature of each child has a probability of pc of being mutated. For example, in Figure 3.9, each of the six components of

SECTION 3.5: A SIMPLE CONTINUOUS GENETIC ALGORITHM 5 7

both children has a mutation probability of p^c. Also, mutation in a continuous G A results in the solution feature being taken from a uniform distribution between its minimum and maximum possibility values, as shown in Equation (3.17).⁶

However, in a binary GA, we discretize each dimension of each individual. If we discretize a continuous dimension into m bits and use a mutation rate of p&, then each bit has a probability of pb of being mutated. That means that each bit has a probability of 1 — pb of not being mutated. Therefore, the probability of each dimension not being mutated is equal to the probability that all m of its bits are not mutated, which is equal to (1 — Pb)^m- Therefore, the probability of the ra-th dimension being mutated is 1 — (1 — pb)^m> Furthermore, if mutation does occur, then the mutated dimension is not uniformly distributed between its minimum and maximum values; it's distribution instead depends on which bit is mutated.

We can obtain the mutation rate p^c for a continuous-domain problem that has an effect that is approximately equal to the mutation rate pb for a discrete problem.

As we discussed above, if a binary GA for a discrete problem with m bits per dimension has a mutation rate of pb, then the probability that any given dimension is not mutated is equal to (1 — pb)^m- This can be approximated with a first-order Taylor series:

Pr(no mutation in a binary G A) = (1 — pb)m

« l-mpb (3.18) where the approximation is valid for small pb. If a G A for a continuous problem

has a mutation rate of pc, then the probability that any given dimension is not mutated is equal to 1 — pc. Equating this probability with Equation (3.18) gives

1 - pc = 1 - rnpb

pc = mpb. (3.19)

Therefore, the mutation process in a binary G A with m bits per dimension and a mutation rate of p&, is approximately equivalent to the mutation process in a continuous G A with a mutation rate of mpb· We stress the word approximately in the previous sentence because it is not clear that equivalent mutation rates in binary and continuous GAs give equivalent results. This is because he distribution of the magnitude of a binary GA mutation is different than that of a continuous GA mutation. An interesting topic for further work would be a thorough study of the equivalence of binary and continuous G A mutations.

■ EXAMPLE 3.4

Consider the minimization problem of Example 3.3:

min/(#,£/), where (3.20)

f(x,y) = e - 20exp ( - 0 . 2 ^ ± ^ - exp ( ^C ° ^S ^ +^C ° ^S ^ ) . Suppose that x and y can both range from —1 to + 1 . In Example 3.3, we discretized the search domain so that we could apply a binary G A. However,

6Uniform mutation is probably the most classic type of mutation in continuous GAs. However, we can also choose from many other types of mutation as described in Section 8.9.

since the problem is defined on a continuous domain, it is more natural to use a continuous G A. In this example we run both the binary G A and the continuous GA for 20 generations with a population size of 10. For the binary GA, we use four bits per dimension and a mutation rate of 2% per bit, as in Example 3.3. To keep the effect of mutation approximately the same for the continuous G A as for the binary G A, we use a mutation rate of 8% in the continuous G A. We also use an elitism factor of 1, which means that we keep the best individual in the population from one generation to the next (see Section 8.4).

Figure 3.10 shows the best individual found at each generation, averaged over 50 simulations. We see that the continuous G A is significantly better than the binary GA. For continuous-domain problems, we generally (but not always) get better performance with a binary G A as we use more bits, and we get the best performance if we use a continuous G A.

2.5

8 1.5 E I 1 E

0.5

~0 5 10 15 20 Generation

Figure 3.10 Example 3.4: Binary G A vs. continuous GA performance for the two-dimensional Ackley function. The plot shows the cost of the best individual at each generation, averaged over 50 simulations.

It is interesting to note that continuous GAs have a somewhat controversial his-tory. Since GAs were originally developed for binary representations, and since all of the early G A theory was geared towards binary GAs, researchers were skeptical about the rise of continuous GAs in the 1980s [Goldberg, 1991]. However, it is difficult to argue with the success of continuous GAs, their ease of use, and their relatively recent theoretical support.

«* ** « Binary GA

— Continuous GA

SECTION 3.6: CONCLUSION 5 9

3.6 CONCLUSION

The genetic algorithm was one of the first evolutionary algorithms, and today it is probably the most popular. Recent years have seen the introduction of many competing EAs, but G As remain popular because of their familiarity, their ease of implementation, their intuitive appeal, and their good performance on a variety of problems.

Many books and survey papers have been written about G As over the years.

David Goldberg's book [Goldberg, 1989a] was one of the first books about GAs, but like early books in many other subjects, it has aged well and is still popular because of its clear exposition. There are many other good books about GAs, including [Mitchell, 1998], [Michalewicz, 1996], [Haupt and Haupt, 2004], and [Reeves and Rowe, 2003], which is notable because of its strong emphasis on theory. Some popular tutorial papers include [Back and Schwefel, 1993], [Whitley, 1994], and [Whitley, 2001].

In view of the huge number of books and papers about GAs, this chapter is a necessarily brief introduction to the topic. We have neglected many GA-related issues in this chapter - not because we believe that they are unimportant, but simply because our perspective is limited. Some of these issues include messy GAs, which have variable-length chromosomes [Goldberg, 1989b], [Mitchell, 1998]; gender-based GAs, which simulate multiple genders in the GA population and are often used for multi-objective optimization (Chapter 20) [Lis and Eiben, 1997]; island GAs, which includes subpopulations [Whitley et al., 1998]; cellular GAs, which impose a specific spatial relationship among the individuals in the population [Whitley, 1994]; and covariance matrix adaptation, which is a local search strategy that can augment any EA [Hansen et ah, 2003].

There are also many variations on the basic G A that we presented in this chapter.

Some of those variations are extremely important, and can make the difference be-tween success and failure in a GA application. Chapter 8 discusses many variations that apply to GAs and other EAs.

PROBLEMS

Written Exercises

3.1 Section 3.4.1 gave a simple example for how we could represent robot design parameters in a G A. Suppose that we have a G A individual given by the bit string 110010.

a) What is the chromosome for this GA individual?

b) What are the genotypes and phenotypes for this individual?

3.2 We want to use a binary G A to find x to a resolution of 0.1 to minimize the two-dimensional Rastrigin function (see Section C.l.ll) on the domain [—5,5].

a) How many genes do we need for each chromosome?

b) How many bits do we need in each gene?

c) Given your answer to part (b), what is the resolution of each element of xl

3.3 We have a G A with 10 individuals {a:*}, and the fitness of xi is f(xi) = i for i 6 [1,10]. We use roulette wheel selection to select 10 parents for crossover. The first two parents mate to create two children, and the next two mate to create two more children, and so forth.

a) What is the probability that the most fit individual will mate with itself at least once to create two cloned children?

b) Repeat part (a) for the least fit individual.

3.4 We have a GA with 10 individuals {#«}, and the fitness of Xi is /(#*) = i for i G [1,10]. We use roulette wheel selection to select 10 parents for crossover.

a) What is the probability that X\Q is not selected at all after 10 spins of the roulette wheel?

b) What is the probability that xio is selected exactly once after 10 spins of the roulette wheel?

c) What is the probability that X\Q is selected more than once after 10 spins of the roulette wheel?

3.5 Roulette wheel selection assumes that the fitness values of the population satisfy f(xi) > 0 for i e [1,N]. Suppose you have a population with fitness values { — 10,-5,0,2,3}. How would you propose modifying those fitness values so that you could use roulette wheel selection?

3.6 Roulette wheel selection assumes that the population is characterized by fit-ness values {/(x^)}, where higher fitfit-ness values are better than lower fitfit-ness values.

Suppose we have a problem whose population is characterized by cost values {c(x^)}, where lower cost values are better than higher cost values, and c(xi) > 0 for all i.

How could you modify the cost values to use roulette wheel selection?

3.7 We have two parents in a binary G A, each with n bits. The 2-th bit in parent 1 is different than the z-th bit in parent 2 for i G [1, n]. We randomly select

PROBLEMS 6 1

a crossover point c G [l,n]. What is the probability that the children are clones of (that is, identical to) the parents?

3.8 Suppose we have TV randomly initialized individuals in a binary G A, where each individual is comprised of n bits.

a) What is the probability that the i-th bit of each individual is the same for a given il

b) What is the probability that the i-th bit of each individual is not the same for alii G [l,n]?

c) Recall that exp(—am) « 1 - am for small am, and (1 — a)^m « 1 — am for small values of a. Use these facts to approximate your answer to part (b) as an exponential.

d) Use your answer to part (c) to find the population size N that is required to obtain a probability p that both alleles occur at each bit position of a randomly initialized population.

e) Suppose we want to randomly initialize a population of individuals, each with 100 bits, so that there is a 99.9% or greater chance that both alleles occur at each bit position. Use your answer to part (d) to obtain the minimum population size.

3.9 We have a binary G A with a population size of N and a mutation rate of p, and each individual is comprised of n bits.

a) What is the probability that we will not mutate any bits in the entire population for one generation?

b) Use your answer to part (a) to find the minimum mutation rate p for a given population size N and. bit length n such that the probability of no mutations during each generation is no greater than Pn0ne·

c) Use your answer to part (b) to find the minimum mutation rate p such that the probability of not mutating any bits is 0.01% when N = 100 and n = 100.

Computer Exercises

3.10 Write a computer simulation to confirm your answers to Problem 3.3.

3.11 Write a computer simulation to confirm your answer to Problem 3.8.

3.12 The one-max problem is the search for a string of n bits with as many l's as possible. The fitness of a bit string is the number of l's. Of course, we can easily solve this by simply writing n consecutive l's, but in this problem we are interested in seeing if a GA can solve the one-max problem. Write a GA to solve the one-max problem. Use n = 30, generation limit = 100, population size = 20, and mutation rate = 1%.

a) Plot the fitness of the best individual, and the average fitness of the pop-ulation, as a function of generation number.

b) Run 50 Monte Carlo simulations of your GA. This will result in 50 plots of the fitness of the best individual as a function of generation number. Plot

the average of those 50 plots. Denote the average of the 50 best fitness values at the 100th generation as f(x*). What is /(a;*)?

c) Repeat part (b) with a population size of 40. How does f(x*) change compared to your answer from part (b)? Why?

d) Set the population size back to 20 and change the mutation rate to 5%.

How does /(#*) change compared to your answer from part (b)? Why?

e) Set the mutation rate to 0%. How does f(x*) change compared to your answer from part (b)? Why?

f) Instead of setting the fitness equal to the number of l's, set the fitness equal to the number of l's plus 50. Now repeat part (b). How does f(x*) change compared to your answer from part (b)? Why?

g) As in part (b), set fitness equal to the number of l's; but then for all individuals with fitness less than average, set fitness to 0. How does f(x*) change compared to your answer from part (b)? Why?

3.13 Write a continuous G A to minimize the sphere function (see Section C.l.l).

Set the search domain in each dimension to [—5, +5], the problem dimension to 20, the generation limit to 100, the population size to 20, and the mutation rate to 1%.

For roulette wheel selection, we need to map the cost values c(xi) to fitness values f(xi). Do this as follows: f(xi) = l/c(xi).

a) Plot the cost of the best individual, and the average cost of the population, as a function of generation number.

b) Run 50 Monte Carlo simulations of your GA. This will result in 50 plots of the cost of the best individual as a function of generation number. Plot the average of those 50 plots. Denote the average of the 50 best cost values at the 100th generation as c(x*). What is c(x*)?

c) Repeat part (b) with a mutation rate of 2%. How does c(x*) change compared to your answer from part (b)? Repeat with a mutation rate of 5%.

CHAPTER 4

Dans le document EVOLUTIONARY OPTIMIZATION ALGORITHMS (Page 89-97)