GENETIC PROGRAMMING FOR MINIMUM TIME CONTROL

A precursor to modern GP was the variable-length GA developed by Stephen Smith in his 1980 doctoral dissertation [Smith, 1980], in which each individual in

7.3 GENETIC PROGRAMMING FOR MINIMUM TIME CONTROL

In this section, which is motivated by [Koza, 1992, Section 7.1], we demonstrate the use of GP for minimum time control of a second-order Newtonian system. A second-order Newtonian system is a simple position-velocity-acceleration system that satisfies the equations

x = v

v — u (7.5)

where x is position, v is velocity, and u is the commanded acceleration. That is, the derivative of position is velocity, and the derivative of velocity is acceleration.

We consider motion only in one dimension. The problem is to find the acceleration profile u{t) to drive the system from some initial position x(0) and velocity f(0), to x(tf) = 0 and v(tf) — 0, in the minimum time tf. Intuition tells us approximately how to accomplish this: we accelerate as fast as we can in one direction until we reach a certain position, and then we accelerate as fast as we can in the opposite direction until we reach x(Q) = v(0) = 0.³

We assume for the sake of simplicity, and without loss of generality, that the maximum acceleration magnitude is 1, and that we can acceleration in either direc-tion. The minimum time control problem is illustrated at the top of Figure 7.9. We accelerate in the positive direction (toward the right) until reaching the strategic point labeled "Switch." Then we accelerate in the negative direction (toward the left) until reaching the goal. Note that the vehicle's velocity is toward the right for the entire time period. If the timing is right, we will reach the goal with zero velocity.

u = +1 u = - 1

Switch Goal u = + l u = - 1

Goal Switch v = 0

Figure 7.9 Illustration of minimum time control. In the top figure, we accelerate to the right to the switching point, then accelerate to the left, and reach the goal with zero velocity.

In the bottom figure, our initial velocity is so high tha we twill inevitably overshoot the goal.

In this case we accelerate to the left, overshoot the goal, switch the acceleration to the right at the switching point, and reach the goal with zero velocity.

It may be that we have such a high initial velocity that we cannot stop before the goal. In this case we will inevitably overshoot the goal, and so we must return back

3In fact, this is what we observe in teenage male drivers at every stoplight: accelerate as fast as possible when the light turns green, and then at a carefully chosen point before the next stoplight, slam on the brakes. If the teenager's timing is right, the car will stop precisely at the next red light, and the travel time between stoplights will be minimized.

SECTION 7.3: GENETIC PROGRAMMING FOR MINIMUM TIME CONTROL 159

to the goal, reaching it with zero velocity. This situation is a little less intuitive, and is shown at the bottom of Figure 7.9. The minimum time solution is to first accelerate as much as possible in the negative direction (toward the left). The vehicle overshoots the goal. Eventually the vehicle will reach zero velocity, at which point the vehicle begins moving toward the left. We continue accelerating in the negative direction until reaching the strategic point labeled "Switch," at which time we begin accelerating in the positive direction (toward the right) until returning to the goal. Again, if the timing is right, we will reach the goal with zero velocity.

The minimum-time control problem is a classic optimal control problem with many aerospace applications, and is studied in detail in many optimal control books [Kirk, 2004]. The solution is called bang-bang control, because for any initial condi-tion x(0) and v(0), the solucondi-tion consists of one time period of maximum acceleracondi-tion in one direction, followed by a time period of maximum acceleration in the other direction. The minimum time control problem can be represented in graphical form with a phase plane diagram as shown in Figure 7.10. We assume for simplicity that the vehicle mass is 2. In this case, the curve drawn in Figure 7.10, which is called the switching curve, is given by

x = -v\v\/2. (7.6) The goal is to reach the origin x = 0 and v = 0 in minimum time from any

initial point in the phase plane. If the position and velocity is above the switching curve, then we should apply maximum acceleration in the negative direction. If the position and velocity is below the switching curve, then we should apply maximum acceleration in the positive direction. This will take us on a trajectory that reaches the switching curve, at which point we will reverse the direction of the acceleration.

Then we will follow the switching curve to the origin of the phase plane.

I 0

-1

-2 -1 0 1 2 Position

Figure 7.10 Switching curve for minimum time control. If the position and velocity lie above the switching curve, then the acceleration should be maximum in the negative direction. If the position and velocity lie below the curve, then the acceleration should be maximum in the positive direction.

Figure 7.11 illustrates the optimal trajectory for the initial condition x(0) = —0.5 and v(0) = 1.5. This corresponds to the bottom picture in Figure 7.9. The vehicle is moving too fast to stop before reaching x = 0. Therefore, we apply the maximum accleration in the negative direction until reaching the switching curve; note that the vehicle passes through v = 0 during the time of maximum negative acceleration.

When the vehicle reaches the switching curve, we apply the maximum acceleration in the forward direction. The trajectory reaches the origin of the phase plane (x = 0 and v = 0) in the minimum possible time.

* ^*w

% * %

* \ 1

" ■ ■ Switching Curve

1 « _ Minimum Time Trajectory

* * ^

.2L _ _ _ ^ _ _ ^

- 2 - 1 0 1 2

Position

Figure 7.11 Minimum time trajectory for initial condition x(0) = —0.5 and v(0) = 1.5.

The acceleration is —1 above the switching curve, and +1 after the trajectory reaches the switching curve.

Now we use GP to try to evolve a minimum time control program for this problem. We define two special Lisp functions for this problem. The first is the protected division operator shown in Equation (7.4), and the second is the greater-than operator:

(defun GT (x y)

(if (> x y) (return-from GT 1) (return-from GT -1))). (7.7) The GT function returns 1 if x > y, and returns —1 otherwise.

To evaluate the cost of a program, we take 20 random initial points in the (x, v) phase plane, with \x\ < 0.75 and |i?| < 0.75, and see if the program can bring each of the (x, v) pairs to the origin within 10 seconds. If the program is successful for an initial condition, then the cost contribution of that simulation is the time required to bring (x,v) to the origin. If the program is not successful within 10 seconds, then the cost contribution of that simulation is 10. The total cost of a computer program is the average of all 20 cost contributions. Table 7.3 summarizes the GP parameters for this problem, which are mainly based on [Koza, 1992, Section 7.1].

SECTION 7.3: GENETIC PROGRAMMING FOR MINIMUM TIME CONTROL 1 6 1

GP Option Setting

Objective Find the minimum time vehicle control program Terminal set x (position), v (velocity), —1

Function set + , - , * , DIV, GT, ABS

Cost Time to bring the vehicle to the phase plane origin, averaged over 20 random initial conditions

Generation limit 50 Population size 500 Maximum initial tree depth 6

Initialization method Ramped half-and-half Maximum tree depth 17

Probability of crossover 0.9 Probability of mutation 0 Number of elites 2

Selection method Tournament (see Section 8.7.6)

Table 7.3 GP parameters for the minimum time vehicle control problem.

Figure 7.12 shows the cost of the best GP solution as a function of generation number. The best computer program is found after less than 10 generations for this particular run, but the average performance of the entire population continues to decrease during the entire 50 generations. For most GP problems, it takes much longer than 10 generations to find the best solution. The reason this particular run was quicker than the average GP run might be because the problem is relatively easy, or it might simply be a statistical fluke. The best solution obtained by the G P i s

u = ( * ( GT ( - ( DIV x v) ( - - 1 v) ) ( GT ( + v x) ( DIV x v) ) )

( DIV ( GT ( + x v) ( + v x) ) ( GT ( + v x) x ) ) ) ) . (7.8) The switching curve for this control is plotted in Figure 7.13, along with the

the-oretically optimal switching curve. For v < 0 the two curves are very similar. For v > 0 there is more of a difference between the curves, but the general shape is still similar.

The time that it takes the vehicle to reach the origin of the phase plane, aver-aged over 10,000 random initial conditions in the state space x G [—0.75, +0.75]

and v G [—0.75,+0.75], is about 1.53 seconds for the optimal switching curve and 1.50 seconds for the GP switching curve. Interestingly, the GP switching actually performs slightly better than the optimal switching curve! This is not possible theoretically, but practice and theory do not always match.4 In practice, there are implementation issues that make it possible to perform better than the theoreti-cally optimal strategy. For example, we terminated our simulation when \x\ < 0.01 and \v\ < 0.01, and considered such small values a complete success. In theory, we can reach the origin with exactly zero error, but in practice, we cannot. Also, we used a step size of τ = 0.02 seconds to simulate the dynamic system. Rather than

4In theory, practice and theory should match. In practice, they do not.

20 30 Generation

Figure 7.12 GP performance for the minimum time control problem. The best solution is found after less than 10 generations for this particular run.

8 o CD

>

U = +1

——GP Solution - - - Optimal Solution | u = -1

Position 0

Figure 7.13 The best switching curve obtained by the GP for the minimum time control problem, along with the theoretically optimal switching curve.

computing the exact continuous time solution to v — u x — v we instead approximated the solution as

Vk+i = vk + ruk

Xk+i = Xk + T{vk + vk+i)/2

(7.9)

(7.10)

SECTION 7.4: GENETIC PROGRAMMING BLOAT 1 6 3

where k is the time index, which ranges from 0 to 500 (that is, 0 to 10 seconds).

That is, we used rectangular integration for the solution of velocity, and trapezoidal integration for the solution of position [Simon, 2006, Chapter 1]. These differences between theory and practice may result in more control chattering along the optimal switching curve than is present along the GP-generated switching curve.5 The reader can replicate the results in this section by following the steps described in Problem 7.13.

Theory versus Practice

The superiority of the GP switching curve over the theoretically optimal solution raises an important point regarding the difference between theory and practice. En-gineering solutions are often generated on the basis of theory, but as any practicing engineer knows, theoretical results need to be modified to take real-world consider-ations into account. This example shows that a GP may be able to take real-world considerations into account to find a solution that is better than the theoretically optimal solution to a problem.

It may be easier to learn optimal control theory and solve the minimum time control problem in a more traditional way, rather than learning how to use a GP.

But it may not. This example shows us that GP might be able to find solutions to problems that we lack the expertise to solve on our own. It further shows the possibility of finding "better-than-optimal" solutions when practical considerations are taken into account.

Dans le document EVOLUTIONARY OPTIMIZATION ALGORITHMS (Page 192-197)