Simulation Modeling - • Part I, Fundamental Concepts, provides insights on the most important

T

his chapter reviews the main idea behind Monte Carlo (MC) simulation, and discusses important issues in its application to business problems, such as number of scenarios to generate, interpretation of output, and effi-cient ways to simulate random numbers. Depending on which software you decide to use for simulation, you may read Appendix B for an introduction to @RISK or Appendix C for an introduction to MATLAB—both found on the companion web site.

4 . 1 M O N T E C A R L O S I M U L A T I O N : A S I M P L E E X A M P L E

As explained in Chapter 3, the analysis of risk is based on modeling un-certainty, and uncertainty can be represented mathematically by probabil-ity distributions. These probabilprobabil-ity distributions are the building blocks for simulation models. Namely, simulation models take probability distribution assumptions on the uncertainties as inputs, and generate scenarios (often re-ferred to as trials) that happen with probabilities described by the probability distributions. They then record what happens to variables of interest (called output variables) over these scenarios, and let us analyze the characteristics of the output probability distributions (see Exhibit 4.1).

Let us start with a simple example. Suppose you want to invest $1,000 in the U.S. stock market for one year. To do so, you decide that you want to invest in a stock index that represents the performance of the stock market.

Specifically, you invest in a mutual fund whose investment objective is to reproduce the return performance on the S&P 500. A mutual fund with such an objective is referred to as an index fund. We will denote the initial investment, or capital, invested in the index fund as C0(i.e., C0=$1,000).

How much money do you expect to have at the end of the year? Let us label the amount of capital at the end of the year by ˜C1.¹Note that ˜C1 will be a

101

E X H I B I T 4 . 1 Typical Monte Carlo simulation system.

random variable because it will depend on how the market (i.e., S&P 500) performs over the year. In fact, if we let ˜r0,1denote the market return over the time period [0,1), then ˜C1will equal

C0+˜r0,1C0

or, equivalently,

(1+˜r0,1)C0

As explained in Chapter 2, the return rt,t+1 over a time period [t, t+ 1) can be computed as

Pt+1−Pt+Dt

where Pt and Pt+1 are the values of the S&P 500 at times t and t +1, respectively, and Dt is the amount of dividends paid over that time period.

In this case, we can think of Pt and Pt+1as the S&P 500 index levels at the beginning (t=0) and at the end of the year (t=1), respectively, and assume that Dtis 0.

To estimate the end of year capital you would have, you can guess the return on the market, and compute the resulting value for ˜C1. However, this would give you only a point estimate of the possible values for your investment. A more sophisticated approach is to generate scenarios for the market return over the year, and compute ˜C1in each of these scenarios. In other words, you can represent future returns by a probability distribution,² generate scenarios that are representative of this probability distribution, and then analyze the resulting distribution of your end-of-year capital. The resulting probability distribution of ˜C1will be a set of scenarios itself. You can create a histogram of the outcomes, that is, collect the outcomes of the scenarios into nonoverlapping bins and draw bars above all bins with heights corresponding to the percentage of times outcomes in each bin were obtained in the simulation. This will allow you to visualize the approximate probability distribution of ˜C1, and analyze it with the statistical measures described in section 3.6 (central tendency, skew, variability, etc.) in the pre-vious chapter. The distribution for ˜C1from the simulation will be only an approximation because it will depend both on the number of scenarios and on the set of scenarios you generated for ˜r0,1. Intuitively, if you generate 1,000 scenarios that cover the possible values for ˜r0,1 well, you would ex-pect to obtain a better representation of the distribution of ˜C1 than if you generated only two scenarios.

4 . 1 . 1 S e l e c t i n g P r o b a b i l i t y D i s t r i b u t i o n s f o r t h e I n p u t s

The first question you need to ask yourself when creating the simulation model about the future values of your funds is what distribution is appro-priate for modeling the future market returns. One possible starting point is to look at a historical distribution of past returns, and assume that the future will behave in the same way. When creating scenarios for future realizations, then, you can draw randomly from historical scenarios. This is a very sim-ple approach, which is based on the bootstrapping technique described in section 3.11.3.

Another possibility is to assume a particular probability distribution for future returns, and use historical data to estimate the parameters of this distribution, that is, the parameters that determine the specific shape of the distribution, such as the expected value (µ) and standard deviation (σ) for a normal distribution (see section 3.4), λ for a Poisson distribution (see section 3.7.1), or α and β for a beta distribution (see section 3.7.2). For example, if you assume a normal distribution for returns, then you can use the historical variability of returns as a measure of the standard deviationσ

of this normal distribution, and the historical average (mean) as the expected returnµof the normal distribution.

A third approach is not to start out with a particular distribution, but to use historical data to find a distribution for returns that provides the best fit to the data. As we mentioned in sections 3.7.2 and 3.11.4, the chi-square hypothesis test is one possible fit test. Other goodness-of-fit tests include the Kolmogorov-Smirnov (K-S) test, the Anderson-Darling (A-D) test, and root-mean-squared-error (RMSE).³ Most simulation soft-ware packages, including MATLAB and @RISK, have commands that can test the goodness of fit for different probability distributions.⁴

Yet a fourth way is to ignore the past and look forward, constructing a probability distribution based on your subjective guess about how the uncertain variable in your model will behave. For example, using the beta distribution from Exhibit 3.18(A) to model the future market return will express a more pessimistic view about the market than using the beta dis-tribution in Exhibit 3.18(B) or a normal disdis-tribution because most of the probability mass in the distribution in 3.18(A) is to the left, so low values for return will happen more often when scenarios are generated.

It is important to realize that none of these approaches will provide the answer. Simulation is a great tool for modeling uncertainty, but the outcome is only as good as the inputs we provide to our models. We discuss ways for defining input distributions in specific applications in the book. The art of simulation modeling is in providing good inputs and interpreting the results carefully.

4 . 1 . 2 I n t e r p r e t i n g M o n t e C a r l o S i m u l a t i o n O u t p u t

For purposes of our example, let us assume that the return on the mar-ket over the next year will follow a normal distribution. (This is a widely used assumption in practice, despite the fact that few empirical studies find evidence to support it.) Between 1977 and 2007, the S&P 500 returned 8.79% per annum on average, with a standard deviation of 14.65%. We will use these numbers as approximations for the average return and the standard deviation of the return on your investment in the stock market over the next year. Relying on historical data is flawed, but is a reasonable starting point.

This chapter’s Software Hints section goes step-by-step through the actual implementation of our simple example with only Microsoft Excel functions, @RISK functions, and MATLAB functions. Here, we discuss the output one would obtain after generating 100 scenarios for the market return over the next year. Note that to generate these scenarios, we simply need to draw 100 numbers from a normal distribution with mean 8.79%

and standard deviation 14.65%. The input to the simulation would then be a sequence of 100 numbers such as

0.0245 -0.1561 0.1063 0.1300 -0.0801 0.2624 0.2621 0.0824 0.1358 0.1135 0.0605 and so on.

The output graph would look like Exhibit 4.2. Summary statistics ob-tained based on the 100 values of the distribution are provided to the right of the graph.⁵

If historical trends hold, you would expect to have $1,087.90 on average at the end of the first year. The standard deviation of the end-year capital you would expect is $146.15, that is, on average, you would expect to be

$146.15 off the mean value. With 5% probability, you will not be able to make more than $837.00 (the 5th percentile of the distribution), and with 95% probability you will make less than $1,324.00 (the 95th percentile of the distribution). The skewness is close to 0, and the kurtosis is close to 3, which means that the simulated distribution is close to normal. (In fact, the

0.0030

0.837 1.324

Values in Thousands

5.0%

Money at the End of Year 1

5.0% 90.0%

0.0025 0.0020 0.0015 0.0010 0.0005 0.0000

0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5

E X H I B I T 4 . 2 Histogram and summary statistics for the end-of-year distribution of 100 simulated values for $1,000 invested at the beginning of the year.

output distribution is normal because the input distribution we provided for the simulation of this simple relationship was normal, but the estimate from the simulation will never be perfectly accurate.)

Be careful with the interpretation of minima and maxima in a simula-tion. Theoretically, the minimum and maximum we could have obtained in this simulation are negative and positive infinity because the probability distribution for the return (the normal distribution) has an infinite range.

We did not obtain a particularly small minimum or a particularly large maximum because we only simulated 100 values. An event in the tail of the distribution with probability of occurring of less than 1/100 would be unlikely to appear in this set of simulated values. The minimum and the max-imum are highly sensitive to the number of simulated values and whether the simulated values in the tails of the distribution provide good representation for the tails of the distribution. There are smart ways to simulate scenarios so that the tails are well represented (we will talk about such methods later in this chapter), but the minimum and the maximum values obtained in a simulation should nevertheless be interpreted with care.

In section 3.11.2 in the previous chapter, we explained the statistical concept of confidence interval (CI) estimates. The main idea was the fol-lowing: In statistics, when we want to estimate a specific parameter of a distribution, such as the mean, we take a sample and observe what the value of the parameter is in the sample (in technical terms, we record the value of the sample statistic for the mean). Instead of reporting a single value for our estimate for the mean, however, we could report an interval whose length is related to the probability that the true distribution parameter indeed lies in that interval.

Simulation is very similar to statistical sampling in that we try to repre-sent the uncertainty by generating scenarios, that is, “sampling” values for the output parameter of interest from an underlying probability distribution.

When we estimate the average (or any other parameter of interest) of the sample of scenarios, we run into the same issue statisticians do—we need to worry about the accuracy of our estimate. To compute a 95% CI estimate for the average end-of-year capital, we use the 95% CI formula from sec-tion 3.11.2, and substitute the values obtained from the simulasec-tion statistics:

N=100, X=1,087.90, and s=146.15. The value for t(100−α/2)%,N−1for 95% CI is the value of the 97.5th percentile of the standard t-distribution with 99 degrees of freedom, which is 1.98. The 95% CI is therefore

Therefore, if the 100 scenarios were independent when generated, we can be 95% confident that the true average end-of-year capital will be be-tween $1,058.90 and $1,116.90. It just happens that because of the simplic-ity of the example, we know exactly what the true mean is. It is (1+0.0879)· 1,000=$1,087.90 because 8.79% was assumed to be the true mean of the distribution of returns (see section 3.9 for calculating means of functions of random variables), and it is indeed contained inside the 95% CI. In 5%

of all possible collections of 100 scenarios, however, we will be unlucky to draw a very extreme sample of scenarios, and the true mean will not be contained in the confidence interval we obtain. Note that if we had calcu-lated a 99% confidence interval, then this would happen in only 1% of the cases. If we generated 4N (instead of N) scenarios, then the 95% confidence interval’s length would be half of the current length. (This is because the square root of the number of scenarios is contained in the denominator of the expression that determines the length of the confidence interval.) We will revisit the issue of confidence interval estimation and the implications for accuracy again later in this chapter when we talk about the number of scenarios needed in a simulation.

Also later in this chapter, we will explain how random numbers are actually generated. We will see that drawing “independent” samples from distributions is not the most efficient way to simulate random numbers that provide good representation of the underlying probability distribution. Most simulation engines nowadays (including the simulation engines in @RISK and MATLAB) use sophisticated methodology that estimates parameters from the distribution of output variables of interest a lot more accurately.

The previous CI formula we used is in fact a conservative bound, rather than an exact estimate, for the actual accuracy in estimating the mean. Still, it is a useful benchmark to have.

4 . 2 W H Y U S E S I M U L A T I O N ?

The example in the previous section illustrated a very basic MC simula-tion system. We started out with a deterministic model that involved a relationship between an input variable (market return ˜r0,1) and an output variable of interest (capital at the end of one year ˜C1). We modeled the input variable as a realization of a probability distribution (we assumed a normal distribution), generated scenarios for that input variable, and tracked what the value of the output variable was in every scenario by computing it through the formula that defines the relationship between C˜1 and ˜r0,1. This is the general form of simulation models illustrated in Exhibit 4.1.

Despite its simplicity, this example allows us to point out one of the ad-vantages of simulation modeling over pure mathematical modeling. Simula-tion enables us to evaluate (approximately) a funcSimula-tion of a random variable.

In this case, the function is very simple—the end-of-year capital, ˜C1, is de-pendent on the realization of the returns through the equation (1+˜r0,1)C0. If we are given a probability distribution for ˜r0,1, in some cases we can compute the probability distribution for ˜C1in closed form. For example, if

˜r0,1followed a normal distribution with meanµ0,1=E[ ˜r0,1] and standard deviationσ0,1, then ˜C1would follow a normal distribution too, with mean (1+µ0,1)C0and standard deviationσ0,1C0.

However, if ˜r0,1did not follow a normal distribution, or if the output variable ˜C1were a more complex function of the input variable ˜r0,1, it would be difficult and practically impossible to derive the probability distribution of ˜C1from the probability distribution of ˜r0,1. Using simulation simplifies matters substantially.

There are three other important advantages of simulation that can only be appreciated in more complex situations. The first one is that simulation enables us to visualize the probability distribution resulting from compound-ing probability distributions for multiple input variables. The second is that it allows us to incorporate correlations between input variables. The third is that simulation is a low-cost tool for checking the effect of changing a strategy on an output variable of interest. Next, we extend the investment example to provide illustrations of such situations.

4 . 2 . 1 M u l t i p l e I n p u t V a r i a b l e s a n d C o m p o u n d i n g D i s t r i b u t i o n s

Suppose now that you are planning for retirement and decide to invest in the market for the next 30 years (instead of only the next year). Sup-pose that your initial capital is still $1,000. You are interested in the re-turn (and, ultimately, in the end-of-year capital, ˜C30) you will have after 30 years.

Let us assume that every year, your investment returns from investing in the S&P 500 will follow a normal distribution with the mean and standard deviation from the example in section 4.1.2. The final capital you have will depend on the realizations of 30 random variables—one for each year you are invested in the market.⁶We found through simulation in section 4.1.2 that the probability distribution of the capital at the end of the first year will be normal. What do you think the probability distributions for the total return and the capital at the end of the 30th year will look like? Will they be normal?

As explained in Chapter 2, an investment of $1 at time 0 will grow to (1+˜r0,1)(1+˜r1,2). . .(1+˜rt−1,t) dollars at the end of year t, and the total return ˜r0,t from time 0 to time t equals

˜r0,t =(1+˜r0,1)(1+˜r1,2). . .(1+˜rt−1,t)−1.

Interestingly, the probability distribution of ˜r0,tis not normal, and nei-ther is the distribution of the capital at the end of 30 years. (The latter is basically a scaled version of the distribution of total return, since it can be obtained as ˜C0,t =(1+˜r0,t)·C0, and the initial capital C0is a constant [nonrandom] number). In general, here are some useful facts to keep in mind when dealing with multiple input probability distributions:

When a constant is added to a random variable, as in 1 added to the random variable ˜r0,1, the distribution of (1+˜r0,1) has the same shape as the distribution of ˜r0,1; however, it is shifted to the right by 1.

As we saw in section 3.9, when a random variable is added to another random variable (e.g., ˜r0,1+˜r1,2), we cannot simply “add” the two probability distributions. In fact, even in cases when the two distribu-tions have the same shape, the probability distribution of the sum of the random variables does not necessarily have the same shape. There are some exceptions—for instance, if we add two independent normal random variables, the probability distribution of the sum is normal.

However, holding aside this case, this is not true in general.

In our example, we are multiplying two random variables, (1+˜r0,1) and (1+˜r1,2) in order to obtain the total return. Products of random variables are even more difficult to visualize than sums of random vari-ables. Again, it virtually never happens that a product of several random variables, even if the random variables all follow the same probability distributions, results in a random variable with that same probability distribution. The lognormal distribution, which we introduced in sec-tion 3.7.2 in the previous chapter, is a rare excepsec-tion, and this is one of the reasons that the lognormal distribution is used so often in finan-cial modeling. (Specific applications of the lognormal distribution are discussed in Chapter 12.)

Fortunately, simulation makes visualizing the probability distribution of the product easy. Exhibit 4.3 presents the output distributions for total re-turn and capital at the end of 30 years. (For instructions in how to create the model, see Appendix B or C online, and then the summaries of instructions in this chapter’s Software Hints.) We can observe (both from the graph and

5 6

Values in Thousands

Values x 10^–5

Total Amount of Money in Account

4 3 2 1

0 0 20 40 60 80 100 120 140 160 180 200

E X H I B I T 4 . 3 Output distribution for amount of capital after 30 years.

from the statistics for skewness and kurtosis) that the distribution is very skewed, even though the distributions for individual returns in each of the 30 years were symmetric.

4 . 2 . 2 I n c o r p o r a t i n g C o r r e l a t i o n s

Let us now complicate the situation more. Suppose that you have the op-portunity to invest in stocks and Treasury bonds over the next 30 years.

Suppose that today you allocate 50% of your capital to the stock market by investing in the index fund, and 50% in bonds. Furthermore, suppose over the 30 years you never rebalance your portfolio (i.e., you do not change the allocation between stocks and bonds). What will be the portfolio return

Dans le document • Part I, Fundamental Concepts, provides insights on the most important issues in fi nance, simulation, optimization, and optimization under uncertainty (Page 121-163)