Semi-parametric implied volatility surface models and forecasts based on a regression tree-boosting algorithm

(1)

Semi-parametric implied volatility surface models and forecasts based on a regression tree-boosting algorithm

Dominik Colangelo

Submitted for the degree of Ph.D. in Economics at Swiss Finance Institute

Faculty of Economics

Universit`a della Svizzera italiana, USI Lugano, Switzerland

Thesis Committee:

Prof. F. Audrino, advisor, Universität St. Gallen Prof. F. Trojani, Università della Svizzera italiana Prof. W. Härdle, Humboldt-Universität zu Berlin

November 2009

(2)

ii

(3)

Acknowledgments

I would like to thank my supervisor Prof. Francesco Audrino for his guidance throughout my doctoral studies. He taught me a lot by pushing me to my limits and beyond. This thesis is an offspring of my collaboration in his research project

‘Multivariate FGD techniques for implied volatility surfaces estimation and term structure forecasting’ that was funded by the Foundation for Research and Devel- opment of USI.

During the time spent in Lugano working at the Institute of Finance, I had the chance to make a lot of great friends, discuss my research with colleagues on numerous occasions and follow a top PhD program provided by the Swiss Finance Institute. I also discovered my own italianit`a and found the love of my life.

Special thanks and gratitude go to my wife, my parents, my sister and my extended family for their love, support and hospitality. A substantial part of the thesis has been written in the land down under.

Gabriela and Kathy take credit for proofreading, remaining typos are my sole responsibility.

iii

(4)

iv

(5)

Abstract

A new methodology for semi-parametric modelling of implied volatility surfaces is presented. This methodology is dependent upon the development of a feasible estimating strategy in a statistical learning framework. Given a reasonable starting model, a boosting algorithm based on regression trees sequentially minimizes generalized residuals computed as differences between observed and estimated implied volatilities. To overcome the poor predicting power of existing models, a grid is included in the region of interest and a cross-validation strategy is implemented to find an optimal stopping value for the boosting procedure. Back testing the out-of-sample performance on a large data set of implied volatilities from S&P 500 options provides empirical evidence of the strong predictive power of the model.

Accurate IVS forecasts also for single equity options assist in obtaining reliable trading signals for very profitable pure option trading strategies.

v

(6)

vi

(7)

Introduction

The liquidity of option markets has steadily grown since the seminal work ofBlack and Scholes(1973) andMerton(1973). They showed that the price of an option is the initial cost of a self financing replicating strategy and derived the well known analytical Black-Scholes (BS) formula for European options. At the current time t, the expiry date T, the underlying stock price S_t as well as the constant risk- free interest rate r are directly observable. However, the instantaneous volatility of the underlying stock return process is unknown. Using the market price of an option, it is possible to numerically solve the BS formula for the unknown volatility parameter. The resulting number is called implied volatility (IV). It is a well known empirical fact that the IV is not constant as actually assumed for deriving the BS formula. Instead, it varies over time, strike and expiry date. The concept of implied volatility surface (IVS) specifies IV as a function of moneyness mand time to maturityτ, where the former quantifies the degree of intrinsic value in the option price and the latter the time value. m is an increasing function in the strike K, in general eventually also depending on t, T, S_t and r.

IV is regarded as a state variable that reflects current market situations and expections about future states. Hence it makes sense to model the IVS directly although the degenerated structure of option data makes this task difficult. Only options with a few distinct maturities, but various different strikes are traded.

Certain regions of the IVS exhibit a strong dynamic that is hard to capture. A 1

(12)

2 CHAPTER 1. INTRODUCTION thoughtfully constructed estimation strategy needs to be considered to avoid all sorts of pitfalls (smoothness, no-arbitrage conditions, computational feasibility, overfitting, etc.).

Recently, a great deal of effort has been put into modelling the IVS directly.

Gon¸calves and Guidolin (2006) combined a cross-sectional approach similar to that ofDumas, Fleming, and Whaley (1998) with vector autoregressive models.

They tried, and partially succeeded depending on transaction costs, to exploit single- and multi-step ahead volatility predictions produced by their model to form profitable volatility-based trading strategies. Semi- and nonparametric smoothing methods as well as dimension-reduction techniques have also been introduced. Ski- adopoulos, Hodges, and Clewlow(2000) popularized principal components analysis (PCA) in the IVS literature. They applied PCA on a multivariate time series of IV differences for a given moneyness level and within a certain expiry range. For a surface analysis, they only used three ‘expiry buckets’ with 10 to 90, 90 to 180 and 180 to 270 days to expiry.

Cont and da Fonseca (2002) presented a functional data analysis approach based on the Karhunen-Loève decomposition, an extension of the PCA method for random surfaces. Fengler, Härdle, and Villa(2003) argued that IVs of different maturity groups have a common eigenstructure and defined a common principal component (CPC) framework. Fengler, Härdle, and Mammen (2007) combined methods from functional PCA and backfitting techniques for additive models in their dynamic semiparametric factor model (dsfm). By taking the degenerated option data structure explicitly into account, they overcame some of the difficul- ties that the models based on PCA had encountered. They fitted their functional model directly on the aggregated data, without the need to estimate IV with a nonparametric smoothing estimator on a fixed grid or to sort IV into moneyness/time to expiry buckets in order to obtain a high dimensional time series of IV classes as an approximation of the IVS. In a comparison of the one-day out-of- sample prediction error, the dsfm performes only 10% better on DAX option data than a simple sticky-moneyness model, where IV is taken to be constant over time at a fixed moneyness.

(13)

1.1. GOALS 3

1.1 Goals

The first goal of this thesis is to set up a statistical learning framework that im- proves any given starting model for the IVS with an extended predictor space. The classical predictor space consisting of only m and τ is enhanced to higher dimen- sions by including a call/put dummy variable, exogenous factors and time-lagged as well as forecasted time-leading versions of themselves. Supervised learning is achieved by iteratively applying a tree-boosting algorithm.

Tree-boosting is a simple version of an optimization technique in function space called functional gradient descent (FGD), using regression trees (Breiman, Fried- man, Stone, and Olshen, 1984) as base learners and a quadratic loss function.

Audrino and B¨uhlmann(2003) developed this machine learning technique for financial time series. FGD has shown its power in improving volatility forecasts in high-dimensional GARCH models for risk management purposes (Audrino and Barone-Adesi,2005), modelling interest rates (Audrino, Barone-Adesi, and Mira, 2005) and expected bond returns (Audrino and Barone-Adesi,2006). It also helps to improve the filtered historical simulation method, for example to compute reliable out-of-sample yield curve scenarios and confidence intervals (Audrino and Trojani,2007).

The second goal is to focus on out-of-sample predictions of the IVS. For certain regions in the (m, τ) domain, the prediction errors shall be controlled such that the peformance of any reasonable starting model in forecasting IV is also improved under possible structural breaks in the time series.

The third goal is to investigate the practical use of the proposed IVS methodology. Only a few studies link option trading with IV analysis (Ahoniemi,2006;

Goyal and Saretto, 2009). This thesis defines option trading strategies and ana- lyzes their performances, also in the context of dispersion trades (Driessen, Maen- hout, and Vilkov,2009).

(14)

4 CHAPTER 1. INTRODUCTION

1.2 Outline

After thoroughly revisiting the Black-Scholes framework in Chapter 2, the related concept of implied volatility is compared in Chapter 3 to other volatility concepts that emerge from generalizing the dynamics of the underlying security. It is possible to analyze the shape of the IVS for any local volatility or stochastic volatility model¹, but the opposite direction is more promising as IV provides an exact link to them. Modelling the IVS directly (as a random field) raises a lot of questions about possible predictors of IVS.

Chapter 4 introduces supervised learning methods that perform automatic variable selection. Chapter 5, based on a forthcoming article in Statistics and Computing², defines the new methodology for modelling the IVS in a statistical learning framework.

The two following chapters are empirical. In Chapter 6, the out-of-sample (OS) performance of IVS predictions for the S&P 500 index is analyzed, also for a possible application with dispersion trading. In Chapter 7, single equity option returns (the constituents of the S&P 100 index) are forecasted 10 days OS. A pure option trading strategy is defined based on that signal, relying on stability of the moneyness state during the last 20 calendar days until maturity. Conclusions are presented in Chapter 8.

1All such models generate an IVS with similar shape (Gatheral,2006, Chapter 7).

2Audrino, F. and D. Colangelo (2009). Semi-parametric forecasts of the implied volatility surface using regression trees. Forthcoming inStatistics and Computing. DOI:10.1007/s11222- 009-9134-y.

(15)

Chapter 2

The Black-Scholes model revisited

The model ofBlack and Scholes(1973) is set in a continuous-time financial market.

Assume there are two securities in a frictionless market³, a risky asset S_t and a risk-free securityB_t that acts as a num´eraire, i.e. a saving account paying a risk- free interest rate r, here assumed to be constant and equal for borrowing and lending. The dynamics of the two securities are given by

dS_t=µS_tdt+σS_tdW_t (2.1)

dB_t=rB_tdt (2.2)

where W_t is a F-adapted standard Wiener process (a.k.a. Brownian motion) defined on a probability space (Ω,F,P). The filtrationFis an increasing sequence of σ-algebras on (Ω,F), consisting ofFt=σ(W_s:s≤t), the smallestσ-algebra such that all {W_s, s≤t} areFt-measurable, for t∈[0, T]. Furthermore, all P-nullsets are included inF0. In other words, the investors know the history ofS from time 0 up to present time t, but they have no information about later values.

3Assets are perfectly (infinitesimally) divisible, there are no short sale restrictions and no transaction costs occur either for buying or selling.

5

(16)

6 CHAPTER 2. THE BLACK-SCHOLES MODEL REVISITED

2.1 Geometric Brownian motion as a process for stock prices

The solution of the ordinary differential equation for the num´eraire (2.2) with boundary conditionB₀ = 1 is straightforward, given byB_t=e^rt. Dividing both sides of Eq. (2.1) by St > 0 reveals that µ is the instantaneous drift and σ the instantaneous volatility of dS_t/S_t, the percentage change process of S_t over an infinitesimally small period dt. Both µ and σ are assumed to be constant in the Black-Scholes (BS) framework. A process following such a stochastic differential equation (SDE) is called geometric Brownian motion (GBM). The solution to Eq.

(2.1) is analytically given by S_t=S₀exp

µ−σ²

2

t+σW_t

(2.3) for any initial value S₀ > 0. This can be checked with the help of Itˆo’s lemma.

For an Itˆo process of the form X_t=X₀+

Z t 0

a_sds+ Z t

0

b_sdW_s (2.4)

withapredictable and Lebesgue integrable,ba predictableW-integrable process, Itˆo’s lemma states that a twice continuously differentiable function f on X_t is itself an Itˆo process with dynamics given by

df(X_t) =f^′(X_t)dX_t+1

2f^′′(X_t)dhXi_t, (2.5) adding half of the second derivative of f times the differential of the quadratic variation process to the standard chain rule part⁴. For a partition of the interval [0, t], 0 = t₀ < t₁ < ... < t_n = t, the quadratic variation P_n

k=1(X_t_k −X_t_k−1)² converges in probability tohXi_t=Rt

0b²_sdsas the mesh of the partition tends to 0.

Therefore, in differential notation, we havedhXi_t=b²_tdt.

Applying Itˆo’s lemma (2.5) tof(S_t) = logS_thelps find the solution of the SDE for a geometric Brownian motion.

4More generally, iff(t, Xt) is continuously differentiable intand twice continuously differentiable inXt, thendf(t, Xt) =_∂f(t,X

t)

∂t dt+^∂f(t,X_∂X ^t⁾

t dXt

+¹₂^∂²^f(t,X_∂X2^t⁾ t

dhXi_t.

(17)

2.1. GBM AS A STOCK PRICE PROCESS 7

dlogSt= 1

S_tdSt+1 2

− 1 S_t²

σ²S_t²dt

= 1

S_t(µS_tdt+σS_tdW_t)−1 2σ²dt

= (µ−1

2σ²)dt+σdW_t, (2.6)

the right-hand side being independent ofS_t. It follows that logS_t= logS₀+ (µ−

1

2σ²)t+σW_t, and solving forS_t leads to expression (2.3). The defining properties of a standard Wiener process⁵together with the derived results imply that the log return process of S_t has a normal distribution,

log St

S_s d

∼ N

(µ−1

2σ²)(t−s), σ²(t−s)

, 0≤s < t≤T. (2.7) Hence,S_t|S_s is log-normally distributed with probability density function (PDF)

p_s,t(x) = 1 xb√

2π exp (

−1 2

logx−a b

2)

(2.8) a:=a(s, t, µ, σ, Ss) =

µ−1

2σ²

(t−s) + logSs

b:=b(s, t, σ) =σ√ t−s

and cumulative distribution function (CDF) P[S_t≤x|S_s] =P[S_t≤x|Fs] =

Z x 0

p_s,t(y)dy (2.9)

=

Z ^log^x−a

b

−∞

√1 2π exp

−1 2z²

dz (2.10)

=

Z ^log^x−a

b

−∞

ϕ(z)dz= Φ

logx−a b

. (2.11)

A change of variable takes place in (2.10),z := ^log_b^y⁻^a. ϕ(·) and Φ(·) denote the PDF and CDF of a standard normal random variable.

5A standard Wiener process Wt on [0, T] is defined by the following properties: W0 = 0, Wt is almost surely continuous, has independent increments and Wt−Ws

∼ N(0, td −s) for 0≤s < t≤T.

(18)

8 CHAPTER 2. THE BLACK-SCHOLES MODEL REVISITED The conditional expectation and variance of S_t|S_s under the phyiscal probability measurePare

E_P[S_t|S_s] =e^a+¹²^b² =e^µ(t⁻^s)S_s (2.12) VarP(S_t|S_s) =e^2a+b²

e^b² −1

=e^2µ(t⁻^s)S_s²n

e^σ²^(t⁻^s)−1o

. (2.13)

Remark 2.1 Note that the instantaneous drift µ is the expected percentage change in the stock price per infinitesimally small period dt, E_P[dS_t/S_t]/dt = µ, but the expected continuously compounded return over the period [0, T] is E_Ph

1 T log

ST

S0

i=µ−¹₂σ².

2.2 Pricing European plain vanilla options

The term “plain vanilla option” describes the standard version of an option that does not have any special component. This is unlike an exotic option which is more complex and non-standard.

Definition 2.2 A stock option is a contract between a buyer (holder) and a seller (writer) that guarantees the buyer the right, but not the obligation, to buy (call option) or sell (put option) a share of the underlying stock at a fixed strike price K in the future at (European-style) or up to (American-style) a fixed maturity dateT (a.k.a. expiry date). In financial jargon, the holder is said to be long and the writer short an option. If the option is exercised, the writer is obliged to fulfill the terms of the contract.

The frictionless BS financial market consisting of a risk-free security B_t = e^rt with constant r and a (non-dividend paying) risky stock S_t that follows a geometric Brownian motion with constantµandσis complete and does not allow for arbitrage opportunities (Hafner, 2004, p. 24). A complete market is one in which any contingent claim is attainable, i.e. for any contingent claim, there exists a self-financing strategy investing in the given securities such that it replicates

(19)

2.2. PRICING EUROPEAN PLAIN VANILLA OPTIONS 9 the final value of that contingent claim. Therefore, by the fundamental theorem of asset pricing, a unique risk-neutral measure to price contingent claims exists (Schachermayer, 2009). The principles of contingent claim pricing are explained in AppendixB.

2.2.1 Ingredients of the BS framework

Equations (2.1), (2.3), (2.7) specify the stock price process and Eq. (2.8) its PDF;

the pricing kernel is given by the following change of measure dQ

dP = exp (

− Z _t

0

µ−r σ

dW_s−1 2

Z _t

0

µ−r σ

2

ds )

(2.14) and Girsanov’s theorem states that

W˜t=

µ−r σ

t+Wt (2.15)

is a standard Brownian motion under the new measure Q, which together with Eq. (2.1) implies that the stock price process satisfies

dS_t=rS_tdt+σS_tdW˜_t. (2.16) The density of Qis called risk-neutral PDF or state-price density (SPD),

q_s,t(x) =dQ[S_t≤x|S_s] (2.17)

= 1

xσp

2π(t−s)exp



−1 2

log(_S^x

s)− r−¹₂σ² (t−s) σ√

t−s

!2



. The risk-neutral PDF is Log-normal distributed like the physical PDF in Eq. (2.8), but withr instead ofµ.

The discounted stock price process ˜S_t = e⁻^rtS_t is a martingale under Q; to prove this, we have dS˜_t = ˜S_tσdW˜_t by virtue of Itˆo’s lemma, and an Itˆo integral is a martingale (Elliott and Kopp, 2005, Theorem 6.3.3)⁶. Alternatively, the

6The martingale representation theorem proves the converse statement. Any almost sure continuous martingale can be expressed as an Itˆo integral with unique integrand process w.r.t. a standard Brownian motion (Elliott and Kopp,2005, Theorem 7.3.9).

(20)

10 CHAPTER 2. THE BLACK-SCHOLES MODEL REVISITED martingale property is directly checked by

E_Q[ ˜S_t|Fs] =S₀E_Q

e^σ^W^˜^t⁻^σ

2 2 t

Fs

(∗)

= S₀e^σ^W^˜^s⁻^σ

2

2 s= ˜S_s (2.18) for all 0≤s < t≤T. (∗) follows from the defining properties of a Wiener process (Elliott and Kopp,2005, Theorem 6.2.5).

According to Ait-Sahalia and Lo, “SPDs are ‘sufficient statistics’ in an eco- nomic sense – they summarize all relevant information about preferences and business conditions for purposes of pricing financial securities” (1998, p. 503).

Detlefsen, H¨ardle, and Moro (2007) show how to recover the market utility func- tionU(s) implicit in the BS framework by equating the stochastic discount factor M_t,T =βU^′(S_T)/U^′(S_t) obtained in a preference-based equilibrium model, where β is a fixed discount factor, with the state price density per unit probability e⁻^r(T⁻^t)q_t,T(S_T)/p_t,T(S_T) that appears in the context of risk-neutral pricing. The implicit utility is a power utility of the form

U(S_T) =

1−µ−r σ²

₋1

S

1−^µ−r_σ2

T . (2.19)

The contract specifications of a European plain vanilla stock option determine its payoff function;ψ_T(S_T) = max(S_T −K,0) for a call and ψ_T(S_T) = max(K− S_T,0) for a put. All relevant quantities to price these contingent claims (Appendix B) have now been defined. Option prices can be obtained by calculatingπ_t(ψ_T) = E_P[ψ_TM_t,T|Ft] = E_Q

ψ_Te⁻^r(T⁻^t)Ft .

2.2.2 The BS formula

Black and Scholes derive their famous option pricing formula by showing that“it is possible to create a hedged position, consisting of a long position in the stock and a short position in the [call]option [on the same stock],whose value will not depend on the price of the stock” (1973, p. 641). Since such a hedge portfolio is risk-free, its rate of return must equalr by the assumption of no-arbitrage.

More generally, this method of arbitrage-free pricing leads to a partial dif-

(21)

2.2. PRICING EUROPEAN PLAIN VANILLA OPTIONS 11 ferential equation (PDE) for the price H(t, S_t) of a European contingent claim⁷. Merton (1973) derives the BS model from weaker assumptions than in the orig- inal paper and also includes dividends. If the stock provides a dividend yield at constant rate q, then the BS PDE turns out to be

∂H

∂t + (r−q)S∂H

∂S +1

2σ²S²∂²H

∂S² −rH = 0 (2.20)

with boundary condition H(T, S_T) = ψ_T(S_T). For European plain vanilla stock options, the solution of the PDE can be analytically calculated and is known as BS formula,

C_t^BS=S_te⁻^q(T⁻^t)Φ(d₁)−Ke⁻^r(T⁻^t)Φ(d₂) (call) (2.21) P_t^BS=Ke⁻^r(T⁻^t)Φ(−d₂)−S_te⁻^q(T⁻^t)Φ(−d₁) (put) (2.22) where

Φ(u) = Z _u

−∞

ϕ(z)dz d₁ = log(S_t/K) + (r−q+¹₂σ²)(T −t) σ√

T −t ϕ(z) = 1

√2πe⁻^z²^/2 d2 =d1−σ√ T −t

Definition 2.3 Thecp flag denotes a binary variable that equals 1 for a call and 0 for a put option.

The BS formula can then be written as BSt(St, σ,cp flag, K, T, r, q) =

( C_t^BS ifcp flag= 1

P_t^BS ifcp flag= 0 . (2.23)

7Its payoff functionψt=ψt(St) must be path-independent and a non negative random variable that isFt-measurable. An integrability condition forψt can be found in Fengler(2005, Section 2).

(22)

12 CHAPTER 2. THE BLACK-SCHOLES MODEL REVISITED 2.2.3 Comments and clarifications

The solution of the BS PDE (2.20) with boundary condition H(T, S_T) =ψ_T(S_T) is equivalent to the ‘linear pricing rule’ result that is inherent in the state price approach,H(t, S_t)≡π_t(ψ_T(S_T)). For example, the price of a European call option on a non-dividend paying stock is

C_t(S_t, K, T) =π_t(max(S_T −K,0))

=e⁻^r(T⁻^t)EQ[max(ST −K,0)|Ft]

=e⁻^r(T⁻^t) Z _∞

0

max(S_T −K,0) dQ(S_T|S_t)

=e⁻^r(T⁻^t) Z _∞

K

(S_T −K)q_t,T(S_T) dS_T. (2.24) The first part of the integral in (2.24) is

Z _∞

K

S_Tq_t,T(S_T) dS_T = EQ[S_T|S_t]− Z K

0

S_Tq_t,T(S_T) dS_T

=e^r(T⁻^t)S_t− Z K

0

S_Tq_t,T(S_T) dS_T (2.25) and the second part

Z _∞

K

Kq_t,T(S_T) dS_T =KQ[S_T > K|S_t]

=K(1−Q[S_T ≤K|St])

=K−K Z _K

0

q_t,T(S_T) dS_T. (2.26) Indeed, it can be shown that

C_t(S_t, K, T)≡BS_t(S_t, σ,cp flag= 1, K, T, r, q= 0).

Remark 2.4 Breeden and Litzenberger (1978) prove that q_t,T(x) = dQ[S_T ≤ x|S_t] is the second derivative of the price of a call option with strikexat maturity T w.r.t. the strike of the price when “the relation between the future cash flow

(23)

2.2. PRICING EUROPEAN PLAIN VANILLA OPTIONS 13 and the underlying portfolio may be of any type – not necessarily linear or jointly normal” (p. 649),

q_t,T(x) =e^r(T⁻^t) ∂²C_t(S_t, K, T)

∂K²

K=x

. (2.27)

Note 2.5 This result is only based on the specific form of the call payoff function ψ_T = max(S_T −K,0), as we shortly verify for Eq. (2.24) with help of Equations (2.25) and (2.26):

∂C_t(S_t, K, T)

∂K =e⁻^r(T⁻^t) Z K

0

qt,T(ST) dST −1

(2.28)

∂²Ct(St, K, T)

∂K² =e⁻^r(T⁻^t)q_t,T(K). (2.29)

Remark 2.6 The BS PDE (2.20) and therefore also the BS formula (2.23) do not depend on µ. No individual investor preferences or agreements on expectations amongst investors are assumed in the BS framework.

It is quite reasonable to expect that investors may have quite different estimates for current (and future) expected returns due to different levels of information, techniques of analysis, etc. However, most an- alysts calculate estimates of variances and covariances in the same way: namely, by using previous price data. Since all have access to the same price history, it is also reasonable to assume that their variance- covariance estimates may be the same (Merton,1973, p. 163).

This seems to be a contradiction to the found implicit market utility (2.19).

Using Eq. (2.27), Breeden and Litzenberger clarify this issue by showing that “a necessary and sufficient condition for the Black-Scholes option-pricing formula to correctly price options on aggregate consumption is that individuals’ preferences aggregate to a utility function displaying constant relative risk aversion” (1978, Theorem 3).

(24)

14 CHAPTER 2. THE BLACK-SCHOLES MODEL REVISITED

2.3 The Greeks

The Greeks of a European contingent claim represent the sensitivities of the value processH(t, S_t) to a small change in underlying parameters of the financial model.

Usually, they are denoted by Greek letters. Table2.1defines the Greeks as partial derivatives ofH(t, S_t). The most common ones are delta, gamma, vega, theta and rho;

∆ := ∂H

∂S, Γ := ∂²H

∂S², ν = ∂H

∂σ, θ:= ∂H

∂t , ρ= ∂H

∂r .

The Greeks can be analytically calculated in the case of European plain vanilla stock options with given price

BS_t(S_t, σ,cp flag, K, T, r, q) =C_t^BS1I_{cp flag=1}+P_t^BS1I_{cp flag=0},

where 1I_expression is a dummy variable that equals 1 if the expression is true and 0 otherwise.

∆^BS_t = ∂BSt

∂S_t =

e⁻^qτΦ(d1) 1I_{cp flag=1} +

−e⁻^qτΦ(−d1) 1I_{cp flag=0}

(2.30)

Γ^BS_t = ∂BSt

∂S_t² = e⁻^qτϕ(d1) S_tσ√

τ (2.31)

ν_t^BS= ∂BS_t

∂σ =e⁻^qτS_t√

τ ϕ(d₁) (2.32)

θ_t^BS= ∂BS_t

∂t =

−e⁻^qτS_tσϕ(d₁) 2√

τ +qe⁻^qτS_tΦ(d₁)

−re⁻^rτKΦ(d₂)

1I_{cp flag=1} +

−e⁻^qτS_tσϕ(d₁) 2√

τ −qe⁻^qτS_tΦ(−d₁) +re⁻^rτKΦ(−d2)

1I_{cp flag=0}

(2.33)

ρ^BS_t = ∂BSt

∂r =

τ e⁻^rτKΦ(d2) 1I_{cp flag=1} +

−τ e⁻^rτKΦ(−d2) 1I_{cp flag=0}

(2.34)

(25)

2.3.THEGREEKS15

Definition of the Greeks

Spot price Volatility Time Time to expiry Risk-free rate

S σ t τ :=T −t r

Value delta vega theta rho

H ∆ := ^∂H_∂S ν := ^∂H_∂σ θ:= ^∂H_∂t [θ=−^∂H_∂τ ] ρ:= ^∂H_∂r

Delta gamma vanna charm

∆ Γ := ^∂∆_∂S = ^∂_∂S²^H2 ∂∆

∂σ = _∂S^∂ν = _∂S∂σ^∂²^H ^∂∆_∂τ =−_∂S^∂θ = _∂S∂τ^∂²^H

Gamma speed zomma color

Γ ^∂Γ_∂S = ^∂_∂S³^H3 ∂Γ

∂σ = _∂S^∂³2^H∂σ ∂Γ

∂τ = _∂S^∂³2^H∂τ

Vega vanna vomma DvegaDtime

ν ^∂∆_∂σ = _∂S^∂ν = _∂S∂σ^∂²^H ^∂ν_∂σ = ^∂_∂σ²^H2 ∂ν

∂τ = _∂σ∂τ^∂²^H

Vomma ultima

∂vomma

∂σ = ^∂_∂σ³^H3

Table 2.1: “The table shows the relationship of the more common sensitivities to the four primary inputs into the Black- Scholes model (spot price of the underlying security, time remaining until option expiration, volatility and the rate of return of a risk-free investment) and to the option’s value, delta, gamma, vega and vomma. Greeks which are a first-order derivative are in [blue], second-order derivatives are in [green], and third-order derivatives are in [orange]. Note that vanna is used, intentionally, in two places as these two sensitivities are mathematically equivalent” (Wikipedia contributors,2009).

(26)

16 CHAPTER 2. THE BLACK-SCHOLES MODEL REVISITED Remark 2.7 First-order linear approximations of the loss distribution play an important role in risk management, for example when estimating the value at risk (VaR) of a stock portfolio. If the risk-factor changes have a multivariate normal distribution, then a linear combination of them is also normally distributed and it is not difficult to find the mean µ_p and variance σ_p of the portfolio. VaR is the α-quantile of the loss distribution over a specified period. In this case, the calculations simplify to VaRα =µp+σpΦ⁻¹(α) because the normal distribution belongs to a location-scale family. Hence it is clear why this procedure is called variance-covariance method or delta-normal approach in the literature (see e.g.

McNeil, Frey, and Embrechts,2005, Section 2.3.1).

For spot or forward positions in the underlying, the delta approach is fully accurate, because the associated price function . . . is linear in the underlying. The delta approximation . . . is the foundation of delta hedging: A position in the underlying asset whose size is minus the delta of the derivative is a hedge of changes in price of the derivative, if continually re-set as delta changes, and if the underlying price does not jump (Duffie and Pan,2001, Section 3.1).

Remark 2.8 Applying the delta method to an option portfolio results in a poor approximation of the true change in value because an option price is a highly nonlinear function of (t, S_t, σ, r, q). A better solution is given by a second-order Taylor extension. For a general portfolio value processV(t, X_t) that depends on ad-dimensional risk factor X_t, the delta-gamma method

δVt≈θδt+ ∆^′δXt+1

2 δX_t^′ΓδXt (2.35)

approximates the change in portfolio valueδV_t=V_t+δt−V_tover a short fixed time δtas a function of risk-factor changesδXt=X_t+δt−Xt. The symbol^′ stands for the transpose sign. The Greeks of the portfolio areθ= ^∂V_∂t^t, ∆ = [_∂X^∂V^t

t,1, . . ._∂X^∂V^t

t,d]^′ (gradient) and Γ = [Γ_ij], a d×d matrix (Hessian) with Γ_ij = _∂X^∂²^V^t

t,i∂Xt,j. Duffie and Pan(2001, Section 4) show how to calculate the portfolio VaR.

Remark 2.9 In his PhD thesis, Studer studied the delta-gamma method and noted that it “captures a part of the non-linearity of option portfolios. Never- theless heavy-tailedness is not included and we have the problem of estimating a

(27)

2.4. NO-ARBITRAGE CONDITIONS AND OPTION BOUNDS 17 covariance matrix [ofX_t]. Finally for the last step[finding the distribution of δV_t for risk management purposes]we have to rely on numerical methods(2001, p. 11).

Assuming a BS framework, Studer refined the delta-gamma method in Proposi- tion 4.9 by using stochastic Taylor expansions to approximate the“distribution of the change in value of a portfolio . . . of positions in assets and derivatives in the market” (2001, p. 68).

2.4 No-arbitrage conditions and option bounds

The value of a contingent claim at expiry dateT equals its payoff, π_T(ψ_T(S_T)) = ψ_T(S_T) and hence it is obvious that C_T(S_T, K, T)−P_T(S_T, K, T) = max(S_T − K,0)−max(K−S_T,0) = S_T −K. A simple no-arbitrage argument shows that this equality must also hold for t < T when K is discounted appropriately, the options are of European style and the stock does not pay dividends,

C_t−P_t=S_t−e⁻^r(T⁻^t)K. (2.36) Eq. (2.36) is calledput-call parityand is model-free, i.e. only based on the specific form of European option payoff functions similar to the Breeden-Litzenberger result in Remark2.4. The put-call parity also holds for the BS formula,

C_t^BS=S_tΦ(d₁)−Ke⁻^r(T⁻^t)Φ(d₂) P_t^BS=Ke⁻^r(T⁻^t)Φ(−d₂)−S_tΦ(−d₁)

⇒C_t^BS−P_t^BS=S_t−Ke⁻^r(T⁻^t)

since Φ(d_i) + Φ(−d_i) = Φ(d_i) + (1−Φ(d_i)) = 1 for i = 1,2. If the stock pays dividends, the present value of the dividends that will be paid out before the option’s expiry dateT needs to be subtracted fromStin Eq. (2.36). If we assume a dividend yield at constant rate q, theput-call parity becomes

Ct−Pt=e⁻^q(T⁻^t)St−e⁻^r(T⁻^t)K. (2.37)

(28)

18 CHAPTER 2. THE BLACK-SCHOLES MODEL REVISITED C_t, P_t ≥ 0, hence the following lower and upper bounds for European option prices are implied by Eq. (2.37):

max(e⁻^q(T⁻^t)S_t−e⁻^r(T⁻^t)K,0)≤C_t≤S_t (2.38) max(e⁻^r(T⁻^t)K−e⁻^q(T⁻^t)St,0)≤Pt ≤e⁻^r(T⁻^t)K. (2.39) From Eq. (2.28) follows that ^∂C_∂K^t < 0 since 0 ≤ R_K

0 q_t,T(S_T) dS_T < 1 for 0≤K <∞ and by combining Eq. (2.28) with Eq. (2.37), we conclude that

∂P_t

∂K = ∂ C_t−e⁻^q(T⁻^t)S_t+e⁻^r(T⁻^t)K

∂K

=e⁻^r(T⁻^t) Z K

0

q_t,T(S_T) dS_T >0.

(2.40)

In the BS framework, a similar change of variable as in Eq. (2.11) leads to an explicit expression forQ[S_T ≤K|S_t] since

Z K 0

q_t,T(S_T) dS_T = Φ logK− r−¹₂σ²

(T −t)−logS_t σ√

T−t

!

= Φ(−d₂) = 1−Φ(d₂).

(2.41)

The partial derivations w.r.t.K are then given by

∂C_t^BS

∂K =−e⁻^r(T⁻^t)Φ(d₂)<0 (2.42)

∂P_t^BS

∂K =e⁻^r(T⁻^t){1−Φ(d₂)}>0. (2.43) As a consequence of ^∂C_∂K^t <0 and ^∂P_∂K^t >0, the strike monotonicityof European options is a no-arbitrage condition: forK₁ < K₂ and all other variables fixed,

C_t(K₁)> C_t(K₂) (2.44) P_t(K₁)< P_t(K₂). (2.45) When dealing with European options on non-dividend paying stocks (q = 0), there exists amaturity monotonicity due toQ(S_T₁ > K|St)<Q(S_T₂ > K|St)

(29)

2.4. NO-ARBITRAGE CONDITIONS AND OPTION BOUNDS 19 for T₁ < T₂ under usual economical conditions (r > 0). In such a case, ^∂C_∂τ^t =

−^∂C_∂t^t >0: forτ1< τ2 and all other variables fixed,

C_t(τ₁)≤C_t(τ₂) (2.46)

P_t(τ₁)−e⁻^rτ¹K ≤P_t(τ₂)−e⁻^rτ²K. (2.47) A long (short) butterfly spread with strikes K1 < K2 < K3 is an option strategy that is neutral in the underlying at level K₂ and bearish (bullish) in volatility, i.e. a trader taking such a position does not assume anything about the direction in whichS_t moves relative toK₂ ast→T (S_T ≶K₂?), but she believes in decreasing (increasing) volatility such that S_T is close to (far away from) K₂. A long butterfly can be created by going long 1 call with strike K1, short 2 calls with strikeK₂ and long 1 call with strikeK₃, all with the same maturityT, or by doing the same with puts. The payoff function of a long butterfly is shaped like an upside-down V. There is non-zero probability that the payoff Ψ_T is positive, hence it must have a non-negative price by no-arbitrage:

C_t(K₁)−2C_t(K₂) +C_t(K₃)≥0 (2.48) P_t(K₁)−2P_t(K₂) +P_t(K₃)≥0. (2.49) Cassese and Guidolin (2004) also discuss additional no-arbitrage conditions:

reverse strike monotonicity (forK₁ < K₂ and all other variables fixed) (K1−K2)e⁻^rτ ≤Ct(K2)−Ct(K1) (2.50) Pt(K2)−Pt(K1)≤(K2−K1)e⁻^rτ, (2.51) box spreads

[P_t(K₂)−C_t(K₂)]−[P_t(K₁)−C_t(K₁)] = (K₂−K₁)e⁻^rτ (2.52) and maturity spreads(for τ₁< τ₂ and all other variables fixed)

[Pt(τ2)−Ct(τ2)]−[Pt(τ1)−Ct(τ1)] =K(e⁻^rτ² −e⁻^rτ¹). (2.53)

(30)

20 CHAPTER 2. THE BLACK-SCHOLES MODEL REVISITED Testing these no-arbitrage conditions and option price bounds empirically is rather tricky. Synchrony of option and equity prices is absolutely essential, but not necessarily ensured when using end-of-day settlement data. It is also important to mind the persistence of detected arbitrage opportunities. Market microstructure (Corsi, 2005; Bandi and Russell, 2008), transaction costs and dividends need to be taken into consideration. Put-call parity (2.37) and all derived results only hold for European options. For an overview of the classical empirical literature on testing no-arbitrage conditions in option prices seeHull (2002, Section 8.8).

2.5 Criticism of BS framework

Undoubtedly, the assumptions made in the BS framework are unrealistic. The fact that financial markets are not frictionless lies at the bottom of the market microstructure theory. A continously rebalanced hedge with or without transaction costs of option positions can not be realized in practice.

“The many improvements on Black-Scholes are rarely improvements, the best that can be said for many of them is that they are just better at hiding their faults. Black-Scholes also has its faults, but at least you can see them” (Wilmott,2008).

The main flaw of the BS framework is the assumed asset price dynamics with constant volatility, only driven by independent Gaussian increments. This has led to extensive research in option pricing theory. More realistic continuous-time models and different concepts of volatility will be introduced in the next chapter.

(31)

Chapter 3

The Implied Volatility Surface

The only unobservable variable in the BS framework is the most crucial one, the volatilityσ. By equating the observed market price (C_t,P_t) of an option with the BS price and implicitly solving for

˜

σÎV: BS_t(S_t,σ˜ÎV,cp flag, K, T, r, q)=^! C_t1I_{cp flag=1}+P_t1I_{cp flag=0}, (3.1) an implied volatility (IV) can be numerically found. ˜σÎV is unique, due to the monotonicity of the BS price inσ, see Eq. (2.30). According to the BS assumptions, this implicitly calculated volatility should be constant. Cassese and Guidolin remark that “since Rubinstein (1985), it is well known that option markets are characterized by systematic deviations from the constant volatility benchmark of Black and Scholes(1973), a fact that has become even more evident after the world market crash of October 1987” (2006, p. 146).

To visualize how far BS assumptions and reality are apart, IVs for options on the S&P 500 index with different strikes K and expiry dates T are calculated on t = 10 August 2001 and plotted in Figure3.1. IV is not constant as actually assumed for deriving the BS formula. Instead, ‘smiles’ and ‘smirks’ across the K-axis as well as a term structure along the T-axis can be seen.

21

(32)

22 CHAPTER 3. THE IMPLIED VOLATILITY SURFACE Implied volatilities of S&P 500 index options, t= 10 August 2001

800 1000

1200 1400

1600 1800 *1 *2*3 *4 *5 *6

*7

*8 0.1

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

K T

IV

*1 18 Aug 2001

*2 22 Sep 2001

*3 20 Oct 2001

*4 22 Dec 2001

*5 16 Mar 2002

*6 22 Jun 2002

*7 21 Dec 2002

*8 21 Jun 2003

Figure 3.1: Scatter-plot of IVs for 214 calls and 147 puts with different strikesK and expiry datesT. The underlying S&P 500 index closed at 1,190.16 points on 10 August 2001.

Definition 3.1 (IVS in absolute coordinates) The mapping

˜

σ_t^IV: (K, T)7−→σ˜_t^IV(K, T) (3.2) is called the implied volatility surface (IVS) in absolute coordinates.

PluggingS_t, K, r, T, and ˜σ^IV_t (K, T) back in the BS formula leads (by definition of IV) to the observed market price. As it is usually done in the IV literature, we describe the IVS in relative coordinates.

Definition 3.2 (Relative coordinates) Moneyness is an increasing function in the strikeK, in general eventually also depending on timet, expiry dateT, the spot price of the underlying security S_t and risk-free interest rate r. If not stated otherwise, moneyness is defined asm=K/S_t throughout this thesis. τ =T−tis called time to maturity.

Strike and expiry date are fixed in the contract specification of each option.

(33)

23 In-the-money At-the-money Out-of-the-money

(ITM) (ATM) (OTM)

Call m <1 m= 1 m >1

Put m >1 m= 1 m <1

Table 3.1: Moneyness categories for options when moneyness is defined asm=K/St.

They can be easily derived from relative coordinates, (K, T) = (m·S_t, τ +t).

At any point in time during its lifetime, an option is either in-the-money (ITM), at-the-money (ATM) or out-of-the-money (OTM).

Definition 3.3 (Intrinsic value, time value) The ITM part of the option value is called intrinsic value at time t,

max(St−K,0)1I_{cp flag=1}+ max(K−St,0)1I_{cp flag=0}. (3.3) The time value is the difference between option value and intrinsic value.

The former reflects the (hypothetical) value of immediate exercise of the option and the latter the value of holding on to the option. The time value is usually decreasing as time t approaches the expiry date T. As Hull points out, “an exception to this could be an in-the-money European put option on a non-dividend- paying stock or an in-the-money European call option on a currency with a very high interest rate” (2002, p. 310). This can also be seen in the BS framework.

From Eq. (2.33) follows thatθ_t^BS= ^∂BS_∂t^t is not always <0 and might be positive in the aforementioned cases. A rational option holder exercises a European option atT only if its intrinsic value is positive, which implies that the option is ITM. A monyeness classification is given in Table 3.1.

Definition 3.4 (IVS in relative coordinates)

σ_tÎV: (m, τ)7−→σ_tÎV(m, τ) = ˜σ_tÎV(m·S_t, t+τ) (3.4) is called the IVS in relative coordinates.

(34)

24 CHAPTER 3. THE IMPLIED VOLATILITY SURFACE Figure 3.2shows the IV plot of the earlier mentioned S&P 500 index options in relative coordinates. ‘Smiles’ and ‘smirks’ appear to form a string; although there are only a few expiry dates, a higher number of options with different strikes per string exist. This is due to institutional and practical conventions. Terms and conditions of exchange-traded options are standardized. The difference between successive expiry dates for the range of small time to maturities (τ ≤3 months) is usually one month, for large time to maturities three months. It is clear that the scatter-plot looks different on another day as IV levels, string shapes and (m, τ) location are all functions of time.

Implied volatilities of S&P 500 index options, t= 10 August 2001

0.5 1

1.5

2 0

0.5 1 1.5 2 0

0.2 0.4 0.6 0.8 1

m τ

IV

Figure 3.2: IVs of S&P 500 index options on t= 10 August 2001 are plotted against relative coordinates(m, τ) = (K/St, T−t). IVs of 214callsare blue and IVs of 147putsare red.

Definition 3.5 (Degenerated option data structure) The fact that there is only a discrete set of strikes with a small number of maturities available at each moment in time is called the degenerated structure of option data. The data is sparse and unequally distributed over the(m, τ) plane and arranged in IV strings that are moving deterministically along theτ-axis towards zero and randomly along the m-axis according to time.

Semi-parametric implied volatility surface models and forecasts based on a regression tree-boosting algorithm