• Aucun résultat trouvé

The Estimation Error in the Chain-Ladder Reserving Method: A Bayesian Approach

N/A
N/A
Protected

Academic year: 2021

Partager "The Estimation Error in the Chain-Ladder Reserving Method: A Bayesian Approach"

Copied!
12
0
0

Texte intégral

(1)

THE ESTIMATION ERROR IN THE CHAIN-LADDER RESERVING METHOD: A BAYESIAN APPROACH

BY

ALOISGISLER

ABSTRACT

We derive the estimation error in a Bayesian framework and discuss the estimates of Mack [2] and of Buchwalder, Bühlmann, Merz and Wüthrich (BBMW) [1] from a Bayesian point of view.

1. INTRODUCTION

In [1] the authors BBMW revisit the chain ladder method on the basis of a time-series’ model and suggest a new formula for the estimation error, which differs from the formula in the original paper [2] of Mack. The estimates result-ing from the new formula are always greater then the ones resultresult-ing from Mack. In [3] Mack, Quarg and Braun make some comments on the approach taken in [1] and make a question mark, whether the new formula is an improvement. Hence there is some controversial discussion on how to estimate the estimation error.

In this contribution we first discuss shortly the different formulae of Mack and BBMW in section 2. The main part is section 3, where we tackle the ques-tion of the estimaques-tion error in a new way by means of a Bayesian approach. We will consider two Bayesian models with a limiting prior, where the Bayes estimators for the unknown chain ladder factors will coincide with the classi-cal chain ladder estimates fj, such that the mean square error of this Bayes estimator is comparable with the estimation error of the classical chain ladder method.

By this different approach we hope to contribute to the discussion and to give some further insight into the problem.

Wherever possible and not otherwise stated we will use the same notation as in [1] .

2. DISCUSSION ON THE ESTIMATES OFBBMW ANDMACK

(2)

, k K-k+1 l l , , k j k j K K , k : . f e E E X f j l K k j l K k j 2 2 1 1 1 1 2 = -= -= - + -= - + -X D D X

%

%

J L K K a N P O O k 8 ; B E (2.1)

In the original paper [2] Mack suggests to estimate ej, kby

, k K k l l l 1 - + l , k . s f f X S1 j l K k j l K k j 2 2 1 1 2 2 1 1 = = - + -= - + -e J

%

%

L K K J L K K N P O O N P O O (2.2)

The reasoning behind this formula is well documented in [2] .

First we should note that (2.2) takes also into account the information con-tained in the upper right corner DKOand that it can therefore also be considered as a conditional approach.

BBMW suggest to estimate ek, j by

, k K-k+1 l l l , kj X E f f l K k j l K k j 2 2 2 1 1 1 1 = -= - + -= - + -B e J

%

%

L K K N P O O 9 C (2.3) , k K k l l l 1 - + l , s f f X S l K k j l K k j 2 2 2 2 1 1 1 1 = + -= - + -= - +

-%

%

J L K K J L K K N P O O N P O O (2.4)

where in (2.4) s2l was replaced by s

2

l. The reasoning behind (2.4) is explained in [1], but the arguments are not so obvious and not so easy to follow.

The main objection in [3] to the new approach of BBMW is that, by taking the product l K f E l K k j 2 1 1 = - + -D

%

9 C

in (2.3), this formula disregards the negative correlation between the r.v. f2

l given BK – k + 1. Hence on average over all upper right triangles D O

K, (2.3) has a systematic upward bias, i.e. for ek, jas in (2.3) we have that E [ek, j|BK – k + 1] > E [ ek, j|BK – k + 1] for all fj.

The point is that we are interested in the conditional estimation error given the whole triangle DK including the upper right corner DKO. This conditional estimation error given by (2.1) is a real number, which is unknown to us, because we do not know the true chain ladder factors fj. As a consequence, given the whole triangle DKand hence the resulting estimates fj, for some val-ues of the underlying fj (2.4) will be closer to the true estimation error (2.1) and for others (2.2). The question is, which of the two methods give on aver-age a more accurate result.

(3)

But how to define this average? In [3] the authors consider the average over all possible upper right corners DOK given BK – k + 1. This is one possible point of view, but not the only one.

Another possibility is to vary the underlying true chain ladder factors fjand to take the average over the set of all possible underlying fjwhich might be behind the observed triangle DK. It seems to me that this point of view is even more suited to the conditional situation and hence to the estimation of the conditional estimation error, where the triangle DKis given and where the true factors fjare unknown.

However, averaging over the set of possible fjcan only be done in a strict mathematical way within a Bayesian framework. This will be the subject of the next section.

3. DERIVATION OF THE ESTIMATION ERROR IN ABAYESIAN FRAMEWORK

In the Bayesian framework, we assume that the true chain ladder factors of a given triangle are a realisation

f = ( f1, …, fJ – 1)

of a random vector

F = (F1, …, FJ – 1),

and that conditionally, given F, the chain ladder assumptions hold true. In a Bayesian model, we further have to specify the family of conditional distribu-tions on the one hand and the a priori distribution on the other hand.

Conditionally given F we make the same assumption as (M2) of [1], namely that conditionally Xk, j( j = 1, …, J ) form a Markov chain. To be strict, this is a slightly stronger assumption than the original chain ladder assumption in Mack [2] . In [2] it is assumed that the first and second order moments of Xk, j +1 depend only on Xk, j and not on Xk, lfor l < j, but it is not assumed that this property holds true for the whole distribution of Xk, j +1.

In the following two subsections we will consider two different Bayes mod-els. In both models we will consider a limiting case of a prior where the Bayes estimator of the Fj as well as of Xk, j given the triangle DK coincide with the classical estimators fjresp.Xk, j. Hence the mean square error of this Bayes estimator will be comparable with the estimation error of the classical chain ladder model.

Instead of the random variables Xk, j +1we will rather consider the random variables Yk, j = , , k j k j+1 X X

(4)

Yj= (Y1, j, ...,YK – j, j) the random vector and by yj = ( y1, j, ..., yK – j, j) a realisation of Yj.

We will also distinguish between the random variable

j j , , k j k j . F S k K j 1 = = -X Y

!

and the observed chain ladder factor fj as a realisation of Fj.

3.1. Bayesian model I: normal distribution

In this model we assume that conditionally, given F, Yk, jare normally distribu-ted, i.e. we make the following assumptions:

(N1) Conditionally, given F and Bj, the random variables Yk, j(k = 1, …, K – j ) are independent and normally distributed with moments

E [Yk, j| Fj] = Fj, Var [Yk, j| Fj, Xk, j] = j , k j s2 X .

(N2) F1, …, FJ – 1are independent and uniformly distributed on (– a, a).

Remarks:

The time series’ model considered in [1] with the additional assumption, that the ek, jare normally distributed, assumes that (conditionally given F = f ) the triangle DKis generated by

Xk, j + 1 = fjXk, j+sj Xk j, ek, j + 1, where

ek, j are independent and ek, j~ N (0,1).

Assumption (N1) is slightly more general, because the r.v.ek, j + 1(k = 1, …, K – j ) are assumed to be only conditionally independent given the observa-tions Xk, jin column j.

Assumption (N1) could be generalised by allowing the sjto depend on the observations Xk, j in column j .

Because of the normality assumption, some of the Xk, j could become nega-tive. A negative value of Xk, jcontradicts the chain ladder assumption, since the conditional variance of Xk, j +1would then become negative. Therefore,

(5)

from a theoretical point of view, this model is a bit questionable as it is the time-series’ model in [1] . We consider it here to get a direct Bayesian counter-part to the time series’ model of [1] with normally distributed ek, j. In the next subsection we will consider a model which has no longer this theoretical deficiency. However, as we will see, it will have other drawbacks.

We now derive the Bayes estimator and its mean square error. On the condi-tion F = f and given B1, the joint density of Y1, …, YJ – 1is given by

uf(y1, …, yJ – 1) = j k, y j x e p s 2 1 j x k K j j J s 2 1 2 1 1 1 1 k j, j 1 2 -- -= -= -, k j 2 f

%

%

] g Z [ \ ]] ] _ ` a b b b (3.1)

where xk, j= Xk,1· yk,1· ... · yk, j – 1.

From (3.1) we obtain for the joint posterior distribution of F given the trian-gle DK , k j K j k j, u f e j J X Y s D 1 1 2 1 j k K j 1 2 1 " = -- - -= -2 f

%

!

]

g

] g for fj(– a, a). (3.2)

The exponent can be written as

j j , , k j k j j j j j j j j j j j , , , , , k j k j k j k j k j K 2 Y j j j , X S S S S f S f h s s s s 2 k j k K j k K j 2 1 2 1 1 2 1 2 2 1 2 2 2 -= -= - + = - + -= -= -= -X X Y Y Y D f f f f

!

!

!

_ _ a a ^ c i i k k hm (3.3)

where h (DK) depends only on the observations DK and not on the unknown

parameters fj. Inserting (3.3) into (3.2) yields

f K j u f e j J S s D 1 1 2 1 j j 1 2 " = -- 2- -j f

%

]

g

]

g

for fj(– a, a). (3.4)

Now we make the prior non informative by considering the limiting case a"3. Then we obtain immediately from (3.4) that the posterior density of F is given by

j j f K j u S e f p s 2 1 j J S s D 1 1 2 1 2 1 j j 1 2 " = -- 2- -j f

%

]

g

]

g

Z [ \ ]] ]] _ ` a b b b b fj ∈. (3.5)

(6)

From (3.5) we see that a posteriori, given the observed triangle DK, the ran-dom variables F1, …, FJ – 1are independent and normally distributed with

j j j i i , . E F F S Var s 2 = = K K D D f 6 6 @ @

Hence the Bayes estimator of Xk, j is

, , k K-k+1 k K-k+1 , k j l K k j, , X X E X l K k j l l K k j Bayes 1 1 1 1 $ = = = - + -= - + -F D f = X

%

%

R T S SS V X W WW

and for the mean square error of the Bayes estimator we obtain

(3.6) , , , , k K k k K k l k K k l k K k l l 1 1 1 1 - + - + - + - + l l l , , k j k j l . f f f f E X E X E F E F X E F X S s 2 K l l l K k j l K k j K K l K l K k j l K k j l K k j K l K k j l K k j l l K k j l K k j 2 2 1 1 1 1 2 2 2 2 1 1 1 1 1 1 2 2 2 1 1 1 1 2 2 2 1 1 2 1 1 $ $ $ $ $ - = = -= - + = -= + -= - + -= - + -= - + -= - + -= - + -= - + -= - + -= - + -= - + -X X F D D D D D f f

%

%

%

%

%

%

%

%

%

J L K K J L K K J L K K J L K K J L K K a N P O O N P O O N P O O N P O O N P O O k R T S S S ; 6 8 8 V X W W W E @ B B (3.7) By replacing sl2in (3.7) by s 2

l we obtain the BBMW formula (2.4).

The Mack formula (2.2) is also obtained in the Bayesian framework by the following Taylor-approximation: . f f f l l l K k j l l l K k j l l K k l j l K k j l l l K k j h h K k j 1 1 1 1 1 1 1 1 1 1 1 1 ! l h l 2 2 -- -= -= - + -= - + -= - + -= - + -= - + -= - + -F F F F F f f

%

!

%

%

!

%

a a k k (3.8)

(7)

, , , , k K k k K k k K k l k K k l 1 1 1 1 - + - + - + - + l l , k , f f f f f X E X E X E X s S1 j l l l K k j h h K k j K l l K k j l l l l K k j K l l K k j l l K l K k j l l K k j l K k j 2 1 1 1 1 2 2 1 1 2 1 1 2 2 1 1 2 2 2 1 1 2 1 1 2 2 2 1 1 ! h l - -= -= -= = - + -= - + -= - + -= - + -= - + -= - + -= - + -= - + -e F F F D D D f f f f f

!

%

%

!

%

!

%

!

J L K KK J L K K J L K KK J L K K J L K K a a a N P O OO N P O O N P O OO N P O O N P O O k k k R T S S SS R T S S SS R T S S SS V X W W WW V X W W WW V X W W WW (3.9) (3.10)

which is identical to (2.2) if we replace sl2in (3.10) by s

2

l.

3.2. Bayesian model II: Gamma distribution

In this more realistic model, where the Xk, j are always strictly positive, we assume that conditionally, given F, the observed chain ladder factors are Gamma-distributed, i.e. we make the following assumptions:

(G1) Conditionally, given F and Bj, the random variables Yk, j(k = 1, …, K – j ) are independent and Gamma-distributed with moments

E [Yk, j| Fj] = Fj, (3.11) Var [Yk, j| Fj, Xk, j] = j j , k j F t2 2 X , (3.12)

where t2j are given positive constants. (G2) {F1, …, FJ – 1} = {Q

–1

1 , …, Q

–1

J – 1}, where the random variables Q1, …, QJ – 1

are independent and uniformly distributed on (a– 1

, a ), where a !+. Remarks:

Conditionally, given F = f, the chain ladder assumptions are fulfilled with s2 j =t 2 jf 2 j .

The difficulty of this model is that sj2depends on the parameter fjto be esti-mated. Contrary to the normal model it is therefore not possible here that E [Fl2|DK] in the Bayesian model is identical to E [f

2

l |Bl] in the classical model, since the latter depends on the parameter flto be estimated. We can therefore not expect to get out of this model exactly the BBMW formula (2.4).

(8)

Assumption (G1) can also be written as a time-series’ model in the follow-ing way: conditionally, given F = f, the triangle DKis generated by the process

Xk, j + 1 = G~k, j · Xk, j,

where conditionally, given the observations in column j, the r.v. G~k, j(k = 1, …, K – j ) are independent and Gamma distributed with

E [G~k, j] = fj, Var [G~k, j| Xk, j] = j j , k j f t2 $ 2 X .

It follows from assumption (G1) that on the condition Q =‡, Yk, j (k = 1, …, K – j ) are Gamma-distributed with

shape parameter gk, j= Xk, jt

– 2

j and (3.13)

scale parameter ck, j = Xk, jt– 2j ‡j, (3.14)

i.e. the conditional density of Yk, j is given by g g y c . u y c e g G , , k j k j 1 y , , , j k j k j k j = - -j ^ _ _ h i i

Note also that

Var [Fj| Qj, Bj] = j j j F t2 2 S .

Assumption (G2) is equivalent to the assumption that the density of the Fj is given by

u ( f ) = a a 1f . 2

2

-

-•

It might look a bit strange that we have introduced Qjin assumption (G2). The reason is just mathematical feasibility.

We derive again the Bayes estimator of the unknown chain ladder factors Fj as well as of Xk, jand the mean square error of these estimators. On the con-dition Q = ‡ and B1, the joint density of {Y1, …, YJ – 1} is given by

u(y1, …, yJ – 1) = j j j , k j -- x x x y j j j j , x x y e t t G , , k j k j k K j j J t t t 2 2 1 1 1 1 , , , , k j k j k j k j 2 2 2 = -= -- -- - j q

%

%

` ` j j (3.15)

(9)

Denote by uDK(‡1, …,‡J – 1) the posterior density of Q1, …, QJ – 1 given the

observed triangle DK. From xk, jyk, j = Xk, j + 1and (3.15) follows that uDK(‡1, …,‡J – 1) j j j j k=1 k=1 q e j J t t 1 1 , , k j K j k j K j 2 2 " = -- - -j X X

%

! ! for j(a–1, a), (3.16) where terms not depending on ‡j( j = 1, …, J ) are included in the normalising constant and therefore omitted on the right hand side of the equation.

Now we consider the limiting case for a " 3. Because Sj= k=1 k j, K-j

X

!

, we then obtain from (3.16) that

uDK(‡1, …,‡J – 1) j j j j k 1= j qS e j J t t 1 1 , k j K j 2 1 2 " = -+ -j X

%

! for ‡j∈+. (3.17)

From (3.17) follows that a posterior, given the observed triangle DK,

the random variables Q1, …, QJ – 1are independent,

the Qjare Gamma distributed with

shape parameter gj = Sjtj– 2+ 1 and

scale parameter cj = k j, tj k K j 1 1 2 + = -X

!

. Remark:

If we take an exponential distribution with density u (‡)le–l‡

as a prior for Qiand if we consider the limiting case l " 0, we obtain the same a posteriori distribution as in (3.17).

For the first and second order moments we get from (3.17)

j j j j j j j j j j j j j j , , , , K k j k j k j k j j j j j j j k 1= j j j j , q q E e t t t t t t G G G 1 1 k K j S S k K j S k K j S k K j t t t t t 1 1 1 1 0 1 1 1 1 1 1 1 , k j K j 2 2 1 2 2 = + = + = = 3 + = - + - -+ = - + + = -+ = -+ -2 2 2 2 2 2 -j 2 -S S X X S S X X D F X f

#

!

!

!

!

! a a a a a a k k k k k k 8 B

(10)

and by an analogous calculation j j j j j j j j j j j j j , , K k j k j > , > , , . f E F if otherwise if otherwise t t t t t t 1 1 k K j k K j 2 1 1 1 1 2 3 3 = -= + -+ = -+ = -2 2 2 2 2 2 -S S S S S D X X

!

!

J L K KK N P O OO 9 C Z [ \ ] ] ] ] Z [ \ ] ] ] ]

Hence the Bayes estimator of Xk, j is given by

, , k K-k+1 k K-k+1 , k j l K k j, , X X E X l K k j l l K k j Bayes 1 1 1 1 $ = = = - + -= - + -F D f = X

%

%

R T S SS V X W WW

and for the mean square error of the Bayes estimator we obtain

(3.18) K-k+1 , k j , , , k K k k K k l k K k l l 1 1 1 - + - + - + j l , k j ... ... > . . f f f f E X E X E F X if for alll otherwise t t t 1 K K k j K K l K k j l K k j l K k j l l l l K k j l l 2 2 1 1 1 2 2 2 2 1 1 1 1 2 2 1 1 2 2 2 1 1 2 $ $ $ $ $ $ $ 3 - = = -= -= + - -- - + -= - + -= - + -= - + -= - + -X S S X F D D D F f

%

%

%

%

J L K K J L K K J L K K a a N P O O N P O O N P O O k k ; ; 8 E E B Z [ \ ] ] ]]

Formula (3.18) is not directly comparable to (2.4) because tl2is not the same parameter as sl2. Indeed the difficulty with this model to be comparable with the classical formula is that sl2=t

2

lFl

2

and thus depends itself on the parameter to be estimated. A natural estimator for tl2 is

t2 l = l l . s f2 2 If we replace tl2 in (3.18) with t 2 l we obtain

(11)

, , k j k K k l l l l l l l 1 - + > , . s s s e f f f f X if for alll otherwise l K k j l l K k j l 2 1 1 2 2 2 2 2 1 1 2 2 $ 3 = = - + + - -= - + -S S

%

%

J L K K K K K J L K K K K K N P O O O O O N P O O O O O Z [ \ ] ] ]] ] ] ]] (3.19) (3.19) is not exactly the same as the BBMW formula (2.4) and yields higher values than (2.4). The resulting estimates are approximately the same in the case where l l >s f l 2 2 S for all l.

3.3. Summary and Conclusions

The estimation error resulting in the Bayesian normal model coincides with the formula of BBMW. In this normal model the Xk, j could become nega-tive, which is problematic from a theoretical point of view.

In the Gamma Bayesian model we have obtained a formula, which is similar but not fully identical to the BBMW formula.

Hence we could not find a model which doubtlessly confirmed the new approach (2.4) of BBMW. Nevertheless, by comparing the results with BBMW and Mack, the results seem to be rather in favour of the formula of BBMW.

One question and criticism in [3] is, whether it is allowed and whether it

makes sense to take the product

l l f E B l K k j 2 1 1 = - +

-%

R T S SS V X W WW

in (2.3). From a Bayesian point of view, the answer to this question is definitely positive the reason being that in the Bayesian context the random variables Fjare a posteriori independent and that therefore

l K l K . E F E F l K k j l K k j 2 1 1 2 1 1 = = - + -= - + -D D

%

%

R T S SS 8 V X W WW B

The result that the r.v. Fj are a posteriori independent is not a specific of the two considered Bayesian models. It results essentially from the natural model assumptions that, on the condition F = f, Xk, j( j = 1, …, J ) form a Markov process and that the r.v. Fj are a priori independent. The posterior inde-pendency property will therefore also be encountered in other Bayesian mod-els with the above natural assumptions.

The question, whether one should consider the average over the set of all possible true factors fj or the average over all possible upper right corners

(12)

DOKgiven BK – k + 1, is rather a question of philosophy than mathematics. The first definition of average is in my view probably better suited to the problem of estimating the conditional estimation error. Then, based on the results above, I would give preference to the formula of BBMW. If one takes the average over all possible upper right corners DOKgiven BK – k + 1then the formula of Mack should be preferred, because BBMW has then an upward bias whereas Mack does not. In the classical as well as in the Bayesian framework, the Mack formula is obtained by a first order Taylor approximation to BBMW. There-fore the difference between BBMW and Mack will mostly be rather small and not essential for practical purposes. Finally, what is called the conditional approach in the paper by BBMW, is a third way to look at the problem. This approach can be described as follows: each of the r.v. Yk, jis associated with a weight wk, j and the variance of Yk, j is inversely proportional to wk, j; the r.v. Yk, jare then considered as independent and the weights are taken from the observed triangle, i.e. wk, j= Xk, j= observed value in the triangle.

We have focused on the problem of the estimation error, since the discussion

on this part of the mean square prediction error is controversial. We should however remark here that, surprisingly, the estimate of the process variance in the Bayesian framework would differ and would be greater than the estimate suggested by both, Mack and BBMW. If we consider the Bayesian approach as the stringent conditional approach, then the estimate of the process variance would have to be changed too.

ACKNOWLEDGEMENT

I would like to thank Thomas Mack for many fruitful discussions which have helped me to better understand many features and which have very much improved the quality of this discussion contribution. I would also like to thank the authors BBMW of [1] for their comments on an earlier version and for hav-ing brought up a problem which might also be relevant in other areas.

REFERENCES

[1] BUCHWALDERM., BÜHLMANNH., MERZM., WÜTHRICHM.V. (2006) The Mean Square Error of Prediction in the Chain Ladder Reserving Method, ASTIN-Bulletin 00(0), 000-000. [2] MACKT. (1993) Distribution-free calculation of the standard error of Chain ladder reserve

estimates. ASTIN Bulletin 23(2), 213-225.

[3] MACKT., QUARGG., BRAUNCh. (2006) The Mean Square Error of Prediction in the Chain Ladder Reserving Method – a Comment. ASTIN Bulletin 00(0), 000-000.

ALOISGISLER

Winterthur Insurance Company Römerstrasse 17

Box 357 – CH-8401 Winterthur Switzerland

Références

Documents relatifs

The transfer matrix is de- fined as the staggered product of R-matrices, which have staggered sign of the anisotropy parameter ∆ and staggered shift of the spectral parameter of

Pour les lois de probabilit´e, nous noterons F la fonction de r´epartition, et f la densit´e associ´ee - si elle existe - ou la masse de probabilit´ee associ´ee dans le cas

We also propose an attacker model using fault-injection able to obtain some bits of the secret in the semi-interleaved cases (including the Montgomery ladder) but none in the

Souligne les verbes conjugués et entoure leur sujet (dans

Comparison between MLE (CATS) and a Bayesian Monte Carlo Markov Chain method, which estimates simultaneously all parameters and their uncertainties.. Noise in GPS Position Time

– MCMC yields Up component velocity uncertainties which are around [1.7,4.6] times larger than from CATS, and within the millimetre per year range

Another simple ladder graph is the Brecht-Colbourn ladder [13,48], which has been considered as a case study, both for the two- and all-terminal reliability, in order to evaluate

site energies modulations are totally correlated with the Madelung potential modulation, both for the chain and ladder subsystems and both for the undoped and highly calcium