• Aucun résultat trouvé

Selection of statistics for a jump-diffusion stochastic volatility model Creel and Kristensen (forthcoming; henceforth CKf) use ABC methods to estimate a

con-tinuous time stochastic volatility model that has non-constant drift, leverage, and jumps with dynamic jump intensity. In this model, the true log price,pt, solves

dpt = (µ0+µ1(htα)/σ)dt+

qexp(ht)dW1,t+JtdNt. (5.4)

whereht is log volatility, Jt is jump size, and Nt is a Poisson process with time-varying jump intensityλt. Log-volatility is specified to follow

dht = ht+κ(αht)dt+σ

ρdW1,t+

q1−ρ2dW2,t

, (5.5)

whereW1,tandW2,tare two independent standard Brownian motions. Finally, the jump intensity processλtis modeled as

λt=1(λt >0) whereλt =λ0+λ0λ1(htα)/σ.

CKf discusses the model and interprets the parameters. One of the results was that mea-surement error inpt is not an important factor for the data set that was studied. For this reason, here, we use the model that does not include measurement error. We collect the 10 parameters of the model in the vectorθ= (µ0,µ1,α,κ,σ,ρ,λ0,λ1,µJ,σJ).

The data series available for computing statistics include daily returns, r, realized volatility measured using 5 minute intervals (RV) , and realized bipower variation (BV), which is a measure of volatility that is robust to jumps. The candidate statistics are func-tions of these three variables. The variable J = RVBV is constructed as indicator of jump activity, as is discussed by Andersen, Bollerslev and Diebold (2007). Another vari-able,r2, is identical tor, except that returns are set to zero in periods when Jt is greater than 2.5 of its own standard deviation. This is an attempt to create a returns series that is net of jumps. The candidate statistics are: the mean of the jump indicator (statistic 1); the correlation of the jump indicator with the average of its 10 most recent lags (statistic 2);

the mean, standard deviation, skew and kurtosis of the variablesr,r2,RV,BV,J (statis-tics 3-22); the correlations between the same 5 variables (statis(statis-tics 23-32), and regression coefficients, standard errors andR2from several auxiliary models (statistics 33-53), along with cross equation correlations of the residuals of regressions 2, 3 and 4 (statistics 54-56).

The four auxiliary regressions3are:

Aux.reg. 1: logBVt=α1logBVt1−logBVt2

2 +α2logBVt3−logBVt4

2 +ǫ1t

3All variables in the auxiliary regressions are standardized and and normalized before estimation of parameters.

5 Examples 22

Aux.reg. 2:Jt=α1logBVt1+α210j=1Jtj

10

! +ǫ2t

Aux.reg. 3: logBVt =α1logBVt1+α2r2t1+α3(logBVt1)2+α4(logr2t1)2+ǫ3t

Aux.reg. 4: logr2t= α1r2t1+α2logBVt1+α3(r2t1)2+α4(logBVt1)2+ǫ4t. These auxiliary statistics are somewhat different than those used in CKf. In particular, the EGARCH model used in that paper was the most costly statistic to compute, but it contributed no statistics to those that were selected, so it has been dropped. There have been some other minor adjustments to the auxiliary regressions, too, but overall, the statistics in the candidate set are very similar to those of the previous paper.

The prior is different from that used in CKf, because the uniform prior of CKf, and the broad bounds chosen for its support, are probably unrealistically uninformative, and because CKf already presents results using selected statistics, for the original prior. For the purpose of selecting informative statistics, it is probably of more interest to investi-gate how the procedure performs when draws are made from a region of non-negligible posterior mass. Thus, the prior used here is one that makes use of the previous esti-mation results for the S&P 500 series, during the period 2008-2011, to focus on such a region. Nevertheless, we seek to select statistics which will perform well even if market conditions change. Specifically, the prior is that the parameters have independent nor-mal distributions with means given by the estimates for the S&P 500 series during the period 2008-2011, found in Table 3 of CKf, and standard deviations equal to two times the estimated standard errors, also reported in the same table. However, the lower and upper bounds for parameters given in Table 1 of CKf continue to be enforced, through re-jection sampling. Thus, the prior is not exactly unbiased, because the parameter bounds are occasionally binding for certain parameters (most notably,µ0andρ).

To apply the selection procedure, we set S = 20000 and R = 1000. A larger value of S is used than was the case for the linear regression model, because the number of candidate statistics is much larger, and a larger sample is required to help ensure that EˆS[Θ|Z(δ) =z˜r(δ)]in equation (3.5) is a reasonably accurate nonparametric fit. The loss function is the same mean absolute error criterion as was used in the previous example.

We performed a battery of 100 runs of the selection procedure.

Table 2lists the selected statistics from the best of the 100 runs, and describes each selected statistic. Of the 56 candidate statistics, 17 were selected. Perhaps the most im-portant point to highlight is that the selected statistics are a subset of the candidate set which one would be unlikely to arrive at if unsystematic experimentation were done. It is encouraging to see that statistics indicative of jump activity (statistic 1), jump

dynam-5 Examples 23

Tab. 2:Selected statistics, jump-diffusion stochastic volatility model Statistic Description Statistic Description

1 jump detection 36 σbǫ, aux. reg. 1 2 jump clustering 43 αb2, aux. reg. 3 4 meanr2 44 αb3, aux. reg. 3 6 meanBV 50 αb3, aux. reg. 4 23 corr(r,r2) 52 αb4, aux. reg. 4 25 corr(r,BV) 53 R2, aux. reg. 4 30 corr(RV,BV) 55 corr(ǫb3,ǫb4) 31 corr(RV,J) 56 corr(ǫb2,ǫb4) 34 αb2, aux. reg. 1

ics (statistic 2) and leverage (statistics 25 and 55) are selected, because these features are present in the model. The means ofr2 andBVare selected, but variances, skewness and kurtosis are never selected. Several correlations between variables are selected, and two of the cross equation residual correlations are selected. The four auxiliary regressions all contribute to the set of selected statistics.

Next, to investigate how well the selection procedure performs, we do a Monte Carlo exercise similar to that of CKf. The sample size isn = 2000, as in CKf. For each esti-mator explored, we perform only 100 Monte Carlo replications of the ABC estiesti-mator, as the goal is not to explore carefully the performance of the ABC estimator, but rather to focus on how well the selection procedure performs. The true parameter values are the estimated parameters for the 2008-2011 period, in Table 3 of CKf. Table3presents bias and root mean squared error for the prior, the ABC estimator using the selected statistics described in the previous paragraph (the “Baseline” results), the ABC estimator that uses all 56 statistics in the candidate set, and an ABC estimator that uses 17 randomly selected statistics, out of the 56 candidate statistics. We may note that the prior is essentially unbi-ased, except forµ0andρ, for which the parameter bounds are sometimes binding. Com-paring the “Prior” and “Baseline” columns, we see that the selection procedure yields an ABC estimator that reduces root mean squared error (RMSE), for all parameters. In a number of cases, the reduction is more than 50%. The “Baseline” results are more biased than the prior, for most parameters. This is not surprising, because the prior only has bias because of the enforcement of parameter bounds. However, the bias forα,ρandλ1 is perhaps larger than we would like to see. Comparing to the ABC estimator that uses all available statistics (“All” in the table), the selection procedure achieves a reduction, in some cases considerable, of RMSE, for most parameters. It is perhaps somewhat sur-prising how well the estimator that uses all statistics performs. This may be due to two factors: first, the set of candidate statistics may have been chosen well, in that most or all statistics are in fact are informative, and secondly, Algorithm 3, which is used for the ABC estimation, is successful in concentrating on an area of high posterior mass, even when the conditioning statistic has high dimension. Finally, the column labeled “Ran-dom” give results for an ABC estimator that uses 17 randomly selected statistics (this is

5 Examples 24

Tab. 3:Monte Carlo results, Stochastic volatility model

Prior Baseline All Random

Param. True Bias RMSE Bias RMSE Bias RMSE Bias RMSE

µ0 -0.021 0.012 0.052 -0.007 0.026 -0.015 0.023 -0.022 0.029 µ1 -0.013 0.000 0.036 -0.017 0.027 -0.021 0.039 -0.013 0.031 α 0.785 -0.003 0.357 0.103 0.176 0.136 0.232 0.227 0.303 κ 0.036 0.003 0.021 0.007 0.008 0.007 0.013 0.009 0.011 σ 0.204 0.000 0.048 -0.010 0.025 -0.007 0.029 0.014 0.028 ρ -0.713 0.014 0.147 0.027 0.089 -0.046 0.117 0.023 0.132 λ0 0.016 0.000 0.008 0.006 0.006 0.004 0.005 0.004 0.005 λ1 1.152 0.000 0.331 0.074 0.126 0.054 0.130 0.088 0.142 µJ -0.002 0.000 0.012 -0.001 0.003 -0.001 0.003 0.000 0.003 σJ 1.472 0.000 0.072 0.014 0.023 0.002 0.018 -0.005 0.025 the same number of statistics as was selected for the “Baseline” results). We note that this estimator has considerably larger RMSE than does the estimator that uses the “Baseline”

set, which is indicative that the selection procedure does in fact succeed in choosing in-formative statistics out of a candidate set. The “Random” results do give a reduction in RMSE, compared to the prior. This is not surprising, because the candidate set was cho-sen to include statistics that one would hope are informative. Also, one should note that the results for the “Random” set are only indicative, because they depend on the specific set of 17 statistics that were drawn in a single case. The main point is that the selection procedure performs considerably better than the random selection.

The ABC estimator using the “Baseline” set of statistics exhibits some biases that we might like to try to address. It is possible to modify the criterion that the selection pro-cedure minimizes to focus on the MAE of one or more parameters, rather than on all of the parameters. The default criterion minimizes the scaled MAE, averaged over all of the parameters: the loss function L in equation (3.5) is set to L(θ) = dim1 θdimj=1θ σ1

j

θj. If we are interested in selecting statistics to fit well a certain parameter, for example, the fourth parameter, without regard to the other parameters, we can set the loss function to L(θ) = σ1

4 |θ4|. This targets only the selected parameter. It may also be useful to intro-duce the penalty termp(δ)in equation (3.5), by settingain equation to a number greater than zero, in order to find a small set of statistics that are most informative for the chosen parameter. We did this for the parametersα,ρandλ1, each in turn, settinga= 0.05. For α, the selected statistics were 5 and 48. Forρ, the selected statistics were 30, 34 and 55. For λ1, only statistic 25 was selected. Statistics 34 and 55 were already selected in the baseline set.

Table 4presents ABC Monte Carlo results for three cases: 1) adding statistics 5 and 48 to the baseline set, in order to targetα; 2) adding statistics 30 and 55 to the baseline set, to targetρ; and 3) adding statistics 5, 30, 48 and 55, to target both αandρ. When α is targeted, the bias of the ABC estimator drops from 0.103 (“Baseline” results in Table 3) to 0.018 (“Baseline+α” results in Table 4). RMSE also drops slightly. For the other

6 Conclusion 25

Tab. 4: Monte Carlo results, Stochastic volatility model, adding specially selected statis-tics Baseline+α Baseline+ρ Baseline+α+ρ

Param. True Bias RMSE Bias RMSE Bias RMSE

µ0 -0.021 -0.010 0.030 -0.011 0.027 -0.017 0.033 µ1 -0.013 -0.016 0.023 -0.023 0.031 -0.013 0.020 α 0.785 0.018 0.174 0.082 0.157 0.059 0.174 κ 0.036 0.009 0.011 0.009 0.010 0.010 0.012 σ 0.204 -0.000 0.030 -0.001 0.028 -0.001 0.029 ρ -0.713 0.015 0.098 0.008 0.090 0.011 0.093 λ0 0.016 0.006 0.006 0.006 0.006 0.006 0.006 λ1 1.152 0.079 0.124 0.060 0.114 0.075 0.120 µJ -0.002 -0.000 0.003 -0.001 0.004 -0.001 0.003 σJ 1.472 0.013 0.023 0.012 0.021 0.014 0.023

parameters, bias and RMSE are not affected in any systematic way by the inclusion of the two additional statistics. Some variation is expected, due to the fact that only 100 Monte Carlo replications were used. Whenρis specifically targeted, the results are in the column labeled “Baseline+ρ”, in Table4. Comparing to the “Baseline” results in Table3, we see that bias for estimation ofρdeclines from 0.027 to 0.008, and RMSE is essentially the same.

Finally, if both parameters are targeted, we see that results for “Baseline+α+ρ” in Table4 have biases for bothαandρthat are lower than those obtained when only the “Baseline”

statistics are used. These exercises illustrate the fact that the selection procedure can be performed for all parameters jointly, which is the normal usage of most general interest, and it can be targeted to certain parameters, if these parameters are of special interest, or if one has reason to believe that the overall selection procedure may not have found statistics that are sufficiently informative for these parameters. The final set of statistics used for ABC estimation can be determined using both sources of information.

6 Conclusion

This paper presents a method of selection of statistics for ABC (and simulation-based GMM estimators) that is conceptually simple and easy to implement. It is designed to select statistics for accurate point inference, throughout the support of the prior (or on a parameter space). While one might hope that this would lead to an accurate posterior, no claims are made that this is the case. The method relies on the widely accepted crite-rion of minimizing mean absolute error (or a similar measure, such as root mean squared error), possibly with a penalty to encourage parsimonious selection. The criterion to be minimized is directly related to a point estimator associated with the ABC or GMM es-timator of interest - there is no projection or dimension reduction step that intervenes.

The criterion must be minimized over a discrete search space that has many elements.

Simulated annealing, which is a widely used and well known tool, was used to address this problem. In principle, any global minimization algorithm that operates on a discrete

6 Conclusion 26

choice set could be used in place of simulated annealing. Simulated annealing was cho-sen simply because it is well known in the economics literature and related fields, and because it works well for this application. It is a simple matter to run multiple simu-lated annealing minimizations in parallel, to speed up the search for a set of statistics that minimize the criterion.

It is to be emphasized that other researchers who wish to use the method for their own ABC research can very easily make use of the provided software. The only requirement is that the user provides a set of parameters, drawn from the prior, and the corresponding candidate statistics,{(θs,Ws)}. This is easy to do, as it is an essential part of any ABC method that is based on statistics. The software provided will then selectZ, the subset ofW, according to the default criterion, or the criterion that the user specifies. This takes very little effort on the part of the user.

The method has been tested using a simple example, for which theory tells us which statistics should be selected, and it is found to be able to select these statistics with good accuracy. In spite of the simplicity of the example, it is important to note that the model contains 6 parameters, and a candidate set of statistics with 35 elements. The stochastic volatility example presents a model with 10 parameters and 56 candidate statistics. In this case, theory does not tell us what are the optimal statistics. However, the Monte Carlo results show that the selection method results in precise estimation, compared to results using different sets of statistics. This second example also shows that the proposed methods can be applied when the model is complex, with numerous parameters and a large set of candidate statistics.

The criterion function that is minimized focuses on minimizing the integrated Bayes expected loss of simulated posterior mean. As explained in Section 4, it is a simple matter to change the criterion function in order to focus on other features of the posterior distri-bution, including the full distridistri-bution, and still use the simulated annealing algorithm for minimization. In order to do so, one would simply edit the function “select_obj” which is near the top of the file “SelectionAlgorithm.jl” and replace the least-squares criterion by the KLIC criterion developed in Section 4. The performance of this alternative selection method is left for future research.

[insert online link to SelectionAlgorithm.jl accompanying text file about here]

References

[1] Andersen, T. G., T. Bollerslev and F. X. Diebold, 2007, “Roughing it up:

including jump components in the measurement, modeling and forecasting of return volatility”,Review of Economics and Statistics89, 701–720.

[2] Andrews, D.W.K. and B. Lu, 2001, “Consistent model and moment selection procedures for GMM estimation with application to dynamic panel data models,Journal of Econometrics, 101, 123-164.

6 Conclusion 27

[3] Beaumont, M., W. Zhang and D. Balding, 2002, “Approximate Bayesian computation in population genetics”,Genetics, 162, 2025-2035.

[4] Beaumont, M.A., J.-M. Cornuet, J.-M. Marin, and C.P. Robert, 2009, “Adap-tive approximate Bayesian computation”,Biometrika,96, 983-990.

[5] Blum, M.G.B., M. A. Nunes, D. Prangle and S. A. Sisson, 2013, “A com-parative review of dimension reduction methods in approximate Bayesian computation”,Statistical Science, 28, 189–208.

[6] Barnes, C.P, S. Filippi, M.P.H. Stumpf and T. Thorne, 2012, “Considerate approaches for constructing summary statistics for ABC model selection”, Statistics and Computing, 22, 1181-1197.

[7] ˇCerný, V. (1985), “Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm”,Journal of Optimization Theory and Applications,45, 41-51.

[8] Cheng, Xu and Z. Liao, 2012, “Select the valid and relevant moments: one-step procedure for GMM with many moments”, PIER Working paper 12-045.

[9] Chernozhukov, V. and H. Hong, 2003, “An MCMC approach to classical estimation”,Journal of Econometrics115, 293-346.

[10] Creel, M. and D. Kristensen, 2011 (unpublished), “Indirect likelihood infer-ence”, Dynare Working Papers (CEPREMAP), Working Paper 8.

[11] Creel, M. and D. Kristensen, 2013 (unpublished), “Indirect likelihood infer-ence (revised),” Working Paper 931, UFAE and IAE Working Papers.

[12] Creel, M. and D. Kristensen, forthcoming, “ABC of SV: limited information likelihood inference in stochastic volatility jump-diffusion models”,Journal of Empirical Finance, doi:10.1016/j.jempfin.2015.01.002.

[13] Del Moral, P., A. Doucet and A. Jasra, 2012, “An adaptive sequential Monte Carlo method for approximate Bayesian computation”,Statistics and Com-puting, 22, 1009-1020.

[14] Drovandi, C.C. and A.N. Pettitt, 2011, “Estimation of parameters for macroparasite population evolution using approximate Bayesian Compu-tation”,Biometrics, 67, 225-233.

[15] Fearnhead, P. and Prangle, D., 2012, “Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation”, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 74, 419–474.

6 Conclusion 28

[16] Gallant, A. R. and G. Tauchen, 1996, “Which moments to match?” Econo-metric Theory12, 657-681.

[17] Goffe, W.L., G. D. Ferrier and J. Rogers (1994), “Global optimization of sta-tistical functions with simulated annealing”,Journal of Econometrics,60, 65-99.

[18] Gouriéroux, C., A. Monfort, and E. Renault, 1993, “Indirect inference,” Jour-nal of Applied Econometrics, 8, S85-S118.

[19] Hall, A. R., A. Inoue, J. M. Nason, and B. Rossi, 2012, "Information crite-ria for impulse response function matching estimation of DSGE Models,"

Journal of Econometrics, 170, 499-518.

[20] Hall, P., J. Racine and Q. Li , 2004, “Cross-Validation and the Estimation of Conditional Probability Densities”,Journal of the American Statistical Associ-ation, 99, 1015-1026.

[21] Hansen, L. P., 1982, “Large sample properties of generalized method of mo-ments estimators,”Econometrica,50, 1029-1054.

[22] Härdle, W., P. Hall and J.S. Marron, 1988, “How Far Are Automatically Cho-sen Regression Smoothing Parameters From Their Optimum?” Journal of the American Statistical Association, 83, 86- 95.

[23] Joyce, P. and P. Marjoram, 2008, “Approximately sufficient statistics and Bayesian computation”,Statistical Applications in Genetics and Molecular Bi-ology,7, Art. 26.

[24] McFadden, D., 1989, “A method of simulated moments for estimation of discrete response models without numerical integration,”Econometrica, 57, 995–1026.

[25] Marin, J.-M., P. Rudlo, C. Robert and R. Ryder, 2012, “Approximate Bayesian computational methods,Statistics and Computing, 22, 1167-1180.

[26] Marjoram, P., J. Molitor, V. Plagonal, V. and S. Tavaré, 2003, “Markov chain Monte Carlo without likelihoods”,Proceedings of the National Academy of Sci-ence, 100, 15324- 15328.

[27] Marron, J.S. and D. Nolan, 1989,“Canonical kernels for density estimation”, Statistics and Probability Letters, 7, 195-199.

[28] Nunes, M. A. and D.J. Balding, 2010, “On optimal selection of summary statistics for approximate Bayesian computation”,Statistical Applications in Genetics and Molecular Biology,9, Art. 34.

6 Conclusion 29

[29] Örkcü, H., 2013, “Subset selection in multiple linear regression models: A hybrid of genetic and simulated annealing algorithms”,Applied Mathematics and Computation, 219, 11018-11028.

[30] Sisson, S., Y. Fan, and M. Tanaka, 2007, “Sequential Monte Carlo without likelihoods”,Proceedings of the National Academy of Science, 104, 1760-1765.

[31] Smith, A., 1993, “Estimating nonlinear time series models using simulated vector autoregressions,”Journal of Applied Econometrics, 8, S63-S84.

[32] Stock, J., J. Wright and M. Yogo, 2002, “A survey of weak instruments and weak identification in generalized method of moments”, Journal of Business

& Economic Statistics, 20, 518-529.

[33] Stone, M., 1974, “Cross-validatory choice and assessment of statistical pre-dictions,”Journal of the Royal Statistical Society, Series B, 36, 111-147.

[34] Tauchen, G., 1986, “Statistical properties of generalized method-of-moments estimators of structural parameters obtained from financial mar-ket data”,Journal of Business & Economic Statistics,4, 397-416.

[35] Wu, L. and Y. Wang, 1998, “An introduction to simulated annealing algo-rithms for the computation of economic equilibrium”, Computational Eco-nomics,12, 151-169.

Documents relatifs