• Aucun résultat trouvé

Choosing tests and test statistics

Dans le document WORLD CLIMATE PROGRAMME DATA and MONITORING (Page 65-68)

PLOTTING PRACTICE AND STYLE

5.6 Choosing tests and test statistics

5.6.1 Introductory notes

This section provides summaries of common tests for change. These are intended to be used as a source of test statistics for use within resampling methods. Further details on most of these tests are given in Appendices A-F.

5.6.2 Choosing which test statistics to use

Sections 5.6.5-5.6.7 list a number of common tests that can be used to test for change. For most studies, it is recommended that more than one of these tests should be used. Criteria to be aware of when selecting test statistics are

• Type of change that is of interest

• Power of test — more powerful tests are to be preferred.

• Different types of test — some tests are very similar to one another and it is best to choose a selection of tests that are not too similar

• Whether test is for a known or unknown change-point time (see Section 5.6.4).

If a resampling technique is to be used, it is possible to either construct a test statistic to test for a particular type of change, or to extract a suitable test statistic from almost any other test for change.

5.6.3 Choosing how to evaluate significance levels

The tests given in Section 5.6.5 to 5.6.7 fall into two main groups

• Rank-based tests - inherently distribution-free

• Tests that assume a normal distribution.

Tests assuming a normal distribution are provided with the intention that sign levels should be evaluated using resampling methods. For this, the test statistic is calculated as usual, but the significance level is obtained by resampling, rather than by referring to tabulated values. An alternative would be to use the normal scores transformation (Section 5.3).

Rank-based tests can be applied directly to the data, providing the data meet assumptions of independence and constancy of distribution.

If data violate independence assumptions, then it is recommended that block resampling methods are used to obtain significance levels (for both of the above types of test). Note that all tests assume that, under the null hypothesis, the distribution of data values does not change with either tune or space (Section 5.5.2).

5.6.4 Testing when the time of change is unknown

Tests for step-change and tests for a change in distribution work by dividing the data into two parts and comparing the parts. In the tests listed below, some assume a known time of change others assume an unknown time of change.

In most cases, the time of change is unknown and tests assuming an unknown time of change are preferable. It is also possible to adapt a test that assumes a known time of change.

A recommended method is to replace the original test statistic by two new test statistics.

These are (1) the maximum value of the original test statistic when it is calculated for all possible change-point times, (2) the time at which this maximum occurs. These modified test statistics can then be evaluated using resampling methods.

5.6.5 Tests for step change

1. Median change point test / Pettitt’s test for change. This is a rank-based test for a change in the median of a series with the exact time of change unknown (Siegel & Castellan, 1988;

Pettitt, 1979). The test is considered to be robust to changes in distributional form and powerful relative to tests such as the Wilcoxon-Mann-Whitney test (see below).

2. Wilcoxon-Mann- Whitney test / Mann- Whitney test / Mann test / Rank-sum test. This test is a rank based test that looks for differences between two independent sample groups (Siegel &

Castellan, 1988; WMO, 1988; Helsel & Hirsch, 1992). It is based on the Mann-Kendall test statistic - see below, but is calculated for subsets of the series in order to detect the point of change in the mean (Chiew & McMahon, 1993). In its basic form it assumes that the time of change is known. When the time of change is unknown, use of the median change-point test is recommended.

3. Distribution-free CUSUM test. This is a rank-based test in which successive observations are compared with the median of the series (Chiew & McMahon, 1993; McGilchrist &

Woodyer, 1975). The test statistic is the maximum cumulative sum (CUSUM) of the signs of the difference from the median (i.e. the CUSUM of a series of plus or minus ones) starting from the beginning of the series.

4. The Kruskal-Wallis test. The Kruskal-Wallis test (Sneyers, 1975) is a rank-based test for equality of sub-period means. It can also be used to test for equality of sub-period variability.

5. Cumulative deviations and other CUSUM tests. The cumulative deviation test (Buishand, 1982) is based on the rescaled cumulative sum of the deviations from the mean. The test is relatively powerful in comparison with other tests (e.g. Worsley likelihood ratio test;

basic test assumes normally distributed data. Other CUSUM based tests (using Bayesian and likelihood methods) are described in Buishand (1984).

6. Student’s t-test. This is a standard parametric test for testing whether two samples have different means. In its basic form it assumes normally distributed data and a known change- point time.

7. The Worsley likelihood ratio test. The Worsley likelihood ratio test (Worsley, 1979) is similar to Student’s t-test but is suitable for use when the change-point time is unknown. It assumes normality.

5.6.6 Tests for trend

1. Spearman’s rho. This is a rank-based test for correlation between two variables that can be used to test for a correlation between time and the data series (Siegel & Castellan, 1988).

Spearman’s correlation is a rank-based version of the usual parametric measure of correlation (the Pearson product moment; Sprent, 1989).

2. Kendall’s tau / Mann-Kendall test. This is another rank-based test which is similar to Spearman’s rho (same power and still based on ranks) but using a different measure of correlation which has no parametric analogue.

3. Seasonal Kendall test. The seasonal Kendall test is a version of the Mann-Kendall test that allows for seasonality in the data (Hirsch et al., 1982). There is also a modified seasonal Kendall test that additionally allows for some autocorrelation in the data (Hirsch & Slack 1984).

4. Linear regression. The test statistic for linear regression is the regression gradient. This is one of the most common tests for trend and in its basic form assumes that data is normally distributed.

5. Other robust regression tests. There are a number of robust methods for estimating trend in series. These could potentially be used as alternative measures of the change. For example, in least absolute deviation regression, the gradient is that which minimises the sums of the deviations of the points from the fitted line (Bloomfield & Steiger, 1983). Other robust means of estimating the rate of change include M-estimates of regression and trimmed regression (Rousseeuw & Leroy, 1987).

5.6.7 Tests for a change in distribution

1. Kolmogorov-Smirnov test. This test can be used to decide whether two samples have the same distribution. it is a distribution free approach. In its basic form it assumes that the time of change is known. The test statistic is based on the maximum difference between the distributions of the data before and after the change-point.

2. Cramer-von-Mises test. This is a second distribution-free statistic which is similar to the Kolmogorov-Smirnov test, except that it uses a different way of measuring the difference between distributions.

Dans le document WORLD CLIMATE PROGRAMME DATA and MONITORING (Page 65-68)