• Aucun résultat trouvé

6.2 Fit statistics

To really quantify the level of agreement between the data and a given hy-pothesis, one can define a so-called fit statistic or goodness-of-fit parameter.

This statistic is a function of the data, and its value can be compared to what one expects under the assumption of H0. The significance of a discrepancy between data and hypothesis is then quantified by the p-value, that is defined as the probability to find a value for the fit statistic in the region of equal or lesser compatibility with H0than the level of compatibility observed with the actual data [135].

Let us illustrate this for the concrete case of the goodness-of-fit statistic used for the oscillation analysis. The test that is used here to provide a mea-sure of the agreement between the data and the hypothesised model is the Pearson’sχ2 statistic:

χ2=

N ij

DijPij

σij

!2

, (6.2)

where the data is represented by events in the N = nL×nE position-energy histogram bins, Dij, and the hypothesis is contained in the model prediction that is represented by the values Pij. In general, the model is not perfect and is only known within an estimated uncertainty, that is defined as thestandard deviationσij.

In case the data Dij are indeed a statistical representation of the pro-posed model, then theχ2statistic will follow aχ2probability density function (p.d.f.) with the number of degrees of freedom (n.d.o.f.) kequal to the num-ber of position-energy bins (nL×nE) minus the number of free parameters in the fit:

f(χ2,k) = (χ2)k/21eχ2/2

2k/2 Γ(k2) . (6.3)

This p.d.f. distribution and its cumulative are plotted in figure 6.1 for values k= 0, ..., 6. Using the cumulative distribution, the fraction of possible outcomes for the prediction that result in an equal or larger χ2 than the one obtained from the data can easily be derived. It represents the probability to find a more extreme fit result than the measured one, in other words the p-value of the test.

When the p-value of the measurement is deduced, it tells us with what confidence the null hypothesis H0 can be accepted. Often, the level of confi-dence in the degree of rejection of a hypothesis is quoted in literature. This

138 CHAPTER 6. OSCILLATION ANALYSIS

0 2 4 6 8 10 12 14

χ

2 0

0.05 0.1 0.15 0.2 0.25 0.3 0.35

probability density

1 d.o.f. 2 d.o.f.

3 d.o.f. 4 d.o.f.

5 d.o.f. 6 d.o.f.

0 2 4 6 8 10 12 14

χ

2 0

0.2 0.4 0.6 0.8 1

p-value

1 d.o.f. 2 d.o.f.

3 d.o.f. 4 d.o.f.

5 d.o.f. 6 d.o.f.

Figure 6.1: Theχ2 probability density function (top) and its complementary cumu-lative (bottom) for a number of degrees of freedomk=0, ..., 6.

6.2. FIT STATISTICS 139 is called an exclusion confidence level (CL) and is the complement of the p-value: CL=1−p.

In caseH0is not rejected, i.e. when it has an acceptable p-value, the statis-tical analysis can continue with tests of the alternative hypothesesHxand the determination of exclusion contours in the parameter space of the alterna-tive model’s parameters. The methods for the construction of those exclusion contours will be discussed in detail in section 6.5.

We should note that equation 6.2 gives the χ2 fit statistic in its simplest form, where it is assumed that thePijvalues, as well as theDij values are un-correlated and where no systematic uncertainties are included. In the follow-ing paragraphs, we will discuss two methods for the inclusion of (correlated) systematic uncertainties in theχ2 fit.

6.2.1 Nuisance parameters

One way to improve the model of the hypothesis, is to introduce a number of additional parametersαk in the fit [136]:

χ2 =

N ij

DijPij(αk) σij

!2

+

k

αk σk

2

. (6.4)

These αk’s are so-called nuisance parametersand the values σk represent their assumed (correlated) errors. The best fit to the data is now found for those values of the αk that minimise the χ2, in this way including the effect of (correlated) systematic uncertainties in the updated model Pij(αk). It is to constrain the values of the nuisance parameters, that the additionalpull terms, (αkk)2, are introduced as well.

We want to stress that the fitted values of theseαk parameters are in prin-ciple not of interest. They simply help to improve the fit to the data and thus allow to account for some additional systematic uncertainties. However, they sometimes canbe useful to identify possible biases in the model or to study poorly known systematics.

6.2.2 Covariance matrix

The correlated systematic uncertainties in a model can also be represented by so-called covariance matrices. In such matrices, the statistical and uncorre-lated systematic uncertainties for the model prediction construct the matrix diagonal, while the correlated uncertainties populate the off-diagonal matrix elements. Uncertainties related to different sources can be combined in one

140 CHAPTER 6. OSCILLATION ANALYSIS full covariance matrix, that we will simply denote here asVtot. The use of the covariance matrix approach is especially useful when detailed information on the systematic uncertainties is available from the experiment.

When working with a covariance matrix, the data - i.e. both measurement and prediction - has to be reshaped from a 2D histogram to a 1D matrix for practical use. For data that is binned in (Lreco,Ereco) in a total of nL = 5 by nE = 10 histogram bins, the related data matrix becomes a (1×50)-matrix.

For a histogram bin with indices(i,j)that run over the bins inLrecoandEreco, respectively, a new matrix position index is determined as

α=i∗nE+j, (6.5)

where nE represents the number of Ereco-bins. This translation from his-togram to matrix is illustrated in figure 6.2, for clarity. One can see that the matrix elements are grouped as one energy spectrum per baseline, run-ning from shortest to longest baseline with increasingα.

Since the uncertainties on the model are evaluated per bin, and the covari-ance matrix also represents bin-to-bin correlations, this uncertainty matrix Vtot will be 50 by 50 large. Each elementVαβ of Vtot represents the relation between bin α(i,j)and binβ(k,l)in the (Lreco,Ereco)-data.

With the data and prediction rewritten to matrix format, theχ2 definition is written as:

χ2= (D−P)TV1(D−P) (6.6)

=

N α

N β

(DαPα)TVαβ1(DβPβ). (6.7)

6.2.3 Nuisance parameters versus covariance matrix

Although it is not straightforward to derive, the use of nuisance parameters or covariance matrices in the χ2 formalism are known to be fully equivalent.

This equivalence between both approaches can be derived analytically, e.g.

as demonstrated by Fogli et al in [136]. We should note that, although this assertion of equivalence is consistent for most realistic cases where the sys-tematic errors are not extremely large compared to the statistical ones, there are exceptional cases for which this assumption is incorrect [137].

The choice between both approaches is mostly driven by the practical implementation and use. In particular, it is often a trade-off between the size

6.2. FIT STATISTICS 141

Lreco (i)

Ereco (j)

i=0 i=4

i=3

i=2

i=1

j=0 j=1j=2 j=3 j=4

𝞪 = i*nE + j

𝞪 = 2*5 + 3 = 13

Figure 6.2:Scheme of the data and covariance matrix binning convention.

142 CHAPTER 6. OSCILLATION ANALYSIS of the set of measurements, in this case that is the total number of bins in the fit, N, and the number of systematic uncertainty contributors,K.

For the covariance matrix approach, the inversion of an N×N matrix is required, which becomes increasingly computationally demanding for larger values of N. For the pull terms method, on the other hand, the complexity of the minimisation procedure increases withK and forK N, the covariance matrix approach becomes more efficient.

A mixed approach is also possible, where one can exploit good knowl-edge of some uncertainties in the use of a related covariance matrix, while including one or more nuisance parameters for other systematics.

In this thesis, we will choose to use covariance matrices for the represen-tation of the systematic uncertainties.