Estimation of Uncertainties
6.1 Uncertainties Propagation to the Cross-Section Re-
Re-1866
sults
1867
All uncertainties are numerically calculated throwing toys which generate
pseudo-1868
experiments that go through the unfolding framework. To evaluate the effect of a
1869
systematic uncertainty, the extracted cross section must be recalculated for each toy
1870
of the systematic parameters. For the data statistical uncertainty only the actual data
1871
selection is varied. For all the other error sources the MC selection (truth and
recon-1872
structed) and the flux are variate while the actual data selection is kept unaltered.
1873
This means that the unfolding matrix (Section 4.2) efficiency, purity, and flux may
1874
change depending on which error source is being propagated. Therefore, each
pseudo-1875
experiment corresponds to an alternative hypothesis and gives a different cross-section
1876
result.
1877
For each independent error source, a covariance matrix is built across the bins in
1878
which the cross-section measurement is made. The covariance for bins (i, j) is defined
1879
where σi,ns and σj,ns are the measured cross-section values in bins i and j for throw
1881
n of source s, Ns is the total number of throws, and the overline denotes the sample
1882
mean across all throws.
1883
When sources are independent these covariance matrices can be added to form the
1884
final overall covariance matrix which is then used to extract the overall uncertainty
1885
across bins. The diagonals of the covariance matrix gives the variance for each bin.
1886
The cross-section results of this analysis are given by the formulas derived in
Sec-1887
tion 4.3.2. Eq. (4.19) is evaluated for each throw to build the covariance matrix. This
1888
means that the number of events of FGD1 and in FGD2 in each bin, as they appear
1889
in the formulas, are evaluated at each throw, preserving the correlations between the
1890
FGDs in the propagation.
1891
Different runs may have different flux properties (Section 5.1), thus they need to
1892
be handled separately for measuring the cross section. Nevertheless all uncertainties
1893
have to be treated as completely correlated across runs, and this is done by ensuring
1894
identical seeding for all throws.
1895
Uncertainties can be categorised intofive categories, assumed to be independent: 1
1896
• statistical, for data and for MC
1897
– neutrino interaction model
1901
– final state interaction model (“FSI”)
1902
The way of propagating the uncertainties onto the cross-section results presents
1903
some differences among these error categories. Statistical uncertainties are evaluated
1904
using 10000 pseudo-experiments for both data and MC, with poissonian variations.
1905
Propagation of systematics is much more CPU intensive, so a smaller number of throws,
1906
500, was used.
1907
6.1.1 Propagation of Statistical Errors
1908
While all the other systematic uncertainties are treated on a event-by-event basis,
1909
the statistical uncertainties are performed on a bin-by-bin basis. All the bins are
1910
assumed independent of each other and are smeared under the Poisson√
N assumption.
1911
Statistical errors affect both MC and data in the same way. To evaluate the data
1912
statistical uncertainties, each pseudo-experiment only varies the data reconstructed
1913
distribution. To evaluate the MC statistical uncertainties, each pseudo-experiment
1914
varies the MC reconstructed and truth distributions, implying also a new unfolding
1915
matrix, efficiency, and background in each throw. The selections in FGD1 and in
1916
FGD2 are designed such that an event can only pass one of them, ensuring the two
1917
samples are statistically uncorrelated.
1918
1In general, if control samples are used to constrain the background, systematic errors from different sources would become correlated and could not be thrown independently. However, in this analysis the background is relatively small (Section 5.2.2) and control samples are not used.
In an analysis which performs the background subtraction (Section 7.4) the MC
1919
background and the measured data are independent, thus when propagating the data
1920
statistical uncertainties the subtracted MC background is kept constant over throws.
1921
This means that the variation on the data distribution is the same with or without
1922
background subtraction:
1923
Var[Ndata−Nbackground] = Var[Ndata] =σ2[Ndata] (6.2) This implies that the background-subtracted result carries a larger fractional
un-1924
In an analysis which corrects the data distribution by the purity instead of
sub-1926
tracting the background, Ndata is simply scaled, which preserves the fractional error.
1927
The fact of throwing toys instead of calculating analytically the statistical
uncer-1928
tainties guarantees of handling properly the FGDs subtraction (Eq. (4.17)) and the
1929
water to scintillator ratio (Eq. (4.19)) together with the background subtraction and
1930
the unfolding. Since the data statistical uncertainty is the largest in this analysis, it is
1931
worth to give an estimation.
1932
The variance of a cross-section ratio R which has been defined in Eq. (4.19) is:
1933
Choosing a simple case where there are exactly the same number of events in FGD1
1934
where 0.46 is the ratio between the number of targetsTFGD2waterandTFGD1calculated
1936
in Section 4.3.3, and 1.011 is theflux normalisation ratio estimated in Eq. (4.20).
1937
In reality FGD1 and FGD2 don’t have exactly the same number of events,
never-1938
theless from Eq. (6.5) we can estimate that for a ratioR∼1, in a bin with ˆNk= 2000
1939
events the fractional error should be around 7%, and for ˆN = 20000 events the
frac-1940
tional error should be around 2.2%. Because of Eq. (6.3), the amount of background
1941
increases these values.
1942
6.1.2 Propagation of Detector Errors
1943
All detector uncertainties are treated on a event-by-event basis, some of them via
1944
reweighting (“weight systematics”), other varying some kinematic variables and
redo-1945
ing the selection (“variation systematics”). Both types of systematics can affect the
1946
MC by altering the unfolding matrix, the efficiency and the background. All detector
1947
systematics are thrown simultaneously, allowing correlations among them.
1948
A detailed discussion of individual systematics is discussed in Section 6.2.
1949
The central value of thefinal cross-section results includes some cross-section model
1950
corrections discussed in Section 6.4.1) (“NIWG tuning”), which are not taking into
1951
account when the detector systematics are propagated (for software limitations). On
1952
the other hand, the covariance matrix Eq. (6.1) associated with detector uncertainties
1953
is evaluated using the average of detector throws as a reference. This implies that the
1954
variance in each diagonal bin is related to the average of the throws, but not to the
1955
final result. Nevertheless, the fractional error is preserved, thus it is easy to extrapolate
1956
the absolute error on the tunedfinal result.
1957
6.1.3 Propagation of Flux and Theory Errors
1958
Flux and cross-section model errors are all handled by altering a set of parameters
1959
(or dials), discussed in Section 6.3 and Section 6.4 respectively. Correlations between
1960
parameters are taken into account using the covariance matrix provided by the T2K’s
1961
NIWG group. Gaussian throws are performed using this covariance matrix across the
1962
three groups,flux, neutrino interaction model, and FSI model, via the Cholesky
decom-1963
position method: for each group, the parameters within that group are simultaneously
1964
varied while the other parameters are kept to their nominal value. A reweighting
proce-1965
dure is then run across all the events to generate event-by-event weights for each value
1966
of each altered parameter.
1967