Modelling of the prompt-photon contamination in the hadron-fake template 104

5.5 Modelling of uncertainties

5.5.2 Modelling of the prompt-photon contamination in the hadron-fake template 104

0 2

4 6 810

1214 1618

20 [a.u.]

-3 -2 -1 0α 1 2 3 4

)γ , TEα| iso Tp(P

10-4

10-3

10-2

γ) up

T( E

[GeV]

iso T

p 0 2

4 6 810

1214 1618

20 [a.u.]

−α

-3 -2 -1 0 1 2 3 4

)γ , TEα| iso Tp(P

10-4

10-3

10-2

) down (γ ET

Figure 5.12: Parametrisation ofT_bkg^{data, nom} as a function α and p^iso_T , modelling the E_T(γ) depen-dence. Upwards (downwards) variations with respect to theE_T-independentT_bkg^{data, nom} are shown on the left (right).

In the differential measurement each nominal template is calibrated to its correspondingE_T(γ) weight, see Tab. 5.1 by shifting the nominal value of parameter α^down,up_E

T(γ) accordingly. In the inclusive measurement the no E_T-dependence of the templates was used, and the corresponding uncertainty was found to be small.

5.5.2 Modelling of the prompt-photon contamination in the hadron-fake tem-plate

The hadron-fake template describes the binned probability of a jet being mis-reconstructed as a photon. While the nominal fake background template is extracted using a data-based procedure, the residual contamination of true prompt-photons inside the template is extracted by combining data and simulation-based templates. In order to avoid a binned simulation-dependency in the nominal likelihood fit (used to compute the cross section), the fact that the prompt-photons are distributed according to the signal template T_sig^data,γ is used. The corrected fake template T_bkg^corr, taking into account the prompt-photon contamination, can thus be parametrised as

T_bkg^corr(p^iso_T |N_b^fake) =

1−α_fake·f h

T_bkg^{data, nom}(p^iso_T |N_b^fake)−αfake·f×T_sig^data,γ(p^iso_T |N_b^fake) i (5.17) whereN_b^fakeis the number ofhadron-fakes,T_bkg^{data, nom} is the data-based nominal (i.e. uncorrected) background template,f is the prompt-photon contamination andαfake is a scale-factor modelling the strength of the correction. In the case of no correction being applied α_fake = 0 and hence T_bkg^corr ≡T_bkg^{data, nom}. As usual, the term 1/(1−α_fake·f)is just a normalisation factor.

Figure 5.13 shows the effect of different strength factors in thehadron-fakebackground template.

The caseα_fake = 0 corresponds to not applying any correction (i.e. using the nominal, contami-nated, background templateT_bkg^{data, nom}), whileα_fake= 1 corresponds to the usage of the corrected templateT_bkg^corr.

Because of the partial model dependency off, the strength factor α is let free to float and is determined by the data in the likelihood fit. This strength factor is treated as a nuisance parameter

[GeV]

iso

0 2 4 6 8 10 12 14 16 18 20

) / GeV γ | iso Tp(P

0 0.1 0.2 0.3 0.4 0.5

fake = 0 α

= 1 αfake

= 2 αfake

ATLAS

L dt = 4.59 fb-1

∫

=7 TeV, s

Figure 5.13: Modeling of the hadron-fake template as a function of the fraction of true photons in the background control region. The strength factorα_fake is varied from α_fake = 0to α_fake = 2.

The nominal value corresponds toα_fake= 1 (corresponding to the corrected data-driven template T_bkg^corr), while the templates forα_fake = 0andα_fake = 2correspond respectively to the±1σvariation off.

and, as described in section Sec. 5.5.4, it is constrained by a Gaussian pdf which’s width is set to 27%corresponding to the maximum uncertainty estimated on f (see Eq. 5.14).

5.5.3 Choice of probability density functions

The choice of the pdf with which the nuisance parameters distribute is typically the normal distributionN:

N(x|x, σˆ x) = 1

p2πσ²_xexp

−(x−x)ˆ ² 2σ_x²

(5.18) upon an observable x with mean xˆ and variance σx. This is motivated by the central limit theorem [136], according which the estimator of the mean, of any well defined pdf (with fixed variance), is normally distributed.

Re-using the example of the integrated luminosity, if the measurement was to be repeated with several Van Der Meer scans, then the results would distribute according to the normal distribution.

Moreover, the majority of nuisance parameters are considered to have a well defined pdf, so N is a correct approximation. For a small fraction of systematic components this is not obvious.

For example, the parametrisation with a normal distribution of systematic uncertainties evaluated from two disconnected variations, such as the choice of the Monte Carlo (MC) generator or the choice of parton shower program, see Sec. 7.1.

The argument in this case is the fact that, because these sources are calculated from the two point variation of a large number of different type parameters varied simultaneously, which may share very different distributions, in some cases their pdf is not clearly defined or are unphysical.

This mis-modelling can be resolved by the definition of a pdf which is continuous with a flat top

Statistical model 106

and rapidly falling to zero for values close to its variance B(x|x, σˆ x) = 1

2 (

tanh

x−xˆ

√σx

−2#

−tanh

−

x−xˆ

√σ

−2#)

. (5.19)

The varianceσ_x of B(x|ˆx, σ_x) corresponds to the estimated uncertainty on the nuisance parame-ter x, associated to a two-point systematic variation. It is approximatively constant for (small) variations ofx, therefore independent of the unphysical meaning ofx. A comparison ofB(x|ˆx, σ_x) with N(x|ˆx, σ_x) with respect to an equally increasing variance is shown in Fig. 5.14. It can be seen that compared to the normal distribution, B(x|ˆx, σx) is has a near-to-constant probability value for ranges ofx≤σ.

[a.u.]

x -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 )xσ,x| x(P

0 0.2 0.4 0.6 0.8 1

pdf Box Normal

Variance [%]

=4 σ

=6 σ

=8 σ

=9 σ

Figure 5.14: Comparison with increasing variance of the normal distribution and the square distri-bution. The black lines labelled by “Box pdf” correspond toB(x|ˆx, σx) and the red lines labelled by “Normal pdf” correspond to N(x|ˆx, σ_x).

In a first moment, theB(x|ˆx, σ_x)pdf was included into the likelihood to be associated with nui-sance parameters modelling systematic uncertainties estimated from variable variations of which the true physical distribution is unknown (specifically the signal modelling systematic uncertain-ties: MC generator choice, parton shower, etc. . . , see Sec. 7.1). However, as with eitherB(x|ˆx, σ_x) orN(x|ˆx, σx) modelling in the likelihood similar uncertainties onx and on the cross section were observed it was, finally, chosen to model all uncertainties by Gaussian pdf, as it eases the repro-ducibility (or future combination with other measurements) of the result.

Dans le document Enlightened Top Quark: measurements of the ttγ cross section and of its spectrum in transverse energy of the photon in the single lepton channel at √s = 7 TeV in 4.59 fb−1 of pp collision data collected with the ATLAS detector (Page 110-113)