• Aucun résultat trouvé

5.4 Template for hadrons misidentified as photons

5.4.3 Prompt-photon contamination

The templates determined in Sec. 5.4.2 are based on the requirement that at least one, but not all, shower-shape variable fails the tight identification criterion, therefore (prompt-) photons may leak into the hadron-enriched CR . This eventual leakage (referred here also as the prompt-photon contaminant) causes the definition of the pdf for thehadron-fakesto be, eventually, biased.

Therefore, the probability of a photon to be identified as a hadron or a hadron decay product needs to be estimated. This section explains the method used for extracting this probability from data and the method used for correcting the templates obtained in Eq. 5.12 for this leakage.

In order to extract the prompt-photon contamination in the hadron-fake template, a simple extended likelihood functionLf is maximised:

Lf = nNtotfentot

Nf! ×ntot×h

1−fθˆ

TjjMC+fθ Tˆ sigdata,γ i

× 1 q

2πσθ2 exp

"

−(θ−θ)ˆ2θ2

#

(5.13)

where Nf is the total number of events observed in data within the hadron-fake background control region,f is the fraction of prompt-photons leaking into this region,TjjMC is a simulation-based background template modelling the probability of true hadron-fakes (i.e. without photon contamination) and Tsigdata,γ is the signal template of Eq. 5.6. The parameter θˆ represents the added uncertainty onf and it is considered to be anuisanceto the determination of f, the latter being distributed according to a Gaussian pdf of meanθ= 1 and widthσθ.

Statistical model 98

Simulation-based templates

Dijet (j, j0) ensembles generated with PYTHIA at different ET(j) jet thresholds have been used (see table 5.2). These simulations are based on the Leading-Order (LO) perturbative QCD matrix elements for thepp→jj0 hard sub-processes, with initial- and final- state radiation included with apT-ordered parton showering algorithm calculated in a leading-logarithmic approximation. The generated samples use an underlying event model for multiple parton interactions and the Lund string model for hadronisation [133]. PYTHIALO jet samples have been used in previous ATLAS analyses,e.g. for studies on multi-jet production with up to six jets in the final state [134].

Description FE × σ [nb]

PYTHIA JF17 filtered dijet,ET(j)>17GeV 1.4×106 PYTHIA JF35 filtered dijet,ET(j)>35GeV 6.4×104 PYTHIA JF70 filtered dijet,ET(j)>70GeV 3.7×103

Table 5.2: PYTHIA jet samples used to extract the simulation-based background templates TMC andTjjMC. Dijet events are selected before detector simulation, the corresponding Filter Efficiency (FE) multiplied by the matrix element cross section is shown.

The following procedure has been followed:

• At first, the same selection as used for the data-based extraction of thehadron-faketemplates is applied, see Sec. 5.4.1.

• Then, two different track-isolation templates,TMCand TjjMC, are obtained according to the following additional selections:

– Jet-photon selection TMC: events are required to have a photon candidate passing the tight identification criteria excepting for one of the four strip variablesFside,ws,3,

∆E,Eratio.

– Jet-jet selection TjjMC: information from the High Energy Monte Carlo Record (HepMC) [120] is used directly to discard events containing photons.

Therefore, the TMC and TjjMC templates represent the probability, obtained from simulations, ofhadron-fakeswith and without the prompt-photon contamination respectively (i.e.,TMCshould be comparable to the nominal data-based template as the contamination in the control region is unknown in data;TjjMC corresponds to an ideal non-contaminated hadron-faketemplate).

0 2 4 6 8 10 12 14 16 18 20

)γ| iso Tp(P/ GeV

0.1 0.2 0.3 0.4

0.5 Nominal template Tdatabkg

MC γ

MC template Tj

Data/MC ratio Stat. uncertainty (x 100) Total uncertainty Normalized residuals

Ldt = 4.59 fb-1

=7 TeV, s

0 2 4 6 8 10 12 14 16 18 20

Data / MC

0.5 1 1.5

[GeV]

iso

pT

0 2 4 6 8 10 12 14 16 18 20

]σSignificance [ -1

0 1

Figure 5.7: Comparison of the data-drivenTbkgdata, nom and MC-based TMC background templates (top), ratio of the two templates (middle) and normalised residuals (bottom). In the middle plot, considering the statistical uncertainty only, the two templates disagree. Including an uncertainty of 27%(obtained from aχ2 test-statistic using pseudo-experiments), and as indicated by the dashed area, an agreement is reached. The maximal deviation of the normalised residuals (bottom) is below1σ.

Statistical model 100

Determination of σθ

Without accounting for systematic uncertainties, the nominal data-basedTbkgdata, nomand the simulation-based TMC background templates do not agree (the statistical uncertainty O(10−4) is negligible in both cases). These two templates are shown in the upper plot of Fig. 5.7. The maximum difference is found to be about 18% at the lastpisoT bin.

In order to account for this to-data discrepancy, an uncertainty to the simulation-based template is extracted from a χ2 test-statistic using pseudo-experiments. The amount of uncertainty on the overall normalisation,i.e. over the total number of events, is randomised and theχ2 between theTbkgdata, nom and TMCtemplates is calculated. As shown in Fig. 5.8, the p-value reaches a plateau (p-value>0.95) at a value of 27% uncertainty. After inclusion of this additional uncertainty, both templates agree within1σ.

Uncertainty [a.u.]

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

-valuep

0.2 0.4 0.6 0.8

1 2/ndfχ

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

Pseudo experiments value

p

2/ndf χ

-value > 0.95 p

Figure 5.8: χ2/ndf and p-value of the Tbkgdata, nom and TMC templates as a function of the back-ground uncertainty as obtained from pseudo-experiments. The dashed lines indicates the minimal value for which thep-value >0.95.

Extraction of the fractionf

The likelihood functionLf of Eq. 5.13 is used to fit the data in thehadron-fake control region to extract the amount of signal (true prompt-photons) leaking into the background template. The uncertainty on the nuisance parameter θ,ˆ σθ, is taken as the 27% uncertainty to the simulation-based template as explained previously. The fraction of prompt-photon contaminationf is derived from the minimisation ofLf (eq. 5.13).

Upper and lower limits to f are extracted at a 68.3% Confidence Level (CL) by constructing the confidence belt with the Feldman-Cousins technique [125] using pseudo-experiments. The profile likelihood ratio is chosen as the ordering principle for the construction. Fig. 5.9 shows such interval on a set of105 pseudo-experiments. The result of the fit is shown in Fig. 5.10 from which f is found to be:

f = 6.1+1.7−0.9(syst)

×10−2. (5.14)

0 0.2 0.4 0.6 0.8 1f

f

0 0.2 0.4 0.6 0.8 1

fγ

Observed

Condidence belt at 68.3% C.L.

F.C.lower limit F.C. upper limit

dt = 4.59 fb-1

L = 7 TeV s

fγ

0 0.02 0.04 0.06 0.08 0.1 0.12

0.68 / 0.001) γf | iso TpP(

0 500 1000 1500 2000 2500 3000 3500 4000

fγ s+b on CL

fγ s+b on 68.3% CL

Figure 5.9: Extraction of the upper and lower limits on the fraction f. The Feldman Cousins (F.C.) confidence belt, as obtained from pseudo-experiments, is shown on the left. The horizontal line (observedf) corresponds to the maximum of the likelihood evaluated on data, while the dotted horizontal lines correspond to the upper and lower limits of the 68.3% Confidence Level (CL). The distribution of theestimatesoff is shown on the right. The points represent the estimated values off using pseudo-experiments; the dashed area corresponds to the interval covering a 68.3% CL.

The dotted line represents the best fit value off.

0 2 4 6 8 10 12 14 16 18 20

) / GeV γ | iso Tp(P

0.1 0.2 0.3 0.4

0.5 Nominal template Tdatabkg

MC

Tjj

) × MC template: (1-f

data

Tsig

× contamination: f γ

Prompt-Background template total uncertainty ATLAS

L dt = 4.59 fb-1

=7 TeV, s

[GeV]

iso

pT

0 2 4 6 8 10 12 14 16 18 20

]σResidual [

-1 01

Figure 5.10: Track-isolation background template distribution after maximisation of the likelihood Lf defined in Eq. 5.13 (top) and normalised residuals (bottom). The markers correspond to the nominal hadron background template. The stacked filled histograms represent the fraction of prompt photons in the hadron-fake control region (obtained as f ×Tsigdata) and the fraction of hadron-fakes (obtained from the simulation-based template as (1−f)×TjjMC) as given by the fit. The normalised residuals, shown in the bottom plot, are defined as the difference between the

“Nominal template” and the sum of(1−f)×TjjMCand f×Tsigdata, divided by the total uncertainty σθ. The last bin contains any overflow [132].

Statistical model 102