• Aucun résultat trouvé

Optimised NN architecture and comparisons with alternative approaches

8.3 Top-quark charge discrimination using neural networks

8.3.5 Optimised NN architecture and comparisons with alternative approaches

sensitivity estimate, as it reaches almost the same accuracy as the setup including all observables, but using a much smaller set of input variables (17 vs 26). The optimised values of hyper-parameters of the trained NN are listed in Table8.10. The comparison of the NN discriminant for the training and testing set is shown in Fig.8.6and the evolution of the training loss and validation loss for both cross-trained NNs in Fig.8.8. The validation loss is found to be systematically smaller than that of the training loss, which is not unexpected. Firstly, the training loss is evaluated continuously after each batch within an epoch, and the average training loss is quoted, whereas the validation loss is calculated only after the full epoch. The training loss average thus includes higher loss values from early batches in the epoch, increasing the value of the loss average. Additionally, the regularisation term is only added to the training loss function and not the validation loss, further increasing the training loss in contrast to validation loss. Nevertheless, a cross-check was performed by randomly swapping the four samples in Table8.7and repeating the NN training. The same trend in training and validation loss was reproduced, ruling out the issue of jets with outlier properties not being evenly represented in all four samples. In Fig.8.7, the ROC curves for each of the cross-trained NNs are shown, showing very similar training set and testing set accuracy. An approximately 0.1 % accuracy difference is observed between the two cross-trained NNs. This is attributed to slightly different convergence of the NN

8. Prospect of charge asymmetry measurement in boosted all-hadronictt¯events

Fig. 8.4: Linear correlations for the NN input variables and the discriminant for top-quark jets, expressed in %.

Fig. 8.5: Linear correlations for the NN input variables and the discriminant for top-anti-quark jets, expressed in %.

8.3. Top-quark charge discrimination using neural networks

training and validation loss due to the early stopping, and not due to overtraining, since both training and validation loss in Fig.8.8bshow similar trend.

Table 8.10: The chosen hyper-parameters and the architecture of the optimised NN.

Hyper-parameter Value

Optimiser Adam

Number of hidden layers 2

Number of neurons in the hidden layers 15, 10 Hidden layer activation function ELU Output layer activation function sigmoid

Learning rate 10−4

Batch size 100

L2 regularisation strength 5×10−5

0.0 0.2 0.4 0.6 0.8 1.0

Arbitraryunits Large-RJet (t) Train

Large-RJet (¯t) Train

Arbitraryunits Large-RJet (t) Train

Large-RJet (¯t) Train

Large-RJet (t) Test Large-RJet (¯t) Test

(b)

Fig. 8.6: The comparison of the NN discriminant for the training and testing set, for the two cross-training configurations in (a) and (b) respectively, as defined in Table8.7.

0.0 0.2 0.4 0.6 0.8 1.0

(top) Training accuracy = 72.89%

Testing accuracy = 72.86%

(top) Training accuracy = 72.96%

Testing accuracy = 72.96%

Training (AUC = 0.808) Testing (AUC = 0.808)

(b)

Fig. 8.7: The comparison of ROC curves for the training and testing set, for the two cross-training configurations in (a) and (b) respectively.

8. Prospect of charge asymmetry measurement in boosted all-hadronictt¯events

2 5 7 10 12 15 17 20

Epoch 0.540

0.542 0.544 0.546 0.548 0.550 0.552

Loss Training

Validation

(a)

0 5 10 15 20

Epoch 0.540

0.542 0.544 0.546 0.548 0.550

Loss Training

Validation

(b)

Fig. 8.8: The evolution of the training and validation loss as a function of the training epoch, for the two cross-training configurations in (a) and (b) respectively.

In the all-hadronic charge-asymmetry measurement, the NN is applied to the two candidate large-R jets passing the signal region criteria defined in Sec.8.2.1. The two jets are denoted asJ1andJ2, with their NN input valuesxJ

1andxJ

2, respectively. They are identified as the top-quark jetJt and top anti-quark jetJt¯by comparing the NN scores f(xJ1)and f(xJ2):

f(xJ

1) > f(xJ

2) ⇒ Jt ≡ J1, J¯t ≡ J2, f(xJ

1) < f(xJ

2) ⇒ Jt ≡ J2, J¯t ≡ J1, (8.14) where the cross-training NN setup is employed as discussed in Sec.8.3.3. The main source of dilution in the all-hadronic channelACmeasurement is expected to be the mis-identification of the jets in Eq.8.14, resulting in wrong determination of the sign of∆|y|= |yt| − |yt¯|, where|yt|is the absolute rapidity of Jtand|yt¯|is the absolute rapidity ofJt¯. Therefore, the figure of merit when comparing the NN with alternative approaches, is the∆|y|sign assignment purity, labelled asP|y|. It is defined as the fraction of events where the sign of reconstructed∆|y| matches the sign of∆|y|parton = |ypartont | − |ypartont¯ |, calculated from the true absolute rapidities of the top-quark and top anti-quark at parton level, and is evaluated in the signal region usingt¯tMC simulated events.

For the comparison with alternative approaches, theP|y|is also calculated using only theb-tagged track jets matched to the large-Rjets, either by using the inclusive jet chargeQb, or by using the aforementioned JVC discriminant. In these approaches, the charge classification of the candidate large-Rjets is performed based on the assumption that the top-quark jet should contain ab-jet with smaller value ofQb or JVC than theb-jet matched to top-anti-quark jet. Finally, theP|y| is also calculated for the charge asymmetry measurement in the single-lepton channel, where the charge assignment is performed based on the single isolated lepton in final state.

Since the all-hadronic channel targets a specific kinematic region, the P|y| comparisons are performed in the two highest-mpartontt¯ bins corresponding to the single-lepton channelACmeasurement in Ch.7, where thempartontt¯ is calculated from the parton-level top quark and anti-quark four-vectors.

The truemt definition is used instead of reconstructedmtt¯in order to remove the differences intt¯ system reconstruction between all-hadronic and single-lepton channel in the comparison. TheP|y|

results are summarised in Table8.11.