Jet shapes for boosted jet two-prong decays from first-principles

(1)

HAL Id: cea-01464935

https://hal-cea.archives-ouvertes.fr/cea-01464935

Submitted on 10 Feb 2017

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Jet shapes for boosted jet two-prong decays from

first-principles

Mrinal Dasgupta, Laís Schunk, Gregory Soyez

To cite this version:

Mrinal Dasgupta, Laís Schunk, Gregory Soyez. Jet shapes for boosted jet two-prong decays from first-principles. Journal of High Energy Physics, Springer, 2016, 2016, pp.166. �10.1007/JHEP04(2016)166�. �cea-01464935�

(2)

Prepared for submission to JHEP

Jet shapes for boosted jet two-prong decays

from first-principles

Mrinal Dasgupta,a La´ıs Schunkb and Gregory Soyezb

a_{Consortium for Fundamental Physics, School of Physics & Astronomy, University of}

Manch-ester, Manchester M13 9PL, United Kingdom

b_{IPhT, CEA Saclay, CNRS UMR 3681, F-91191 Gif-sur-Yvette, France}

E-mail: [email protected],

[email protected], [email protected]

Abstract: Several boosted jet techniques use jet shape variables to discriminate

the multi-pronged signal from Quantum Chromodynamics backgrounds. In this paper, we provide a first-principles study of an important class of jet shapes all of which put

a constraint on the subjet mass: the mass-drop parameter (µ2), the N -subjettiness

ratio (τ₂₁(β=2)) and energy correlation functions (C₂(β=2) or D₂(β=2)). We provide analytic results both for QCD background jets as well as for signal processes. We further study the situation where cuts on these variables are applied recursively with Cambridge-Aachen de-clustering of the original jet. We also explore the effect of the choice of axis for N -subjettiness and jet de-clustering. Our results bring substantial new insight into the nature, gain and relative performance of each of these methods, which we expect will influence their future application for boosted object searches.

Keywords: QCD, Hadronic Colliders, Standard Model, Jets

(3)

1 Introduction

In recent years jet substructure studies have received unprecedented attention and have been the focus of many theoretical and experimental studies. Most of this research has been carried out in the direct context of boosted new particle searches at the LHC. For reviews and detailed studies we refer the reader to Refs. [1–4] and references therein.

The basic ideas that underpin such studies are simple to understand. A high pT

resonance with a mass m _pT will exhibit collimated decays where in a significant

fraction of events the decay products would be reconstructed in a single “fat” jet. Tagging signal jets and removing jets arising from QCD background will thus rely crucially on detailed information about the jets themselves. In this context it is clear that valuable information will be obtained by studying the internal structure of jets in some detail.

Let us for example contrast the two-pronged hadronic decays of an electroweak

boson (W/Z/H) with 1_{→ 2 QCD splittings. QCD emission probabilities are infrared}

enhanced, favouring soft splittings, and hence a QCD jet would typically consist of a single hard prong. On the other hand decays of electroweak bosons show no preference for soft splittings and this results in a more symmetric energy sharing which gives rise to jets with a characteristic two-pronged internal structure. Another important difference results from the colour neutral nature of electroweak bosons which results in a strong suppression of radiation at angles that are large compared to the opening angle between the hard decay products. Soft large-angle radiation in a signal jet would thus typically arise from emissions that are uncorrelated with the decay of the electroweak boson in question i.e. from initial state radiation (ISR) and underlying event (UE) as well as from pile-up. Such radiation serves to degrade signal peaks making them less visible and also pushes up the masses of background jets. It is therefore also desirable to

(5)

eliminate this radiation. In the above context the two principal aims of a substructure analysis therefore emerge as identification of two hard prongs (tagging) and removal of uncorrelated soft radiation (grooming).

In recent years there have been many tools developed to achieve the above aims

of tagging and grooming jets. These include the mass-drop+filtering methods [5],

trimming [6] and pruning [7,8] amongst a whole host of other techniques. Monte Carlo event generator studies involving several of these techniques can be found in Refs. [1–4] and the original references.

Somewhat more recently there has also been the emergence of jet shape variables that directly attempt to quantify the N -pronged nature of a fat jet. Examples include the N -subjettiness variables [9–11] and the N -point energy correlation functions (ECFs) [12, 13], both of which are designed to take on small values for particle configurations corresponding to N collimated subjets of a fat jet, which one can naturally associate to an N -pronged decay. These techniques typically put constraints on the gluon radiation patterns in a jet. We expect this to have a good discriminating power both at small and large angles because gluon radiation is different for colour-neutral bosons compared to coloured QCD jets. At small angles, gluon radiation tends to be larger in QCD jets, made of a mixture of quarks and gluons, than in resonances, which decay mostly into quarks. At large angles, this is an even bigger effect since one expects a strong suppression of the radiation from collimated colour-neutral resonance decays compared to QCD jets. It is interesting to notice at this stage that the large-angle region, which shape variables try to constrain, is also the region that is sensitive to initial-state radiation and the underlying event. One typically uses grooming techniques to mitigate these effects and, therefore, one may wonder about the effectiveness of shape variable constraints when combined with grooming.

For studies involving two-pronged (W/Z/H) signal jets the N -subjettiness ratio τ₂₁(β) = τ₂(β)/τ₁(β) and the ECF C₂(β) are known to provide good discrimination between signal and background, where β is a parameter (angular exponent) that enters the definition of both variables. We shall provide precise definitions of these variables in the following section.1

There have also been several detailed studies carried out for both τ21and C2 in the

literature. Again, nearly all of these studies have been done using Monte Carlo event generator tools. As examples we refer the reader to the work carried out in the original

references [9, 11] while for more recent studies also including the implementation of

these variables in multivariate combinations we refer to Ref. [4].

In contrast our principal aim here is to carry out analytical calculations for the 1_{Note that to satisfy infrared and collinear (IRC) safety one has the requirement β > 0.}

(6)

above variables, based on the first principles of QCD. Such calculations have, for

in-stance, been carried out for the mass-drop, pruning and trimming methods [14] and

provided considerable new insight into the performance of those tools over and above what could be gained purely from Monte Carlo methods. We would therefore expect a similar level of information from analytical studies of the shape variables considered here. For our calculations in this paper we shall make the choice of β = 2, i.e. focus on τ₂₁(β=2) and C₂(β=2) for which calculations are relatively straightforward to perform. Detailed numerical studies of the dependence on β have been carried out in particu-lar for C₂(β), in Ref. [12]. These studies found that in the transverse momentum range

pT ∈ [400, 500] GeV for jet masses relevant to W/Z/H tagging, optimal β values ranged

between 1.5 and 2. For larger masses the optimal β values were found to be smaller. An analytical understanding of the β dependence of discrimination power would also be desirable but is left to future work.

As we shall show explicitly later in the article, cuts on τ₂₁(β=2)and C₂(β=2)effectively

serve to constrain subjet masses. Another similar variable, that has been far less

investigated in the literature, is the parameter µ2 of the mass-drop tagger (MDT)

[5]. This is obtained by declustering a jet into two subjets and taking the ratio of

the squared jet mass for the heavier subjet to that for the original jet. The original

mass drop tagger uses a cut on µ2 _{along with an energy cut designed to discriminate}

against soft splittings i.e. the ycut parameter of the MDT. It was shown in Ref. [14]

that in fact in the presence of the ycut condition the dependence on µ2 could essentially

be neglected. In the present article we instead study the dependence on µ2 without

any ycut requirement and compare the discriminating power it provides, to that from

similar variables i.e. τ₂₁(β=2) and C₂(β=2). Note that while the standard mass-drop tagger recurses, successively undoing the last step of a Cambridge/Aachen clustering, until

the cut on µ2 _{(and the y}

cut condition) is satisfied here we study both recursive and

non-recursive variants for each of the shape variables.

We carry out analytical studies for the jet mass distributions of QCD background jets with cuts on shape variables v < vmax, with v = τ21, C2 and µ2. We also study

the probability for signal jets to pass the same cuts. We define ρ = m2/(p2_TR2),

with m being the jet mass and work in the limit ρ _{1 (relevant for boosted object}

studies) and vmax 1 which is desirable to separate two-pronged structures from QCD

background. Our analytical results aim only to capture leading-logarithmic accuracy although we also retain several sources of next-to–leading logarithmic corrections. We

test our analytical results by comparing to fixed-order results from EVENT2 [15, 16]

to results from parton shower Monte Carlos and additionally carry out pure Monte Carlo studies of the impact of non-perturbative corrections. Since non-perturbative corrections are found to be large, we further examine with Monte Carlo studies the

(7)

impact of grooming with SoftDrop [17]. This shows an important reduction of the non-perturbative effects. To avoid diluting the main message of this paper with additional technical considerations, we defer the study of groomed jet shapes to a forthcoming work.

Note that some level of analytic understanding for jet shapes already exists. For ex-ample, studies of the lowest-order Energy-Correlation Functions, C₁β, have been carried out in Ref. [12]. Also, in the framework of Soft-Collinear Effective Theory (SCET) [18–

20] and its extension SCET+ [21], results for N -subjettiness have been obtained at the

N3_{LL accuracy for signal jets [}₂₂_{] and studies of the Energy-Correlation Functions C}β 2

and D₂β [23] appeared as the present paper was being finalised. In contrast, rather than providing a high-accuracy calculation of a given method, the main aim of our work is a transparent comparison of different shapes for both signal and background jets with phenomenological applications in mind.

With that in mind, it is however interesting to compare our approach and results to

what is obtained for D2 in Ref. [23]. Besides using different approaches (SCET-based

v. more standard pQCD language), the main difference between this work and Ref. [23]

is that, to the best of our understanding in terms of the variable ρ and D2, the latter

provides a NLL resummation2 in ρ, regardless of the value of D2 while our approach

assumes small D2 and treats log(D2) and log(ρ) on an equal footing.3 Therefore, the

calculation in Ref. [23] has likely a higher accuracy, at least in the region used in many

phenomenological applications. However, it is limited to D2 while our main goal here is

to discover the source of and address the main diffferences between various shapes. The results of Ref. [23] require at least four numerical integration (compared to a single one for our results), which, keeping in mind our purposes, makes a physical interpretation more involved.

This article is organised as follows: In the next section we provide detailed defini-tions of the shapes mentioned above. Following this, in section3, we discuss the general form of the results obtained for all the shapes under consideration, both for signal and

background jets. In section4we perform the detailed calculations for background jets

for both non-recursive and recursive variants for each shape variable. In the same section we compare the expansion of our results to fixed-order results from EVENT2, as a check on our calculations. We also carry out comparisons to results from Pythia with only final state radiation (FSR) turned on, to give a direct comparison against

our calculations. In section 5 we perform the calculations, checks and comparisons to

Monte Carlo for signal jets. Following this, in section 6 we study the impact of

non-2_{The treatment of the non-global logarithms and of their resumamtion is not totally clear to us.} 3_{Strictly speaking, we reach (modified) LL accuracy but we include a series of NLL effects, see}

(8)

perturbative corrections where we note the significant contributions from initial state radiation and the underlying event in particular. In order to obtain better control over such effects we combine shape variable studies with grooming using SoftDrop and

study the impact on both signal and background efficiencies. In section 7 we discuss

our findings in detail including an assessment of the comparative performance of all the shapes studied here. Finally we present our conclusions.

2 Radiation-constraining jet shapes

Among a large family of jet shapes, this paper will identify and focus on a series of variables all of which place constraints on the subjet mass. In this category, we will study the following three variables:

• N-subjettiness computed with β = 2, τ21(β=2) = τ

(β=2) 2 /τ (β=2) 1 with τ (β=2) N defined as [9,24] τ_N(β=2) = 1 pt,jetR2 X i∈jet

pt,imina1...aN(θ

2

ia1, . . . , θ

2

iaN), (2.1)

where the sum runs over all the constituents of a given jet and a1, . . . , aN denote

the partition axes. While the choice β = 1 is more common in experimental studies at the LHC — likely because of an expected smaller sensitivity to non-perturbative effects —, analytic studies have thus far mostly focused on β = 2. As argued eariler, the latter is expected to give better discriminative power. We

decided to choose β = 2 for the present study because in that case, τN acts like a

measure of the subjet mass which allows for a direct comparison with the

mass-drop µ2 _cut.4 _{w To fully define τ}

21, we still need to specify our choice for the

partition axes a1, . . . , aN in (2.1). We shall consider the following three options:5

– the optimal axes which should minimise τN;

– the kt axes obtained by clustering the jet with the kt algorithm [29–31] and

taking the N exclusive subjets;

– the generalised-kt axes with p = 1/2 (gen-kt(1/2)) obtained by clustering the

jet with the generalised-kt algorithm (see Section 4.4 of [32]), with its extra

parameter p set to 1/2, and taking the N exclusive subjets.

4_{The choice β = 1 would fall in another category of observables, together with energy-correlation}

functions with β = 1 and Y-splitter [25]. A calculation similar to the one in this paper can be performed, although the situation is often more complicated. We leave the study of these variables for future work together with a comparison of the performance of the “β = 1” and “β = 2” shapes.

(9)

The third option is new and leads to similar performance to the optimal axes at

much smaller computational cost. The motivation to look into gen-kt(1/2) axes

is that its distance measure behaves again like a mass, as does τ₂₁β=2, and we can expect the resulting axes to be very close to the optimal axes. More generally, for τ₂₁β with a generic β, we would expect the generalised-kt axes with p = 1/β to

give a close-to-optimal result.

• a version of the mass-drop parameter [5], µ2 _{which, given two subjets j}

1, j2 in

a given jet j is defined as µ2 _{= max(m}2

j1, m

2 j2)/m

2

j. In its original formulation,

the cut on µ2 _{was applied in a recursive de-clustering of a jet obtained with}

the Cambridge/Aachen (C/A) algorithm [33, 34]. The present definition of µ2

is however defined non-recursively, i.e. as a cut that the jet j satisfies, or not, without any further de-clustering if it does not. Similarly to the definition of the N -subjettiness axes, we need to specify the procedure to separate the jet j into two subjets j1, j2. We will denote by µ2p the result obtained by undoing the

last step of a generalised-kt clustering, with extra parameter p, of the jet j. We

shall concentrate on µ2_1/2, since it follows the ordering in mass, and µ2₀ since it corresponds to the historical choice.6

• the energy correlation function double ratio. Here we again use β = 2, which will be kept fixed here, and define [12],

e2 = 1 p2 TR2 X i<j∈jet pt,ipt,jθ2ij, (2.2) e3 = 1 p3 TR6 X i<j<k∈jet pt,ipt,jpt,kθij2θ 2 ikθ 2 jk, (2.3)

and work with C2 = e3/e22. Note that, at the order of accuracy targeted in this

paper, we can alternatively use the recently-proposed D2 = e3/e32, [13], since, up

to our accuracy, they only differ by a rescaling by the total jet mass.

For any of these three shapes, v, a cut of the form v < vcut is expected to show

good performance in discriminating two-pronged boosted objects from standard QCD jets. Note also that, if the cut is not satisfied, the jet is discarded.

Additionally, we shall also consider the cases where one of the three shape con-straints introduced above is applied recursively. By this we mean that, for a shape v, we apply the following procedure:

6_{We shall see that, unless it is completed by a recursive declustering (as it is the case in the original}

formulation) or a pre-grooming of the jet e.g. using the SoftDrop procedure, µ2

(10)

1. recluster the jet j with the C/A algorithm,

2. compute v from j; if v < vcut, j is the result of the procedure and exit the loop,

3. undo the last step of the clustering to get two subjets j1 and j2, define the hardest

of j1 and j2 (in terms of their pt) as the new j and go back to 2.

This is of course motivated by the original mass-drop tagger proposal [5], where a cut

was placed on the µ2 _{parameter. We have to note that, here, the recursion follows}

the hardest branch, as suggested in the modified version of the mass-drop tagger [14],

rather than the most massive one, as in the original proposal.

3 Generic structure of the results

For QCD jets, there are two basic physical quantities that we will be interested in: the jet mass distribution after applying a given fixed, recursive or not, cut on one of the shapes described in the previous section; or the distribution of a jet shape for a given fixed value of the jet mass. The latter situation only applies to the non-recursive cases. For signal jets, we are interested in jets of a fixed mass so the calculation will mostly focus on what fraction of these jets satisfy the constraint on the jet shape v, hence on the distribution of v for an object of a given mass. Jets which fail the constraint on v will be discarded.

Our calculations apply to the boosted regime, where the jet transverse

momen-tum is much larger than its mass. In that context, it is convenient to introduce

ρ = m2/(ptR)2, with R the radius of the jet. The boosted regime means that we

can take the limit ρ _{1. Furthermore, in this work, we shall focus on two-pronged}

decays, where we expect that the radiation-constraining shapes introduced above would be smaller for signal jets than for the QCD background. It is therefore natural to start the study of these shapes in the limit where they are small. In the following we shall thus also assume that the cut on the shape is small compared to 1. In this limit, we

focus on the leading double logarithm7 for which soft and collinear emissions can be

considered as strongly ordered and the mass of the jet is dominated by the strongest of these emissions. Throughout the paper, we will therefore assume that this emission,

dominating the mass of the jet, occurs at an angle8 _Rθ

1 and with a fraction z1 of the

jet transverse momentum pt. This has to satisfy the constraint z1(1− z1)θ12 = ρ, where,

7_{We will also include the hard-splitting corrections and discuss a series of NLL corrections in}

Section4.7.

(11)

for QCD jets we can neglect the (1_{− z}1) factor which would only lead to subleading

power corrections in ρ.

All the shapes, v, that we consider put constraints on additional emissions. This means that we can always consider, as a starting point, a system made of two partons — the “leading parton p0” initiating the jet and the “first, leading, emission p1” which

sets the jet mass for QCD jets, or the two prongs of a massive boson decay for signal jets — and study additional radiation from this system.

In the leading-logarithmic approximation, the constraint on radiation will always take the form of a Sudakov suppression coming on top of the mass requirement. For QCD jets, the mass distribution with a cut on v can always be written as

ρ σ dσ dρ <v = Z 1 ρ dθ2 1 θ2 1 Z 1 ρ dz1P (z1) ρ δ(z1θ12− ρ) αs(z1θ1ptR) 2π e −Rmass(ρ)−Rv(z1,ρ) = Z 1 ρ dz1P (z1) αs(√z1ρ ptR) 2π e −Rmass(ρ)−Rv(z1,ρ)_. _(3.1)

In the above Rmass(ρ) is the Sudakov resumming the leading log(1/ρ) contributions to

the plain jet mass and Rv(z1, ρ) the extra contribution coming from the additional cut

on v.

In the approximation we shall be working at, instead of P (z1), it is sufficient to

consider its leading logarithmic contribution from its 2CR/z1 term and a subleading

hard collinear contribution 2CRBiδ(z1 − 1), where CR is the colour charge of a jet

initiated by a parton of flavour i and Bi is the integral of the non-singular part of the

splitting function: Bq = Z 1 0 dz 1 2CF Pqq(z)− 1 z =₋3 4, (3.2) Bg = Z 1 0 dz Pgg(z) + 2nfPqg(z) 2CA − 1 z =₋11CA− 4nfTR 12CA . (3.3)

Eq. (3.1) can therefore be replaced by ρ σ dσ dρ <v = Z 1 ρ dz1 z1 αs(√z1ρ ptR)CR π e −Rmass(ρ)−Rv(z1,ρ) + αs(√ρ ptR)CR π Bie −Rmass(ρ)−Rv(z1=1,ρ)_. _(3.4)

Note however that keeping the full integration over the splitting function is some-times useful in comparing background and signal efficiencies and can lead to potentially

(12)

large subleading corrections.9 _{For all the analytic plots in this paper, where the} inte-gration over z1 is done numerically, we have decided to keep the exact P (z1) splitting

function and use Eq. (3.1).

If instead we want to obtain the probability to satisfy the cut on the shape v for a jet of a given mass one get (for the non-recursive versions):

Σ(v) = R_mass0 (ρ)e−Rmass−1 ρ

σ dσ dρ _<v , (3.5)

with R0_massbeing the derivative of Rmass wrt log(1/ρ). Note that the shapes we consider

all require at least three particles in the jet to be non-zero, meaning that the distribution dσ/dρ_|_<v — or, equivalently, the double-differential distribution in both the mass and

the shape, d2_{σ/dρdv — starts at order α}2

s. Conversely, Σ(v) will start at order αs,

since it is normalised to the jet mass which itself starts at order αs.

At fixed coupling, the integration over z1 can usually be carried out analytically.

This however does not bring any additional insight on the underlying physics mech-anisms and so will not be done explicitly. For the sake of clarity, we will give fixed-coupling results in the main body of the text, see Section4, and defer the full results,

including running-coupling corrections, to Appendix A (more precisely, Appendix A.2

for QCD jets). The analytic results presented for the radiator function Rv in the main

text therefore correspond to a fixed-coupling (modified) LL accuracy, i.e. they include the leading logarithms as well as the corrections due to the hard collinear splittings (the “B terms” in the forthcoming equations). Note that we treat logarithms of the shape and the jet mass on an equal footing. Hence, by leading logarithms, we mean, for fixed coupling, double logarithms of any kind, i.e. in either the shape or the jet mass or both. For the figures and the comparisons to Monte-Carlo simulations, we will also include the (leading order) running-coupling contributions as well as a few relevant

NLL effects, discussed in Section4.7 and Appendix A.

For signal jets, we will directly be interested in the efficiency, i.e. in the fraction of jets (of the original jet mass) that will satisfy the constraint on v. This can be written as

Σsig(v) =

Z 1

ρ

dz1Psig(z1)e−Rv,sig(z1,ρ) (3.6)

where the signal “splitting function” Psig(z1) is assumed to be normalised to unity.

Again, we can either decide to keep the full integration over z1 or, at our level of

accuracy, keep only the dominant part without any z1dependence and the first log(1/z1)

and log(1/(1_{− z}1)) corrections. Note that here z1 can no longer be neglected in the

(13)

constraint on the jet mass, ρ = z1(1− z1)θ12. For the illustrative fixed-coupling results

given in Section5, we will only keep the first corrections in log(1/z1) and log(1/(1−z1)),

while for the full results including running-coupling corrections given in AppendixA.3,

we will include these factors in the resummation, mainly for simplicity reasons.

Given these basic expressions, our main task is to compute the Sudakov factors Rv

for all the shapes under consideration. We do that in the next two sections.

4 Calculations for the QCD background

The results below give the generic expression for the Sudakov form factor assuming one works in the (modified) leading-log approximation. It is helpful to clarify the notations once and for all:

Lρ = log(1/ρ) = log(p2tR

2_/m2_), _L

τ = log(1/τ21),

L1 = log(1/z1), Lµ = log(1/µ2), (4.1)

Lv = log(1/[τ21, µ2 or C2]), Le = log(1/C2).

We assume, as stated before, that the angles are normalised to the jet radius R and we work with a jet initiated by a parton of flavour i. For a fixed mass ρ and momentum fraction z1, we have θ21 = ρ/z1.

4.1 τ21 cut (pure N -subjettiness cut)

We first consider the case where we impose a cut τ21< τcut on the N -subjettiness of a

jet of a given mass ρ. We are interested in the limit τcut 1.10

The first step is to find an expression for τ21in the limit where emissions are strongly

ordered in angle and transverse momentum fraction. For this, let us assume that the second leading emission occurs at an angle θ2, wrt the leading parton p0, (initiating the

jet) and carries a transverse momentum fraction z2 of the leading parton.

The expression obtained for τ21 in this limit depends on the choice of axes. It is

useful to consider three specific options: • the optimal axes [11] which minimise τ2,

• the kt axes, which take the 2 exclusive kt subjets as axes,

• the gen-kt(1/2) axes, which also takes exclusive subjets as axes, except that this

time, we use the generalised kt algorithm with p = 1/2.

10_{In order to keep the notation as light as possible, we shall drop the “cut” subscript when no}

(14)

We defer most of the technical discussions regarding how to obtain τ21for the above

choices to AppendixB.1. In the end, the kt axes choice leads to a more complex

phase-space, while the optimal and gen-kt(1/2) options are equivalent to taking the leading

parton and the emission setting the mass (emission p1) as axes, clustering emission p2

with whichever axis is closest, and both lead to τ21=

z2θ22

z1θ21

, (4.2)

up to corrections which are beyond the LL accuracy we aim for here.11 _{In what follows,}

we shall concentrate on the generalised kt axes choice since they are simpler than the

optimal axes.

Furthermore, we also have to consider secondary emissions, where the radiation is emitted from the gluon (z1, θ21) itself. If z2 denotes the fraction of the (first emitted)

gluon energy carried by the extra emission at an angle θ12, with θ12< θ1 due to angular

ordering, we find τ₂₁secondary= z2 θ2 12 θ2 1 , (4.3)

where the different normalisation wrt Eq. (4.2) is purely due to z2 being normalised to

the gluon energy fraction z1.

In the limit of small τ21, additional emissions at smaller mass do not affect the

result. The one-gluon emission will thus exponentiate according to eq. (3.1) and we get

Rτ(z1) = Z 1 0 dθ2₂ θ2 2 Z 1 0 dz2 αs(z2θ2) 2π Pi(z2) Θ(ρ > z2θ 2 2 > ρτ ) + Z θ21 0 dθ2 12 θ2 12 Z 1 0 dz2 αs(z1z2θ12) 2π Pg(z2) Θ(z2θ 2 12/θ 2 1 > τ ), (4.4)

where the first line takes into account emissions from the leading parton p0 while

the second accounts for secondary gluon emissions from the first emitted gluon p1.

The arguments of the strong coupling are given as factors multiplying the “natural”

scale of the problem, ptR. The phase-space corresponding to the primary emissions is

represented in Fig.1a.

For simplicity, we shall only quote results with a fixed coupling approximation in the main body of the paper. Results with a proper treatment of the running-coupling 11_{Note however that there is a bug in MultiPass Axes in version 2.1.0 of the N -subjettiness}

im-plementation [35] available from FastJet contrib [36] which makes the minimisation step ineffective. Optimal axes obtained with that version of the N -subjettiness implementation will therefore return the ktaxes.

(15)

θ2) log θ) log (1/ (z ρ ρτ (a) N -subjettiness θ2) log θ) θ2 log (1/ (z 2 2 1 ρ ρµ ρµ /z₁ (b) Mass-drop θ2) log θ) θ2 log (1/ θ2 (z 1 ρ ρ ρ c c/ ₁

(c) Energy correlation function

Figure 1: Plots of the phase-space constraints on emissions setting the mass (in red) and the jet shape (in blue).

corrections are presented in the Appendices. In this case, the final exponent does not depend12 on z1 and we find

R_τ(fixed)(z1) = αsCR π L 2 τ/2 + LρLτ + BiLτ + αsCA π L 2 τ/2 + BgLτ , (4.5)

where, for quark jets, we have CR = CF and Bi = Bq =−3/4 while for gluon jets we

have CR= CA and Bi = Bg =−(11CA− 4nfTR)/(12CA).

4.2 µ2 _cut

As for the case of N -subjettiness, we first have to find, given the emissions p1 and

p2 with p1 giving the dominant contribution to the mass, what is the value of the

mass-drop parameter µ2_{. Since µ}2 _{is defined by undoing the last clustering step, it will}

depend on the jet algorithm we use to (re-)cluster the jet. The Cambridge/Aachen algorithm is a common choice but does not work here. Indeed, undoing the last step of a Cambridge/Aachen clustering would separate the emission at the largest angle from the rest of the jet, regardless of the transverse momentum of that emission. This is not

infrared safe. We further discuss infrared-safety issues in AppendixC.

Instead, we shall define µ2 _{by undoing the last step of a generalised-k}

t clustering

with p = 1/2. The motivation for this is the same as the motivation for the axes choice in the previous section: the generalised-ktalgorithm with p = 1/2 follows closely

the ordering in mass. To keep things unambiguous, we shall denote by µ2_p the

mass-drop parameter obtained by undoing the last step of a generalised-kt clustering with

12_{This is no longer valid if we include running-coupling corrections due to the scale entering the}

(16)

parameter p. The (infrared-unsafe) case of a C/A clustering would correspond to µ2 0

while we will be interested in µ2

1/2, although the calculation can be performed for any

positive p.

Again, we leave the technical details of the calculation for Appendix B.2. In a

nutshell, the hard parton and the first emission (setting the mass) will form two subjets, and the second emission, setting the subjet mass, will be clustered with whichever of these two subjets is closest. In the end, keeping in mind that, to our leading-logarithmic accuracy we can assume strong ordering in angle (θ2 θ1 or θ2 θ1), we find

(z1θ21)µ 2 1/2 ≈        z2θ22 for θ2 < θ1 or (θ2 > θ1 and θ2 < θ12), z1z2θ22 for (θ2 > θ1 and θ2 > θ12), z2

1z2θ212 for secondary emissions.

(4.6)

There is a crucial difference between mass-drop and N -subjettiness: the latter can be seen as (1/pt)

P

j∈subjetsm 2

j/pt,j which has an extra 1/pt,j compared to µ2_1/2. This

leads to different expressions whenever the jet with the largest mass is not the one with

the largest pt. The secondary emissions and large-angle radiations will therefore give

additional suppressions for N -subjettiness compared to the mass-drop.

With similar arguments, it is easy to realise that additional emissions with smaller masses will not affect this calculation, so that, at leading-logarithmic accuracy, the

lowest order simply exponentiates according to eq. (3.1). The vetoed phase-space for

emissions is represented in Fig.1b and we get

R_µ2 1/2(z1) = Z 1 0 dθ2 2 θ2 2 Z 1 0 dz2 αs(z2θ2) 2π Pi(z2) Θ(θ₂2 < θ2₁) Θ(ρ > z2θ22 > ρµ 2₎ + Θ(θ₂2 > θ2₁)h1 2Θ(ρ > z2θ 2 2 > ρµ 2_{) +} 1 2Θ(ρ > z2θ 2 2 > θ 2 1µ 2₎i + Z θ₁2 0 dθ2₁₂ θ2 12 Z 1 0 dz2 αs(z1z2θ12) 2π Pg(z2) Θ(z1z2θ 2 12/θ 2 1 > µ 2 ). (4.7)

For a fixed coupling approximation, we find R(fixed)_µ2 1/2 (z1) = αsCR π (Lρ+ L1+ Lµ)Lµ/2 + 1 2(Lρ− L1)(Lµ− L1)Θ(Lµ> L1) + BiLµ +αsCA π (Lµ− L1) 2_{/2 + B} g(Lµ− L1) Θ(Lµ> L1). (4.8) 4.3 C2 cut

For two strongly-ordered emissions p1(z1, θ1) and p2(z2, θ2), such that z1θ12 z2θ22, one

finds, for primary emissions, C2 = 1 z2 1θ14 z1z2(1− z1− z2)θ21θ 2 2θ 2 12' z2θ22 z1θ12 max(θ₁2, θ2₂) (4.9)

(17)

which is the same result as the one we obtained in the N -subjettiness case with an

extra factor max(θ2

1, θ22).13 For secondary emissions, θ12 θ1, hence θ2 ' θ1 and we

have (with z2 measuring the momentum fraction wrt emission 1)

C2 ' z2 θ2 12 θ₁2θ 2 1 = z2θ122 . (4.10)

The corresponding phase-space is represented in Fig. 1cand gives

RC2(z1) = Z 1 0 dθ2 2 θ2 2 Z 1 0 dz2 αs(z2θ2) 2π Pi(z2) Θ(ρ > z2θ 2 2) h Θ(θ2₂ < θ₁2) Θ(z2θ22θ 2 1 > ρC) + Θ(θ 2 2 > θ 2 1) Θ(z2θ24 > ρC) i + Z θ2₁ 0 dθ₁₂2 θ2 12 Z 1 0 dz2 αs(z1z2θ12) 2π Pg(z2) Θ(z2θ 2 12> C). (4.11)

For a fixed coupling approximation, one finds R(fixed)_C₂ (z1) = αsCR π L 2 e/2 + (Le− Lρ+ L1)(L1+ Bi)Θ(Le > Lρ− L1) + αsCA π (Le− Lρ+ L1) 2_{/2 + B} g(Le− Lρ+ L1)Θ(Le > Lρ− L1). (4.12)

If we decide to work with D2 = C2/ρ rather than C2, and define Ld = log(1/D2) =

Le− Lρ, we get, assuming Ld> 0, R(fixed)_D 2 (z1) = αsCR π (Ld+ Lρ) 2_{/2 + (L} 1+ Ld)(L1+ Bi) + αsCA π (Ld+ L1) 2 /2 + (Ld+ L1)Bg. (4.13) 4.4 Recursive τ21 cut

We now move to the same calculations as above but apply the cut recursively declus-tering a C/A jet until the cut is met (see Sec.2).

The calculation of the shapes mostly remains unchanged but the recursion will affect the allowed phase-space for emissions. As before, let us assume that p1(θ1, z1) is

the emission that dominates the mass after the recursion procedure has been applied and see what constraints on the phase-space the cut imposes on additional emissions p2(θ2, z2).

13_{Contrary to what we have for µ}2

1/2 (see Appendix.D), Eq. (4.9) is continuous for θ1= θ2. Using

the exact expression for θ12 in the region θ2 ≈ θ1 will therefore not lead to (single) logarithmically

(18)

For emissions at angles θ2 smaller than θ1, the de-clustering will reach p1 before p2,

which corresponds to the same situation as for the non-recursive case. In fact it remains true for all shape variables under consideration in this paper that for such angular configurations the results from the recursive and non-recursive variants coincide.

Differences occur for emissions at angles larger than θ1. The physical reason for

that comes from emissions at angles larger than θ1and which would dominate the mass,

i.e. for which z2θ22 > z1θ12. In the non-recursive case, these emissions are forbidden by

our constraint on the jet mass and this is included in the Sudakov suppression for

the jet mass Rmass(ρ) in Eq. (3.1), which imposes that the mass of the jet is truly

dominated by the (z1, θ21) emission. In the situation where the cut on the shape is

applied recursively, some extra care is needed since some of these emissions — that are vetoed in the non-recursive case because they would lead to a larger jet mass — can be simply discarded by the recursive procedure. In such a case, they should no longer be forbidden.

For the large-angle region, θ2 > θ1 we therefore have to separate 4 different regions:

• for z2θ22 < ρτ , we have τ21 ≈ z2θ22/z1θ12 = z2θ22/ρ < τ , meaning that the constraint

is satisfied. That region is therefore allowed,

• for ρτ < z2θ22 < ρ, we have τ21 ≈ z2θ22/z1θ12 = z2θ22/ρ as in the previous case, but

this time it does not satisfy the condition τ21 < τ . The emission (z2, θ22) will thus

be discarded, meaning that this region is again allowed,

• for ρ < z2θ22 < ρ/τ , we now have τ21 ≈ z1θ21/z2θ22 = ρ/z2θ22, i.e. τ21 > τ . The

condition is once again not satisfied and the region is allowed.

• for z2θ22 > ρ/τ , we find similarly τ21 ≈ z1θ12/z2θ22 = ρ/z2θ22 < τ . The condition

on τ21 would be met, leaving a jet with a mass z2θ22 > ρ. This region is therefore

forbidden.

Compared to the non-recursive case, the vetoed region at large angle is therefore re-duced.

In the above discussion, we tacitly assumed that we were working with the

gen-kt(1/2) axes or with the optimal axes, but the argument is more general. We could also

define τ21 using the exclusive C/A axes, automatically available from the declustering

procedure. Indeed, in that case, all emissions with z2θ22 < ρ/τ would fail the cut on τ21

(19)

θ2) log θ) θ2 log (1/ (z 1 ρ ρτ ρ/τ

(a) N -subjettiness or mass-drop

θ2) log θ) θ2 log (1/ θ2 (z 1 ρ ρ c/₁ ρ/c (b) Energy-correlation function

Figure 2: Same as Fig. 1but this time for cases where the cut is applied recursively.

Again, the lowest order result simply exponentiates and the Sudakov suppression, depicted in Fig.2ais Rτ,rec(z1) = Z 1 0 dθ₂2 θ2 2 Z 1 0 dz2 αs(z2θ2) 2π Pi(z2)Θ(θ 2 2 > θ 2 1) Θ(z2θ22 > ρ/τ ) + Θ(θ2₂ < θ₁2) Θ(z2θ22 > ρτ ) + Z θ21 0 dθ2 12 θ2 12 Z 1 0 dz2 αs(z1z2θ2) 2π Pg(z2) Θ(z2θ 2 12/θ 2 1 > τ )− Rmass(ρ), (4.14)

where we have subtracted Rmass(ρ) which has already been included in (3.1).

For a fixed coupling approximation, this gives R(fixed)_τ,rec (z1) = αsCR π n L2 τ/2− LρLτ+ 2L1Lτ + BiLτ Θ(Lτ < L1) +L2_τ _{− L}ρLτ + L1Lτ + L21/2 + BiL1 Θ(L1 < Lτ < Lρ) +h1 2(Lρ+ L1+ Lτ + 2Bi)(Lτ + L1− Lρ) i Θ(Lρ< Lτ) o +αsCA π L 2 τ/2 + BgLτ . (4.15)

4.5 Recursive µ2 cut (pure mass-drop tagger)

The situation is mostly the same as for the recursive τ21cut. Here, the use of a recursive

criterion allows to use either the subjets naturally given by the C/A declustering or the gen-kt(1/2) subjets. The results presented in this section are valid for both µ20 and

µ2

(20)

answer for the mass distribution in different ways, and would give different answers for other observables.

As before, for θ2 smaller than θ1, the declustering has no effect and the results are

as obtained in Sec.4.2. The complication related to the clustering distance for θ2 θ1

is absent here because of the declustering, and only emissions with z2θ22 > ρ/µ2 have

to be vetoed. In all other cases, either the mass-drop condition fails and the emission is simply discarded, or the mass-drop condition is satisfied but the mass of the jet remains z1θ12.14 E.g., for the natural choice, µ02, all emissions in the region z2θ22 < ρ/µ20

will fail the condition and be discarded before the recursion continues. That said, the

only remaining difference between a recursive µ2 cut and a recursive τ21 cut will be in

the extra factor z1 in the secondary emissions (see, e.g. Sec. 4.2) and we find

Rµ2_,rec(z₁) = Z 1 0 dθ₂2 θ2 2 Z 1 0 dz2 αs(z2θ2) 2π Pi(z2)Θ(θ 2 2 > θ12) Θ(z2θ22 > ρ/µ2) + Θ(θ2₂ < θ₁2) Θ(z2θ22 > ρµ2) + Z θ2 1 0 dθ2 12 θ2 12 Z 1 0 dz2 αs(z1z2θ2) 2π Pg(z2) Θ(z1z2θ 2 12/θ 2 1 > µ 2₎ − Rmass(ρ). (4.16)

For a fixed coupling approximation, we get R(fixed)_µ2_,rec(z1) = αsCR π n L2 µ/2− LµLρ+ 2LµL1+ BiLµ Θ(Lµ< L1) +L2 µ− LµLρ+ LµL1+ L12/2 + BiL1 Θ(L1 < Lµ< Lρ) +h1 2(Lρ+ L1+ Lµ+ 2Bi)(Lµ+ L1− Lρ) i Θ(Lρ< Lµ) o +αsCA π (Lµ− L1) 2 /2 + Bg(Lµ− L1) Θ(Lµ> L1), (4.17)

where the CR contribution is the same as for the recursive τ21 cut and the CA

contri-bution is the same as for the non-recursive µ2_1/2 cut.

4.6 Recursive C2 cut

Again, the calculation unfolds as for the two recursive cases above with a contribution from “failed” conditions for θ2 > θ1 and a standard constraint for θ2 < θ1. In the first

case, e2 (resp. e3) is set by emission p2 (resp. p1) and θ12 ≈ θ2. In the second case, e2

(resp. e3) is set by emission p1 (resp. p2) and θ12≈ θ1, yielding

C2 = z1θ21 z2 Θ(θ2 > θ1) + z2θ22 z1 Θ(θ2 < θ1). (4.18)

14_{As for the axes choice in N -subjettiness, these regions will differ for µ}2

(21)

The Sudakov exponent will ultimately be given by RC,rec(z1) = Z 1 0 dθ2 2 θ2 2 Z 1 0 dz2 αs(z2θ2) 2π Pi(z2)Θ(θ 2 2 > θ 2 1) Θ(z2θ22 > z1θ21) Θ(z2 > ρ/C) + Θ(θ₂2 < θ2₁) Θ(z2θ22 > z1θ21) + Θ(θ₂2 < θ2₁) Θ(z2θ22 < z1θ21) Θ(z2θ22 > ρC/θ 2 1) + Z θ₁2 0 dθ2₁₂ θ2 12 Z 1 0 dz2 αs(z1z2θ2) 2π Pg(z2) Θ(z2θ 2 12> C)− Rmass(ρ). (4.19)

For a fixed coupling approximation, we obtain R(fixed)_C,rec (z1) = αsCR π n −L2 e/2 Θ(Le < Lρ− L1) (4.20) +(Lv+ L1− Lρ)(Lv + 2L1− Lρ+ Bi)− L2e/2 Θ(0 < Lρ− Le < L1) + [(Le+ 2L1− 2Lρ)(Le+ 2L1)/2 + Bi(Le− 2Lρ+ 2L1)] Θ(Le > Lρ) o + αsCA π (Le+ L1 − Lρ) 2 /2 + Bg(Le+ L1− Lρ) Θ(Le > Lρ− L1). 4.7 Towards NLL accuracy

In this article, as we have stated before, we are aiming to achieve only a (modified) leading-logarithmic description of the shape variables we study here. This level of approximation has already been demonstrated to capture the main physical features of various jet tagging and grooming tools (see e.g. Refs. [14,37] ).

Nevertheless it may ultimately prove important to extend the scope of our current

studies in various directions. One potential reason for this could be that here we

study tools that have some broad similarities e.g. all of them place constraints on subjet masses. In order to understand in more detail the differences between these tools it would be helpful to increase the accuracy of our analytical predictions, so that differences that may arise beyond LL effects are effectively highlighted. We would also expect such differences to show up in the Monte Carlo event generator studies, like those carried out below, since event generators would partially capture many sources of subleading corrections.

Secondly we do not study here the question of optimal values of cuts on subjet

variables, mainly confining ourselves to the region with both vcut and ρ 1. To

meaningfully explore the dependence on vcut and ρ over a broader range of values of

the variables concerned, one may need to carefully investigate effects beyond leading-logarithmic level including the role of hard non-leading-logarithmically enhanced contributions.

(22)

With such future developments in mind we discuss below several extra ingredients that are required to reach NLL accuracy: soft-and-large-angle contributions, multiple

emissions, the two-loop β function for αs, finite z1 corrections and non-global

loga-rithms [38].

For the figures where we compare to Monte Carlo simulations, we will include mul-tiple emission effects (numerically important; see below for their effect on the radiator

function), two-loop running coupling corrections (trivial to add, see AppendixA.1) as

well as finite z1 corrections (important for the physics discussion; see Appendix A.4).

We have not included in our analytic results contributions which are power-suppressed in the jet radius R. Although they would be relevant for a full phenomenological predic-tion, and can be substantial at the peak of the distributions (see e.g. Section 5 of [39]), these are expected to have little impact when comparing the discriminative power of different jet shapes. Moreover, they would be further reduced by the combination with

a grooming procedure which, as we argue in Section 6, is the natural future direction

of this work.

Soft-and-large-angle radiation. A source of single-logarithmic corrections comes

from radiating soft gluons at large angles. This would correspond to all the limits beyond the strict collinear ordering that we have adopted until now i.e. it can come from either θ1 ∼ R, or θ2 ∼ R, or θ1 ∼ θ2.

The first two regions would give single-logarithmic corrections proportional to R2.

In the small-R approximation we have adopted so far, these would further be sup-pressed. At the same order of accuracy, one would also have to include contributions coming from initial-state radiation and potential colour-correlation with the recoiling

partonic system [39]. Taking these into account would also add single-logarithmic

contributions to the mass distributions. This significantly complicates the discussion, especially for signal jets, where the mass would no longer be identical to the boosted heavy-boson mass and we would have to impose a certain window around the signal mass. In practice, therefore, one usually applies these techniques together with some grooming procedure which would drastically change this discussion. Some first results

have already been obtained in [40] for grooming techniques and we reserve for future

work the addition of radiation constraints to that discussion. We will comment on that a bit further in Section6.

The situation for θ1 ∼ θ2 is a bit more involved and we show in Appendix D

that it would only contribute to single-logarithmic corrections suppressed by θ2

1. These

contributions are also at most proportional to R2_{, although since radiation constraints}

tend to take most of their discriminative power from the large-angle region θ2 > θ1,

(23)

θ1 ∼ θ2 region would be even further suppressed.

Multiple emissions. Multiple gluon emissions also bring single-logarithmic

correc-tions to our results and we briefly discuss below how to account for them for the non-recursive variants of the shapes.

They correspond to cases where several gluon emissions, (z2, θ2), . . . , (zn, θn), are

only strongly ordered in angle and give similar contributions to the shape v, i.e. when v(z2, θ22; z1, θ21) ∼ · · · ∼ v(zn, θ2n; z1, θ12). This will come with a single-logarithmic

cor-rection αn−1_s Ln−1_v to the resummed exponent R.

It is important to realise that we will keep working in the v_{1 limit and so neglect} the contribution where all the ziθi2, i≥ 2, are of the same order as z1θ21. This would also

give a single logarithmic correction of the form αn

sLnρfn(v). Up to power corrections,

we can take fn constant and this correction would therefore simply be equivalent to the

multiple-emission correction to the plain jet mass, cancelling against the corresponding

normalisation in the spectrum of v.15 So, from now on, we focus on the region where all

the ziθi2, i≥ 2, are much smaller than z1θ21 and compute the corresponding correction

to Rv(z1) for a fixed z1.

The case of N -subjettiness and energy-correlation functions are mostly

straightfor-ward. In the kinematical configurations under consideration, the (optimal or gen-kt)

N -subjettiness axes will still align with the jet axis and with the emission (z1, θ1) setting

the mass. At a given z1, both τ21 and C2 will therefore be additive and the correction

to Rv(z1) will be γER0v(z1) + log[Γ(1 + R0v(z1))] where γE is the Euler constant and

R0_v(z1) is the derivative of Rv(z1) wrt Lv.

The situation is a bit more involved for the mass drop parameter. Had we defined µ2 _{as (m}2

j1 + m

2 j2)/m

2_{, µ}2 _{would have been additive and the similar conclusion as for}

τ21 and C2 would have been reached. Since µ2 is defined as a maximum over the two

subjets rather than a sum, we should instead use the fact that the condition µ2 < µ2_cut will be satisfied if both m2_j₁ < µ2m2 and m2_j₂ < µ2m2.

In practice, the emissions will either be clustered with the original hard parton or with the emission setting the mass. How exactly the particles in the jet are sifted in these two sets can depend non-trivially on the details of the clustering. If we take as an approximation, the assumption that particles behave independently, they will be clustered with the hard parton or the emission setting the mass according to which is geometrically closer, in a way similar to the heavy-jet mass in e+e− collisions [41]. If we split R_µ2

1/2(z1) in two contributions according to whether the emissions are clustered

15_{These type of corrections may however be crucial in trying to obtain the spectrum of v at finite}

(24)

with one or the other of the subjets, R_µ2 1/2,0(z1) = Z 1 0 dθ2 2 θ2 2 Z 1 0 dz2 αs(z2θ2) 2π Pi(z2)Θ(θ 2 2 < θ 2 1) Θ(ρ > z2θ22 > ρµ 2₎ + 1 2Θ(θ 2 2 > θ 2 1)Θ(ρ > z2θ22 > ρµ 2 ) (4.21) and R_µ2 1/2,1(z1) = Z 1 0 dθ₂2 θ2 2 Z 1 0 dz2 αs(z2θ2) 2π Pi(z2)Θ(θ 2 2 > θ 2 1) 1 2Θ(ρ > z2θ 2 2 > θ 2 1µ 2 ) + Z θ21 0 dθ2 12 θ2 12 Z 1 0 dz2 αs(z1z2θ12) 2π Pg(z2) Θ(z1z2θ 2 12/θ 2 1 > µ 2_). _(4.22)

each of these two parts become additive and we obtain the following correction to Rµ2

1/2 γER0_µ2 1/2(z1) + log[Γ(1 + R 0 µ2 1/2,0(z1))] + log[Γ(1 + R 0 µ2 1/2,1(z1))]. (4.23)

This is however only an approximation and we leave a more precise treatment for future work. At this stage, it can also be seen as the fact that, compared to N -subjettiness and energy-correlation functions, the mass-drop parameter is more delicate to tackle analytically.

Before going to comparisons with Monte Carlo simulations, we can observe that the two axes of 2-subjettiness can be viewed as partitioning the jet in two subjets, one with the jet constituents closer to the hard parton, one with those closer to the emission setting the mass. If instead of summing over all particles in the jet we were summing independently over the contributions of each of the two subjets and defining a modified 2-subjettiness as the maximum of these two contributions, the resummation of multiple emissions for that observable would follow Eq. (4.23). However, since Γ(1 + R0₀)Γ(1 + R0₁)/Γ(1 + R₀0 + R0₁) < 1 we should expect this variant of 2-subjettiness to perform worse than its original definition. Conversely, defining the mass-drop parameter as (m2_j₁ + m2_j₂)/m2_j would not only make its analytic behaviour simpler but could also translate into a slightly more efficient tool.

Two-loop running coupling. The inclusion of the two-loop β function is purely a

technical complication. In the results presented in AppendixA, we have included their

effects.

Finite z1 corrections. Finite z1 corrections would typically give contributions to

R(z1) like αslog(1/v) log(1/z1) or αslog(1/v) log(1/(1− z1)). The first of these two

(25)

emission, will give a double-logarithmic contribution that we already have included. The second term, as well as the first term integrated over the non-singular contributions

to the P (z1) splitting function will become important at NLL accuracy. Indeed, after

integration over z1, they would give corrections proportional to αsLv which contribute

at the single-log accuracy. To properly include these corrections, it is sufficient to integrate over the full P (zi) splitting function (rather than just including the finite

piece as a Bi term) and to keep the full z1 dependence when we calculate the shapes

in order to get single-logarithmic corrections to R(z1).

The corresponding results are presented in Appendix A.4. It is interesting to note

that their calculation allows for a nice physical discussion of similarities and differences between background and signal jets. Unless explicitly mentioned, these results will be used for the figures in this paper.

Non-global logarithms. Non-global logarithms are known to be difficult

contribu-tions to handle, especially if we want to go beyond the large-Nc approximation, where

a general treatment is still lacking. We will not provide an explicit calculation of their contribution in this paper. We note however that it might be beneficial to apply groom-ing techniques such as SoftDrop which are known to eliminate the contributions from non-global logarithms.

4.8 Comparison with fixed-order Monte-Carlo

As a partial cross-check of our results, the expressions obtained above can be expanded

in a series in αs and compared to EVENT2 [15,16] simulations. Here we compare the

(non-recursive) τ21, µ2_1/2 and C2 distributions at order αs.

Note that since we are using the N -subjettiness implementation from FastJet con-trib, we have to use pp coordinates (transverse momentum, rapidity and azimuth)

rather than e+_e− _{ones (energy and polar coordinates).}16 _{To maximise the efficiency}

and provide quark jets with a monochromatic pt, events are rotated so that their

origi-nal 2_{→ 2 scattering gives 2 jets at y = 0.}17 After that rotation, jets are reconstructed with the standard (pp) anti-kt algorithm [42] with R = 0.4.

16_{Alternatively, we could have used an e}+_e− _{implementation of the jet shapes (and clustering)}

together with unmodified e+e− events. Such an implementation is already readily available in the fastjet-contrib implementation of Energy Correlation Functions. This would however give the same logarithms as in our pp study so we decided to stay with a single coordinate system throughout this paper.

17_{Given the block structure of EVENT2 events, each event can be uniquely associated with a}

corresponding event with 2 partons in the final state. The latter can be used to define the event rotation. Another approach would be to rotate the event so as to align its thrust axis at y = 0.

(26)

On the analytic side, we take the fixed-order results18_{, expand (}_3.5_{) to first order} in αs, and perform the z1 integration.

For N -subjettiness, starting from (4.5) we get τ dΣ(τ ) dτ = αsCF π (Lρ+ Lτ + Bq) + αsCA π (Lτ + Bg). (4.24)

For the mass-drop parameter, we use (4.8) and reach

µ2 dΣ(µ 2₎ dµ2 Lµ<Lρ = 1 Lρ+ Bq hα_sC_F 4π 3L 2 ρ+ 6LρLµ− L2µ+ 4Bq(2Lρ+ Lµ) + 4Bq2 +αsCA 2π L 2 µ+ 2BqLµ+ 2Bg(Lµ+ Bq) i Lµ>Lρ = 1 Lρ+ Bq hα_sC_F π L 2 ρ+ LρLµ+ Bq(2Lρ+ Lµ) + Bq2 +αsCA 2π 2LµLρ− L 2 ρ+ 2BqLµ+ 2Bg(Lρ+ Bq) i . (4.25)

Finally, for the energy correlation function, we start from (4.12) and obtain C2 dΣ(C2) dC2 Le<Lρ = 1 Lρ+ Bq hα_sC_F 2π Le(4Lρ− Le+ 4Bq) + αsCA 2π Le(Le+ 2Bg) i (4.26) Le>Lρ = αsCF 2π 2Le+ Lρ+ Bq Lρ+ 2Bq Lρ+ Bq +αsCA 2π 2Le− Lρ+ 2Bg− Bq Lρ Lρ+ Bq .

The comparison with EVENT2 is presented in Fig. 3 where we have plotted the

shape distributions at order αs together with our analytic prediction. In these plots,

a constant factor αs/(2π) has been factored out. From Fig. 3, we see that this

differ-ence goes at least to a constant at large Lv, meaning that we do control the leading

logarithmic behaviour.

In principle, one can also wonder if the constant term can be obtained from an analytic calculation, which is, strictly speaking, beyond our leading-logarithmic accu-racy. For example, we have included in equations (4.24)-(4.26) corrections coming from the hard part of the splitting function. However, we have neglected large-angle

contri-butions proportional to R2 _{and expected to be small for R = 0.4, as well as possible}

finite z1 corrections. It is unclear from Fig. 3 whether or not this fully accounts from

the apparent constant value observed at large Lv. In this respect, it is also

interest-ing to note that, contrary to the jet mass where besides the logarithmic and constant

terms we would only have power corrections, the constant term in the Lv expansion

18_{Running coupling corrections would only enter at order α}2 s.

(27)

0 20 40 60 80 100 120 0 2 4 6 8 10 12 (2 π/ αs ) 1 /N d N /d Lτ N-subjettiness 3<Lρ<3.5 5<Lρ<5.5 7<Lρ<7.5 -6 -4 -2 0 2 4 6 0 2 4 6 8 10 12 d iff e re n ce Lτ=log(1/τ21) 0 20 40 60 80 100 120 0 2 4 6 8 10 12 14 (2 π/ αs ) 1 /N d N /d Lµ Mass-drop (non-recursive) 3<Lρ<3.5 5<Lρ<5.5 7<Lρ<7.5 -6 -4 -2 0 2 4 6 0 2 4 6 8 10 12 14 d iff e re n ce Lµ=log(1/µ21/2) 0 20 40 60 80 100 120 0 2 4 6 8 10 12 14 (2 π/ αs ) 1 /N d N /d Lε Energy-correlation function 3<Lρ<3.5 5<Lρ<5.5 7<Lρ<7.5 -6 -4 -2 0 2 4 6 0 2 4 6 8 10 12 14 d iff e re n ce Lε=log(1/C2)

Figure 3: Distributions for the (non-recursive) shapes at order αs for a few specific

bins in the jet mass. A constant factor αs/(2π) has been factored out of the

cross-section. The top row shows the distributions themselves, with solid lines corresponding to EVENT2 simulations and dashed lines to our analytic calculation. The bottom row show the difference between the two.

0 2 4 6 8 10 2 3 4 5 6 7 8 Lv c o e ffi ci e n t N-subjettiness all CF CA NF -30 -20 -10 0 10 20 2 3 4 5 6 7 8 co n st a n t te rm Lρ 0 2 4 6 8 10 2 3 4 5 6 7 8 Lv c o e ffi ci e n t Mass-drop (non-recursive) all CF CA NF -30 -20 -10 0 10 20 2 3 4 5 6 7 8 co n st a n t te rm Lρ 0 2 4 6 8 10 2 3 4 5 6 7 8 Lv c o e ffi ci e n t

Energy correlation function

all CF CA NF -30 -20 -10 0 10 20 2 3 4 5 6 7 8 co n st a n t te rm Lρ

Figure 4: Coefficients of the Lv (top row) and constant (bottom row) terms extracted

from the distributions in different bins of the jet mass. For each distribution, we have separated the results in the different colour channels. In all cases, a factor αs/(2π) has

(28)

has some corrections proportional to 1/Lρ, coming from the normalisation of the shape

distributions by the jet mass cross-section (see Eq. (3.5)). These terms can make the

convergence slower.

To extract more precise information, we have fitted, in each bin of the jet mass,

the coefficient of Lv and the constant term. This has been done in each colour channel

and reported in Fig. 4. Again, we see a good agreement for the linear rise with Lv as

well as for the constant terms proportional to CA and Nf. The slow convergence of the

CF term is related to the above discussion.

More precise statements would require going to larger values of Lv and Lρ. This is

difficult to explore due to limited machine precision.

4.9 Comparison with parton-shower Monte-Carlo

Our resummed analytic results can be directly compared to parton-shower Monte Carlo

event generators such as Pythia [43] or Herwig [44]. To do this, we have generated

QCD dijet events in 14 TeV pp collisions simulated with Pythia. We have selected

anti-kt(R=1) jets with a transverse momentum of at least 3 TeV.

For our analytical predictions, we have used the results from AppendixA.4, which,

unless explicitly mentioned otherwise, include all the computed global NLL corrections

discussed in Section 4.7. We have fixed αs(Mz) = 0.1185 with Nf = 5 and frozen the

coupling at µf r = 1 GeV.19

In Fig. 5, we compare the analytic results obtained for the distribution of N

-subjettiness, the mass-drop parameter and the energy-correlation functions, at a given jet mass, with the same distributions obtained with Pythia at parton-level, including

only final-state radiation. First of all, if we look at the large Lv region, where our

analytic description is valid, we see that it does reproduce nicely the Pythia simulations.

However, at smaller Lv, Pythia tends to produce more peaked distributions than what

we obtain analytically.20 In any case, the main message that one has to take from this

comparison is that the generic ordering between the different shapes is well captured by our analytic calculations.

Instead of plotting the distributions themselves, we can instead look at the mass distributions. This has the advantage that we can also consider the recursive versions of

19_{Note that Pythia uses a different prescription for the strong coupling, with α}

s(Mz) = 0.1383 and

a 1-loop running. However, our analytic results use the 2-loop β function. We show in AppendixE

that this does not affect our conclusions in any way.

20_{Using the prescription from [}₄₅_{] we can replace R(v) by R(v/(1}

− v)) and impose an endpoint, e.g. at v = 1/2, which would be the case for N -subjettiness at the order αs. That would produce

distributions which look much closer to Pythia, although a more detailed resummation of subleading logarithms of ρ (and Lv when if becomes small), and potentially fixed-order corrections (e.g. for

(29)

0 0.1 0.2 0.3 0.4 0.5 0 1 2 3 4 5 6 7 8 Lρ=4.25 1 /σ d σ /d Lv Lv quark - Pythia8(FSR) τ21 µ2 1/2 C2 0 0.1 0.2 0.3 0.4 0.5 0 1 2 3 4 5 6 7 8 Lρ=4.25 1 /σ d σ /d Lv Lv quark - analytic τ21 µ2 1/2 C2

Figure 5: Distributions obtained from quark jets for each of the three shapes studies. Left: results obtained with Pythia including only final-state radiation (we used pt,jet>

3 TeV, and 4 < Lρ < 4.5); right: results of our analytic calculations (for pt = 3 TeV

and Lρ= 4.25).

the cuts on the shapes. In Fig.6, we plotted the ratio of the mass distribution obtained after a given cut, Lv > 2.4, applied recursively (dashed lines) or not (solid lines) on our

three shapes, divided by the jet mass distribution without applying any cut. Globally, our analytic calculations tends to reproduce the main features of the Monte Carlo simulations, although they show longer tails at small masses. Note that for these plots,

we have used D2 instead of C2 since, compared to the latter, the former peaks at values

of Lvcloser to the other two shapes. Furthermore, since we have not computed

multiple-emission corrections for the recursive versions of the shape constraints, we have also left aside the multiple-emission corrections to the non-recursive versions for the analytic results plotted in Fig.6. It is interesting to notice that including the multiple-emission corrections for the non-recursive shapes tends to reduce the tails towards small mass, bringing more resemblance to the Pythia results. We could expect a similar behaviour for the corresponding recursive versions.

Finally, we want to investigate how the three shapes we have considered are affected by initial-state radiation (ISR) and non-perturbative effects such as hadronisation and the Underlying Event (UE). To get an insight about the importance of these effects,

we have looked, for each jet mass, at the cut on Lv that has to be applied to obtain a

(30)

0 0.2 0.4 0.6 0.8 1 1.2 0.001 0.01 0.1 1 L_v>2.4 (d σ /d ρ )cu t / ( d σ /d ρ )pla in ρ=m2_/(p t2R2) quark - Pythia8(FSR) τ21 µ21/2 D2 0 0.2 0.4 0.6 0.8 1 1.2 0.001 0.01 0.1 1 L_v>2.4 (d σ /d ρ )cu t / ( d σ /d ρ )pla in ρ=m2_/(p t2R2) quark - analytic τ21 µ21/2 D2

Figure 6: Ratio of the mass spectrum obtained with a cut on one of the shapes, divided by the plain jet mass spectrum. The solid lines are obtained imposing a fixed cut on the jet, while the dashed lines are obtained by imposing the cut recursively. Left: results obtained with Pythia including only final-state radiation (we used pt,jet > 3 TeV, and

Lv > 2.4 corresponding to v < 0.09); right: results of our analytic calculations (for

pt = 3 TeV). Note that multiple emissions are not included in these expressions since

they have not been computed for the recursive versions.

0 1 2 3 4 5 6 7 8 0.001 0.01 0.1 1 Lv,c u t ρ=m2_/(p t2R2) N-subjettiness FSR parton hadron hadr+UE 0 1 2 3 4 5 6 7 8 0.001 0.01 0.1 1 Lv,c u t ρ=m2_/(p t2R2) Mass-drop FSR parton hadron hadr+UE 0 1 2 3 4 5 6 7 8 0.001 0.01 0.1 1 Lv,c u t ρ=m2_/(p t2R2) Energy correlation FSR parton hadron hadr+UE

Figure 7: As a function of the jet mass, value of the cut on a given shape, log(1/vcut)

which would correspond to a 25% tagging rate. Results correspond to dijet events ob-tained with Pythia with pt,jet > 3 TeV. The various curves correspond to different levels

of the simulations. The three plots, from left to right, correspond to N -subjettiness, the mass-drop parameter and the energy-correlation function.

(31)

that, as expected, the cuts are quite sensitive to ISR and the UE, with hadronisation effects remaining relatively small.

We attribute this behaviour to the sensitivity of the shapes to soft and large-angle radiation. We also see that the energy correlation function tends to be more sensitive to these effects than N -subjettiness and the mass-drop parameter.

These conclusions however have to be taken with a bit of care since the mass of the jet itself will also be subject to the non-perturbative effects. In practice, one would rarely use such a cut without some additional grooming of the jet, limiting the non-perturbative effects at least on the reconstruction of the jet mass. We will come back to this point later, in Section 6.

5 Calculations for the signal

We now turn to the case of signal jets, i.e. jets coming from boosted colourless objects that decay into a q ¯q pair (or a pair of gluons), like a W , Z or Higgs boson, or a photon. As already briefly discussed in Sec.3, the splitting of such a boosted object X into

a q ¯q pair differs from a QCD gluon emission in the sense that it does not diverge as

1/z at small transverse-momentum fraction. This means that, although we are still

in the regime ρ _{1 and we shall still consider the limit of small v for all jet shapes}

v we study in this paper, now L1 = log(1/z1) is no longer large. As for the case of

QCD jets, we shall write the results as a function of z1, see eq. (3.6), but now we

will keep the correction in z1 and 1− z1. These finite z1 corrections would generate

single-logarithmic terms under the form of contributions with one logarithm of z1 or

1_{− z}1 and one logarithm of ρ or v. It is illustrative to expand out results in series

of log(1/ρ) and log(1/v) to see explicitly how these terms appear. We shall do this in this Section and use a fixed-coupling approximation to better highlight the physics

behind our calculation. In AppendicesA.3and A.4, we give the results with a running

coupling. In that case, we found it easier to keep the z1 dependence without making an

explicit series expansion, knowing that both results are equivalent at single-logarithmic accuracy.

Besides the careful inclusion of the z1and 1−z1dependence, the calculation follows

the same logic as what has been done above and mostly consists of two copies of the contribution from “secondary emissions” in the QCD case, one for each of the decay products of the boosted colourless object. The contributions from each parton will just differ by the replacement z1 ↔ (1 − z1). For simplicity, we still use L1 = log(1/z1) and