A Theory of Quark vs.
Gluon DiscriminationAndrew Larkoski
Reed College
with Eric Metodiev 1906.01639
BOOST 2019, July 24, 2019
2
So, what is a jet?
Experimental Representation
{pi}i2JJ
N particles 3N - 4 unique real numbers
3
So, what is a jet?
Experimental Representation
{pi}i2JJ
N particles 3N - 4 unique real numbers
Abstract Representation
N particles 3N - 4 unique real numbers
~f({pi})
x
y
z
xiyi
zi
additional dimensions suppressed
4
So, what is a jet?
Examples of mappings
{pi}i2JJ
pixels
{pi}i2JJ
Energy Flow Polynomials
{pi}i2JJ
clustering history
de Oliveira, Kagan, Mackey, Nachman, Schwartzman 2015
Komiske, Metodiev, Thaler 2017
Louppe, Cho, Becot, Cranmer 2017
5
So, what is a jet?
Universal Approximation TheoremAll mappings are equal, but some are more equal than others.
Choose the representation that works for your problem
Theory of Quark vs. Gluon Jet Discrimination
eliminates four-vectors, pixels, clustering history, etc.
Want infrared and collinear safe representation
6
So, what is a jet?
Universal Approximation TheoremAll mappings are equal, but some are more equal than others.
Choose the representation that works for your problem
Theory of Quark vs. Gluon Jet Discrimination
Want infrared and collinear safe representation
Want simple, additive all-orders properties
eN eN+1 eN + es
= +
7
So, what is a jet?
N-subjettinesses and related observables accomplish this
{pi}i2JJ
N-subjettiness (also EFPs, ECFs,…)
Datta, AJL 2017
history:Thaler, van Tilburg, 2010, 2011Stewart, Tackmann, Waalewijn 2010Brandt, Dahmen 1979Wu, Zobernig 1979Nachtmann, Reiter 1982
⌧1
⌧2
⌧3
⌧ (�)N =1
pTJ
X
i2J
pTi minnR�
1i, R�2i, . . . , R
�Ni
o
8
Where do jets live?
For visualization simplicity, just consider (𝜏1,𝜏2)
⌧1
⌧2
9
Where do jets live?
For visualization simplicity, just consider (𝜏1,𝜏2)
⌧1
⌧2
IRC safety + additivity = exponential suppression
Exponentially small probability in regions where ⌧N ! 0
Particle production as Poisson process
10
Where do jets live?
For visualization simplicity, just consider (𝜏1,𝜏2)
⌧1
⌧2
/ CA
/ CF < CA
gluon jet⌧gluonN > ⌧quarkN
Additivity then implies
Exponentially more likely to be quark than gluon here
quark jet
11
Where do jets live?
For visualization simplicity, just consider (𝜏1,𝜏2)
⌧1
⌧2
Arbitrarily pure sample of quark jets for ⌧N ! 0
Quark “reducibility factor” q = 0
Exponentially more likely to be quark than gluon here
12
Where do jets live?
For visualization simplicity, just consider (𝜏1,𝜏2)
⌧1
⌧2
Can also determine best gluon purity
Argument is more subtle; details in our paper
More likely to be gluon than quark here
13
Where do gluon jets live?
Simplified argument:
For all 𝜏N ~ 1, probabilities described at fixed-order:
pg({⌧N ⇠ 1}) ' (↵sCA)Nfg({⌧N})
pq({⌧N ⇠ 1}) ' (↵sCF )Nfq({⌧N})
No non-analytic structure where 𝜏N ~ 1
Quark likelihood = Gluon “reducibility factor” = g ⇠✓CF
CA
◆N
14
Where do jets live?
For visualization simplicity, just consider (𝜏1,𝜏2)
⌧1
⌧2
Gluon jets are always contaminated by some quark jets
In practice small because CF/CA ~ 0.44
Resolving only 6 emissions:✓CF
CA
◆�6
< 1%
Exponentially more likely to be quark than gluon here
More likely to be gluon than quark here
15
One final observation
What is the optimal quark versus gluon discriminant?
⌧1
⌧2
Ans: Jerzy Neyman and Egon Pearson say likelihood ratioNeyman, Pearson 1933
L =pg({⌧N})pq({⌧N})
Exponentially more likely to be quark than gluon here
More likely to be gluon than quark here
16
One final observation
What are the properties of the likelihood ratio?
⌧1
⌧2
Ans: No clue. Some nasty function of {𝜏N}AJL BOOST 2019
L =pg({⌧N})pq({⌧N})
Exponentially more likely to be quark than gluon here
More likely to be gluon than quark here
Exponentially more likely to be quark than gluon here
17
One final observation
What are the properties of the likelihood ratio?
⌧1
⌧2 More likely to be gluon than quark here
Ans: As any 𝜏N → 0, likelihood vanishes
𝜏N → 0 is the fixed-order divergent limit
Quark vs. Gluon likelihood ratio is IRC safe!
L =pg({⌧N})pq({⌧N}) ! 0
L =pg({⌧N})pq({⌧N}) !
✓CA
CF
◆N
18
IRC Safety of the Likelihood
Consequences
L(q, g)
GeneralIndependent of number of
resolved emissions N
quar
k
gluon
Non-vacuousPronginess discriminators
(D2, 𝜏2,1, 𝜏3,2, …) all IRC unsafe
PracticalIRC safe observables are good q/g discriminants out of the box
good 𝜏N discrimination well-knownGallicchio, Schwartz 2012
Caveat EmptorDoes not mean that likelihood
can be calculated at fixed-order
19
Conclusions
Three results from simple considerations:
Can always purify quark jet sample
⌧1
⌧2
L =pg({⌧N})pq({⌧N}) ! 0
L =pg({⌧N})pq({⌧N}) !
✓CA
CF
◆N
Gluons contaminated by (CF/CA)N quarks
Quark/gluon likelihood ratio is IRC safe
Moral: Choice of jet representation matters for understanding!
20
Other results
Many other results presented in our paper:
Explicit calculations up through 𝛼s3
Derivation of quantitative performance bounds
Relationship between multiplicity and 𝜏N
0 1 2 3 4 5N
0.0
0.1
0.2
0.3
0.4
0.5
Quar
kvs
.G
luon
AU
C
Multiplicity
N-subjettiness: ⌧ (�)N
Pythia 8.226,p
s = 14 TeV
R = 0.4, pT 2 [1000, 1100] GeV
� = 0.5
� = 1.0
� = 2.0
0.90 0.92 0.94 0.96 0.98 1.00Quark Jet Signal E�ciency
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
Glu
onJe
tB
ackg
round
Rej
ection
N-subjettiness DNN, � = 2.0
Pythia 8.226,p
s = 14 TeV
R = 0.4, pT 2 [1000, 1100] GeV
DNN(⌧1)
DNN(⌧1, ⌧2)
DNN(⌧1, ⌧2, ⌧3)
DNN(⌧1, ⌧2, ⌧3, ⌧4)
1
B
1
1
S
Signal Fraction
Backg
roundFraction
Validation in simulation
1
�0
d3�C3
Fq
d⌧1 d⌧2 d⌧3=
⇣2↵s
⇡
⌘3C3
F
Z 1
0
dz1z1
Z 1
0
d✓1✓1
Z 1
0
dz2z2
Z ✓1
0
d✓2✓2
Z 1
0
dz3z3
Z ✓2
0
d✓3✓3
⇥ �(⌧1 � z1✓1)�(⌧2 � z2✓2)�(⌧3 � z3✓3)
=⇣2↵s
⇡
⌘3 1
2
C3F
⌧1⌧2⌧3
✓1
3log3
⌧3⌧1
� 1
3log3 ⌧3 + log ⌧1 log
2 ⌧2⌧3
◆
�1.0 �0.5 0.0 0.5 1.0↵
0.16
0.18
0.20
0.22
0.24
0.26
0.28
0.30
0.32
0.34
0.36
Quar
kvs
.G
luon
AU
C
N-subjettiness: ⌧ (�)↵1 ⌧ (�)
2
Pythia 8.226,p
s = 14 TeV
R = 0.4, pT 2 [1000, 1100] GeV
� = 0.5
� = 1.0
� = 2.0
21
Now Available!
Bonus Slides
22
Reducibility Factors for QCD vs. Z jets
23
d�Z(⌧(2)1 )
d⌧ (2)2
= 2↵s
⇡
CF
⌧ (2)2
log⌧ (2)1
⌧ (2)2
d�q(⌧(2)1 )
d⌧ (2)2
= �↵s
⇡
1
⌧ (2)2
"CF
2log ⌧ (2)1 + (CF + CA) log
⌧ (2)2
⌧ (2)1
#
q = �2CF log ⌧ (2)
1
⌧ (2)2
CF2 log ⌧ (2)1 + (CF + CA) log
⌧ (2)2
⌧ (2)1
�������⌧ (2)2 !⌧ (2)
1
= 0
Form of Calculation
log1
z
log1
✓log
1
⌧1log
1
⌧2
log1
⌧1
log1
⌧2
log1
z
log1
✓log
1
⌧1log
1
⌧2
log1
⌧1
log1
⌧2
p(⌧1, ⌧2) =
Zdz1 p(⌧1) p(z1|⌧1) p(⌧2|⌧1, z1)
24
AreaCF =↵s
⇡CF
✓log2
⌧2⌧1
+ 2 log z1 log⌧2⌧1
◆AreaCA =
↵s
⇡CA log2
⌧2⌧1
Robust bound on AUC
1
B
1
1
S
Signal Fraction
Backg
roundFraction
AUC � S + B � 2SB
2� 2SB
25
26
Gluon Reducibility Factor in Simulation
0.90 0.92 0.94 0.96 0.98 1.00Quark Jet Signal E�ciency
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
Glu
onJe
tB
ackg
round
Rej
ection
N-subjettiness DNN, � = 2.0
Pythia 8.226,p
s = 14 TeV
R = 0.4, pT 2 [1000, 1100] GeV
DNN(⌧1)
DNN(⌧1, ⌧2)
DNN(⌧1, ⌧2, ⌧3)
DNN(⌧1, ⌧2, ⌧3, ⌧4)
Slope of ROC compared to predicted (CA/CF)N
Relationship to Multiplicity
0 10 20 30 40 50 60 70 80 90 100N
0.0
0.1
0.2
0.3
0.4
0.5
Quar
kvs
.G
luon
AU
C
Multiplicity
N-subjettiness: ⌧ (�)N
Pythia 8.226,p
s = 14 TeV
R = 0.4, pT 2 [1000, 1100] GeV
� = 0.5
� = 1.0
� = 2.0
0 1 2 3 4 5N
0.0
0.1
0.2
0.3
0.4
0.5
Quar
kvs
.G
luon
AU
C
Multiplicity
N-subjettiness: ⌧ (�)N
Pythia 8.226,p
s = 14 TeV
R = 0.4, pT 2 [1000, 1100] GeV
� = 0.5
� = 1.0
� = 2.0
27
0 25 50 75 100 125 150Constituent Multiplicity
0.000
0.005
0.010
0.015
0.020
0.025
Cro
ssSec
tion
(Nor
mal
ized
)
Constituent Multiplicity
Pythia 8.226,p
s = 14 TeV
R = 0.4, pT 2 [1000, 1100] GeV
Quarks
Gluons