a deep learning approach to pid and alignment of the dircc. fanelli user group annual meeting at...
TRANSCRIPT
A deep learning approach to PID and
alignment of the DIRC
C. FanelliUser Group Annual Meeting at JLab, June 2018
JLAB User Group Workshop and Annual Meeting - June 2018 2
The GlueX detector in Hall D and the DIRC
DIRC will improve GlueX PID capabilities
(current π/K separation limited to 2 GeV/c)
(with DIRC)
3JLAB User Group Workshop and Annual Meeting - June 2018
Milestone:
● 6/5/2018 all of the DIRC bar boxes in Hall D
● No issues mechanically and optically
● 2 installed in the lower part
Transportation and installation
4JLAB User Group Workshop and Annual Meeting - June 2018
Optical Box
Particle track
Cherenkov photons
5JLAB User Group Workshop and Annual Meeting - June 2018
Cherenkov Photon “Ring” in PMT plane
On average about ~30 photons detected per particle
6JLAB User Group Workshop and Annual Meeting - June 2018
Misalignment● After installation the optical box will be filled by
distilled water (refraction index close to bars).
● Optical box made by several components, system for calibration.
● During data-taking this becomes a black-box problem with many non-differentiable terms.
○ relative alignment of the tracking system with the location and angle of the bars
○ mirrors shifts cause parts of the image change
○ other offsets
● These aspects make seemingly impossible to analytically understand the change in PMT pattern
7JLAB User Group Workshop and Annual Meeting - June 2018
A complex inverse problem
● A grid search with high dimensionality seems less suitable. ● Need of dedicated algorithms for alignment optimization
using pure samples of pions from Ks
8JLAB User Group Workshop and Annual Meeting - June 2018
Hit Patterns
● 3D (x,y,t) readout and this allows to separate spatial overlaps.
● Patterns take up significant fractions of the PMT in x,y and are read out over 50-100 ns due to propagation time in bars.
● H12700 PMTs have a time resolution of O(200 ps) and read-out electronics giving time information in 1 ns buckets.
DIRC rings for π⁺ plotted with time on the z-axis.
Credits: J. Hardin, PhD thesis
t
yx
J. Hardin and M. Williams, JINST 11.10 (2016)
9JLAB User Group Workshop and Annual Meeting - June 2018
Bayesian Optimization
t
x
● BO is a strategy for global optimization of black-box functions.
● After gathering evaluations BO builds a posterior distribution used to construct an acquisition function.
● This determines what is next query point.
● BO is agnostic to what is optimized.
-Log
10JLAB User Group Workshop and Annual Meeting - June 2018
Bayesian Optimization
● BO worked amazingly well for tuning MC (Ilten, Williams, Yang [1610.08328]).
● BO could align the DIRC detector leveraging current reconstruction algorithms.
11JLAB User Group Workshop and Annual Meeting - June 2018
Bayesian Optimization
t
x
Toy model● 3 parameters: ● 3-seg mirror θx, θy, θz
Each call based on high puritysample of (only) 100 pions
true: (0.50, 1.0, -0.50) deg
found: (0.48, 0.9, -0.44) deg
true offsets <θx, θy, θz>: (0.50, 1.0, -0.50) deg
Grid search
Toy model
BO-
log
L(π
|π)
12JLAB User Group Workshop and Annual Meeting - June 2018
Detector Optimization
t
x
-Log
● Optimization of detector design is quite complex problem that can be accomplished with BO
● Multi-purpose detector like EIC requires large-scale simulations of the main processes to make decision
● Goal: satisfy detector requirements and minimize cost R&D
13JLAB User Group Workshop and Annual Meeting - June 2018
Reconstruction Algorithms and PIDJ. Hardin and M. Williams, JINST 11.10 (2016)
basically a trade-off memory/CPU usage
faster reconstruction/hit pattern better resolution in regions with high overlap
R. Dzhygadlo et al. Nucl. Instr. And Meth. A, 766:263 (2014)
1. Creation of the LUT: store directions at the end of the radiator for each hit pixel
2. Direction from the LUT for the hit pixels are combined with the track directions (from tracking)
Fast tracing mapping straight lines through a tiled plane
1. Generation - 2. Traces through bars - 3. Traces through expansion volume
https://github.com/jmhardin/FasDIRC
LUT-
base
d g
eom
etric
al
KD
E-ba
sed
Can we use the same technology that facebook uses to
recognize faces for doing particle ID?
14 slide format suggested by a ML algorithmArtwork by Sandbox Studio, Chicago with Ana Kova
15JLAB User Group Workshop and Annual Meeting - June 2018
Artificial Intelligence Any technique enabling
computers to mimic human behaviour
16JLAB User Group Workshop and Annual Meeting - June 2018
Artificial Intelligence
A subset of AI based on
statistical methods to
enable machines to improve with experiences
Machine Learning
17JLAB User Group Workshop and Annual Meeting - June 2018
Artificial Intelligence
A subset of ML which makes the computation of multi-layer
Neural Network feasible
Machine Learning
Deep Learningwhen applied to
massive datasets (as particle
physics experiments) and
giving massive computer power it
outperforms all other models
most of the time
18JLAB User Group Workshop and Annual Meeting - June 2018
Deep Learning
● “Hottest” field in AI, and it’s everywhere
● use of ML (and DL) in HEP is becoming ubiquitous
19JLAB User Group Workshop and Annual Meeting - June 2018
How does it work?
● The real magic about NN is the result of an optimization technique: back-propagation (how a NN works to improve its output over time)
● DL (more hidden) nets are good in
learning non-linear functions (heavy processing tasks)
● Based on old school NN revitalized by augmented capabilities (e.g. GPU) and a plethora of new architectures (RNN, CNN, autoencoders, GAN, etc.)
Forward Propagation
ErrorEstimation
Backward Propagation
20JLAB User Group Workshop and Annual Meeting - June 2018
Why does it work? ● Debatable and may seem a dark art
(e.g. pruning/dropout neurons, transfer learning)
● No doubts it works…
Forward Propagation
ErrorEstimation
Backward Propagation DeepMind
Deep Q-learning playing Atari Breakout
Mnih et al, 1312.5602
Nature, 518.7540 (2015)
21JLAB User Group Workshop and Annual Meeting - June 2018
Example PID: NN vs Likelihood Approach
● LHCb uses NNs trained on 32 features from all subsystems each of which is trained to identify a specific particle type
● Standard candles are used as calibration samples to characterize performance of NN
Typically getting ~3x less misID background per particle.
arXiv:1412.6352v2
better
22JLAB User Group Workshop and Annual Meeting - June 2018
Deep Learning
● In particle physics, our goal is often (if not always) that of distinguishing signal from background.
● We need to build high-level features (e.g. invariant masses) to accomplish this task.
● DNN does not need our “help” and can learn from low-level features.
The DNN is able to learn all that it needs in this case, as providing high-level features results in no gains. DNN using low-level features outperforms any selection based only on high-level features.
Baldi, Sadowski, Whiteson arXiv:1402.4735
23JLAB User Group Workshop and Annual Meeting - June 2018
Convolutional Neural Network
● CNN architecture is inspired by human visual cortex. An image can be thought as a group of numbers each describing an intensity value. This can be input in a NN for classification in output.
● The neurons in a CNN look for local examples of translationally invariant features. This is done using convolutional filters to locate patterns producing maps of simple features, then build complex features using many layers of simple feature maps.
CNN “scans the image”
24JLAB User Group Workshop and Annual Meeting - June 2018
http://scs.ryerson.ca/~aharley/vis/conv/
25JLAB User Group Workshop and Annual Meeting - June 2018
CNNs for neutrinos
DNN at NOvA led to an impressive improvement for νe-detection (equivalent to ~ 30% more in exposure than previous PID techniques: $$$)
● MicroBoone has managed to train CNNs that can locate neutrino interactions within an event (draw bounding boxes), identify objects and assign pixels to them arXiv 1611.05531
Similar work ongoing at other ν-experiments (see, e.g. , NOvA
[1604.01444]), and also at colliders in the area of jet physics
[1511.05190], [1603.09349], …)
26JLAB User Group Workshop and Annual Meeting - June 2018
Generative Adversarial Network
Data sample
sample
R/F
is data sample?
Discriminator
Generator
from noise to an event
CALOGAN can generate the reconstructed CALO image using random noise, skipping the GEANT and RECO steps
Fast Simulations
● Detailed simulation of detector response is provided by amazing tools like Geant, which is slow and often prohibitive for generating large enough samples.
● Cutting-edge application of deep learning uses GAN for fast simulation.
● 2-NN game, one model maps noise to images, the other classifies the images if real or fake.
● The goal is to confuse the discriminator.
- CALOGAN: Paganini, de Oliveira, Nachman 1705.02355- jet images production: 1701.05927
arXiv:1406.2661
27JLAB User Group Workshop and Annual Meeting - June 2018
Generative Adversarial Network
CALOGAN can generate the reconstructed CALO image using random noise, skipping the GEANT and RECO steps
Fast Simulations
● Detailed simulation of detector response is provided by amazing tools like Geant, which is slow and often prohibitive for generating large enough samples.
● Cutting-edge application of deep learning uses GAN for fast simulation.
● 2-NN game, one model maps noise to images, the other classifies the images if real or fake.
● The goal is to confuse the discriminator.
- CALOGAN: Paganini, de Oliveira, Nachman 1705.02355- jet images production: 1701.05927
arXiv:1406.2661
28JLAB User Group Workshop and Annual Meeting - June 2018
Deliverables and timeline
● DIRC alignment with BO. ~ ½ year
● Explore deep nets architectures for GlueX DIRC/PID; determination of best approaches enhanced by BO hyperparameter tuning (on a longer term ~ 1 year).
● Collaboration: MIT/DIRC group.
● Resources: deep learning workstation.
29JLAB User Group Workshop and Annual Meeting - June 2018
Summary
● Use of ML became ubiquitous in particle physics. Deep Learning is starting to make an impact.
● A lot of exciting cutting-edge work not discussed, e.g., DKF, DNNs on FPGAs, automatic anomaly detection, compression using autoencoders, etc.
● Systematics are vital in our field: we are developing systematics aware ML algorithms and we are characterizing “black boxes” (e.g. DIRC optical box, EIC detector design).
● Beyond the issue of systematics, our data have other interesting features from a CS perspective: sparse data, irregularities in detector geometries, heterogeneous information, physical symmetries and conservation laws (e.g. recursive NN), etc.
We just began to scratch the surface when it comes to using tools like BO and DL and recent achievements in our field (e.g. NOvA, LHCb, ...) suggest it’s worth exploring these strategies.
30
BACKUP
JLAB User Group Workshop and Annual Meeting - June 2018
31JLAB User Group Workshop and Annual Meeting - June 2018
Deep jet tagging
u,d,s jet gluon jet b,c jet pileup jet
W,Z jet Higgs jet top jet
● A jet in LHC is a spray of hadrons from shower initiated by some fundamental particle
● Define set of features with
distributions depending on the jet nature
● Train NN using a sample
of jets whose nature is known
32JLAB User Group Workshop and Annual Meeting - June 2018
Deep jet tagging● DNN based on high-level features (jet
masses, multiplicities, energy correlation functions, etc.)
J. Duarte et al arXiv:1804.06913v2
D. Guest et al arXiv:1607.08633v2
33JLAB User Group Workshop and Annual Meeting - June 2018
Intuitively understanding CNN
34JLAB User Group Workshop and Annual Meeting - June 2018
Other Architectures: Recurrent & Recursive NNRecursive Neural Network
G. Louppe, K. Cho, C. Becot, and K. Cranmer 1702.00748v1
● Example: jet physics. ● More efficient than image-based networks.
Recurrent Neural Network
● Applications in jet physics, tracking
35JLAB User Group Workshop and Annual Meeting - June 2018
Other Architectures: Recurrent & Recursive NN
T. Cheng [1711.02633v1]
Recursive Neural Network
Recurrent Neural Network
36JLAB User Group Workshop and Annual Meeting - June 2018
Frameworks
37JLAB User Group Workshop and Annual Meeting - June 2018
Geometrical Reconstruction
38JLAB User Group Workshop and Annual Meeting - June 2018
Geometrical Reconstruction
σ σ σ
pionskaonspionskaons
39JLAB User Group Workshop and Annual Meeting - June 2018
Time-based imaging
pions 2 GeV/c, θ=1.2º, φ=90º
kaons 2 GeV/c, θ=1.2º, φ=90º
40JLAB User Group Workshop and Annual Meeting - June 2018
Time-based imaging
41JLAB User Group Workshop and Annual Meeting - June 2018
FastDIRC - KDE based Reconstruction
42JLAB User Group Workshop and Annual Meeting - June 2018
FastDIRC - KDE based Reconstruction
Fraction Worse = (LUTres - FastDIRCres)/LUTres FastDIRC