machine learning workflows in km3net

42
Machine Learning workflows in KM3NeT Stefan Reck IWAPP workshop 2021-03-10

Upload: others

Post on 27-May-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning workflows in KM3NeT

Machine Learning workflows in KM3NeT

Stefan Reck

IWAPP workshop

2021-03-10

Page 2: Machine Learning workflows in KM3NeT

ORCA and ARCAsame technology – different sites

Page 3: Machine Learning workflows in KM3NeT

3Machine Learning workflows in KM3NeT

KM3NeT: neutrino detector network

ORCA

Page 4: Machine Learning workflows in KM3NeT

4Machine Learning workflows in KM3NeT

KM3NeT: neutrino detector network

ARCA

Page 5: Machine Learning workflows in KM3NeT

5Machine Learning workflows in KM3NeT

→ detection principle: measure Cherenkov radiation of charged particles

43 cm

KM3NeT: neutrino detector network

DOM

DOM 31 PMTs

String 18 DOMs

Total (planned) 3 blocks of 115

strings each

Currently 6 strings (ORCA)

1 string (ARCA)

Page 6: Machine Learning workflows in KM3NeT

6Machine Learning workflows in KM3NeT

KM3NeT ORCA

KM3NeT ARCA _

- for neutrinos with TeV energy

- cosmic neutrinos, …

- height: 600m

- line spacing: about 90m

- for neutrinos with GeV energy

- neutrino oscillations, mass ordering, …

- height: 150m

- line spacing: about 20 - 23m

KM3NeT: neutrino detector networkhttps://arxiv.org/abs/1601.07459

Page 7: Machine Learning workflows in KM3NeT

7Machine Learning workflows in KM3NeT

Neutrino interactions

νμ − CCνe − CC ντ − CC ν − NC

Credit: J. Tiffenberg,

NUSKY11

Page 8: Machine Learning workflows in KM3NeT

8Machine Learning workflows in KM3NeT

Detector signatures

shower-like shower-like (83%) shower-like

track-like (17%)

track-like

Credit: J. Tiffenberg,

NUSKY11

Page 9: Machine Learning workflows in KM3NeT

9Machine Learning workflows in KM3NeT

● Two types of background producing photons in the deep sea:

1. Atmospheric muons passing the detector from above

2. Random noise, by K-40 beta decays and bioluminescent organisms

μ μ μ

νμ − CC

KM3NeT backgrounds

Page 10: Machine Learning workflows in KM3NeT

10Machine Learning workflows in KM3NeT

𝝁

𝝂𝝁

Up-going ν𝜇 − CC track-like event ν𝑒 − CC shower-like event

𝝂𝒆

Event topologieshttps://arxiv.org/abs/1601.07459

Page 11: Machine Learning workflows in KM3NeT

11Machine Learning workflows in KM3NeT

𝝁

𝝂𝝁

Up-going ν𝜇 − CC track-like event ν𝑒 − CC shower-like event

𝝂𝒆

Event topologies

How to separate these topologies?

Page 12: Machine Learning workflows in KM3NeT

Random Decision

Forests

Page 13: Machine Learning workflows in KM3NeT

13Machine Learning workflows in KM3NeT

Random decision forest

▪ ensemble of decision trees

▪ input: "hand-crafted" features

→ these need to be manually designed

▪ each tree is trained on random subset of features

▪ each tree outputs either "track" or "shower"

→ final score:

in total: ~150 features!

Page 14: Machine Learning workflows in KM3NeT

14Machine Learning workflows in KM3NeT

what are the input features?

→ e.g. fit quality

▪ use maximum-likelihood-based reconstruction of observables

▪ compare the reco quality of track and shower hypothesis

Random decision forest

Page 15: Machine Learning workflows in KM3NeT

15Machine Learning workflows in KM3NeT

what are the input features?

→ e.g. hit distribution

▪ compare distribution of hits to expectation in simulations

▪ shower events are more spherical

Random decision forest

Page 16: Machine Learning workflows in KM3NeT

16Machine Learning workflows in KM3NeT

Random decision forest

Result:

▪ good separation between track-

like and shower-like events

▪ define separation power S:

quantifies the overlap

between the distributions

PhD thesis Steffen Hallmann

Page 17: Machine Learning workflows in KM3NeT

17Machine Learning workflows in KM3NeT

Random decision forest

Result:

▪ good separation between track-

like and shower-like events

▪ define separation power S:

quantifies the overlap

between the distributions

RDFs are also used for separating

neutrinos from background

PhD thesis Steffen Hallmann

/ORCA

Page 18: Machine Learning workflows in KM3NeT

18Machine Learning workflows in KM3NeT

hits

features

track/shower?

extract

RDF

difficult

incomplete?

● Problem: feature design is not easy and maybe we missed

some good features?

Random decision forest

Page 19: Machine Learning workflows in KM3NeT

19Machine Learning workflows in KM3NeT

● Problem: feature design is not easy and maybe we missed

some good features?

● idea: let an algorithm learn the features directly on low-level

simulations

hits

track/shower?

neural network

Random decision forest

Page 20: Machine Learning workflows in KM3NeT

convolutional

neural networks

Page 21: Machine Learning workflows in KM3NeT

21Machine Learning workflows in KM3NeT

Successful model architecture in image recognition:

Convolutional neural networks (CNNs)

Simplified working principle of a CNN

Convolutional networks

Page 22: Machine Learning workflows in KM3NeT

22Machine Learning workflows in KM3NeT

Our data

How does our data look like?

→ spatial: 3D detector

→ temporal

→ pmt direction: 31 orientations per DOM

DOM

Page 23: Machine Learning workflows in KM3NeT

23Machine Learning workflows in KM3NeT

Our data: X-Y-plane

• line anchors in x-y plane are not on rectangular grid

• apply grid with <1 anchor per bin (assuming static detector)

→ 11 bins in X, 13 bins in Y (ORCA115)

JINST 15 P10005 (2020)

Page 24: Machine Learning workflows in KM3NeT

24Machine Learning workflows in KM3NeT

Our data: Z-plane

• each line has 18 DOMs

→ similar heights for all lines

→ can be easily binned (assuming static detector)

18 bins

Page 25: Machine Learning workflows in KM3NeT

25Machine Learning workflows in KM3NeT

Our data: time

• time coordinate is unbounded and continuous

→only use hits in a time window (e.g. 750 ns)

→ choose time resolution (e.g. 7.5 ns/bin)

• time resolution limited by hardware

• choose e.g. 100 time bins

JINST 15 P10005 (2020)

Page 26: Machine Learning workflows in KM3NeT

26Machine Learning workflows in KM3NeT

Our data: pmt direction

DOM

31 pmts arranged on a 2 sphere

→ no spherical convolution in tensorflow,

so we use the color channel of

convolutions

→ instead of multiple colors, we supply

multiple pmt directions!

Page 27: Machine Learning workflows in KM3NeT

27Machine Learning workflows in KM3NeT

2d plot, other

dimensions not shown

Input for convolutional networks

Page 28: Machine Learning workflows in KM3NeT

28Machine Learning workflows in KM3NeT

Input XYZ-T

CNN

Input XYZ-P

OutputInput XYZ-T

Input XYZ-P

11 x 13 x 18 x (100+31)

● In total, we end up with 5d data (x, y, z, t, pmt)

● But tensorflow only supports up to 4D input to convolutions!

→ Solution:

● Stack two projections of the event: xyz-pmt and xyz-t

● use color channel of convolution for stacked dimension

Input for convolutional networks

softmax / cat. cross.

Page 29: Machine Learning workflows in KM3NeT

29Machine Learning workflows in KM3NeT

Event topology classification

Neural networkvs

random forest

● Separability between track and shower

JINST 15 P10005 (2020)

Page 30: Machine Learning workflows in KM3NeT

30Machine Learning workflows in KM3NeT

Background classification

Neural networkvs

random forest

● separate atmospheric muons and

neutrinos

JINST 15 P10005 (2020)

Page 31: Machine Learning workflows in KM3NeT

31Machine Learning workflows in KM3NeT

● Convolutional networks on our data have various issues:

● no 5D convolution

● xyz positions need to be binned (problem for non-static detector)

● fixed time window with limited resolution

Graph networks

Idea: Use a network architecture that operates on graphs

Page 32: Machine Learning workflows in KM3NeT

graph convolutional

neural networks

Page 33: Machine Learning workflows in KM3NeT

33Machine Learning workflows in KM3NeT

Input for graph networks

Page 34: Machine Learning workflows in KM3NeT

34Machine Learning workflows in KM3NeT

Edge convolution

each hit is a node 𝒙𝒊

= (𝑥, 𝑦, 𝑧, 𝑡, 𝑝𝑚𝑡) of the hit

2 nodes are connected with edge 𝒆𝒊𝒋

= (𝒙𝒊, 𝒙𝒊 − 𝒙𝒋) of the hits i,j

https://arxiv.org/abs/1902.08570

Page 35: Machine Learning workflows in KM3NeT

35Machine Learning workflows in KM3NeT

Edge convolution

• define a multi-layer perceptron and

convolve over all edges

➢ produces an update 𝒖𝒊𝒋 from each edge

Then:

Page 36: Machine Learning workflows in KM3NeT

36Machine Learning workflows in KM3NeT

Edge convolution

• define a multi-layer perceptron and

convolve over all edges

• Update central node 𝒙𝒊 with averaged

updates from k nearest neighbours:

𝒙𝒊 → 𝒙𝒊 + 𝒖𝒊𝒋 𝒋

➢ produces an update 𝒖𝒊𝒋 from each edge

Then:

Page 37: Machine Learning workflows in KM3NeT

37Machine Learning workflows in KM3NeT

● Convolutional networks on our data have various issues:

● no 5D convolution

● xyz positions need to be binned (problem for non-static detector)

● fixed time window with limited resolution

Graph networks

Idea: Use a network architecture that operates on graphs

Fixed, can use n-D convolution

Fixed, unlimited resolution/time window

Fixed, no spatial binning necessary

Page 38: Machine Learning workflows in KM3NeT

38Machine Learning workflows in KM3NeT

How good is the EdgeConv compared to convolutions?

→ faster, fewer parameters (→ less overfitting)

Convolution Graph

train time / epoch 8.3h 2.0h

free parameters 8.4m 370k

Graph networks

Page 39: Machine Learning workflows in KM3NeT

39Machine Learning workflows in KM3NeT

▪ goal: reconstruct direction of atmospheric muons

Convolution Graph

0.0354 0.0349

Best validation loss

Graph networks: direction

loss (here: mean absolute error) is used to

judge performance of the reconstruction – the

lower the better!

(mean absolute error)

Page 40: Machine Learning workflows in KM3NeT

40Machine Learning workflows in KM3NeT

▪ goal: reconstruct number of atmospheric muons in an event

Convolution Graph

0.389 0.361

Best validation loss:

Graph networks: multiplicity

(categorical cross-entropy)

Page 41: Machine Learning workflows in KM3NeT

41Machine Learning workflows in KM3NeT

▪ goal: reconstruct distance between atmospheric muons

Convolution Graph

-1.911 -2.156

Best validation loss:

Graph networks: muon distance

(negative log-likelihood)

Page 42: Machine Learning workflows in KM3NeT

42Machine Learning workflows in KM3NeT

Summary

▪ Machine Learning is an important tool for event reconstruction in KM3NeT

▪ allows to solve otherwise difficult to tackle problems

▪ workflows are improved continuously and adapted to our data