honkanen_heli

8/7/2019 Honkanen_Heli

1/24

Obtaining Parton Distribution Functions

from Self-Organizing Maps

Heli Honkanen, ISU & UVa

In collaboration with:

Simonetta Liuti (UVa, physics)

Joseph Carnahan, Yannick Loitiere, Paul Reynolds (UVa, cs)

eli Honkanen SPIN 2008 1


2/24

Omnipresent bias

Theoretical bias: Bias introduced by researches in the form of the

precise structure of the model they use, invariably constrains the

form of the solutions

Systematical bias: Bias introduced by algorithms, such as

optimization algorithms, which may favor some results in ways

which are not justified by their objective functions, but ratherdepend on the internal operation of the algorithm

PDFs always present in hadronic processes involving high

virtualities

(x,Q2), F2(x,Q2) P

i=q,q,g fi/h(x,Q2) (i)(x,Q2)

Knowledge of PDFs and their errors crucial in calculations of

new physics and measurements at the LHC



3/24

PDF fast facts

In principle moments ofF2 calculable on lattice, in practise

PDFs need to be extracted from measurements

Needed also for x, Q2 combinations not available in DIS,

DY,...data parametrization

Specific for the incoming hadron, independent of the hard

scattering process Universal

Subject to scale evolution, once known at one scale Q20 can be

predicted for other Q2

Current methods: Global Analysis & Neural Networks



4/24

Extracting PDFs I: Global analysis Initial scale (Q0 1GeV Q

mindat ) ansatz

fi/h(x,Q0) = a0xa1 (1 x)a2P(x; a3,...)

Evolve to higher scale Compute all the available observables

Compare with all the available data e.g.

2 =

expt.

Nei,j=1 (Datai Teori)V1ij (Dataj Theorj )

Adjust parameters and repeat until global mininum found

Errors estimated with Hessian method

(X)2 = 2

i,jXyi

H1

ij

Xyj

Estimates for the current major global analyses are that something like

2 = 50 100 corresponds to a 90% confidence interval.

Differences between current sets size of the estimated errors



5/24

Uncertainties on Uncertainties

Choice of statistical estimator global 2 is not adequate asshown by inconsistencies from different data sets

Error analysis ambiguities in the usage of data from

different experiments

Parametrization dependence bias from the functional forms

chosen at the initial scale, Q20

Theoretical assumptions s, s, c quark content, details ofevolution (NNLO, large/small x resummation,...)



6/24

Extracting PDFs II: Neural Network Approach

(The NNPDF Collaboration)

State of NN represented by the weight vector

=

(1)11 , (2)11 , . . . , (1)1 , (2)1 , . . .

ij (weights) and i (thresholds) free parameters to be

determined by the fitting procedure



7/24

Neural Network SchematicallyOutput of i:th neutron in the l:th layer:

(l)i

= g

h(l)i

, i = 1, . . . , nl , l = 2, . . . , L ,

where nonlinear activation function

g(x) = 11+exp(x) ( g(x) = x for the last layer)

evaluated as a linear combination of the output

(l1)

j of all networks in theprevious layers,

h(l)i

=Pnl1

j=1 (l1)ij

(l1)j

(l)i

Example: For (1-2-1) case:

(3)1 =

(3)1

(2)11

1 + e(2)1

(1)1

(1)11

(2)12

1 + e(2)2

(1)1

(1)21

General architecture:P

L

1l=1 (nl nl+1 + nl+1) parameters



8/24

NNPDF algorithm

Monte Carlo sampling of the data:

F(art)(k)i =

1 + r

(k)N

N

F(exp)i +

PNsysp=1 r

(k)p i,p + r

(k)i

i,s

,

k = 1, . . . , N rep

Use neural networks as universal unbiased interpo-

lating functions for each replica (=individual fit for each replica)

2(k)

[] = 1

NdatPNdat

i,j=1F(art)(k)

i

F(net)(k)

i `(cov)1

ijF(art)(k)

j

F(net)(k)

j

Genetic Algorithm for

Global minimum given by the average over the sample of

trained NN2 = 1

Ndat

PNdati,j=1

F

(exp)i

D

F(net)i

Erep

`(cov)1

ij

F

(exp)j

D

F(net)j

Erep

The uncertainty on the final result is found from the variance

of the Monte Carlo samples



9/24

NN results for nonsinglet PDF and gluon

Architecture of the NN (2-5-3-1)

-2

-1

0

1

2

3

4

1e-05 0.0001 0.001 0.01 0.1 1

xg(x,Q

02)

x

Nrep=100

x

-510

-410

-310

-210

-110 1

)02

xg(x,

Q

-2

-1

0

1

2

3

4CTEQ6.5

MRST2001E

Alekhin02

NNPDF1.0

0809.3716 [hep-ph]



10/24

Things to consider

MC sampling eliminates the problem of choosing a suitable

value of

2

Not tied to use of NN How would a functional form fit

behave in MC sampling?

NN training fully automated

What happens when the data is sparse (nPDFs, GPDs)?

no control over the parameters

How to implement information not given directly by the data? nonperturbative models, lattice calculations

Are bigger error bars really what is needed?



11/24

Give up this...



12/24

...for this!

Introduce Researcher Insight instead of Theoretical bias



13/24

Extracting PDFs III: Self-Organizing maps The SOM is an algorithm used to visualize and interprete large

high-dimensional data sets (subtype of neural networks)

The map attempts to represent all the available observationswith optimal accuracy using a restricted set of models

Widely used in several fields of reserch

SOM is a set of vectors that are isomorphic to the data samples

used for training (PDFs, observables, RGB color triplets...),

arranged e.g. as a 2-D rectangular grid

Each vector Vi, a cell, is assigned spatial coordinates

Distance metric Mmap (us: L1) determines the topology of the

map

Implementation proceeds in 3 steps: initialization, training and

clustering



14/24

Initializing SOM



15/24

Training the SOM

Vi(t + 1) = Vi(t) (1 w(t)Nj,i(t)) + Sj(t)w(t)Nj,i(t)



16/24

Training the SOM II

In the end on a properly trained SOM, cells that are

topologically close to each other will contain map vectors

which are similar to each other.

Data that is introduced (clustered) on a trained SOM get

distributed according to the similarity map vector represents

a class of similar data



17/24

Colors Example



18/24

1. step - Automated minimization: ENVPDF1. iteration:

Use existing PDF sets as a guideline:

For each flavour separately, select randomly either the range [0.5, 1],

[1.0, 1.5] or [0.75, 1.25] times any of the

{PDF} = {CTEQ6(or 4), CTEQ5, MRST02, Alekhin, GRV98} sets at

Q0 = 1.3 GeV

Set a value for each xdata randomly within the selected range (uniform

distribution), apply smoothing

Scale the combined set PDFcombi to obey the sumrules, linear interpolation

between {xdata}

Initialize N N SOM such that Vi = {PDFcombi , F

i2}

Batch train (in Nstep steps), training data 4N2 PDFcomb sets (= database)

Similarity criterion: similarity of observables F2(xdata, Q2data)

Always rescale {PDFcombi } to obey sumrules after updating the Vi

Evolution as in CTEQ6

After training compute 2 against experimental data for every PDF set on the

map, pick Ninit best to start a new iteration with a whole new SOM

DIS data (H1, Zeus, BCDMS) only for now



19/24

ENVPDF algorithm II

Later iterations:

For each selected init PDF, use the best nearest neighbour PDFs to establish a

1 envelope

For each flavour at each xdata, jitter around the init PDF within the selected

range (Gaussian distribution), smooth

Scale the combined set to obey the sumrules, linear interpolation between

{xdata}

Preserve PDF variety by using Norig 1. iteration generators in turn with NinitGaussian generators

Initialize N N SOM, and Nstep Batch train with 4N2 database sets + Ninitmother sets



20/24

Input quality

PDF LO 2/N NLO 2/N

Alekhin 3.34 29.1

CTEQ6 1.67 2.02

CTEQ5 3.25 6.48

CTEQ4 2.23 2.41

MRST02 2.24 1.89GRV98 8.47 9.58

*These are the 2/N for the quoted initial scale PDF sets which are evolved with

CTEQ6 DGLAP settings, no kinematical cuts or normalization factors for the

experimental data were imposed. We dont claim these values to describe the quality

of the quoted PDF sets.



21/24

ENVPDF results I

SOM Nstep Norig Case LO 2/N NLO 2/N

5x5 5 2 1 1.04 1.08

5x5 5 0 1 1.41 -

5x5 5 2 2 1.14 1.25

15x15 5 6 1 1.00 1.07

15x15 5 6 2 1.13 1.18

0 5 10 15 20 25 30 35 40 45 50

Iteration

0.5

1.0

1.5

2.0

2.5

3.0

2/N

LO

5 5, Nstep=5

Case 1:

best of 10

worst of 10

Case 2:

best of 10

worst of 10

0 5 10 15 20 25 30 35 40 45 50

Iteration

0.5

1.0

1.5

2.0

2.5

3.0

2/N

NLO

5 5, Nstep=5

Case 1:

best of 10

worst of 10

Case 2:

best of 10

worst of 10



22/24

ENVPDF results II: LO

10-5

2 5 10-4

2 5 10-3

2 5 10-2

2 5 10-1

2 5 1

x

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Q=1.3 GeV

CTEQ6

MRST02

5 5, Nstep= 5

Case 12 /N 1.2

LO

0.25*xg

xu

xuV

10-5

2 5 10-4

2 5 10-3

2 5 10-2

2 5 10-1

2 5 1

x

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Q=3.0 GeV

CTEQ6

MRST02

5 5, Nstep= 5

Case 12 /N 1.2

LO

0.1*xg

xuxuV

(2/N) = 1.065, = 0.014 2 = 10



23/24

ENVPDF results III: NLO

10-5

2 5 10-4

2 5 10-3

2 5 10-2

2 5 10-1

2 5 1

x

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Q=1.3 GeV

CTEQ6

MRST02

5 5, Nstep= 5

Case 12 /N 1.2

NLO

0.85*xg

xu

xuV

10-5

2 5 10-4

2 5 10-3

2 5 10-2

2 5 10-1

2 5 1

x

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Q=3.0 GeV

CTEQ6

MRST02

5 5, Nstep= 5

Case 12 /N 1.2

NLO

0.25*xg

xuxuV

(2/N) = 1.122, = 0.029 2 = 20



24/24

2. Step - Interactive GUI

Method extremely open for user interaction

Build an interactive GUI, let the user set the shape of the

envelope

Replace jittering with NN (or functional form), generators to

sample the NN weight vector (or parameters)

Clustering criteria could be anything that can bemathematically formulated project desired quality out of the

map

Study of flexible points (opportunities for adapting and finetuning), e.g. DGLAP variables, data selection, SOM params,

theoretical assumptions,...

Extend to nPDFs and GPDs...


honkanen_heli

Documents