hep data analysis using root · – let’s try it, open hist2.root and invoke the browser •...

36
HEP data analysis using ROOT week 2 File I/O, memory and object ownership Fitting, peak finding and Fourier analysis Mark Hodgkinson

Upload: others

Post on 11-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

HEP data analysis using ROOTweek 2 ▪ File I/O, memory and object

ownership ▪ Fitting, peak finding and Fourier

analysis

Mark Hodgkinson

Page 2: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

Week 2

• File I/O – recap of I/O from last week – more I/O operations – balancing load on queues, local/remote files

• Memory and object ownership – residency – persistency

• Fitting, peak finding and simple Fourier analysis – analysis with ROOT built-ins

• We will go through the slides • Then use the remaining time to work through the tasks (this is the more

important part). • Any questions to [email protected] or find me in D36.

Page 3: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

File I/O

• Have already performed some basic I/O operations – via TGraph constructor

– Loading and browsing a TFile

– writing/saving macros • TCanvas::SaveAs(“*.C”)

TGraph TGraph(const char* filename, const char* format = "%lg %lg", Option_t* option = "")

$ root tree2.rootroot [1] new TBrowser

24

68

100 20 40 60 80 100

Graph

Page 4: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

File I/O

• More I/O related operations: – the global directory and the system directory – hadding and chaining – tree friends

Page 5: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

gDirectory

• ROOT provides a global pointer to the current (ROOT) directory – just as gFile is the global pointer to the

current (ROOT) file – and gPad is the global pointer to the current

pad/canvas • also

– gEnv – gRandom

• Task: call associated Print() methods to see more on these

Page 6: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

gDirectory

• if you load a file into your session, it exists in the global directory$ root tree2.root root [0] Processing /Users/perkin/rootlogon.C...Setting MyStyleLoaded ROOT logon script.Attaching file tree2.root as _file0...root [2] gDirectory->ls()TFile** tree2.root TFile* tree2.root KEY: TTree t2;1a Tree with data from a fake Geant3

Page 7: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

gDirectory

• and (as shown last week) newly created objects reside in it

• Task: Repeat last weeks exercise below, and check the contents of gDirectory.

root [2] t2->Draw("destep>>hDEStep","destep")Info in <TCanvas::MakeDefCanvas>: created default TCanvas with name c1(Long64_t)9999root [3] h = (TH1F*)gDirectory->Get("hDEStep")(class TH1F*)0x7fe633db95c0root [4] h->GetMean()(const Double_t)2.00359400802118163e-05

Page 8: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

Ownership

• last line removes object that mark2 points at from memory • If you don’t then it “stays in scope” - once you shut down the process (ROOT) it will be freed. In

real world examples with 1000’s of lines of analysis code using complex c+ objects, such leaks can cause serious performance problems.

• In c++ if something is declared inside a pair of {}, then once the control exits those brackets only objects created with “new” persist. Remember to delete them!

• Most other languages you will use have automatic “garbage collection”, so you don’t worry about this kind of thing (e.g. python and hence pyROOT)

• C++ does have concept of smart pointers, which clean themselves up. Not supported in CINT. • Compiled C++ ROOT code could use them, may need to ensure you use newer C++ version (c++11) -

see later lectures on compilation.

Page 9: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

Some remarks about ownership

• Ownership is dictated by access to the delete method – ROOT owns some objects, you own others

• delete calls the class destructor

• Failure to delete objects owned by you results in memory leaks – rule of thumb: one delete for every new

• Attempting to delete something owned by ROOT (double delete) – segmentation fault!

Page 10: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

Some remarks about ownership

• From the ROOT manual: – Histograms, trees, and event lists created by the user

are owned by current directory (gDirectory). • When the current directory is closed or deleted the objects

it owns are deleted – TROOT master object (gROOT) has several collections

of objects. • Objects that are members of these collections are owned by

gROOT – Objects created by another object

• for example the function object (e.g.TF1) created by the TH1::Fit method is owned by the histogram.

– An object created by DrawCopy methods, is owned by the pad it is drawn in.

Page 11: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

Example

• Cannot delete this TH1F, because gDirectory took ownership of it when you created it. • This is probably not the only example of ROOT objects taking ownership - beware! • It works the other way around though - presumably gDirectory is careful to check the

pointer is valid before calling delete - something like “if f { delete f; }” • If not sure of ownership, being careful to do it this way will offer protection against

crashes.

Page 12: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

gSystem• also have a global pointer to the local

system – how to get the list of file names in the

current (system) directory?root [2] TSystemDirectory dir("dir",gSystem->pwd())root [3] TList* filelist = (TList*)dir.GetListOfFiles()root [4] TSystemFile *froot [5] TString fNameroot [6] TIter next(filelist)root [7] while(f=(TSystemFile*)next()){ fName = f->GetName();

cout<<fName<<endl; }

Page 13: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

Aside: TList and TIter

• note use of TList and TIter - ROOT analogues of standard template library (STL) vector.

• However could have also used standard std::vector, though some ROOT interfaces may only support ROOT TList and TIter.

• To some extent it will be a personal preference (mine is to where possible use standard c++, more portable to other people).

Page 14: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

hadding

• Sometimes necessary to concatenate/merge several ROOT TFiles

• Facilitated by hadd command-line utility

• Does generic merge of all contents, TObject, in all directories in the files

• Can be slow (many hours) if you have large numbers of histograms in a file for example

• Advanced Task: Take a look at what hadd does by reading its macro and for those interested in python you can find here we wrote in ATLAS using pyROOT (faster because it can ignore entire directories):

/home/hodgkinson/scripts/mergePhysValFiles.py

$ hadd merged.root file0.root file1.root … fileN.root

Page 15: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

hadding example

• histogram 1 – random (white) noise

Page 16: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

hadding example

• histogram 2 – bipolar pulse (setting width and sigma twice

below is NOT a typo - if you don’t, then your histogram won’t look the one on this slide)

Page 17: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

hadding example

• now, hadd the two histogram files – open merged.root and view in TBrowser

– Task: what is the result?

$ hadd merged.root hist1.root hist2.root$ root merged.root

root[1] new TBrowser

Page 18: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

hadding example

• The two histograms are merged into one file – there are not, however, added together!

• hadding simply consolidates the contents of multiple files – useful if your analysis output is generated

over many small jobs

Page 19: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

residency

• h1 was created, filled then written to file – creating the histogram before the file

dictates that h1 is memory resident • upon instantiation

• h2 was created after f2 was opened – opening a TFile dictates that all subsequent

objects are disk resident • not permanently until Write() is called

Page 20: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

residency

• Implicit method of determining whether objects are disk or memory resident – distinctly and uniquely ROOTish

• cause of much consternation to some conventionalists

• Important to understand how resources are being used – though ROOT has some ways of coping with large objects in

memory and large objects on file • Persistency of pointers and references is a deep and

nuanced topic – covered in detail in the manual

• for now, sufficient to flag importance of when/where objects are saved

• You can observe where objects reside via the TBrowser

Page 21: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

chaining

• similar to hadding, but specifically for TTrees – containing (exactly) the same branches – Does not actually merge the files!

• simple, illustrative macro here – http://www.hep.shef.ac.uk/people/perkin/

tchain.C • N.B. trees must have the same name – will be default when we write user objects to a

branch [week 4] • i.e. branch name = object name

Page 22: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

fitting• ROOT comes with lots of built-in fit

functionality – let’s try it, open hist2.root and invoke the

browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal) distribution

• can change the range of the fit • also, invoke the DrawPanel and draw simple histogram

errors – the default bin error is sqrt(bin entries) – use TH1F::SetBinError(bin,error) to apply errors to individual

bins

Task: Use the Fit and Draw panels to explore what else you can do.

Page 23: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)
Page 24: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)
Page 25: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

user fit function

• Don’t fit Gaussians to data that aren’t normally distributed – instead define a user function

• Delete previous fit – right click and select Delete

• Now, try this…

$ root hist2.root

root [1] TF1* fSinFit = new TF1("fSinFit","[0]*sin([1]*x+[2])", -20, 20); root [2] fSinFit->SetParameter(0,h2->GetMaximum()) root [3] h2->Fit(fSinFit)

Page 26: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

user fit function

Page 27: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

user functions• User functions can reference other user

fit functions – by function name

root [1] f = new TF1("sinc","sin(x)/x",-20,20)root [2] f->Draw()root [3] g = new TF1("g","sinc * x * x",-20,20)root [4] g->Draw()

Page 28: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

peak finding

• Embed the pulse in the noise – begin with a clone

– ROOT complains if one attempts to add histograms with different ranges

$ root merged.root

root [2] summed = (TH1F*)h1->Clone("summed")root [3] summed->Add(h2)Error in <TH1F::Add>: Attempt to add histograms with different number of bins(Bool_t)0root [4] summed->Draw()

Page 29: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

peak finding

• Sum by hand

• Find the peak

root [1] int iSum;root [2] for (int i=0; i<h2->GetNbinsX(); i++) { iSum = int(summed->GetNbinsX()/2.)+i; summed->SetBinContent(iSum,summed->GetBinContent(iSum)+h2->GetBinContent(i)); }root[3] summed->Draw();

root [4] summed->ShowPeaks()

Page 30: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

peak finding• threshold too low!

Page 31: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

peak finding

• Apply threshold – in range 0<threshold<1

• coordinates extracted from array of TPolyMarkers – observe marker ownership is via histogram’s

list of functions

root [4] summed->ShowPeaks(2,””,0.6)root [9] pm = (TPolyMarker*)summed->GetListOfFunctions()->FindObject("TPolyMarker")root [13] pm->GetX()[0]root [13] pm->GetY()[0]

Page 32: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

TSpectrum

• Built-in class for more advanced spectral analysis – peak finding – background subtraction – fourier analysis

• Worthy of further reading – will show simple use of FFT

Page 33: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

FFT• Create a histogram from a periodic

functionroot [1] f = new TF1("f"," [0]*sin([1]*x + [2])",-20,20);root [2] const Double_t pars[] = {2,10,0};root [3] f->SetParameters(pars);root [4] h = new TH1F("h","h",100,-20,20);root [5] for (i=0; i<h->GetNbinsX(); i++) h->SetBinContent(i,f->Eval(h->GetBinCenter(i)));root [6] h->Draw();

Page 34: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

FFT

• Plot FFT

– where “MAG” is the magnitude • can also request

– real component “RE” – imaginary component “IM” – phase “PH”

• TASK: Work through the examples in the previous slides and make sure you understand them, and reproduce the plots.

root [7] fft = (TH1F*)h->Clone("fft“)root [8] h->FFT(fft,"MAG");root [9] fft->Draw();

Page 35: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

Tasks Recap

• Make sure you can print the properties of the global pointers and understand what they are for - gRandom, gDirectory etc

• Use hadd and understand what it does + Bonus - read the hadd.C and look at the mentioned pyRoot version if you are interested in python.

• Explore the Fit and Draw panels. • Work through the examples shown in the slides

for spectral analysis and ensure you understand them and are able to reproduce the plots.

Page 36: HEP data analysis using ROOT · – let’s try it, open hist2.root and invoke the browser • right click the histogram and select FitPanel – default fit is a Gaussian (normal)

Closing remarks

• ROOT is somewhat idiosyncratic – especially when determining whether objects are disk

or memory resident • Attention must be paid to resource utilisation

– and object ownership • Have introduced basic fitting and analysis

– much more available, worthy of further reading • Next time:

– ROOT physics libraries – ROOT maths libraries

• Any questions?