reinterpretation and analysis preservation infrastructure for ......overview publications files...
TRANSCRIPT
It’s the difference between if you had airplanes where you threw away an airplane after every flight, versus you could reuse them multiple times.
— Elon Musk
L Heinrich USLUA Meeting
October 2017
Reinterpretation and Analysis Preservation Infrastructure for LHC Experiments
/lukasheinrich
It’s the difference between if you had airplanes where you threw away an airplane after every flight, versus you could reuse them multiple times.
— Elon Musk
New Physics seems to be hiding….
2
It’s the difference between if you had airplanes where you threw away an airplane after every flight, versus you could reuse them multiple times.
— Elon Musk
New Physics seems to be hiding….
What models are still viable?
How good is simplified model coverage?
What's the best strategy to test as many models as possible at the LHC?
3
— Elon Musk
New Physics seems to be hiding….
Theorists Experiments
Basic Tension in HEP
4
— Elon Musk
New Physics seems to be hiding….
Theorists Experiments
•do not have data•even with data analysis too
complex•have many ideas for still
viable / interesting models :)
Basic Tension in HEP
5
— Elon Musk
New Physics seems to be hiding….
Theorists Experiments
•do not have data•even with data analysis too
complex•have many ideas for still
viable / interesting models :)
•have data•know how to analyze it•do not have resources to
design dedicated analysis for each model :(
Basic Tension in HEP
6
p’₂
p’₁p₁
p₂
interpretation for model A
Signal Region
interpretation for model B
Signal Region
reinterpret
CLs = 0.03CLs = 0.05
Reinterpretation efficient way to resolve the tension: • re-use existing analyses that have good sensitivity for new model —
no need for a new, dedicated analysis• only need to simulate the new signal — data/background estimates
unchanged • not optimal, but we learn something about the model — if no analysis
can exclude it, good motivation for a dedicated search
7
model A
signal generation
event selectionstat. eval.
limit for A
The recipe
store backgrounds and data you’ll need them for final fit
Signal Region
signal generation
event selectionstat. eval.
store data pipeline to transport BSM signal model to a limit result
Signal Region
CLs = 0.03
model B
signal generation
event selectionstat. eval.
limit for B
Signal Region
CLs = 0.05
8
The recipe
Signal Region
signal generation
event selectionstat. eval.
Easy
store backgrounds and data you’ll need them for final fit
store data pipeline to transport BSM signal model to a limit result
• only need final ntuples histos
9
The recipe
Signal Region
signal generation
event selectionstat. eval.
Easy Easy
store backgrounds and data you’ll need them for final fit
store data pipeline to transport BSM signal model to a limit result
• only need final ntuples histos
• well known, analysis-independent processfor MC production
• can scale complexity / accuracy as needed:• (very) fast / parametrized
simulation O(s) / event• traditional fast sim• full Geant sim• future: GANs?
10
The recipe
Signal Region
signal generation
event selectionstat. eval.
Easy
• only need final ntuples histos
Easy
•many frameworks•one-off code•distributed knowledge &
development•people move on
store backgrounds and data you’ll need them for final fit
store data pipeline to transport BSM signal model to a limit result
Hard?
• well known, analysis-independent processfor MC production
• can scale complexity / accuracy as needed:• (very) fast / parametrized
simulation O(s) / event• traditional fast sim• full Geant sim• future: GANs?
11
The recipe
Signal Region
signal generation
event selectionstat. eval.
Easy
• only need final ntuples histos
Easy
•many frameworks•one-off code•distributed knowledge &
development•people move on
store backgrounds and data you’ll need them for final fit
store data pipeline to transport BSM signal model to a limit result
Hard?
systematic analysis preservation enables reinterpretation
• well known, analysis-independent processfor MC production
• can scale complexity / accuracy as needed:• (very) fast / parametrized
simulation O(s) / event• traditional fast sim• full Geant sim• future: GANs?
12
Workflow Measurements
Analysis 1COLLABORATION
Analyses Analysis 1Collaboration
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero. Sed cursus ante dapibus diam. Sed nisi. Nulla quis sem at nibh elementum imperdiet. Duis sagittis ipsum. Praesent mauris. Fusce nec tellus sed augue semper porta. Mauris massa. Vestibulum lacinia arcu eget nulla. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Curabitur sodales ligula in libero. Sed dignissim lacinia nunc.
1 Publication 23 Files 2 Contributors
John Doe CMS
Mary Smith CMS
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero.Vestibulum lacinia arcu eget nulla. Class aptent taciti sociosq.
Overview Publications Files Workflow Measurements Contributers ReCASTs
Model 1
P.D.F.
Figure 1 Plot
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero.
Eur.Phys.J. C76 (2016) 451, 2016DOI 10.1140/epjc/s10052-016-4286-3
Create new analysis
Team | Contact | Contribute | Source Code
Copyright 2016 CERN, Created & Hosted by CERN, Powered by Invenio Software
Analysis Preservation for re-use
Recent developments in industry make HEPanalysis preservation feasible
• capture heterogeneity in industry-standard containers — sharable, archivable analysis software (-runtimes)
• for analysis re-execution use industry-grade container orchestration tools developed for industry cloud services to execute on cluster
13
Workflow Measurements
Analysis 1COLLABORATION
Analyses Analysis 1Collaboration
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero. Sed cursus ante dapibus diam. Sed nisi. Nulla quis sem at nibh elementum imperdiet. Duis sagittis ipsum. Praesent mauris. Fusce nec tellus sed augue semper porta. Mauris massa. Vestibulum lacinia arcu eget nulla. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Curabitur sodales ligula in libero. Sed dignissim lacinia nunc.
1 Publication 23 Files 2 Contributors
John Doe CMS
Mary Smith CMS
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero.Vestibulum lacinia arcu eget nulla. Class aptent taciti sociosq.
Overview Publications Files Workflow Measurements Contributers ReCASTs
Model 1
P.D.F.
Figure 1 Plot
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero.
Eur.Phys.J. C76 (2016) 451, 2016DOI 10.1140/epjc/s10052-016-4286-3
Create new analysis
Team | Contact | Contribute | Source Code
Copyright 2016 CERN, Created & Hosted by CERN, Powered by Invenio Software
Analysis Presevation for re-use
LHC-wide effort to enable analysis re-use
• CERN Analysis Preservation Portal (CAP):• central place for LHC experiments to store
software, runtime environments, workflows, data needed to re-execution
• Analysis Re-execution Service (REANA)• provide interfaces to pull info from CAP and re-execute analysis in
the cloud
eventsel.yml
fit.yml
eventsel. code img
workflow.yml
data
import analysisworkflow
bkgdsfitting code img
14
Once preserved — re-use analyses for reinterpretation
• prepare pipeline inputsfor new model (scan)
• run pipeline to get new limits
p’₂
p’₁
Signal Region
model signal generation
event selectionstat. eval. limit
execute on REANA
CLs = 0.05
15
Current Status
• system was developed during Run-1 ATLAS pMSSM paper• prototype analysis Run-1 SUSY electroweak 2L production
• now preserving to Run-2 analyses — reinterpretations underway
ATLAS-CONF-2017-021 ATLAS-CONF-2017-02816
It’s the difference between if you had airplanes where you threw away an airplane after every flight, versus if you could reuse them multiple times.
— Elon Musk
— Elon Musk
HEP analysisHEP analyses
It’s the difference between if you had airplanes where you threw away an airplane after every flight, versus if you could reuse them multiple times.
testing it against a single model
•reinterpretation needed to fully exploit the LHC dataset
•preservation enables full-fidelity reinterpretation within the experiments
•recent IT advances make preservation of original HEPanalyses feasible