glast large area telescope: a fusion of hep and astro computing richard dubois

24
GLAST ADASS London Sept , 2007 R.Dubois 1/24 GLAST Large Area GLAST Large Area Telescope: Telescope: A Fusion of HEP and Astro Computing Richard Dubois Stanford Linear Accelerator Center [email protected]

Upload: wynter-haney

Post on 30-Dec-2015

26 views

Category:

Documents


1 download

DESCRIPTION

GLAST Large Area Telescope: A Fusion of HEP and Astro Computing Richard Dubois Stanford Linear Accelerator Center [email protected]. Outline. Introduction to GLAST & LAT A HEP detector in space Code Reuse (Beg, Borrow, Steal) Bulk Processing: turning around a downlink in an hour - PowerPoint PPT Presentation

TRANSCRIPT

GLAST ADASS London Sept , 2007

R.Dubois 1/24

GLAST Large Area Telescope:GLAST Large Area Telescope:

A Fusion of HEP and Astro Computing

Richard DuboisStanford Linear Accelerator Center

[email protected]

GLAST ADASS London Sept , 2007

R.Dubois 2/24

OutlineOutline

• Introduction to GLAST & LAT

• A HEP detector in space

• Code Reuse (Beg, Borrow, Steal)

• Bulk Processing: turning around a downlink in an hour

• Data Access: Catalogues and Portals

• Data and Service Challenges

• Astrophysics Analysis

• Summary

GLAST ADASS London Sept , 2007

R.Dubois 3/24

GLAST Key FeaturesGLAST Key Features

• Huge field of view– LAT: 20% of the sky at any instant; in sky survey mode, expose all parts of sky for

~30 minutes every 3 hours. GBM: whole unocculted sky at any time.• Huge energy range, including band 10 GeV - 100 GeV• Will transform the HE gamma-ray catalog:

– by > order of magnitude in # point sources– spatially extended sources– sub-arcmin localizations (source-dependent)

Large Area Telescope (LAT)

GLAST Burst Monitor (GBM)

spacecraft partner: General Dynamics

Two GLAST instruments:LAT: 20 MeV – >300 GeVGBM: 10 keV – 25 MeV

Launch: Apr 2008. Cape Kennedy 565 km, circular orbit 5-year mission (10-year goal)

GLAST ADASS London Sept , 2007

R.Dubois 4/24

GN

HEASARCGSFC

DELTA7920H

White Sands

TDRSS SNS & Ku

LAT Instrument Science

Operations Center

GBM Instrument Operations Center

GRB Coordinates Network

Telemetry 1 kbps

Alerts

Data, Command Loads

Schedules

Schedules

Mission Operations Center (MOC)

GLAST Science Support Center

GLAST Spacecraft

Large Area Telescope& GBM

GPS

GLAST MISSION ELEMENTS

GLAST ADASS London Sept , 2007

R.Dubois 5/24

e+ e–

Overview of LATOverview of LAT

• Precision Si-strip Tracker (TKR) 18 XY tracking planes. Single-sided silicon strip detectors (228 m pitch) Measure the photon direction; gamma ID.

• Hodoscopic CsI Calorimeter (CAL)

Array of 1536 CsI(Tl) crystals in 8 layers. Measure the photon energy; image the shower.

• Segmented Anticoincidence Detector (ACD)

89 plastic scintillator tiles. Reject background of charged cosmic rays; segmentation removes self-veto effects at high energy.

• Electronics System

Includes flexible, robust hardware trigger and software filters.

Systems work together to identify and measure the flux of cosmic gamma Systems work together to identify and measure the flux of cosmic gamma rays with energy 20 MeV - >300 GeV.rays with energy 20 MeV - >300 GeV.

Calorimeter

Tracker

ACD [surrounds 4x4 array of TKR towers]

Integrated Observatory in Phoenix, AZ

GLAST ADASS London Sept , 2007

R.Dubois 7/24

Fusion of HEP & Astro ComputingFusion of HEP & Astro Computing

1 Gev GammaIncident Gamma

e-

e+

Radiated Gammas

Note energy flow in

direction of incident Gamma

~8

.5 R

adia

tion

Length

s

Full simulation/reconstruction of 1 GeV gamma

EventInterpretation

“Science Tools”

Collection of tools for detection and characterization of gamma-ray sources (point sources and extended sources)

• source finding• max likelihood fitting (binned/unbinned)

• parameterized instrument response• exposure maps

• comparisons to model (observation sim)• GRBs, periodicity searches, light curves

• Science Tools are FITS/FTOOLS based• for dissemination to astro community

• Data distributed to public by Goddard

+ full code development environment on linux, windows (mac imminent), code and data distribution, automated code builds, documentation etc etc

GLAST ADASS London Sept , 2007

R.Dubois 8/24

e+ e–

Instrument Design ConsiderationsInstrument Design Considerations

Energy range and energy resolution requirements bound the thickness of calorimeter

Effective area and PSF requirements drive the converter thicknesses and layout. PSF requirements also drive the sensor performance, layer spacings, and drive the design of the mechanical supports.

Field of view sets the aspect ratio (height/width)

Time accuracy provided by electronics and intrinsic resolution of the sensors.

Electronics

Background rejection requirements drive the ACD design (and influence the calorimeter and tracker layouts).

• Background rejection:•Filter out 97% of downlink on the ground•Use Classification Trees

• Effects of Trigger & Onboard Filtering•Hardware trigger scheme •CPU cycle requirements and throughput•data volume per event

• Segmentation of ACD•Relative importance and size of side tiles•Rejection efficiency due to gaps and screws

Important Design Considerations: Optimized via simulations - Spot Checked in particle beam tests

• Lateral dimension < 1.8 m•Restricts geometric area => FOV

• Mass < 3000 kg•Primarily restricts total depth of the Cal

• Power Budget 650 W•Primarily restricts number of Tracker channels

GLAST ADASS London Sept , 2007

R.Dubois 9/24

Event Processing FlowEvent Processing Flow

• event based processing • C++ framework provides base class definition & services • completely configurable - code loaded at run time when needed

Root: object I/O needed forstructured data with cross linkages

GLAST ADASS London Sept , 2007

R.Dubois 10/24

Sim/Recon ToolkitSim/Recon Toolkit

Package Description Provider

ACD, CAL, TKR Recon Data reconstruction LAT

ACD, CAL, TKR Sim Instrument sim LAT

GEANT4 v8 Particle transport sim G4 worldwide collaboration

xml Parameters World standard

Root 5 C++ object I/O HEP standard

Gaudi C++ skeleton CERN standard

doxygen Code doc tool World standard

Visual C++/gnu Development envs World standards

CMT SCons Package mgmt tool HEP standard

ViewCvs cvs web viewer World standard

cvs File version mgmt World standard

GLAST ADASS London Sept , 2007

R.Dubois 11/24

Data ChallengesData Challenges

• A progression of data challenges.– DC1 in 2004. 1 simulated week all-sky survey simulation.

• find the sources, including GRBs• a few physics surprises

– DC2 in 2006, completed in June. • 55 simulated days (1 orbit precession period) of all-sky survey.• First generation of LAT source catalogue• Added source variability (AGN flares, pulsars). lightcurves and spectral studies.

correlations with other wavelengths. add GBM. study detection algorithms. benchmark data processing/volumes/reliability.

• 200k batch jobs - worked out reliability issues (< 0.1% failure rate now)

Data challenges provided excellent testbeds for science analysis software.

Full observation, instrument, and data processing simulation. Team uses data and tools to find the science.

“Truth” revealed at the end.

GLAST ADASS London Sept , 2007

R.Dubois 12/24

Post DC: Service ChallengePost DC: Service Challenge

• No longer need blind science exercises!

• Coordinate simulation studies for science and Operations

– a common set of simulations plus a near-constant stream of simulations to support special studies. Develop capabilities outside SLAC as needed using collaboration resources.

• Operations

– Simulating first 16 orbits of L&EO

– Run them through full LAT ground processing chain

– Develop shift procedures and train collaborators

• Science

– Full simulation of 1 orbit-year

– Definitive pre-launch dataset for working groups

– Expect to require 400 CPU-months to create

GLAST ADASS London Sept , 2007

R.Dubois 13/24

Service Challenge Eye CandyService Challenge Eye CandyPointing with two targets

LSI +61 303

GRB Trigger Time 0.02089589834 FirstRA 151.2563276 FirstDEC -38.92002236 First Estimated Error 0.5743404438

nPhot w/ [0,100) MeV 20 nPhot w/ [100,1000) MeV 2 nPhot w/ [1,10) GeV 0 nPhot w/ > 10 GeV 0 Trigger window size 40 EnergyCut -1

GRB Trigger Time 0.02089589834 FirstRA 151.2563276 FirstDEC -38.92002236 First Estimated Error 0.5743404438

nPhot w/ [0,100) MeV 20 nPhot w/ [100,1000) MeV 2 nPhot w/ [1,10) GeV 0 nPhot w/ > 10 GeV 0 Trigger window size 40 EnergyCut -1

Offline Sim of Onboard GRB Filter

Alert notice!

GLAST ADASS London Sept , 2007

R.Dubois 14/24

Pipeline ProcessingPipeline Processing

Started with STScI’s OPUS - then rolled our own

Features:• execute independent tasks• keep track of state in db• web view/admin of jobs• use dataset catalogue (db) to track files

• expect millions of files!• Java/Tomcat, jsp - not GLAST specific

GLAST ADASS London Sept , 2007

R.Dubois 15/24

The Hardest Task: Downlink ProcessingThe Hardest Task: Downlink Processing

Reconstruction

DigitizationM

erge

Merge

Reg

iste

r

Verify

Clean

Calibration

Monitoring

Process each downlink before the next arrives:

~100 cores for 1.5 hrs

Split input data into ~100 parallel pieces

On success: put Humpty back together again

Do monitoring

GLAST ADASS London Sept , 2007

R.Dubois 16/24

Automated Source MonitoringAutomated Source Monitoring

GLAST ADASS London Sept , 2007

R.Dubois 17/24

Usage Plots: Activity SummaryUsage Plots: Activity Summary

Many details stored per stepin oracle: web displays to trackusage and performance

GLAST ADASS London Sept , 2007

R.Dubois 18/24

Data Portal/CatalogData Portal/Catalog

Browsable tree of

datasets

Events, file size, run range

automatically set by “crawler”

Access/ Authentification handled by web

Meta-data added by creator

Supports mirroring at

multiple sites

GLAST ADASS London Sept , 2007

R.Dubois 19/24

Skimmer: Data to the userSkimmer: Data to the user

• Can skim any data from catalog

– Data available as root and/or fits files

• Skimmer jobs parallelized using Pipeline

– Need xrootd to spread disk load, avoiding individual disk server overload

• Output available for download for 10 days

Access to data will require registration with GLASTmember db

GLAST ADASS London Sept , 2007

R.Dubois 20/24

Computing Resource ProjectionsComputing Resource Projections

Providing resources for: flight data, reprocessing, simulations, user analysis

Currently: 350 TB disk & 400 coresAdd 250 & 400 for 2008

Providing resources for: flight data, reprocessing, simulations, user analysis

Currently: 350 TB disk & 400 coresAdd 250 & 400 for 2008

GLAST ADASS London Sept , 2007

R.Dubois 21/24

xrootdxrootd

• Beginning to use xrootd – System developed at SLAC for BABAR to manage large datasets– Distributes files across disks

• Maximizes throughput• Minimizes manual disk management• Automates archiving datasets to (and restoring from) tape• Provides more reliability and scalability than NFS• Supports access control based on GLAST collaborator list

queryredirector

File servers

STK tapesilo

GLAST ADASS London Sept , 2007

R.Dubois 22/24

Conforming to HEASARC FTOOLSConforming to HEASARC FTOOLS

• Agreed from the beginning with Mission that science tools would be jointly developed with (and distributed by) Science Support Center and adhere to FTOOLS standard– Atomic toolkit with FITS files as input/output to a string of

applications, controlled by IRAF parameter files– Use scripting language to glue apps together– Very different from the instrument sim/reconstruction code!– Shared code development environment, languages– Caused a certain amount of early tension, having to

bifurcate coding styles. People are spanning both worlds now.

Select eventsCreate

Exposure Map

ComputeDiffuse

Response

Do MaxLikelihood

Fit

GLAST ADASS London Sept , 2007

R.Dubois 23/24

Gamma Ray Analysis: Model FittingGamma Ray Analysis: Model Fitting

• A scarcity of photons in the GeV range… :-(

• Must do max likelihood model fitting– Use parametrized instrument response functions for energy,

angular resolution and effective area– Tabulated exposures– Computationally intensive for crowded regions of sky

• HEP approach would be to perform full simulations of the sky using complete knowledge, including correlations, of the instrument performance– It remains to be seen in practice whether this approach is needed

or feasible– Note that a recent 2 month orbit full simulation took ~500 CPU-

days to perform• BUT - that was one elapsed day on the batch farm

GLAST ADASS London Sept , 2007

R.Dubois 24/24

SummarySummary

• GLAST Observatory approaching final testing now– Launch in early 2008

• LAT use HEP techniques to handle science data stream and produce photon list

• HEASARC FTOOLS for mainstream astrophysics analysis

• It remains to be seen whether HEP’s extensive use of simulations will extend into the data taking era– Invaluable pre-launch– Will “error is in the exponent” make the extra analysis

precision unnecessary?