dune fd sp daq - fermilab · dune fd sp daq • design principles:- a single, scalable system...

42
DUNE FD SP DAQ Georgia Karagiorgi Columbia University LBNC Meeting @ FNAL February 28, 2019

Upload: others

Post on 27-Feb-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQGeorgia Karagiorgi

Columbia University

LBNC Meeting @ FNALFebruary 28, 2019

Page 2: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQ•  DUNE FD SP DAQ has passed CDR Review in December 2018

•  TDR has been lagging behind actual developments in design-  Pragmatic and useful feedback from TDR Chapter Reviewers (thank you!)

•  This talk: Review -  DAQ Design

-  Design Justification

-  Cost Walkthrough

-  Resources and Organization

2 G. Karagiorgi | DAQ

Page 3: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQ•  Scope:-  Begins at fibers from the detector (electrically decoupled from cryostats)

-  Ends at WAN network interface (fibers to FNAL via Esnet)

-  Provides common computing and network services for other systems

-  All slow control and safety functions are outside DAQ

•  Functionality:-  Provides basic timing and synchronization for sub-detectors

-  Receives, synchronizes, compresses, buffers streaming data from sub-detectors

-  Extracts trigger primitives from data, summarizing local activity in detector; makes local, module, and cross-module trigger decisions

-  Builds “event records” from selected space-time volumes, buffers, and relays those to permanent storage

-  Carries out local data reduction and filtering as required

3 G. Karagiorgi | DAQ

Page 4: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQ•  Design principles:-  A single, scalable system design for all detector modules

-  Ability to self-trigger with zero dead-time and with high efficiency for targeted physics signals and to record triggered data covering the full detector

-  Evolutionary design: begin with very conservative design for first module

-  Preserve possibility of additional capacity as required

•  Key challenges:-  Long (“permanent”) commissioning state: partition-able system

-  Supernova Burst (SNB) requirements: large buffering upstream, low fake trigger rates

-  Difficult-to-access location: reliability and remote operation

-  Power, cooling and space considerations

4 G. Karagiorgi | DAQ

Page 5: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQ Design

•  DAQ is split between 4850ft level and surface: Bulk of processing/buffering underground, minimizing data traffic to surface.

•  Strong power, cooling, space constraints in CUC (450kW for all 4 modules, ~50 racks)

5

CUC: DAQ Front-End lives here

G. Karagiorgi | DAQ

Surface: Back-End DAQ lives here

Page 6: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQ Design

6 G. Karagiorgi | DAQ

Notshown:TimingSystem,ControlPaths

Page 7: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQ Design ! Cost Basis

7 G. Karagiorgi | DAQ

For1SPModule:150APAsà 150DetectorLinks

75FelixBoards 150Co-processor(FPGA) 75FelixHostservers 75DataSelectionservers* (possiblyàonsurface) +networking +CCM

PDSSystemàAssumedatalevelof10%ofTPC

6-8FelixBoards 6-8FelixHostservers 6-8DataSelectionservers (possiblyàonsurface) +networking +CCM*recentlyupdated(WBS)

Page 8: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQ Design ! Cost Basis

8 G. Karagiorgi | DAQ

For1SPModule:1(+1)DataSelection(MLT)server

+networking+CCM

Commontoallfourmodules:1(+1)DataSelection(ETM)server*TimingSystemInterface

+networking+CCM

Also:TriggerPrimitiveLocalStorageSystem*

(~1PB)

*recentlyupdated(WBS)

Page 9: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQ Design ! Cost Basis

9 G. Karagiorgi | DAQ

For1SPModule:30EventBuilderserver

+networking+CCM

5DataSelection(HLTFarm)servers

+networking+CCM

Storagebuffer(~0.5PB)

Page 10: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

DUNE FD SP DAQ Design ! Cost Basis

10 G. Karagiorgi | DAQ

TimingSystem: UKdesign(ProtoDUNE)

Control,ConfigurationandManagement(CCM)forDownstreamDAQ: 4ServersforConfig,RunControl,Databases 4ServersforDataManagementandTransfer

Infrastructure Facilitiescosts(16racks,contracts/procurement) ITinfrastructure

Page 11: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 1. ProtoDUNE-SPDemonstrated Hardware Components:

•  Front-end Readout: 2 out of 6 APAs are now read out by FELIX-based Front-end system

(~DUNE FD system without Co-processor(s) and only 1 APA/FELIX board)

à interface to front-end electronics, scalability, data flow à host server requirements and specificationsà platform for further system development (co-processor, trigger primitives)

•  Downstream DAQ: Event Builder farm, CCM machines, disk buffers

à system partitioning, scalabilityà platform for further system development (CCM, IPC, Data Flow

Orchestrator)

•  Data Selection/Timing:(External Triggering)

à Timing Distribution System

G. Karagiorgi | DAQ11

Page 12: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations•  Key challenge for DUNE FD:

(new with respect to ProtoDUNE, MicroBooNE, SBND, ICARUS…):Continuous self-triggering of detector-  Self-triggering demonstration is being planned for ProtoDUNE

(target: 2019, and ~2021).

-  Other detectors (e.g. ICARUS, MicroBooNE) have demonstrated self-triggering in coincidence with external gates (data and trigger rates are capped). In DUNE-FD: Such ``throttling’’ limits physics sensitivity to off-beam physics.

à Need intelligent trigger (Data Selection) design and validation on reliable simulation, plus accurate noise and background predictions.

(This has been the Data Selection focus thus far).

G. Karagiorgi | DAQ12

Page 13: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ13

CUC SurfaceControlRoom

Page 14: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ14

DataSelectionBaselineStrategyandHierarchy:TPC-basedLowestLevel TriggerPrimitives: “hitonawire”

TriggerCandidates: “clustersofhits”

TriggerCommand: “localized(HE)interaction” (prompts“eventrecord”) “extended(SNB)interactions” Filter: down-selectionof triggersorevent records

HighestLevel

Page 15: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ15

ExpectedTriggerRates:

100MeVtriggercandidatedepositedenergythresholdfor“localized(HE)interaction”trigger:>99%(à5.4msoflosslesslycompresseddatafromentiremoduletoeventbuilder;rateupto1Hz)

10MeVtriggercandidatedepositedenergythresholdasinputto“extended(SNB)interactions”=SNbursttrigger:>90%(galacticcoverage)(à100soflosslesslycompresseddatafromentiremoduletoeventbuilder;rate~1/month)

Page 16: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ16

ThisstageMUSTkeepupwithtriggerprimitiveformation.

CUC SurfaceControlRoom

Page 17: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ17

TriggerPrimitivesinCPU:TPCSimulations

Ongoingeffortontriggerprimitivegeneration:EfficiencyandTPratesduetonoiseandradiologicals.

Page 18: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ18

TriggerPrimitivesinCPU:TPCSimulations

Ongoingeffortonfiltering(simulationbased)andtriggerprimitivegenerationspeed.

NeedtobebelowthislinetokeepupwithAPArawdatarate

Page 19: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ19

TriggerPrimitivesinCPU:TPCTPValidationinProtoDUNE-SP

“Pure”39Arrateshouldbe~33kHz/APA(halfofDUNE,sinceonlyonesideofAPAisactive).Currentlyworkingonunderstandingadditionallyexpectedcontributionfromcosmicsand(known)noisychannelsinProtoDUNE-SP.

Also,fullstream,singleAPAtriggerprimitivegenerationonCPU(10cores)demonstratedatProtoDUNE-SP!

Page 20: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ20

ThisstageMUSTkeepupwithTPprocessingandtriggercandidateformation;>99%efficientonHEinteractionsand[sufficiently*]efficientonSNinteractions

CUC SurfaceControlRoom

Page 21: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ21

TriggerCandidatesinCPU:TPCSimulations

Efficiencyonindividualinteractionsstronglydependentonenergy;alsodependentonTriggerPrimitivethreshold.

TriggerCandidatesformedbyclusteringTriggerPrimitivesinChannel(collection)vs.Time.

Page 22: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ22

TriggerCandidatesinCPU:TPCSimulations

Backgroundrateiscriticaltomaintainlow,tominimizeSNBfakerates.Atworstcase:Minimizebackgroundlow-energycandidatesattheexpenseoflowerlow-energycandidateefficiency.(SNBtriggerefficiencycanstillbehigh!)

Page 23: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ23

TriggerCandidatesinCPU:TPCSimulations

Triggerefficiencyfor“localizedHEinteraction”events:beam,atmosphericneutrinos,nucleondecay,neutron-antineutronoscillation,cosmics.Sufficientefficiencydownto10MeV(forindividualSNinteractionsàSNBtriggerinput)

NCeventsleadtolossinefficiency

Page 24: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

•  Simulation studies show that:- >99% trigger efficiency for “localized (HE) interactions”- >90% galactic coverage for SNBcan be maintained while keeping:- “localized (HE) interaction” trigger rate to ~ 0.1 Hz (<1 Hz)- “extended (SN) interactions” (SNB) trigger rate to ~ 1/month

•  The ProtoDUNE-SP downstream DAQ has demonstrated our abilityto keep up with 1/25th the size of a single DUNE FD SP module,for trigger rates up to 40 Hz and 3 ms readout window (x5 design goal).

•  SNB trigger Event Building remains to be demonstrated.

G. Karagiorgi | DAQ24

TriggerRatesandEventBuilderSystem:

Page 25: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ25

MLTMUSThavehighSNB“efficiency”:>90%galacticcoverage(galacticSNBprobability-weightedefficiency)

CUC SurfaceControlRoom

Page 26: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ26

SensitivitytoRadiologicalBackgrounds

neutrons

Radon

ExtensiveSNburstsensitivitystudiesdemonstratehigh(>90%)galacticcoveragewhilekeepingfakeSNBtriggerrateto1/month.

SNBtrigger:Multiplicity-based:low-energytriggercandidatesoverupto10secondsAnenergy-weightedmultiplicitycountschemecouldbeappliedtoincreaseefficiency/minimizebackground.

Page 27: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ27

ThisstageMUSTkeepupwithEVBrates.

CUC SurfaceControlRoom

Page 28: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ28

Filtering(CNN-based)inGPUforeventclassificationand/ordownsizing:

Ongoingeffortonfurtherradiological+noisereductionforSNandHEinteractionsusingimageanalysis/classification.CNNvgg16bnetwork:20-30msclassificationtimeperAPA-frameonsingleGPUcard.

Page 29: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Cost Walkthrough

G. Karagiorgi | DAQ29

Page 30: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Cost Walkthrough

G. Karagiorgi | DAQ30

Page 31: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Cost Walkthrough

G. Karagiorgi | DAQ31

Page 32: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Cost Walkthrough

G. Karagiorgi | DAQ32

Page 33: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Cost Walkthrough

G. Karagiorgi | DAQ33

Page 34: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Cost Walkthrough

G. Karagiorgi | DAQ34

Page 35: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Resources and Organization

G. Karagiorgi | DAQ35

Page 36: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Co-processor design and FELIX integration plans

G. Karagiorgi | DAQ36

Page 37: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Summary

G. Karagiorgi | DAQ37

DUNE FD SP DAQ Conceptual Design is complete.

Substantial efforts have gone into substantiating design and costing through simulation and ProtoDUNE experience.

All this information, and feedback from TDR reviewers, will be incorporated in updated TDR draft.

We are rapidly moving toward a Technical Design this year (ProtoDUNE-SP development platform) and expect substantial progress toward Engineering Design.

Page 38: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Backup Slides

G. Karagiorgi | DAQ38

Page 39: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Uncompressed Event Rates

G. Karagiorgi | DAQ39

LocalizedHEtriggers(1SPmodule): 5.4msx150x2560x2MHzx12bit=6.22GB

ExtendedSNBtriggers(1SPmodule):

100sx150x2560x2MHzx12bit=115TB

Page 40: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Backup Slides

G. Karagiorgi | DAQ40

Page 41: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Trigger Primitive Definitions

G. Karagiorgi | DAQ41

•  Currently in TDR: Conservative TP Model

•  Expected to be dominated by Ar39 radiological backgrounds (~100Hz per channel for 0 threshold)

Page 42: DUNE FD SP DAQ - Fermilab · DUNE FD SP DAQ • Design principles:- A single, scalable system design for all detector modules- Ability to self-trigger with zero dead-time and with

Substantiating Design: 2. Simulations

G. Karagiorgi | DAQ42

TriggerprimitivesinCPU:TPCSimulations

Ongoingeffortontriggerprimitivegeneration.