DUNE FD SP DAQGeorgia Karagiorgi
Columbia University
LBNC Meeting @ FNALFebruary 28, 2019
DUNE FD SP DAQ• DUNE FD SP DAQ has passed CDR Review in December 2018
• TDR has been lagging behind actual developments in design- Pragmatic and useful feedback from TDR Chapter Reviewers (thank you!)
• This talk: Review - DAQ Design
- Design Justification
- Cost Walkthrough
- Resources and Organization
2 G. Karagiorgi | DAQ
DUNE FD SP DAQ• Scope:- Begins at fibers from the detector (electrically decoupled from cryostats)
- Ends at WAN network interface (fibers to FNAL via Esnet)
- Provides common computing and network services for other systems
- All slow control and safety functions are outside DAQ
• Functionality:- Provides basic timing and synchronization for sub-detectors
- Receives, synchronizes, compresses, buffers streaming data from sub-detectors
- Extracts trigger primitives from data, summarizing local activity in detector; makes local, module, and cross-module trigger decisions
- Builds “event records” from selected space-time volumes, buffers, and relays those to permanent storage
- Carries out local data reduction and filtering as required
3 G. Karagiorgi | DAQ
DUNE FD SP DAQ• Design principles:- A single, scalable system design for all detector modules
- Ability to self-trigger with zero dead-time and with high efficiency for targeted physics signals and to record triggered data covering the full detector
- Evolutionary design: begin with very conservative design for first module
- Preserve possibility of additional capacity as required
• Key challenges:- Long (“permanent”) commissioning state: partition-able system
- Supernova Burst (SNB) requirements: large buffering upstream, low fake trigger rates
- Difficult-to-access location: reliability and remote operation
- Power, cooling and space considerations
4 G. Karagiorgi | DAQ
DUNE FD SP DAQ Design
• DAQ is split between 4850ft level and surface: Bulk of processing/buffering underground, minimizing data traffic to surface.
• Strong power, cooling, space constraints in CUC (450kW for all 4 modules, ~50 racks)
5
CUC: DAQ Front-End lives here
G. Karagiorgi | DAQ
Surface: Back-End DAQ lives here
DUNE FD SP DAQ Design
6 G. Karagiorgi | DAQ
Notshown:TimingSystem,ControlPaths
DUNE FD SP DAQ Design ! Cost Basis
7 G. Karagiorgi | DAQ
For1SPModule:150APAsà 150DetectorLinks
75FelixBoards 150Co-processor(FPGA) 75FelixHostservers 75DataSelectionservers* (possiblyàonsurface) +networking +CCM
PDSSystemàAssumedatalevelof10%ofTPC
6-8FelixBoards 6-8FelixHostservers 6-8DataSelectionservers (possiblyàonsurface) +networking +CCM*recentlyupdated(WBS)
DUNE FD SP DAQ Design ! Cost Basis
8 G. Karagiorgi | DAQ
For1SPModule:1(+1)DataSelection(MLT)server
+networking+CCM
Commontoallfourmodules:1(+1)DataSelection(ETM)server*TimingSystemInterface
+networking+CCM
Also:TriggerPrimitiveLocalStorageSystem*
(~1PB)
*recentlyupdated(WBS)
DUNE FD SP DAQ Design ! Cost Basis
9 G. Karagiorgi | DAQ
For1SPModule:30EventBuilderserver
+networking+CCM
5DataSelection(HLTFarm)servers
+networking+CCM
Storagebuffer(~0.5PB)
DUNE FD SP DAQ Design ! Cost Basis
10 G. Karagiorgi | DAQ
TimingSystem: UKdesign(ProtoDUNE)
Control,ConfigurationandManagement(CCM)forDownstreamDAQ: 4ServersforConfig,RunControl,Databases 4ServersforDataManagementandTransfer
Infrastructure Facilitiescosts(16racks,contracts/procurement) ITinfrastructure
Substantiating Design: 1. ProtoDUNE-SPDemonstrated Hardware Components:
• Front-end Readout: 2 out of 6 APAs are now read out by FELIX-based Front-end system
(~DUNE FD system without Co-processor(s) and only 1 APA/FELIX board)
à interface to front-end electronics, scalability, data flow à host server requirements and specificationsà platform for further system development (co-processor, trigger primitives)
• Downstream DAQ: Event Builder farm, CCM machines, disk buffers
à system partitioning, scalabilityà platform for further system development (CCM, IPC, Data Flow
Orchestrator)
• Data Selection/Timing:(External Triggering)
à Timing Distribution System
G. Karagiorgi | DAQ11
Substantiating Design: 2. Simulations• Key challenge for DUNE FD:
(new with respect to ProtoDUNE, MicroBooNE, SBND, ICARUS…):Continuous self-triggering of detector- Self-triggering demonstration is being planned for ProtoDUNE
(target: 2019, and ~2021).
- Other detectors (e.g. ICARUS, MicroBooNE) have demonstrated self-triggering in coincidence with external gates (data and trigger rates are capped). In DUNE-FD: Such ``throttling’’ limits physics sensitivity to off-beam physics.
à Need intelligent trigger (Data Selection) design and validation on reliable simulation, plus accurate noise and background predictions.
(This has been the Data Selection focus thus far).
G. Karagiorgi | DAQ12
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ13
CUC SurfaceControlRoom
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ14
DataSelectionBaselineStrategyandHierarchy:TPC-basedLowestLevel TriggerPrimitives: “hitonawire”
TriggerCandidates: “clustersofhits”
TriggerCommand: “localized(HE)interaction” (prompts“eventrecord”) “extended(SNB)interactions” Filter: down-selectionof triggersorevent records
HighestLevel
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ15
ExpectedTriggerRates:
100MeVtriggercandidatedepositedenergythresholdfor“localized(HE)interaction”trigger:>99%(à5.4msoflosslesslycompresseddatafromentiremoduletoeventbuilder;rateupto1Hz)
10MeVtriggercandidatedepositedenergythresholdasinputto“extended(SNB)interactions”=SNbursttrigger:>90%(galacticcoverage)(à100soflosslesslycompresseddatafromentiremoduletoeventbuilder;rate~1/month)
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ16
ThisstageMUSTkeepupwithtriggerprimitiveformation.
CUC SurfaceControlRoom
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ17
TriggerPrimitivesinCPU:TPCSimulations
Ongoingeffortontriggerprimitivegeneration:EfficiencyandTPratesduetonoiseandradiologicals.
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ18
TriggerPrimitivesinCPU:TPCSimulations
Ongoingeffortonfiltering(simulationbased)andtriggerprimitivegenerationspeed.
NeedtobebelowthislinetokeepupwithAPArawdatarate
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ19
TriggerPrimitivesinCPU:TPCTPValidationinProtoDUNE-SP
“Pure”39Arrateshouldbe~33kHz/APA(halfofDUNE,sinceonlyonesideofAPAisactive).Currentlyworkingonunderstandingadditionallyexpectedcontributionfromcosmicsand(known)noisychannelsinProtoDUNE-SP.
Also,fullstream,singleAPAtriggerprimitivegenerationonCPU(10cores)demonstratedatProtoDUNE-SP!
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ20
ThisstageMUSTkeepupwithTPprocessingandtriggercandidateformation;>99%efficientonHEinteractionsand[sufficiently*]efficientonSNinteractions
CUC SurfaceControlRoom
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ21
TriggerCandidatesinCPU:TPCSimulations
Efficiencyonindividualinteractionsstronglydependentonenergy;alsodependentonTriggerPrimitivethreshold.
TriggerCandidatesformedbyclusteringTriggerPrimitivesinChannel(collection)vs.Time.
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ22
TriggerCandidatesinCPU:TPCSimulations
Backgroundrateiscriticaltomaintainlow,tominimizeSNBfakerates.Atworstcase:Minimizebackgroundlow-energycandidatesattheexpenseoflowerlow-energycandidateefficiency.(SNBtriggerefficiencycanstillbehigh!)
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ23
TriggerCandidatesinCPU:TPCSimulations
Triggerefficiencyfor“localizedHEinteraction”events:beam,atmosphericneutrinos,nucleondecay,neutron-antineutronoscillation,cosmics.Sufficientefficiencydownto10MeV(forindividualSNinteractionsàSNBtriggerinput)
NCeventsleadtolossinefficiency
Substantiating Design: 2. Simulations
• Simulation studies show that:- >99% trigger efficiency for “localized (HE) interactions”- >90% galactic coverage for SNBcan be maintained while keeping:- “localized (HE) interaction” trigger rate to ~ 0.1 Hz (<1 Hz)- “extended (SN) interactions” (SNB) trigger rate to ~ 1/month
• The ProtoDUNE-SP downstream DAQ has demonstrated our abilityto keep up with 1/25th the size of a single DUNE FD SP module,for trigger rates up to 40 Hz and 3 ms readout window (x5 design goal).
• SNB trigger Event Building remains to be demonstrated.
G. Karagiorgi | DAQ24
TriggerRatesandEventBuilderSystem:
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ25
MLTMUSThavehighSNB“efficiency”:>90%galacticcoverage(galacticSNBprobability-weightedefficiency)
CUC SurfaceControlRoom
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ26
SensitivitytoRadiologicalBackgrounds
neutrons
Radon
ExtensiveSNburstsensitivitystudiesdemonstratehigh(>90%)galacticcoveragewhilekeepingfakeSNBtriggerrateto1/month.
SNBtrigger:Multiplicity-based:low-energytriggercandidatesoverupto10secondsAnenergy-weightedmultiplicitycountschemecouldbeappliedtoincreaseefficiency/minimizebackground.
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ27
ThisstageMUSTkeepupwithEVBrates.
CUC SurfaceControlRoom
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ28
Filtering(CNN-based)inGPUforeventclassificationand/ordownsizing:
Ongoingeffortonfurtherradiological+noisereductionforSNandHEinteractionsusingimageanalysis/classification.CNNvgg16bnetwork:20-30msclassificationtimeperAPA-frameonsingleGPUcard.
Cost Walkthrough
G. Karagiorgi | DAQ29
Cost Walkthrough
G. Karagiorgi | DAQ30
Cost Walkthrough
G. Karagiorgi | DAQ31
Cost Walkthrough
G. Karagiorgi | DAQ32
Cost Walkthrough
G. Karagiorgi | DAQ33
Cost Walkthrough
G. Karagiorgi | DAQ34
Resources and Organization
G. Karagiorgi | DAQ35
Co-processor design and FELIX integration plans
G. Karagiorgi | DAQ36
Summary
G. Karagiorgi | DAQ37
DUNE FD SP DAQ Conceptual Design is complete.
Substantial efforts have gone into substantiating design and costing through simulation and ProtoDUNE experience.
All this information, and feedback from TDR reviewers, will be incorporated in updated TDR draft.
We are rapidly moving toward a Technical Design this year (ProtoDUNE-SP development platform) and expect substantial progress toward Engineering Design.
Backup Slides
G. Karagiorgi | DAQ38
Uncompressed Event Rates
G. Karagiorgi | DAQ39
LocalizedHEtriggers(1SPmodule): 5.4msx150x2560x2MHzx12bit=6.22GB
ExtendedSNBtriggers(1SPmodule):
100sx150x2560x2MHzx12bit=115TB
Backup Slides
G. Karagiorgi | DAQ40
Trigger Primitive Definitions
G. Karagiorgi | DAQ41
• Currently in TDR: Conservative TP Model
• Expected to be dominated by Ar39 radiological backgrounds (~100Hz per channel for 0 threshold)
Substantiating Design: 2. Simulations
G. Karagiorgi | DAQ42
TriggerprimitivesinCPU:TPCSimulations
Ongoingeffortontriggerprimitivegeneration.