Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 216 times |
Download: | 0 times |
Database Access
Elizabeth Gallas - Oxford -
October 06, 2009
ATLAS Week - Barcelona, Spain
What does a job need ?1. Data (Events)2. Database (Geometry, Conditions)3. Efficient I/O (sometime across a network), CPU4. (A Purpose and a) Place for Output
Needs:1. Food2. Water3. Love4. Place for output
06-Oct-2009 Elizabeth Gallas 2
Outline Overview of Oracle Databases in ATLAS Conditions Database Database replication Database Distribution Technologies
Emphasis on “Frontier” Decision for grid-wide Frontier deployment What you need to know and do
TAG DB Architecture, Services, Resource planning
Ongoing Work (on current topics) Summary and Conclusions
Insufficient time to describe many ongoing activities Please see presentations during recent Software week:
http://indico.cern.ch/conferenceDisplay.py?confId=50976 But a lot of activity since then as well !
06-Oct-2009 Elizabeth Gallas 3
Overview – Oracle usage in ATLASOracle is used extensively: every stage of data taking & analysis Configuration
PVSS – Detector Control System (DCS) Configuration & Monitoring Trigger – Trigger Configuration (online and simulation) OKS – Configuration databases for the TDAQ Detector Description – Geometry
File and Job management T0 – Tier 0 processing DQ2/DDM – distributed file and dataset management Dashboard – monitor jobs and data movement on the ATLAS grid PanDa – workload management: production & distributed analysis
Dataset selection catalogue AMI (dataset selection catalogue)
Conditions data (non-event data for offline analysis) Conditions Database in Oracle [POOL files in DDM (referenced from the Conditions DB)]
Event summary - event-level metadata TAGs – ease selection of and navigation to events of interest
06-Oct-2009 Elizabeth Gallas 4
Thanks to Oracle Operations SupportMany applications sharing Oracle resources, can effect operational
capacitySupporting the Databases is critical ! Special thanks to
CERN Physics Database Services Support Oracle-based services at CERN Coordinate Distributed Database Operations (WLCG 3D)
Tier-1 (Tier-2) -- DBAs and system managers ATLAS DBAs:
Florbela Viegas Gancho Dimitrov
- Schedule/Apply Oracle interventions- Advise us on Application development- Coordinate database monitoring (experts, shifters)
- Helping to develop, maintain and distribute this critical data takes specialized knowledge and considerable effort which is frequently underestimated.
06-Oct-2009 Elizabeth Gallas 5
“Conditions”
“Conditions” – general term for information which is not ‘event-wise’ reflecting the conditions or states of a system – conditions are valid for an interval ranging from very short to infinity.
Sets of Conditions can be versioned (called a COOL Tag)Any conditions data needed for offline processing and/or analysis
must be stored in the ATLAS Conditions Database
or in its referenced POOL files (DDM)
ATLAS Conditions Database(any non-event-wise data
needed for offline process/analysis)ZDC
DCS TDAQ OKS
LHC
DQM
06-Oct-2009 Elizabeth Gallas 6
Conditions DB infrastructure in ATLAS Relies on considerable infrastructure: COOL, CORAL, Athena (developed by
ATLAS and CERN IT) -- generic schema design which can store / accommodate / deliver a large amount of data for a diverse set of subsystems.
Athena / Conditions DB Considerable Improvements in R15 (TWiki):
DeliverablesForRelease15 More efficient use of COOL connections Enabling of Frontier / Squid connections to Oracle IOVDbSvc refinements … and much more
Continued refinement – by subsystem As we chip away at inefficient usage
Thanks to sculpting Coordination by Richard Hawkings
COOL Tagging - distinct sets of Conditions making specific computations reproducible
Coordination by Paul Laycock Used at every stage of data taking and analysis
From online calibrations, alignment, monitoring tooffline … processing … more calibrations … further
alignment… reprocessing … analysis …to luminosity and data quality
06-Oct-2009 Elizabeth Gallas 7
DB Access Software Components
DatabaseResidentData
06-Oct-2009 Elizabeth Gallas 8
Oracle Distribution of Conditions data Oracle stores a huge amount of essential data ‘at our fingertips’
But ATLAS has many… many… many… fingers May be looking for oldest to newest data
Conditions in Oracle – Master copy at Tier-0 replicated to 10 Tier-1 sites Running jobs at Oracle sites (direct access) performs well
Important to continue testing, optimize RAC But direct Oracle access on the grid from remote site over WideAreaNetwork
Even after tuning efforts, direct access requires many back/forth communications on the network – excessive RTT (Round Trip Time)… SLOW
Cascade effect Jobs hold connections for longer … Prevents new jobs from starting …
Use alternative technologies, especially over WAN: “caching” Conditions from Oracle when possible
OnlineCondDB
Offlinemaster
CondDB
Tier-1replica
Tier-1replica
Tier-0 farm
Computer centre
Outside world
Isolation / cut
Calibration updates
06-Oct-2009 Elizabeth Gallas 9
“DB Release”: make system of files containing data needed. Used in reprocessing campaigns. Includes:
SQLite replicas: “mini” Conditions DB with specific Folders, IOV range, CoolTag(a ‘slice’ – small subset of many rows in particular tables)
And associated POOL files, PFC
“Frontier”: store results in a web cache. developed by Fermilab
used by CDF, adopted and further refined for CMS model One/more Frontier / Squid servers located at/near Oracle RAC
negotiate transactions between grid jobs and the Oracle DB – load levelling reduce the load on Oracle by caching results of repeated queries reduce latency observed connecting to Oracle over the WAN.
Additional Squid servers at remote sites help even more Picture on next slide
Technologies for Conditions “caching”
06-Oct-2009 Elizabeth Gallas 10
Client machineATHENA
COOL
CORAL
FroNTier client
squid
TomcatFroNTier servlet
Oracle DB server
squid
Serversite
IOVDbSvc
GetFileFromROOTViaPOOL
Oracle sqlite MySQL
AT
LAS
LCG
LC
G
FroNTier for ATLAS (picture: David Front)
06-Oct-2009 Elizabeth Gallas 11
What does a remote collaborator (or grid job) need?
Fred LuehringIndiana U
Use case: TRT Monitoring (one of MANY examples…) Needs: latest Conditions (Oracle + POOL files) Explored all 3 access methods
Talked: in/at hallways … then at meetings (BNL Frontier, Atlas DB) .. With experts and many other users facing similar issues…
In this process: Many many talented people have been involved in this process from around the globe – impressive collaboration !
Collective realization that Use cases continue to grow for distributed
Processing…Calibration…Alignment…Analysis … Expect sustained surge in all use cases w/collision data
Frontier technology seems to satisfies the needs of most use cases in a reasonable time Now a matter of final testing to refine configurations, going global … for all sites wanting to run jobs with latest data …
06-Oct-2009 Elizabeth Gallas 12
Very recent results from Japan … TadaAki Isobe reported last week http://www.icepp.s.u-tokyo.ac.jp/~isobe/rc/conddb/DBforUserAnalysisAtTokyo.pdf
ReadReal.py script Athena 15.4.0 3 Methods:
SQLiteAccess files on NFS
w/ LYON Oracle ~290msec RTT to CC‐IN2P3
w/ FroNTier ~200msec RTT to BNL Zip‐level: 5
Zip‐level: 0 (i.e. no compress), it takes ~15% longer time.
Work is ongoing to understand these kinds of tests.
06-Oct-2009 Elizabeth Gallas 13
Frontier/Squid – Grid-wide deployment Thanks to many !!! We have established “Proof of Principle”
Much testing and refinement of various components ongoing This is quickly falling into place
Requested / Received – Approval from Atlas Computing Management for Full scale deployment enabling grid-wide Frontier access to Conditions data
Frontier/Squid servers: Established/Fleshed-out at Tier-1: BNL, FZK New Frontier Service at Tier-0: CERN Additional sites: RAL, TRIUMF, LYON? improve robustness/fail over
Squid servers (experience at Indiana, Michigan, SLAC, Glasgow, DESY …) Required at all Tier-1 and Tier-2 sites intending to be an analysis center
Complemented by HOTDISK for POOL file subscription Documentation:
Frontier installation (needed only at selected Tier-1 sites) https://www.racf.bnl.gov/docs/services/frontier/installation/
Squid installation (all Tier-1, Tier-2, and no limitation beyond that …) https://www.racf.bnl.gov/docs/services/frontier/squid-installation
Squid testing: https://www.racf.bnl.gov/docs/services/frontier/testing
TWiki to register Squid Servers on the ATLAS grid (Rod Walker) https://twiki.cern.ch/twiki/bin/view/Atlas/T2SquidDeployment
Hypernews (E-group): Email: [email protected] Forum: https://groups.cern.ch/group/hn-atlas-DBOps/
Database Operations TWiki (will be updated) https://twiki.cern.ch/twiki/bin/view/Atlas/DatabaseOperations
06-Oct-2009 Elizabeth Gallas 14
Ongoing Work (in Conditions DB Access) Address inefficiencies on individual subsystems is
essential for collective long term stability Many use cases many different types of queries
Associated with Frontier deployment: Final checks to insure cache is not stale Understanding the ‘scale’ of deployment needed POOL containers smaller IOVs POOL file subscriptions (sites install HOTDISK) PFC (POOL File Catalog) update automation Squid Server – setup, testing, registration Hammercloud tests (JElmsheuser, DvanderSter, RWalker)
http://homepages.physik.uni-muenchen.de/~johannes.elmsheuser/dbaccess/
Gradually expanding to include more sites Ultimate: streamline/unify configuration of grid jobs
Configure sites in uniform way according to capability Input to AGIS – ATLAS Grid Information System
06-Oct-2009 Elizabeth Gallas 15
ATLAS TAGs in the ATLAS Computing model Stages of ATLAS reconstruction
RAW data file ESD (Event Summary Data) ~ 500 kB/event
AOD (Analysis Object Data) ~ 100 kB/event TAG (not an acronym) ~ 1 kB/event (stable)
TAG s Are produced in reconstruction in 2 formats:
File based AthenaAwareNTuple format (AANT)
TAG files are distributed to all Tier 1 sites Oracle Database
Event TAG DB populated from files in ‘upload’ process Can be re-produced in re-processing
Available globally through network connection In addition:
‘Run Metadata’ at Temporal, Fill, Run, LB levels File and Dataset related Metadata
TAG Browser (ELSSI) – uses combined Event, Run, File … Metadata
RAW
AOD
ESD
TAG
06-Oct-2009 Elizabeth Gallas 16
TAG Services and Architecture Evolution of TAG services / architecture model
From: Everything deployed at all voluntary sites To: Specific aspects deployed to optimize resources
Decoupling of services underway – increases flexibility of the system to deploy resources depending on evolution of usage
TAG Upload: now automated/triggered, stably w/monitoring Automated upload for initial reconstruction Controlled by Tier-0 Integrated with AMI and DQ2 tools where
appropriate Balanced for read and write operations
06-Oct-2009 Elizabeth Gallas 17
TAG Services and Architecture Components:
TAG Database(s) at CERN and voluntary Tier-1s, Tier-2s
ELSSI – Event Level Selection Service Interface TAG Usage in every Software tutorial
ELSSI and file based TAG usage Web Services
Extract - dependencies Atlas Software AFS maxidisk to hold root files from Extract
Skim - Atlas Software, DQ2, Ganga, … Surge in effort helping to make TAG jobs grid-enabled
Response times vary: O(sec) for interactive queries
event selection, histograms… O(min) for extract O(hr) for skim
06-Oct-2009 Elizabeth Gallas 18
TAGs by Oracle Site (size in GB) CERN ATLR (total 404 GB)
ATLAS_TAGS_CSC_FDR 22 ATLAS_TAGS_COMM_2009 142ATLAS_TAGS_USERMIX 18ATLAS_TAGS_COMM 200 ATLAS_TAGS_ROME 22
BNL (total 383 GB)ATLAS_TAGS_CSC_FDR 18ATLAS_TAGS_COMM_2009 105ATLAS_TAGS_COMM 260
TRIUMF (total 206 GB)ATLAS_TAGS_CSC_FDR 16ATLAS_TAGS_COMM 190
DESY ATLAS_TAGS_MC 231
RAL – gearing up …
Respectable level of TAG deployment – should entertain wide variety of users (commissioning and physics analysis)
TAG upload now routine for commissioning
Intensive work on deployment of TAG services making them increasingly accessible to users (ELSSI)
06-Oct-2009 Elizabeth Gallas 19
Summary of Ongoing Efforts Oracle Database (online, offline, offsite) -- shared resource
Applications: living, breathing, consuming … and evolving systems Developers coming to understand global consequences Cooperation and Read/Write control via interfaces
critical for collective stability Hot topics
Online Operations Trying to anticipate modes of user analysis Conditions distribution
Frontier deployment Hammercloud testing of direct and Frontier access
Data Quality and Luminosity TAG development and resource optimization
06-Oct-2009 Elizabeth Gallas 20
BACKUP
06-Oct-2009 Elizabeth Gallas 21
Online Database Coordinator:
Giovanna Lehmann (w/support from many)
A lot of work finalizing development of applications Last few months
Many changes enhancing stability for collisions (details on each bullet would fill many slides)
ATONR (Online Oracle RAC) isolation from GPN Devising methods for offline online data
Understanding operational reliance on ATONR Phasing in of locking of owner accounts
Applications evolving from development to operations Source of Oracle Streams interruptions
06-Oct-2009 Elizabeth Gallas 22
Dynamic (TAG) services composition User accesses „integrated ELSSI“
Which data to you want to query? (data09, mc08...) Select metadata criteria Select service(s): count, histogram, tabulate, extract, skim System detects user identification and location
System transparently deploys services based on this input Using internal service catalogue Configure appropriate service and execute operation Optimize response using load-balancing of infrastructure
Anticipate deployment of additional features: Logging access patterns to learn more about user behaviour,
deploy resources to optimize service Log good „service combinations“ – allow sharing between
users.
Elisabeth Vinek - ATLAS Software & Computing Week 2203.09.2009
06-Oct-2009 Elizabeth Gallas 23
Snapshot of ATLAS 3D replication Conditions data needed for offline
analysis on the grid 10 Tier-1 sites Oracle Streams replication technology
06-Oct-2009 Elizabeth Gallas 24
Where are the POOL files ? DQ2(DDM) - distributes Event data files and Conditions POOL files. TWiki: StorageSetUp for T0, T1's and T2's ADC/DDM maintains ToA sites (Tiers of ATLAS)
ToA sites are subscribed to receive DQ2 POOL files ToA sites have "space tokens" (areas for file destinations) such as:
“DATADISK" for real event data “MCDISK" area for simulated event data … “HOTDISK" area for holding POOL files needed by many jobs
has more robust hardware for more intense access
Some sites also use Charles Waldman's "pcache": Duplicates files to a scratchdisk accessible to local jobs
avoiding network access to "hotdisk". Magic in pcache tells the job to look in the scratchdisk first.
Deployment of POOL files deployed to all ToA sites 'on the GRID' ? ADC – in progress
06-Oct-2009 Elizabeth Gallas 25
Request: sites create HOTDISKEmail from Stephane Jezequel (Sept 15) Could you please forward this request to all ATLAS Grid
sites which are included in DDM: As discussed during the ATLAS software week, sites are
requested to implement the space token ATLASHOTDISK. More information:
https://twiki.cern.ch/twiki/bin/view/Atlas/StorageSetUp#The_ATLASHOTDISK_space_token
Sites should assign at least 1 TB to this space token (should foresee 5 TB). In case of storage crisis at the site, the 1 TB can be reduced to 0.5 TB. Because of the special usage of these files, sites should decide to assign a specific pool or not.
When it is done, please report to DDM Ops (Savannah ticket is a good solution) to create the new DDM site.
06-Oct-2009 Elizabeth Gallas 26
Where are the PFCs (POOL File catalogs)? Mario Lassnig - modified DQ2 client dq2-ls
Can ‘on the fly’ create the PFC for the POOL files on a system written to work for "SRM systems“ Systems without SRM cannot subscribe automatically
Can get files via more manual procedures Options in dq2-ls detect the type of system but may not
always successfully remove SRM specific descriptors Unclear at the time of the meeting cases where dev is required
DQ2 client continues to evolve/improve with use cases… Update PFC with new POOL files …
Detection of new POOL file arrival Generate updated PFC Run above script if needed preparing file for local use
06-Oct-2009 Elizabeth Gallas 27
Thanks to many experts in many areas Richard Hawkings, Andrea Valassi (CERN IT), Hans von
der Schmitt, Walter Lampl, Shaun Roe, Paul Laycock, David Front, Slava Khomutnikov, Stefan Schlenker, Saverio D'auria, Joerg Stelzer, Vakho Tsulaia, Thilo Pauly, Marcin Nowak, Yuri Smirnov, Solveig Albrand, Fred Luehring, John DeStefano, Carlos Gamboa, Rod Walker, Bob Ball, Jack Cranshaw, Alessandro Desalvo, Xin Zhao, Mario Lassnig, ADC … ! apologies ! Not complete list ! Should delete slide … too incomplete…
Many many many application developers and subsystem experts who insure the data going in is what we want coming out.
06-Oct-2009 Elizabeth Gallas 28
Features of Athena: Previous to Release 15.4:
Athena (RH) looks at IP the job is running at, uses dblookup.xml in the release to decide the order of
database connections to try to get the Conditions data.
Release 15.4 Athena looks for Frontier environment variable,
if found, ignores the dblookup using instead another env