Globus Data Replication Services
Ann Chervenak, Robert SchulerUSC Information Sciences Institute
Motivation for Data Replication Services
Data-intensive applications need higher-level data management services that integrate lower-level Grid functionality
Efficient data transfer (GridFTP, RFT) Replica registration and discovery (RLS) Eventually validation of replicas, consistency management,
etc.
Goal is to generalize the custom data management systems developed by several application communities
Eventually plan to provide a suite of general, configurable, higher-level data management services
Globus Data Replication Service (DRS) is the first of these services
The Data Replication Service
Included in the Tech Preview of GT4.0 release
Design is based on the publication component of the Lightweight Data Replicator system
Developed by Scott Koranda from U. Wisconsin at Milwaukee
Functionality Replicate a set of files in the Grid on a local site Users identify a set of desired files DRS queries Replica Location Service to discover current
locations of these files Creates local replicas of desired files using the Reliable File
Transfer Service Registers new replicas in Replica Location Service for discovery
Outline
Terminology Functionality of Data Replication Service Background: Components used by DRS
Replica Location Service GridFTP Data Transport protocol Reliable File Transfer Service
DRS Design Implementation in GT4 environment Evaluation of DRS performance in a wide area Grid Future work
Some Terminology A logical file name (LFN) is a unique identifier for the
contents of a file Typically, a scientific collaboration defines and manages
the logical namespace Guarantees uniqueness of logical names within that
organization
A physical file name (PFN) is the location of a copy of the file on a storage system.
The physical namespace is managed by the file system or storage system
For example, the LIGO environment currently contains: More than six million unique logical files More than 40 million physical files stored at ten sites
DRS Overview
Client uses DRS interface to specify which files are required at local site
DRS uses: Delegation Service to delegate proxy credentials Globus RLS to discover whether replicas exist locally
and where they exist in the Grid Selection algorithm to choose among available
source replicas Globus Reliable File Transfer service to copy data to
local site This uses GridFTP data transport protocol
Globus RLS to register new replicas
Background: The Replica Location Service
• A Replica Location Service (RLS) is a distributed registry that records the locations of data copies and allows replica discovery RLS maintains mappings between logical identifiers
and target names Must perform and scale well: support hundreds of
millions of objects, hundreds of clients
E.g., Laser Interferometer Gravitational Wave Observ. RLS servers at 8 sites Maintain associations between 6 million logical file
names & 40 million physical file locations
LRC LRC LRC
RLIRLI
LRCLRC
Replica Location Indexes
Local Replica Catalogs
• Replica Location Index (RLI) nodes aggregate information about one or more LRCs
• LRCs use soft state update mechanisms to inform RLIs about their state: relaxed consistency of index
• Optional compression of state updates reduces communication, CPU and storage overheads
RLS Features
• Local Replica Catalogs (LRCs) contain consistent information about logical-to-target mappings
Background: GridFTP A secure, robust, fast, efficient, standards based, widely
accepted data transfer protocol
Features: Standard FTP: get/put etc., 3rd-party transfer GSS binding, extended directory listing, simple restart Striped/parallel data channels Partial file TCP buffer setting Progress monitoring, extended restart
The Globus Toolkit supplies a reference implementation: Server Client tools (globus-url-copy) Development Libraries
Background: Reliable File Transfer Service
RFT accepts SOAP description of transfer
Writes state to a database Uses Java GridFTP client
library to initiate 3rd part transfers
Restart Markers stored in the database
Allow for restart in the event of RFT failure
Supports concurrency, i.e., multiple files in transit
Check status: Subscribe to notifications Poll for status
Control
Data
Control
Data
RFT Service
RFT Client
SOAP Messages Notifications
(Optional)
DRS Functionality Initiate a DRS Request Create a delegated credential Create a Replicator resource Monitor Replicator resource Discover replicas of desired files in Replica Location Service, select
among replicas Transfer data to local site with Reliable File Transfer Service Register new replicas in RLS catalogs Allow client inspection of DRS results Destroy Replicator resource
DRS implemented in Globus Toolkit Version 4, complies with Web Services Resource Framework (WS-RF)
Relationship to Other Globus Services
At requesting site, deploy:
WS-RF Services Data Replication Service Delegation Service Reliable File Transfer
Service
Pre WS-RF Components
Replica Location Service (Local Replica Catalog, Replica Location Index)
GridFTP Server
Web Service Container
Data Replication
Service
Replicator Resource
Reliable File
Transfer Service
RFT Resource
Local Replica Catalog
Replica Location
Index
GridFTP Server
Delegation Service
Delegated Credential
Local Site
WSRF in a Nutshell Service State Management:
Resource Resource Property
State Identification: Endpoint Reference
State Interfaces: GetRP, QueryRPs,
GetMultipleRPs, SetRP Lifetime Interfaces:
SetTerminationTime ImmediateDestruction
Notification Interfaces Subscribe Notify
ServiceGroups
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
Service Container
Create Delegated Credential
Client
Delegation
Data Rep.
RFTReplicaIndex
ReplicaCatalog
GridFTPServer
GridFTPServer
ReplicaCatalog
ReplicaCatalog Replica
Catalog
MDS
Credential
RP
proxy
•Initialize user proxy cert.
•Create delegated credential resource•Set termination time
•Credential EPR returnedEPR
Service Container
Create Replicator Resource
Client
Delegation
Data Rep.
RFTReplicaIndex
ReplicaCatalog
GridFTPServer
GridFTPServer
ReplicaCatalog
ReplicaCatalog Replica
Catalog
MDS
Credential
RP
•Create Replicator resource•Pass delegated credential EPR•Set termination time
•Replicator EPR returned
EPRReplicator
RP
•Access delegated credential resource
Service Container
Monitor Replicator Resource
Client
Delegation
Data Rep.
RFTReplicaIndex
ReplicaCatalog
GridFTPServer
GridFTPServer
ReplicaCatalog
ReplicaCatalog Replica
Catalog
MDS
Credential
RP
Replicator
RP
•Periodically polls Replicator RP via GetRP or GetMultRP
•Add Replicator resource to MDS Information service Index
Index
RP
•Subscribe to ResourceProperty changes for “Status” RP and “Stage” RP
•Conditions may trigger alerts or other actions (Trigger service not pictured)
EPR
Service Container
Query Replica Information
Client
Delegation
Data Rep.
RFTReplicaIndex
ReplicaCatalog
GridFTPServer
GridFTPServer
ReplicaCatalog
ReplicaCatalog Replica
Catalog
MDS
Credential
RP
Replicator
RP
Index
RP
•Notification of “Stage” RP value changed to “discover”
•Replicator queries RLS Replica Index to find catalogs that contain desired replica information
•Replicator queries RLS Replica Catalog(s) to retrieve mappings from logical name to target name (URL)
Service Container
Transfer Data
Client
Delegation
Data Rep.
RFTReplicaIndex
ReplicaCatalog
GridFTPServer
GridFTPServer
ReplicaCatalog
ReplicaCatalog Replica
Catalog
MDS
Credential
RP
Replicator
RP
Index
RP
•Notification of “Stage” RP value changed to “transfer”
•Create Transfer resource•Pass credential EPR•Set Termination Time•Transfer resource EPR returned
Transfer
RP
EPREPR
•Access delegated credential resource
•Setup GridFTP Server transfer of file(s)
•Data transfer between GridFTP Server sites
•Periodically poll “ResultStatus” RP via GetRP•When “Done”, get state information for each file transfer
Service Container
Register Replica Information
Client
Delegation
Data Rep.
RFTReplicaIndex
ReplicaCatalog
GridFTPServer
GridFTPServer
ReplicaCatalog
ReplicaCatalog Replica
Catalog
MDS
Credential
RP
Replicator
RP
Index
RP
•Notification of “Stage” RP value changed to “register”
•RLS Replica Catalog sends update of new replica mappings to the Replica Index
Transfer
RP•Replicator registers new file mappings in RLS Replica Catalog
Service Container
Client Inspection of State
Client
Delegation
Data Rep.
RFTReplicaIndex
ReplicaCatalog
GridFTPServer
GridFTPServer
ReplicaCatalog
ReplicaCatalog Replica
Catalog
MDS
Credential
RP
Replicator
RP
Index
RP
•Notification of “Status” RP value changed to “Finished” Transfer
RP
•Client inspects Replicator state information for each replication in the request
Service Container
Resource Termination
Client
Delegation
Data Rep.
RFTReplicaIndex
ReplicaCatalog
GridFTPServer
GridFTPServer
ReplicaCatalog
ReplicaCatalog Replica
Catalog
MDS
Credential
RP
Replicator
RP
Index
RP
•Termination time (set by client) expires eventually
Transfer
RP•Resources destroyed (Credential, Transfer, Replicator)
TIME
Performance Measurements: Wide Area Testing
The destination for the pull-based transfers is located in Los Angeles
Dual-processor, 1.1 GHz Pentium III workstation with 1.5 GBytes of memory and a 1 Gbit Ethernet
Runs a GT4 container and deploys services including RFT and DRS as well as GridFTP and RLS
The remote site where desired data files are stored is located at Argonne National Laboratory in Illinois
Dual-processor, 3 GHz Intel Xeon workstation with 2 gigabytes of memory with 1.1 terabytes of disk
Runs a GT4 container as well as GridFTP and RLS services
DRS Operations Measured
Create the DRS Replicator resource Discover source files for replication using local RLS
Replica Location Index and remote RLS Local Replica Catalogs
Initiate an Reliable File Transfer operation by creating an RFT resource
Perform RFT data transfer(s) Register the new replicas in the RLS Local Replica
Catalog
Experiment 1: Replicate 10 Files of Size 1 Gigabyte
Component of Operation Time (milliseconds)
Create Replicator Resource 317.0
Discover Files in RLS 449.0
Create RFT Resource 808.6
Transfer Using RFT 1186796.0
Register Replicas in RLS 3720.8
Data transfer time dominates Wide area data transfer rate of 67.4 Mbits/sec
Experiment 2: Replicate 1000 Files of Size 10 Megabytes
Component of Operation Time (milliseconds)
Create Replicator Resource 1561.0
Discover Files in RLS 9.8
Create RFT Resource 1286.6
Transfer Using RFT 963456.0
Register Replicas in RLS 11278.2
Time to create Replicator and RFT resources is larger Need to store state for 1000 outstanding transfers
Data transfer time still dominates Wide area data transfer rate of 85 Mbits/sec
Future Work
Continued performance testing of DRS: Increasing the size of the files being transferred Increasing the number of files per DRS request
Add and refine DRS functionality as needed by GEON and other applications
E.g., add a push-based replication capability Add fine-grained authorization capability to RLS, DRS
Long-term: Will develop a suite of general, configurable, composable,
high-level data management services