Date post: | 31-Dec-2015 |
Category: |
Documents |
Upload: | camilla-small |
View: | 222 times |
Download: | 1 times |
Computational grids Computational grids and and
grids projectsgrids projects
DSS, 4.4.2005DSS, 4.4.2005
[email protected]@kiv.zcu.cz
ContentContent
Grid computing (terminology)Grid computing (terminology) EGEE grid elements, how it worksEGEE grid elements, how it works Gilda testbed (example of simple job)Gilda testbed (example of simple job) Grid projectsGrid projects
Grid computingGrid computing
model for solving massive computational problemsmodel for solving massive computational problems use of use of unusedunused resources (CPU cycles, disk storage,...) resources (CPU cycles, disk storage,...) support computation support computation acrossacross administrative domains administrative domains
– apart from traditional clustersapart from traditional clusters
creates “virtual cluster” embedded in network infrastructurecreates “virtual cluster” embedded in network infrastructure
multi-usermulti-user environment environment issue of issue of authorizationauthorization – allow remote users to control – allow remote users to control
computing resourcescomputing resources
Grid computing - resourcesGrid computing - resources
sharing heterogenous resourcessharing heterogenous resources
– different platformsdifferent platforms– hw / sw architectureshw / sw architectures– computer languagescomputer languages
located in different placeslocated in different places– different administrative domainsdifferent administrative domains– connected through the networkconnected through the network
virtualizingvirtualizing computing resources computing resources
Grid x clusterGrid x cluster
grids – grids – heterogeneousheterogeneous
– can use ordinary desktops as wellcan use ordinary desktops as well
cluster – cluster – homogenoushomogenous– located in data centreslocated in data centres
Grids are build from Grids are build from Computational ElementsComputational Elements ( (CECE)) The The clustercluster can act as an CE of the whole grid system can act as an CE of the whole grid system
Global Grid ForumGlobal Grid Forum
GGFGGF – defines specification for grid computing – defines specification for grid computing Globus AllianceGlobus Alliance – implements standards – GT – implements standards – GT Globus ToolkitGlobus Toolkit – middleware to build services based – middleware to build services based
on GT; on GT; de factode facto standard; just part of the grid standard; just part of the grid
Globus – implemented servicesGlobus – implemented services
Resource managementResource management
– GRAMGRAM (Grid Resource Allocation Management) (Grid Resource Allocation Management)
Information servicesInformation services– MDSMDS (Monitoring and Discovery Services) (Monitoring and Discovery Services)
Security ServicesSecurity Services– GSIGSI (Grid Security Infrastructure) (Grid Security Infrastructure)
Data Movement and ManagementData Movement and Management– GridFTPGridFTP, , GASSGASS (Global Access to Secondary Storage) (Global Access to Secondary Storage)
EGEE grid componentsEGEE grid components
UIUI ( (User InterfaceUser Interface))
– user access to the computational griduser access to the computational grid– logon, start jobs, info about state of jobslogon, start jobs, info about state of jobs– information about free resourcesinformation about free resources– management of user’s datamanagement of user’s data
CECE ( (Computing ElementComputing Element))– receive jobs for the given cluster, farm (homogenous)receive jobs for the given cluster, farm (homogenous)– info about computational power and installed swinfo about computational power and installed sw– give the jobs to the local job management systemgive the jobs to the local job management system
(PBS, LFS, NQE, LoadLeveler, Condor), LJMS sends the job later (PBS, LFS, NQE, LoadLeveler, Condor), LJMS sends the job later to the working nodesto the working nodes
EGEE grid components II.EGEE grid components II.
SE SE (Storage Element)(Storage Element)
– interface how to store user data inside the gridinterface how to store user data inside the grid– access to the files access to the files – replication of filesreplication of files– file is registrated inside the grid with the internal namefile is registrated inside the grid with the internal name
(independent of the name and the location)(independent of the name and the location)
RC RC (Replica Catalog)(Replica Catalog) RLSRLS (Replica Location Server) (Replica Location Server)
– info about file replicas, selection of the appropriate replicainfo about file replicas, selection of the appropriate replica
EGEE grid components III.EGEE grid components III.
WN WN (Worker Nodes)(Worker Nodes)
– computation nodes, place where the computation is runningcomputation nodes, place where the computation is running– have access to the application software (mount from server)have access to the application software (mount from server)– capable of manipulation with data stored on SEcapable of manipulation with data stored on SE– they are accessible only from CE, not from the whole environmentthey are accessible only from CE, not from the whole environment
EGEE grid components IV.EGEE grid components IV.
ISIS (Information Service) (Information Service)– state information about elements of grids (CE, SE, ...)state information about elements of grids (CE, SE, ...)– monitoring of the state of the jobsmonitoring of the state of the jobs
RBRB (Resource Broker) (Resource Broker)– scheduler, find the proper resources for the job requirementsscheduler, find the proper resources for the job requirements– divide jobs to the CE, sending JDL (Job Description Language)divide jobs to the CE, sending JDL (Job Description Language)– use IS for its decisionsuse IS for its decisions
UI
- PKI X.509 certificate keys- JDL files
Students Terminals
enterGrid
enterGrid
enterGrid
enterGrid
UI WN
WN
WN
WN
WN
WNRB
CESE
GILDARLS
How it all works together – step by step How it all works together – step by step
User connectsUser connects to the UI to the UI– time limited proxy certificate is createdtime limited proxy certificate is created
User definesUser defines the computational job and tell it to the the computational job and tell it to the resource brokerresource broker– by the means of JDL fileby the means of JDL file– JDL file may contain some input data (more datasets – SE)JDL file may contain some input data (more datasets – SE)
Resource brokerResource broker talks to IS, talks to IS, finds finds proper CEproper CE Resource brokerResource broker creates job and creates job and sends sends it to the CEit to the CE
How it all works together II.How it all works together II.
CE receivesCE receives job and job and sendssends it to the local job it to the local job management systemmanagement system
The job is The job is running onrunning on the WN (working nodes) the WN (working nodes)– using lager datasets – copy data from SE using lager datasets – copy data from SE – new large output data – copy to SE, registrated with RLS (Replica new large output data – copy to SE, registrated with RLS (Replica
Location Server)Location Server)
At the end of the job, At the end of the job, outputoutput (stdout, stderr) (stdout, stderr) copiedcopied back to the RBback to the RB
How to try it and participateHow to try it and participate
Genius portalGenius portal – access to the grid– access to the grid GildaGilda
– demo applicationsdemo applications– last versions of middleware swlast versions of middleware sw
https://grid-demo.ct.infn.it/https://grid-demo.ct.infn.it/
Example – hostname.jdlExample – hostname.jdl
Type = "Job";Type = "Job";
JobType = "Normal"; JobType = "Normal";
Executable = "/bin/hostname";Executable = "/bin/hostname";
StdOutput = "hostname.out"; StdOutput = "hostname.out";
StdError = "hostname.err"; StdError = "hostname.err";
OutputSandbox = {"hostname.err","hostname.out"};OutputSandbox = {"hostname.err","hostname.out"};
Arguments = "-f";RetryCount = 7;Arguments = "-f";RetryCount = 7;
Example – log after job submissionExample – log after job submission
Let the GILDA Resource Broker choose Selected VirtualLet the GILDA Resource Broker choose Selected VirtualOrganisation name (from UI conf file): gilda Organisation name (from UI conf file): gilda Connecting to host grid004.ct.infn.it, port 7772 Logging Connecting to host grid004.ct.infn.it, port 7772 Logging
to host grid004.ct.infn.it, port 9002 to host grid004.ct.infn.it, port 9002 ================================ edg-================================ edg-job-submit Success job-submit Success ===================================== =====================================
The job has been successfully submitted to the Network The job has been successfully submitted to the Network Server. Use Server. Use edg-job-statusedg-job-status command command to check job to check job current statuscurrent status. Your . Your job identifierjob identifier (edg_jobId) is: - (edg_jobId) is: - https://grid004.ct.infn.it:9000/YWwYrwIircPajba_1pAdehttps://grid004.ct.infn.it:9000/YWwYrwIircPajba_1pAdegg The edg_jobId has been saved in the following file: The edg_jobId has been saved in the following file: /home/demo03/.genius/.tmp_submittedjob_demo03/home/demo03/.genius/.tmp_submittedjob_demo03 ====================================================================================================
Example – job queueExample – job queue
Status of the job can be checked in job queue Status of the job can be checked in job queue – readyready– scheduled scheduled – running running – donedone – – Get OutputGet Output– cleared (after GetOutput)cleared (after GetOutput)
OutputOutput– hostname.errhostname.err 00 – hostname.out.txthostname.out.txt 2424
Hostname.out.txtHostname.out.txt– testbed010.cnaf.infn.it testbed010.cnaf.infn.it {Heureka! We got it!} {Heureka! We got it!}
Grid ProjectsGrid Projects
EGEE (Enabling Grid for E-sciencE)EGEE (Enabling Grid for E-sciencE)– connect Europian grids, create production gridconnect Europian grids, create production grid– starten on 1.April 2004starten on 1.April 2004– 70 partners (EU, USA, Russia)70 partners (EU, USA, Russia)– 7 federations (CE federation – Czech Rep.)7 federations (CE federation – Czech Rep.)– CERN – one federation itself CERN – one federation itself – CESNET – scheduling and state monitoring part of the CESNET – scheduling and state monitoring part of the
middlewaremiddleware
Project GenevaProject Geneva
CoreGrid, Akogrimo, DataMiningGridCoreGrid, Akogrimo, DataMiningGrid GridCoord, HPC4U, IntelliGridGridCoord, HPC4U, IntelliGrid K-WF Grid, NextGrid, OntoGridK-WF Grid, NextGrid, OntoGrid Provenance, SIMDAT, UniGridSProvenance, SIMDAT, UniGridS
Literature, MaterialsLiterature, Materials
WikipediaWikipedia http://egee.cesnet.czhttp://egee.cesnet.cz http://www.globus.orghttp://www.globus.org