+ All Categories
Home > Documents > ServillaBEAM20041214GridIntro

ServillaBEAM20041214GridIntro

Date post: 09-Apr-2018
Category:
Upload: anand-iyer
View: 213 times
Download: 0 times
Share this document with a friend

of 41

Transcript
  • 8/7/2019 ServillaBEAM20041214GridIntro

    1/41

    An Introduction toGrid Computing

    BEAM Workshop

    December 2004

    Mark Servilla

    [email protected]

    LTER Network Office

  • 8/7/2019 ServillaBEAM20041214GridIntro

    2/41

    SEEK-BEAM Workshop Dec 2004 2

    Presentation Agenda

    Definitions

    Evolution of the Grid

    Characteristics Computing Model

    Protocols

    Examples

    References

  • 8/7/2019 ServillaBEAM20041214GridIntro

    3/41

    SEEK-BEAM Workshop Dec 2004 3

    Definitions of a Grid a network of conductors for distribution of electric

    power; also : a network of radio or television stations Merriam-Webster

    the illusion of a simple yet large and powerful self-

    managing virtual computer out of a large collection ofconnected heterogeneous systems sharing variouscombinations of resources IBM Redbooks

    Grid Computing enables virtual organizations to sharegeographically distributed resources as they pursue

    common goals, assuming the absence of central location,central control, omniscience, and an existing trustrelationship. Globus Alliance

    The Web provides us information the grid allows us toprocess it. -Ahmar Abbas

  • 8/7/2019 ServillaBEAM20041214GridIntro

    4/41

    SEEK-BEAM Workshop Dec 2004 4

    The Evolution of

    Grid Technology

    High-Performance Computing

    Cluster Computing

    Peer-to-Peer Computing Internet Computing

  • 8/7/2019 ServillaBEAM20041214GridIntro

    5/41

    SEEK-BEAM Workshop Dec 2004 5

    High-Performance

    Computing

    Traditionallyknown as super-computing

    Specialized forparallelprocessingalgorithms

    Shared equallyamong academia,research, andcommercial sectors

  • 8/7/2019 ServillaBEAM20041214GridIntro

    6/41

    SEEK-BEAM Workshop Dec 2004 6

    Cluster Computing

    Originated 1994 Beowulf cluster NASA

    High-performance

    Massively-parallel (2 to 1000 nodes)

    Commodity hardware (Intel, AMD)

    Low-cost software (Linux, FreeBSD)

    Interconnected via high-speed private networks

    Shared storage SAN/NAS

    AMD Athlon cluster at University ofHeidelberg,Germany 825Gflops, 35th fastest high-performance computer in the world

  • 8/7/2019 ServillaBEAM20041214GridIntro

    7/41

    SEEK-BEAM Workshop Dec 2004 7

    Cluster Computing

  • 8/7/2019 ServillaBEAM20041214GridIntro

    8/41

    SEEK-BEAM Workshop Dec 2004 8

    Peer-to-Peer Computing

    Primarily used for distributed storage andfile-sharing

    Early models (rcp, scp, ftp)

    Restricted to LANs, or Limited to known peers

    Internet-based models Centralized (Napster, Kazaa*)

    Decentralized (Gnutella)

    *100,000,000 downloads by 2004; 2-million new downloads a week

  • 8/7/2019 ServillaBEAM20041214GridIntro

    9/41

    SEEK-BEAM Workshop Dec 2004 9

    Centralized Peer-to-Peer

    .mp3

    ?

    ??

    ??

    ?.mp3 .mp3 .mp3

  • 8/7/2019 ServillaBEAM20041214GridIntro

    10/41

    SEEK-BEAM Workshop Dec 2004 10

    Decentralized Peer-to-Peer

    ?

    ?

    ??

    ?

    ?

    .mp3 .mp3 .mp3 .mp3

  • 8/7/2019 ServillaBEAM20041214GridIntro

    11/41

    SEEK-BEAM Workshop Dec 2004 11

    Internet Computing

    Volunteer or philanthropiccomputing; utilizes personaldesktop computers connectedto the Internet

    Desktop computers idleapproximately 95% of the theirlifespan

    Divide and Conqueror approach Tasks broken into smaller

    subtasks

    Desktop executes subtasksduring idle time

    Desktop sends data back tocentral server, whichaggregates results

  • 8/7/2019 ServillaBEAM20041214GridIntro

    12/41

    SEEK-BEAM Workshop Dec 2004 12

    Synthesis entre Grid

    High-performance computing

    pioneered the use ofparallel algorithms

    Cluster computing

    demonstrated the nature of shared computing andstorage

    load balancing protocols

    Peer-to-peer computing

    distributed storage resource with no central authority

    Internet computing

    geographically distributed virtual organization

    fabric of the project vanishes with completion of the task

  • 8/7/2019 ServillaBEAM20041214GridIntro

    13/41

    SEEK-BEAM Workshop Dec 2004 13

    Grid Characteristics

    Resources that are connected via a network are geographically distributed may consist of heterogeneous hardware and/or

    software are managed transparently for performance and

    fault tolerance

    Creates the illusion of virtual organizationsand projects without the presence of a central authority, or

    a central control Explicit trust relationships between users and

    resources A system that scales in space and time

  • 8/7/2019 ServillaBEAM20041214GridIntro

    14/41

    SEEK-BEAM Workshop Dec 2004 14

    Types of Resources Computation

    utilization of computing cycles found on processors of themachines on the grid

    Storage to increase capacity, performance, sharing, and reliability of data

    Communication to increase capacity, performance, and reliability of data

    communication

    Collaboration tools to facilitate collaboration through conferencing, visualization, and

    data sharing

    Software and Licenses to share site-specific software and/or licenses

    Special equipment, capacities, architectures, and policies printers, imaging, sensors, or other local specialty resources

  • 8/7/2019 ServillaBEAM20041214GridIntro

    15/41

    SEEK-BEAM Workshop Dec 2004 15

    Grid Ingredients

  • 8/7/2019 ServillaBEAM20041214GridIntro

    16/41

    SEEK-BEAM Workshop Dec 2004 16

    Grid Topologies Departmental Grids

    localized to a specific group of people generally, same hardware and software designed for high throughput and high performance over a

    dedicated network

    Enterprise Grids service to numerous groups within a single company or

    campus resource heterogeneity increases company-wide local area network

    Extraprise Grids

    service to multiple companies, partners, and customers withina particular domain domain based private network

    Global Grids established over the public-Internet

  • 8/7/2019 ServillaBEAM20041214GridIntro

    17/41

    SEEK-BEAM Workshop Dec 2004 17

    Resource-based Grids

    Compute Grids desktop nodes

    server nodes

    high-performance computing clusters

    Data Grids performance-based distributed storage

    replication for fault-tolerance

    Collaboration Grids support for video-conferencing, visualization and data sharing

    Utility Grids maintained and managed by a commercial service provider

    compute resources acquired on a per-need basis

    application resources that are purchased on a per-use or per-minute basis

  • 8/7/2019 ServillaBEAM20041214GridIntro

    18/41

    SEEK-BEAM Workshop Dec 2004 18

    Application Characteristics

    Perfect Parallelism computations runautonomously (Monte Carlo Simulations)

    Data Parallelism operationsperformed on data simultaneously (dbsearches)

    Functional Parallelism multipleoperations are performed simultaneously

    Optimized for parallel

    execution

    Not capable of parallel

    computation

    Fibonacci Series (1, 1, 2, 3, 5, 8, 13, 21,)

    F(k+2) = F(k+1) + F(k)

  • 8/7/2019 ServillaBEAM20041214GridIntro

    19/41

    SEEK-BEAM Workshop Dec 2004 19

    Questions to ask?

    When thinking Grid

    Identity and AuthenticationIs this user who he says he is? Isthis program the right program?

    Authorization and PolicyWhat can the user do on the grid?What can the application do on the grid? What resources are theuser and or application allowed to access?

    Resource DiscoveryWhere are the resources? Resource CharacterizationWhat types of resources are

    available? Resource AllocationWhat policy is applied when assigning the

    resources? What is the actual process of assigning the resources.Who gets how much?

    Resource ManagementWhich resource can be used at whattime and for what purpose?

    Accounting/Billing/Service Level Agreement (SLA)Howmuch of the resources is being used? What is the rating schedule?What is the SLA?

    SecurityHow do I make sure that this is done securely? How dowe know if we have been compromised? What steps are takenonce a security breach is detected?

  • 8/7/2019 ServillaBEAM20041214GridIntro

    20/41

    SEEK-BEAM Workshop Dec 2004 20

    A Grid Computing Model

    (the Globus view)

    Software stackconsisting of

    Standards

    Protocols APIs and SDKs

    Loosely basedon the Internet

    model

  • 8/7/2019 ServillaBEAM20041214GridIntro

    21/41

  • 8/7/2019 ServillaBEAM20041214GridIntro

    22/41

    SEEK-BEAM Workshop Dec 2004 22

    Grid Protocols

    Grid Security Infrastructure (GSI)

    Grid Resource Allocation and Management(GRAM)

    Grid File Transfer Protocol (GridFTP)

    Grid Information Services (GIS)

  • 8/7/2019 ServillaBEAM20041214GridIntro

    23/41

    SEEK-BEAM Workshop Dec 2004 23

    Grid SecurityInfrastructure

    Extended from SSL/TLS and X.509 protocols Utilizes PKI for Certificate Authority

    Primary objective is Authorization Generates primary credential

    Generates temporary proxy credential Certificate Authority

    Positively identify entities requesting certificates Issuing, removing, and archiving certificates Protecting the Certificate Authority server Maintaining a namespace of unique names for certificate

    owners Serve signed certificates to those needing to

    authenticate entities Logging activity

  • 8/7/2019 ServillaBEAM20041214GridIntro

    24/41

    SEEK-BEAM Workshop Dec 2004 24

    Public KeyInfrastructure

    1. User A encrypts message with hisprivate key2. Obtains User Bs public key from

    CA3. Encrypts message with Bs public

    key4. Sends message

    1. User B decrypts message with hisprivate key2. Obtains User As public key from

    CA3. Decrypts As message with public

    key4. B knows message is from A

    Public

    Private

    Private

    Public

    Public

    Keys

    A B

    Certificate

    Authority

    Bs public

    keyAs public

    key

    Authentication

    Credential

  • 8/7/2019 ServillaBEAM20041214GridIntro

    25/41

    SEEK-BEAM Workshop Dec 2004 25

    Grid SecurityInfrastructure

  • 8/7/2019 ServillaBEAM20041214GridIntro

    26/41

    SEEK-BEAM Workshop Dec 2004 26

    Grid Resource Allocation

    and Management

    Allows programs to be started on remote resources

    Resource Specification Language (RSL) Resource requirements

    machine type, number of nodes, memory, etc

    Job configuration directory, executable, arguments, environment

    Communication protocols HTTP-base RPC (early protocol)

    Web-services (WSDL, SOAP)

    create 5-10 instances of myprog, each on a machine with at least 64MB

    memory that is available to me for 4 hours, or 10 instances, on a machine with

    at least 32MB of memory

  • 8/7/2019 ServillaBEAM20041214GridIntro

    27/41

    SEEK-BEAM Workshop Dec 2004 27

    Grid File Transfer Protocol

    Providing high-speed and reliable transferof large volume data (petabytes)

    Extension of standard FTP to include

    striped/parallel data channels

    partial files

    automatic and manual TCP buffer size settings

    progress monitoring

    extended restart functionality

  • 8/7/2019 ServillaBEAM20041214GridIntro

    28/41

    SEEK-BEAM Workshop Dec 2004 28

    Grid Information Services

    Grid Resource Information Service (GRIS) provides resource specific information

    Grid Resource Registration (GRR)

    updates GRIS about resource status Grid Index Information Service (GIIS)

    an aggregate directory service

    provides a collection of information that has

    been gathered from multiple GRIS servers Grid Resource Inquiry (GRI)

    queries GRIS server for resource information

    queries GIIS server for information

  • 8/7/2019 ServillaBEAM20041214GridIntro

    29/41

    SEEK-BEAM Workshop Dec 2004 29

    Open Grid Services

    Architecture

    Marriage of grid protocols with webservice protocols

    Specifications for

    How Grid Services are created and discovered

    How Grid Service instances are named andreferenced

    Interfaces that define any Grid Service

    Initial release with GT 3.0 mid-2003; GT4.0 Jan 2005

  • 8/7/2019 ServillaBEAM20041214GridIntro

    30/41

    SEEK-BEAM Workshop Dec 2004 30

    Grid Examples

    Network for Earthquake Engineering andSimulation (NEESGrid)

    Biomedical Informatics Research Network

    (BIRN)

    EcoGrid

  • 8/7/2019 ServillaBEAM20041214GridIntro

    31/41

    SEEK-BEAM Workshop Dec 2004 31

    NEESGrid(Network for Earthquake Engineering and Simulation)

    Linkingscientistsand facilities observation of an experiment in progress observation before and after an experiment remote operation of an experiment

    Linking facilitiesand data hybrid operation of physical simulations with other

    simulations, both physical and numerical automatic archiving of raw data, calibration data, and

    processed data Linkingscientistsand data

    collaborative views (static) of time synchronized datavisualizations

    collaborative views of time synchronized data visualizations

    with video and audio recordings Linkingscientistsandotherscientists

    synchronous communication, such as with colleagues duringan experiment

    asynchronous communication, such as with colleagues overthe course of preparing a publication resulting from anexperiment

  • 8/7/2019 ServillaBEAM20041214GridIntro

    32/41

    SEEK-BEAM Workshop Dec 2004 32

    NEESGrid(Network for Earthquake Engineering and Simulation)

  • 8/7/2019 ServillaBEAM20041214GridIntro

    33/41

    SEEK-BEAM Workshop Dec 2004 33

    NEESGrid(Network for Earthquake Engineering and Simulation)

    Shake table withinstrumentation

    DataGeneral

    POP

    Local computers& storage

    NEES Equipment Site

    EdgeRouter

    Wide AreaNetwork

    EquipmentSite

    UserSite

    NEESgrid

    Operations

    ResourceSite

    Gigabit Ethernet > Gb/s WAN

    Network Architecture Diagram

  • 8/7/2019 ServillaBEAM20041214GridIntro

    34/41

    SEEK-BEAM Workshop Dec 2004 34

    BIRN(Biomedical Informatics ResearchNetwork)

    Testbed for a biomedical knowledgeinfrastructure

    Federated database of neuro-imaging data Fusion of diverse data sources (location; level of

    aggregation) Grid access to computational resources Datamining software Scalable and extensible Driven by research needs, not technology-pull or

    not technology-push

  • 8/7/2019 ServillaBEAM20041214GridIntro

    35/41

    SEEK-BEAM Workshop Dec 2004 35

    BIRN(Biomedical Informatics ResearchNetwork)

  • 8/7/2019 ServillaBEAM20041214GridIntro

    36/41

    SEEK-BEAM Workshop Dec 2004 36

    BIRN(Biomedical Informatics ResearchNetwork)

  • 8/7/2019 ServillaBEAM20041214GridIntro

    37/41

    SEEK-BEAM Workshop Dec 2004 37

    EcoGrid

    Metadata Standardization Ecological Metadata Language EML

    Integrate diverse data networks from ecology, biodiversity,and environmental sciences

    Standardized interfaces to data resources Metacat

    SRB

    DiGIR

    Xanthoria

    Metadata-mediated data access (application-based) Supports multiple metadata standards

    EML, Darwin Core as foci

    Computational services Pre-defined analytical services

    On-the-fly analytical services

  • 8/7/2019 ServillaBEAM20041214GridIntro

    38/41

    SEEK-BEAM Workshop Dec 2004 38

    EcoGrid

    *EML facilitates semi-automatic data binding

  • 8/7/2019 ServillaBEAM20041214GridIntro

    39/41

    SEEK-BEAM Workshop Dec 2004 39

    EcoGrid

  • 8/7/2019 ServillaBEAM20041214GridIntro

    40/41

    SEEK-BEAM Workshop Dec 2004 40

    Grid Organizations Globus Alliance

    Globus ToolkitTM Reference implementationof the grid architecture and grid protocols

    http://www.globus.org

    NSF Middleware Initiative (NMI) Supports the design, development, testing,

    and deployment of middleware for HPC http://www.nsf-middleware.org

    GRIDS Center Grid Research Integration Deployment and

    Support Center part of NMI http://www.grids-center.org

    Global Grid Forum Main standards body governing the world-

    wide grid community http://www.globalgridforum.org

  • 8/7/2019 ServillaBEAM20041214GridIntro

    41/41

    SEEK-BEAM Workshop Dec 2004 41

    RecommendedTexts

    Grid Computing: A Practical Guide to Technology andApplications

    Ahmar Abbas

    Charles River Media 2004

    Introduction to Grid Computing with Globus Luis Ferreira et al.

    IBM Redbooks 2004

    Enabling Applications for Grid Computing with Globus

    Bart Jacob et al.

    IBM Redbooks 2003 Grid Services Programming and Application Enablement

    Luis Ferreira et al.

    IBM Redbooks 2004