+ All Categories
Home > Documents > Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Date post: 14-Jan-2016
Category:
Upload: varian
View: 23 times
Download: 0 times
Share this document with a friend
Description:
Constructing. WEB PORTALS. For Computational Communities. Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University. Overview. What is a Web Portal? Web Portal Architecture Distributed Components: WebFlow Interfaces: Task Descriptor Grid Interface Portal Security - PowerPoint PPT Presentation
Popular Tags:
41
Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University For Computational Communities Constructing
Transcript
Page 1: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Tomasz Haupt

Northeast Parallel Architectures Center, Syracuse University

For Computational Communities

Constructing

Page 2: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Overview

What is a Web Portal? Web Portal Architecture Distributed Components: WebFlow Interfaces:

– Task Descriptor– Grid Interface

Portal Security Summary

Page 3: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Customizable access to information and services

What is a Web Portal?

Page 4: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Portal is not a static web page

It relies on a sophisticated browser technology– DHTML, JavaScript, cookies, applets, …

Server side processing– cgi-bin, servlets, asp, jsp, server side includes, XML

– search engines, mail servers, calendar, ...

Back End– data bases

– credit card processing

– external services: news, weather, stock quotes,...

Page 5: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Computational Portals

To provide a problem-oriented interface (a Web portal) to more effectively utilize HPC resources from the desktop via the Web browser.

This “point & click” view hides the underlying complexities and details of the HPC resources and creates a seamless interface between the user’s problem description on his/her desktop system and the heterogeneous computing resources

These HPC resources include supercomputers, mass storage systems, databases, workstation clusters, instruments, and visualization servers.

Page 6: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Example: Nanomaterials Research

Gaussian

Gamess

convert

convert

datarepository

selectedit

QS QS QSQS

Features: Data Flow computations, user supplied modules, seamlessaccess to heterogeneous mixture of computational resources, seamlessdata transfer, visualizations, data management.Goals: automate the task, maximize throughput.

Page 7: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Example: LMS Landscape Management System

WMS

EDYS CASC2D

DEM Land UseSoil

TextureVegetation

EDYS: vegetation model CASC2D: watershed modelWMS: Watershed Modeling System

Features: access to remote data (distributed databases, internetrepositories), data pre- and postprocessing, tightly coupledapplications running on remote hosts,visualizationsGoal: decision support systemavailable anytime, anywhere

Page 8: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Example: Gateway SystemProblem Solving Environment

Features: access to remote data, data pre- and postprocessing, applications running on remote hosts,visualizations, archivization.Goal: guide the user to select software,generate input files, submit jobs,analyze data; hide complexity anddetails of a heterogeneous back end.

Resources (software, hardware)templates, visualization tools

Resource Allocation

Problem Description

new select arch

Input filesOutput files

Page 9: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Design Issues

Support for a seamless access (security) Support for distributed, heterogeneous Back-End services

(HPCC, DBMS, Internet, ...) managed independently Variable pool of resources: support for discovery and

dynamical incorporation into the system Scalable, extensible, low-maintenance Middle Tier Web-based, extensible, customizable, self-adjusting to

varying capacities and capabilities of clients (humans, software and hardware) Front End

Access to desktop applications

Page 10: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Towards the solution ...

problem description (physics, chemistry, ...)

Task description: I need 64 nodes of SP-2 at Argonne to run my MPI-based executable “a.out” you can find in “/tmp/users/haupt” on

marylin.npac.syr.edu. In addition, I need any idle workstationwith jdk1.1 installed. Make sure that the output of my a.out is

transferred to that workstation

Middle-Tier: map the user’s task description onto the resource specification; this may include resource discovery, and other services

Resource Specification, Control Access, Events

Resource Allocation: run, transfer data, run

Page 11: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Three Tier System

Task Descriptor

Resource Descriptor

Front End: Tools to select or specify the problem to solve

Middle Tier: Translates the user task into resource requests

Back End: Resources and data to execute the task.

Page 12: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Abstract Application Descriptor (AAD)

“man pages written in XML” specifies how to install and run the application on

different hosts [current status of Gateway]

describes requirements, input and output data, options, arguments, etc.

to submit a job it must be reduced to a job descriptor (select host, options, input data…)

More on AAD: http://www.npac.syr.edu/users/haupt/WebFlow/MODULES/AAD.html

Page 13: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Reducing AAD to JDAbstract Application Descriptor to Job Descriptor

AAD

select host

JD

select optionsselect input...

submit

Generatebatch script

GenerateRSL

Informationservice(MDS)

Resource broker

JINI condor

GUI ProblemSolving

Environment

Data Flowmanager

Page 14: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Example Job Descriptor(with selected application, host and i/o files)

<?xml version="1.0"?><!DOCTYPE application SYSTEM "ApplDescV2.dtd"><application id=”Casc2d" installable="No"> selected application<target id="aga.npac.syr.edu"> selected host <status installed="Yes"/> <installed> <CmdLine command="/npac/home/haupt/CASC2D/casc2d" /> how to run it <input> <inFile Path="/npac/home/haupt/CASC2D/lms/" Name="sand.map"/> it expects this input file <source Host="maine.npac.syr.edu" Path="C:\LMS\fromEdys\" Name="S.map" > actual </input> location of the file <output> <outFile Path="/npac/home/haupt/CASC2D/lms/" Name="sed.out"/> it generates this output file <dest Host="maine.npac.syr.edu" Path="C:\LMS\toEdys\" Name="sed.out" > store it there </output> <stdout Host="aga.npac.syr.edu" Path="/npac/home/haupt/CASC2D/history/" Name="job2001.out" > <stderr Host="aga.npac.syr.edu" Path="/tmp/" Name="haupt_job2001.err" > </installed></target> save stdout </application> and stderr

Page 15: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

simple job object (atomic task)

“input port”: method to be invoked

“output port”: event fired

run();

success failure

AAD

Page 16: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Complex Tasks

run();

success failure

run();

success failure

run();

success failure

run();

success failure

run();

success failure

run();

success failure

Page 17: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Task Descriptor

A computational task requested by the user may involve many steps.

Some steps can be performed concurrently, but typically there are data dependencies that force execution of the steps in some particular order.

Tasks can be defined recursively. Task may specify resources explicitly, or provide

requirements and/or preferences leaving the selection of resources to the discretion of a resource broker.

Page 18: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

<!ELEMENT Task (TaskName, (Task|connection)*, InputPort+, OutputPort+><!ELEMENT TaskName EMPTY><!ATTLIST TaskName name CDATA #REQUIRED descriptor CDATA #IMPLIED><!ELEMENT connection (output+,input+)><!ELEMENT output EMPTY><!ATTLIST outputtask CDATA #REQUIREDevent CDATA #IMPLIED><!ELEMENT input EMPTY><!ATTLIST inputtask CDATA #REQUIREDmethod CDATA #IMPLIED><!ELEMENT InputPort EMPTY><!ATTLIST InputPort task CDATA #REQUIRED><!ELEMENT OutputPort EMPTY><!ATTLIST OutputPort task CDATA #REQUIRED>

ATD.dtd

Example Task Descriptor

<Task><TaskName name="ComplexTask" /> <Task> <TaskName name="atomic_task1" descriptor="task1.xml" /> <InputPort method="run" /> <OutputPort event ="done" /> </Task> <Task> <TaskName name="atomic_task2" descriptor="task2.xml" /> <InputPort event="run" /> <OutputPort method ="done" /> </Task> <connection> <output task="task1" /> <input task="task2" /> </ connection> <InputPort task="atomic_task1" /> <OutputPort task="atomic_task2" /></Task>

Page 19: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

How the task descriptors are generated ?

Predefined (“set of scenarios”) Created interactively by the user using

Front End tools Generated by middle-tier components

Page 20: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

LMS Front EndNavigate and choose an existing application

to solve the problem at hand.Import all necessary data.

Retrieve data

Pre/post-processing

Run simulations

Select host

Select model

Set parameters

Run

Page 21: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

QS Front End

Compose interactivelyyour applicationfrom pre-existing

modules

Data-Flow Front-End

Page 22: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Building an application

XMLA visual representation

is converted into a XMLdocument

XMLservice

WebServer

save

parse

ApplContext

Generates Java code to add modules to ApplContextPublishes IOR

Front-End Applet

Middle-Tier

Page 23: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Gateway Front EndGateway Navigator

where do you want to go today?

Define the systemyou are interested in

Control applet:File AccessJob monitor

Page 24: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Middle-Tier: WebFlow ServerCORBA-based distributed components

• WebFlow server is given by a hierarchy of containers (contexts) and components

• The server is the root context.

• A context • knows its location in the hierarchy• has attributes• maintains a persistent state• controls its children life-cycle• is responsible for intercomponent communications (events)• can be specialized by adding services (WebFlow modules)

User 1 User 2

Application 1

Application 2

App 2App 1

WebFlow Services

Page 25: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Distributed Middleware

Master WebFlow

Server

Web Server

Cli

ents

Dis

trib

uted

Bac

k-E

nd R

esou

rces

DownloadApplet

WebFlowContextProxies

Tas

k D

escr

ipto

r

Gri

d In

terf

ace

JBD

C

Inf

orm

atio

n S

ervi

ces

“slave” WebFlow

Server

Page 26: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

User Context

Problem Context

Example: Gateway Components

• Using PSE (Front End) user defines a problem• Session is an instance of it (an attempt to solve it)• Session comprises jobs

• Session context reflects the structure of ATD

• Session context can submit itself

Session Context

Job contextapplication descriptor, job id,date submitted, completed,input file(s), output file(s)

Page 27: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

WebFlow Events

Method m

Client

BEvent

Adapteruses

CORBADSI,DII

Event eA

Context 1

Context 2

Module A does not care who is expecting the event; method fire Event

invokes a method of its parent context

Method m is a public method: anyone can invoke it, including the Event Adapter of Context 1.No protection against misuse!

Dynamic Interfaces

A

B

Page 28: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Middle Tier Components

Task Specification

Resource Specification

Component ContainerXMLparser

Fileaccess &transfer

jobcontrol

User Context

profileCredentials

(proxy)

Session ContextJob

object Jobobject

Jobobject

batchscript

generator

resourcebroker

data flowmanager NetSolve

Linear Algebraproxy

PSEsupport

Informationservices

databaseaccess

dataanalysis

Multi-disciplinarytask control

archivization

accesscontrol

contextlifecycle

Page 29: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Grid Interface

How to hide complexity and details of Back End resources?

Example: JDBC

Servlet

Application

Driver Manager

Oracle D. Sysbase D. mSQL D.

Oracle Sysbase mSQL

FrontEnd

Back-End independentbusiness logic

Java.sqlMiddle-Tier

Page 30: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

JDBC model to provide accessto computational resources

Servlet

Application

GRAM

PBS NQS CONDOR

O2K SP2 NOW

FrontEnd

Back-End independentbusiness logic

Grid Interface

Grid Interface: access control, allocations, resource look-up, discovery, (co)allocation, monitoring, QoS, fault tolerance, services, events, ... Addressed by Grid Forum. Approximated by Globus.

Portal: builds on top of it, implementing proxies

Page 31: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Example of a proxy module

&(rsl_substitution = (MYDIR “/tmp/haupt”))(DATADIR $(MYDIR)/data)(EXECDIR) $MYDIR)/bin))(executable = $(EXECDIR)/a.out)(arguments=$(DATADIR)/file1)(stdout=(MYDIR)/result.dat))(count=1)

GRAM resource descriptionGenerate Data

Run Job

Analyze

The Run Job module is a proxy module. It generates the RSL in-the-fly and submits the job for execution using globusrun function.

The module has access to a job descriptor.

Page 32: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Security: Issues

Front End (Applet or Application)

Connection through open Internet

access control

Gatekeeper

HPCC resources

Layer 1: Secure Web

Layer 2: Secure Middle Tier

Layer 3: Secure access to resources

Policies defined by resource owners

access control and delegation

The same of different security domain

Page 33: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Security (2)

Different model than most commercial solutions:– charge for service and not CPU used to render

the service– identify by credit card number

Three-tier architecture:– delegation of credentials

Page 34: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Security: CORBA security service

Page 35: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Kerberos/SecurID

Web Server

ORB ORB

MasterWebFlow

Server

ORB

SlaveWebFlow

Server

SECIOP

krsh

C:\>kinit

C:\>krsh

(with forwardable ticket)

Downloadframeset

applet

Front End Middle Tier Back End

Page 36: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

SSLWeb Server

ORB

MasterWebFlow

Server

ORB

SlaveWebFlow

Server

https

GlobusGSSAPI

Downloadframeset

applet

Front End Middle Tier Back End

IIOP

Servlets

Page 37: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Proxy Objects

The master creates and maintains proxies for each component– to forward requests from the Web client to remote objects

– Simplify the association of the distributed components

– Enable the communication between the client and the slave servers running on different hosts

– having the capability of logging, tracking and filtering all messages between components in the system to implement fault tolerance and security and transaction monitors.

Page 38: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Distributed Middleware

Master WebFlow

Server

Web Server

Cli

ents

Dis

trib

uted

Bac

k-E

nd R

esou

rces

DownloadApplet

WebFlowContextProxies

Tas

k D

escr

ipto

r

Gri

d In

terf

ace

JBD

C

Inf

orm

atio

n S

ervi

ces

“slave” WebFlow

Server

Page 39: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Clients andtheir servers Middle Tier Custom Servers

Back End Servers andtheir services

Emerging Object Web Multi-Server Model

Page 40: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Summary

We build Web Portals using new, emerging, often ephemeral technologies and standards

What will survive?– Multitier architectures

– Distributed Components

– XML to define interfaces

– Metadata (UML, XMI, …)

What next?– Developer tools for enterprise servers– ASP: Application Service Providers

Page 41: Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Summary (2)

Academic example: WebFlow– Gateway, LMS, GEM, NCSA

We extended notion of a Web Portal– support for HPCC– Grid Interface (a new CORBA facility?)– Abstract Task Descriptor in XML

Middle-Tier components as proxies To be added: proxy communication channels


Recommended