1
TheXCAT SciencePortal
SriramKrishnan�
RandallBramley�
DennisGannon�
RachanaAnanthakrishnan�
MadhusudhanGovindaraju�
AleksanderSlominski�
YogeshSimmhan�
JayAlameda�
RichardAlkire�
Timothy Drews�
Eric Webb�
�Departmentof ComputerScience,IndianaUniversity, Bloomington,IN 47404,USA
Tel: (812)855-8305,Fax: (812)855-4829,[email protected]�
NationalCenterfor SupercomputingApplications,Champaign,IL 61820,USATel: (217)244-4696,Fax: (217)244-2909,[email protected]
�Departmentof ChemicalEngineering,Universityof Illinois, Urbana,IL 61801,USA
Tel: (217)333-0063,Fax: (217)333-5052,[email protected]
Abstract
Thispaperdescribesthedesignandprototypeimplementationof theXCAT Grid SciencePor-
tal. Theportal letsgrid applicationprogrammersscriptcomplex distributedcomputationsand
packagetheseapplicationswith simpleinterfacesfor othersto use.Eachapplicationis pack-
agedasanotebook whichconsistsof webpagesandeditableparameterizedscripts.Theportal
is a workstation-basedspecializedpersonal webserver, capableof executingthe application
scriptsand launchingremotegrid applicationsfor the user. The portal server can receive
event streamspublishedby the applicationandgrid resourceinformationpublishedby Net-
work WeatherService(NWS)[35] or Autopilot [16] sensors.Notebookscanbepublished and
storedin webbasedarchivesfor othersto retrieve andmodify. TheXCAT Grid SciencePortal
hasbeentestedwith variousapplications,includingthedistributedsimulationof chemicalpro-
cessesin semiconductormanufacturingandcollaboratorysupportfor X-ray crystallographers.
2
1 Introduction
Theconceptof a SciencePortalwasfirst introducedby theNationalComputationalScienceAl-
liance(NCSA) aspart of a projectdesignedto provide computationalbiologistswith accessto
advancedtoolsanddatabasesthatcouldbesharedby a communityof usersvia webtechnology.
A SciencePortalcanbebroadlydefinedasanapplicationspecificenvironmentfor usingandpro-
grammingcomplex tasksinvolving remoteresources.OverthepastyeartheSciencePortalconcept
hasbeenheavily influencedby theemergenceof theGrid [13] asacomputationalplatform.
A Grid is asetof distributedservicesandprotocolsthathavebeendeployedacrossalargesetof re-
sources.Theseservicesincludeauthentication,authorization,security, namespacesandfile/object
management,events,resourceco-scheduling,userservices,network quality of service,andinfor-
mation/directoryservices.Togethertheseservicesenableapplicationsto accessandmanagethe
remoteresourcesandcomputations.Web-basedGrid Portalsprovide mechanismsto launchand
managejobson thegrid, via theweb. Grid SciencePortalsareproblemsolvingenvironmentsthat
allow scientiststhe ability to program,accessandexecutedistributedapplicationsusinggrid re-
sourceswhicharelaunchedandmanagedby aconventionalWebbrowserandotherdesktoptools.
In suchportals,scientificdomainknowledgeandtools arepresentedto the userin termsof the
applicationscience,andnot in termsof complex distributedcomputingprotocols.Thesystemef-
fectively makesthegrid into a vastandpowerful computationenginethatseamlesslyextendsthe
user’s desktopto remoteresourceslikecomputeservers,datasourcesandon-lineinstruments.
3
This paperdescribesthe XCAT SciencePortal (XCAT-SP) which is an implementationof the
NCSAGrid SciencePortalconcept.XCAT-SPis basedontheideaof an“activedocument”which
canbe thoughtof asa “notebook” containingpagesof text andgraphicsdescribingthe science
of a particularcomputationalapplicationandpagesof parameterized,executablescripts. These
scriptslaunchandmanagethecomputationon thegrid, andresultsaredynamicallyaddedto the
documentin theform of dataor links to outputresultsandeventtraces.
XCAT-SPis atool whichallowstheuserto read,edit,andexecutethesenotebookdocuments.The
goalof this researchandthefocusof thispaperis to addressthefollowing setof questions.
� How well doestheactivedocumentmodelwork for realscientificapplications?
� How doesoneusescriptsto steercomputationsfrom theportal?
� Whatis asimpleandefficientmechanismto storeandretrievedataspecificto eachapplica-
tion?
� How shouldtheportalbedesignedto interactwith aneventsystemto receivefeedbackfrom
theremotelyexecutingapplications?
� How cana portal usea grid monitoringsystemto provide resourceutilization information
aboutits environment?
2 Existing Grid Portals
Theareaof Grid Portaldesignis now anextremelyactiveandimportantpartof theemergingGrid
researchagenda.Theexistingprojectscanbegroupedinto threecategories:
4
� UserPortalsfor simplejob submissionandtracking,file managementandresourceselection
� PortalConstructionKits, thatprovide theAPIs necessaryfor a portal to communicatewith
Grid services
� SciencePortals,asdefinedearlier
In theuserportalcategory, theNPACI Hot Page[31] is thefirst andmostsuccessfulsystem.Other
userportalprojectsaretheEuropeanprojectUnicore[5], Nimrod-Gfrom Australia[12], andthe
IPGLaunchPad,which is theuserportalfor NASA’s InformationPowerGrid [7] .
At leastthreeprojectsprovideportalconstructiontoolkits.TheArgonneCommodityGrid (CoG)[20]
toolkit is a Java interfacefor Globus. GPDK from LawrenceBerkeley Labs[9] is a JSPAPI for
CoG,andJiPANG from Tokyo Instituteof Technology[28], usesSunMicrosystem’s Jini [26] to
provideaninterfaceto bothCoGandnetworkedsolverslikeNinf [29] andNetsolve[8].
SciencePortalshave a varietyof forms. Somearedesignedaroundrelatively specificapplication
domains.For example,theCactusPortal[19] from theAlbert EinsteinInstitutewasoriginally de-
signedfor blackholesimulationsandtheECCE/ELN[30] projectfrom ORNL, LBNL andPNNL
is for ComputationalChemicalEngineering.TheLatticePortal[23] from JeffersonLabsis a user
portal for high-energy physics.Onecategory of scienceportalsdirectly addressestheproblemof
building multidisciplinaryapplications.TheGateway project[6] andtheMississippiproject[34]
useCORBA [15] andEnterpriseJavaBeans(EJB)[27] to build athree-tierarchitecturefor launch-
ing andschedulingmultipleapplications.Thesetwo projectsalsousescriptingto orchestratelarge,
complex applicationscenarios.AnotherCORBA-basedprojectis theRutgersDiscoverportal[11]
5
whichalsoprovidesagoodinterfacefor computationalsteeringandcollaborations.
3 The XCAT Science Portal
A prototypescienceportal that testssomeof the featuresdescribedabove hasbeendevelopedat
IndianaUniversitywith thehelpof theChemicalEngineeringTeamfrom NCSA.Theportaldif-
fersin its architecturefrom theexamplesdescribedabovebecauseit doesnotuseacentralizedweb
server on a remotemachine.Insteadthe portal softwarethat runson eachuser’s desktop/laptop
hasa built-in server. Thereasonfor this is that theXCAT SciencePortalis designedto integrate
theuser’s desktopenvironmentwith theremotegrid resources.If theportalresideselsewhere,the
only tools theusercanuseto interactwith theGrid is a Webbrowseror otherHTTP clients. In
our model,theportalserver providesa single,local gateway betweentheGrid Servicesandlocal
applications.A local webbrowsercanstill interactwith it throughHTTP, but otherapplications
maypossiblycommunicatewith it via local protocolsandservices,suchasCOM [25], .NET [24]
andBonobo/Gnome[1].
As illustratedin Figure1, themajorcomponentsof theportalserver include:
� A Java-basedserver engine,which spawnsoff a setof Java Servletsthatmanageaccessto
theothercomponents.ThecurrentversionrunsonJakartaTomcat4.0,andis deployableas
a WebArchive(WAR) file, andworksonvariousflavorsof WindowsandUnix.
� A notebook database.A notebook is an active documentdefinedby an XML object that
describesa setof resourcesusedin a computationalapplication.It consistsof documents,
6
webpages,executionscripts,andothernotebooks.
� A Script Engine that is usedto executecomplex Grid operations.Thescriptingis currently
in Jython,a pureJava implementationof thePythonscriptinglanguagewhich hasbecome
popularwith many computationalscientists.We provide Jython-basedinterfacesto theAr-
gonneCoGtoolkit, which in turn,providesaccessto GlobusfunctionalityandtheGSI [21]
Grid authenticationmechanisms.It alsohasan API that allows easyaccessto the DOE
CommonComponentArchitecture(CCA) services[22].
� An Event Subsystem thatis capableof handlingeventmessages,whichmaybegeneratedby
grid resourcesor userapplications.
� A Grid Performance Monitor thatprovidestheuserwith aview of availableresources,their
currentloadsandnetwork loads.
� A Component Browser thatusesanSQL Databasebackendto provide theuserwith infor-
mationaboutcomponentswhichcanbedeployed.Theusercanusethis informationto write
Jythonscriptsto createandwire togethercomponents.
� A Remote File Management Interface thatusestheGSIenabledFTPservice.
3.1 The Notebook Database
Theunderlyingdirectorystructureof thefilesystemis usedasthedatabaseto supporttheportal.
Thedatabasestoresa notebookcorrespondingto eachcomputationalapplication.Eachnotebook
is storedasadirectoryandeachpageof thenotebookis storedin adifferentsubdirectory. An XML
file containingmeta-dataaboutthenotebookanda list of pointersandreferencesto thepagesin
thenotebookis alsostoredin thelocaldatabase.Figure2 showsasnippetof suchanXML file. It
7
describesanotebooksession,with a title Notebook Intro, containinganotebookpage,BigPicture.
Thecompleteschemascanbeviewedat http://www.extreme.indiana.edu/an/xsd.
Theactive documentrepresentinga notebooksessioncanbeconvertedinto a Java Archive (JAR)
andthemetainformationaboutthenotebookcanbestoredin a WebDAV [17] server. This meta
informationincludestheURI for theactuallocationof thearchive. Notebookuserscanbrowsethe
informationaboutpublishedsessionsusingaWebDAV client. Thisenablesthemtogetinformation
like notebookname,author, abstractetc. aboutthesessionbeforeactuallydecidingto retrieve the
archive. Theauthorof thenotebookcandecideto setprivilegeslikeenablingonly reador allowing
updateson the archivedsessionby otherauthors.Theauthorizationinformationis storedin the
WebDAV server asan AccessControl List. The JAR canbe publishedinto a repositoryusing
GSI-enabledFTPor otherfile transferservices.SincetheJAR correspondsto anactive document,
it is self-sufficientandcanbesimplypluggedinto anauthorizeduser’s localdatabase,andis ready
to use.Thus,theportalusersin ascientificcommunitycancollaboratewith theirpeersby sharing
datacorrespondingto theirexperiments.
3.2 Grid Application Scripting
Onedifferencebetweena user portal anda science portal is thecomplexity of the tasksthat the
portal supports.A userportalallows usersto submitsinglejobs to thegrid. Theportalprovides
featuresto make it very simpleto managethejob, providing load-timeandrun-timeinformation,
andto helptheuserselectresourcesandto monitor theexecutionof the job. In a scienceportal,
the applicationstend to be morecomplex. A singlescientificexperimentmay involve running
8
many differentcomputationalsimulationsanddataanalysistasks.It may involve coupledmulti-
disciplinaryapplications,collaboration,andremotesoftwarecomponentslinkedtogetherto form
a distributedapplication. Often thesecomplex tasksmay take a greatdealof effort to plan and
orchestrate,andtheentireapplicationmayneedto berunmany timeseachwith aslightly different
setof parametervalues.We have foundthat thebestway to allow this sortof computationto be
carriedout is to allow thescientistaccessto a simplescriptinglanguagewhich hasbeenendowed
with a library of utilities to manageGrid applications. Furthermore,we provide a simple tool
which allows thescientistto build a webform interfaceto configureandlaunchthescripts.Users
of thescriptssimplyfill in parametervaluesto thewebform andthenclick theSubmit button.This
launchesa script which executeson the user’s desktop,but managesremoteapplicationson the
grid. Our prototypeimplementationusesthe Jythonlanguagefor scriptingbecauseit is popular
with scientistsandhasanexcellentinterfaceto Java,andwemakethescriptsgrid-enabledby pro-
viding anAPI to GlobusServicesusingtheCogToolkit.
Figure4 illustratesaportalinterface,which is typically application-dependentandis configurable
by theusers.In thepanelon the left, thereis a view of anopennotebooksession.It consistsof
a setof pagesandscript forms. In this figure,theform for a simplescriptwhich launchesa local
visualizationapplicationis shown. Parametervaluesselectedby theuserfrom theform pageare
boundto variablesin thescript.By selectingEdit boththescriptandtheform pagemaybeedited
asshown in Figure5. In thiscase,thescriptlaunchesa localprogramcalledanimator whichtakes
asa parameterthenameof asimulationoutputfile to animate.In thisexamplethescriptis trivial,
but it is notmuchmoredifficult to write ascriptto launchanapplicationonthegrid andto manage
remotefiles.
9
A secondform of scriptingis usedto managethe local detailsof the program’s executionon a
remotesite.Theremoteapplicationsaremanagedby application managers. In mostcases,theap-
plicationsthatthescientistsandengineerswantto runon theGrid arenotgrid aware, i.e. they are
ordinaryprogramsthatreadandwrite datato andfrom files. In somecases,wehaveaccessto the
applicationsource,but oftenthatis notavailable- e.g,whenusingcommercialapplicationscodes.
An applicationmanageris anagentprocessthathelpstheapplicationmake useof grid services.
For example,the managercanstageinput files from remotelocationsor invoke post-processing
on theapplicationoutputwhentheapplicationhasfinished.Themanageralsoservesasanevent
conduitbetweentheapplicationandtheportal. If theapplicationdiesor createsafile, themanager
cansendan event backto the portal with the appropriatemessage.The applicationmanageris
shown in Figure6.
Theapplicationmanagercanalsoactasaservicebroker for theapplication.Themanagercanreg-
ister itself with theGrid InformationService[14] andadvertisetheapplication’s capabilities.If a
userwith theappropriateauthorizationdiscoversit, thenthemanagercanlaunchtheapplicationon
behalfof theuserandmediatetheinteractionwith theuser. For example,supposetheapplication
is a library for solvingsparselinearsystemsof equationson a largeparallelsupercomputer. The
managercanexporta remotesolver interfacethattakesasparselinearsystemasinputandreturns
solutionvectorsanderror flagsasoutput. If a userhasa remotereferenceto the manager, the
solvercanbeinvokedby a remotemethodcall passinga linearsystem(or its URI) asa parameter
andthesolutionvectorcanbereceivedasaresultof thecall. This is themodelusedby JiPANG to
invokeNinf andNetsolve.
10
In theXCAT system,theapplicationmanagersconformto theDOE CommonComponentArchi-
tecture(CCA) specification.XCAT is our implementationof theCCA specification,built on top
of XSOAP (formerly SoapRMI[32]), thatallows theusersto write CCA compliantcomponents
in C++ andJava. Theapplicationmanagersaredesignedto bescriptable components,whichhave
onestandardportproviding thecreatorwith theability to downloadascriptwhich thecomponent
canrun. The scriptinglanguageandlibrary usedby the componentis identicalto the language
andlibrary availableto the portal engine. The applicationmanagerscombinethe advantagesof
a persistentremoteshell with that of a remoteobjectwhich may be invoked througha well de-
finedsetof interfaces.Furthermore,theinterfacesthatamanagercomponentsupportscanchange
dynamicallyby simply downloadinga new script. This allows the portal to dynamicallychange
the behavior of a remoteapplicationto suit new problemsor requirements.For a moredetailed
descriptionof theApplicationManager, includingAPIs,consultProgGrid[4].
XCAT providesaJythonbasedscriptinginterfaceto instantiateremotecomponents,wire themto-
getherusinginputandoutputportsandorchestratethecomputations.Theportalusesthisscripting
interface,whoseAPI is describedin Figure8.
3.3 Event Subsystem
TheXCAT SciencePortalusestheSOAP Eventssystem[33] to decouplecommunicationbetween
thescriptingengineandtheremotejobslaunchedby theportal. This decouplingensuresthat the
11
remoteapplicationscancontinueexecutionwhentheportal itself shutsdown. Communicationis
reestablishedseamlesslywhentheportalis restarted.Theeventsthatoccurin theinterimarestored
by a persistenteventchannelandcanberetrievedby theportalon restart.A remotejob canbean
instantiatedcomponentthat readsandwritesfiles or a Grid Monitoring Architecture(GMA) [2]
thatcollectsdatafor fault detectionandperformancetuningfrom acomputationalgrid. Suchsys-
temscanindicatetheirprogressby sendingout eventsat regularintervalsto interestedlisteners.
The SOAP Eventssystemis basedon XSOAP which usesHTTP as the network protocol and
SOAP 1.1 [3] compliantXML messagesasthe wire protocol. By usingXSOAP, the portal can
receive eventsfrom any SOAP 1.1 system.As SOAP eventsarejust XML strings,they canbe
publishedby writing preformattedstringsonto a socket, allowing frameworks that usedifferent
languagesandplatformsto publishto thechannel.To makethechannelfirewall-friendly, weallow
thepublishersto “push” eventsto thechannelandthesubscribersto “pull” eventsfrom it. Such
a modelobviatestheneedfor theeventchannelto initiate thecommunicationwith publishersor
listenersthatmayresidebehindfirewalls. We simulateasynchronouseventnotificationto thelis-
tenerby usinglisteneragents(seeFigure10). The listeneragentsconstantlyquerythe channel
for newly arrivedeventsandforwardtheseeventsto thelistener. Theagentandthechannelusea
cookie-basedschemeto monitor theretrievedevents. A cookie,heldby theagent,hascomplete
stateinformationaboutthe progressof the event pull invocation. Using this cookie,the listener
agentcanresumethepull in casethechannelfails andis restarted.Publishersusesimilar agents
to ensuredeliveryof eventsto thechannelsothatnetwork outagesor failureof thechanneldonot
prevent the event from beingsent. The publisheragentsstorethe unsenteventsin a local store
andperiodicallyretrypublishingthem.Thelistenerandpublisheragentscanalsolocateasuitable
12
event channelto connectto, basedon a setof constraintsprovided by the listeneror publisher
application.Eventchannelsregisterwith adirectoryservicewhenthey startandtheagentsusethis
serviceto selectthechannel.
The portal listenerregisterswith the listeneragentrunning on the local machineusing Jython
scripts,asillustratedby snippetsof codein Figure11. Thelisteneragentlocatesthenearestor a
well-known SOAP eventchannelusingthedirectoryservice.Thelistenercanuseafilter to restrict
theeventsit receivesto thosethatit is interestedin. Thisfiltering canbebasedonmatchingevent
attributesor using SQL to query the persistentchannel. Applicationscan provide information
aboutthestatusof their computationby publishingeventsto the eventchannelvia the publisher
agent.
Theeventchannel,in its simplestincarnationis just a listenerandpublisherworking in tandem.
With all its featuresenabled,it providesfor complex filtering andquerying,persistenceto allow
retrieval of historicaleventsandhandlingof userdefinedeventsthatthechannelis not awareof.
3.4 The Grid Performance Monitor
TheGrid PerformanceMonitor (GPM) usestheeventsubsystemto provide theuserwith visual-
izationof availableresourcesandthecurrentandpredictedfuture loadson theseresources.The
datafor theseloadsis obtainedfrom the NWS. The GPM is designedasa thin layer on top of
SoapRMIeventsandaneventchannel.This providesthe portalwith the flexibility to cooperate
13
andexchangesignals/eventswith avastvarietyof eventgenerators.
Theeventvisualizercomponentof theGPMsubscribesto theeventchannelthroughits agent.The
visualizerregistersinterestonly in thoseeventsthatarefor resourceutilization or a relatedtype.
Usingtheeventchannelprecludestheneedfor aglobalregistryof all sensors.Sensorssendevents
to theeventchannelat periodicintervals. Detectionof a stoppagein eventsfrom a particularsen-
sorcanbedeterminedto bedueto the failureof thesensor. Eventgeneratorsthatsendeventsat
irregular intervalscanberequiredto sendsimpleheart-beateventsat regular intervalsto indicate
thatthey arestill operational.
Figure12 shows an exampleof an XML resourceevent. The XML Schemaeventsystemlends
itself to extensibility andself-describingevent formats,thusmakingit possiblefor the portal to
interoperatewith a wide variety of otherevent systems,including the NWS, Autopilot sensors.
Applicationsthatareawareof their resourceutilizationscanalsowrite application-level resource
events,andsendit to the eventchannel.Thus,the usercannot only receive resourceutilization
information of the target machines,but also the performanceinformation from their executing
applications.
3.5 Authentication & Security
In the future, the portal is plannedto be run in oneof two modes: personalor multiuser. At
present,we only supportthepersonalmode,while work on themultiusermodeis in progress.In
bothcases,theauthenticationis handledvia theGlobusGSI.TheusercaneitheruselocalGlobus
14
credentialson theportal’s server via theGlobusCoGKit, or canremotelyuploadcredentialsinto
the portal via the MyProxy [10] CoG Kit. The initial startupscreenhastext fields for the user
to enterin theappropriateinformation:his/herGlobuscredentialpassword for a local credential,
or a server, tag nameandpassword for a MyProxy credential. In eithercase,the portal server
loadsaGlobusProxyobjectfrom therelevantsourcefor usein authenticationandinstantiation,on
behalfof theuser. In thepersonalmode,only theowneris authorizedto run jobsusingtheportal,
while in the multiusermodethe usercanrun jobs if he/sheis permittedto usethe portal,which
canbeconfiguredby theportalowner, usingsomeAccessControlList mechanism.If cookiesare
enabledby theuser, theserver setsa cookieobjectin theuser’s browserthatmapsthesessionto
theProxysothat,whentheuserleavesthesite,his/heridentity isn’t lost. This helpstheportaldo
bettersessionmanagement.Evenif theuserhasdisabledtheuseof cookies,theportalworksfine,
althoughit losessomeof its sessiontrackingcapabilities.
4 Sample Applications
TheXCAT SciencePortalhasbeenusedfor a numberof differentapplications.It hasbeenused
for distributedsimulationof chemicalprocessesin semiconductormanufacturingby a teamof
ChemicalEngineersat NCSA, for collaboratorysupportby a teamof X-ray crystallographers
at IndianaUniversity, and for Linear SystemsAnalysis [4], and Collision Risk Assessmentof
Satelliteswith spacedebris[4] by theExtremeComputingLabat IndianaUniversity. Wedescribe
two of theabove in thenext subsections.
15
4.1 NCSA Chemical Engineering
The work donewith the ChemicalEngineeringteamfrom NCSA is an exampleof the kind of
scienceproblemsthe portal is intendedto solve. The simulationmodelsthe processof copper
electrodepositionin asubmicronsizedtrenchwhich is usedto form theinterconnectiononmicro-
processorchips.Thesimulationconsistsof two linkedcodes.Oneconsistsof a continuummodel
of the convective diffusion processesin the depositionbathadjacentto the trench. The second
consistsof a Monte Carlo modelof eventsthat occur in the near-surfaceregion wheresolution
additivesinfluencetheevolution of depositshapeandroughnessduringfilling of thetrench.The
codescommunicateby sharingdatafilesaboutcommonboundaryconditions.Figure13showsthe
coupledcodesandthefilter thatis addedto insurestabilityof thelinkedcomputationalsystem.
The codesare run separatelyon the Grid. The transferof files is doneusing grid basedfile-
managementand transferutilities. The interfaceto the Grid is provided by “Application Man-
agers”.As describedbefore,thesearewrapperswhichprovideaccessto grid servicessuchasGSI,
grid-events,etc. to thecodesandmake themgrid-aware.Eachexecutionis set-upandcontrolled
from thecontrollingJythonscriptwhich runsinsidetheportal. Theprimarymechanismfor get-
ting feedbackis theeventsystem.Grid file-managementtoolscanbeusedto transferoutputfiles
which aregenerated.Eventswhich comebackfrom theapplicationsarehandedoff to eventhan-
dlerswhich have beenregistered,or arelogged.Specialeventscouldbeusedto triggeroff event
handlerswhichcanchangeor controlthecourseof theexecution.
This applicationillustratesseveralinterestingscenariosin collaboration.Theexperimentis setup
16
by the chemicalengineersusingthe tools provided in the portal. Simpleweb forms arecreated
for parameterinput which will control the experiment. An exampleof onesuchform is shown
in Figure15. Subsequentusersdo not needto know abouttheseparametersor themechanicsof
thegrid computation.They will interactwith only theportalwebinterfaceandeventnotification
mechanisms.
4.2 IU Xports project
A secondapplicationis a collaboratoryfor X-ray crystallographersusing the beamlines at Ar-
gonne’sAdvancedPhotonSource(APS)andLawrenceBerkeley’sAdvancedLight Source(ALS).
This work will allow usersat remotelaboratoriesto sendsamplecrystalsto thebeamlines,col-
laboratewith the scientistspreparingandmountingthe sample,thento receive initial imagesof
theexecution,over thenetwork. They canthendynamicallyuploadnew controlparametersor, if
the sampleappearsflawed, terminatethe run. In additionto large amountsof data(up to a Ter-
abyte/day)andnumbersof files (1-3 persecond)this applicationrequiresmultiple videostreams,
accessinghigh-speedresearchnetworks, andsynchronousgeographicallydistributedcollabora-
tion.
Theportalwasusedto launchpartof theexperimentalsetupfrom theclientsite.UsingtheJython
controllingscriptandtheJava ApplicationManagers,local applicationswerelaunchedandcon-
trolled. The setupof the experimentclosely resembledthat of the ChemicalEngineeringone.
Eventswereusedto getfeedbackon theprogressof theexecution.
17
5 Conclusions
This paperhasdescribedthe XCAT SciencePortal system. The contributions of this research
projectinclude
� providing a genericprogrammingtool for grid applicationdesignersthat allows them to
scriptcomplex applications,andaccessthemusingasimpleformsbasedwebbrowserinter-
face.
� providingan“activedocument”modelfor packagingapplicationsfor collaborativepurposes.
� demonstratinghow a grid eventsystemcanbeintegratedinto boththegrid applicationsand
resourcemonitoringto providetheuserwith importantfeedbackabouttheruntimebehavior
of hisor herapplications.
� showing that a distributed software componentarchitecture(in this casethe DOE CCA
model)canbeusedasaneffective tool to managedistributedapplicationsbasedon legacy
software,which is notgrid-aware.
6 Future Work
Futurework includesintegrationof theresourceandcomponentdirectoryserviceswith theGrid
Forumstandardsfor informationservicesandwith theemerging work on theWebServiceDirec-
tory Language(WSDL) that is beingadvocatedby industrygroups.In addition,we arebuilding
interfacesto intelligentresourcebrokersandbuilding componentsthatarecapableof adaptingto
availablegrid resources.We areworking on themultiuserversionof theportal,andtrying to use
it for the Grid AccessPortal for PhysicsApplications[18]. We arealsoworking on an secure
18
implementationof SOAP, whichwill bebuilt usingGSIandSecureSockets.We planto integrate
it with a multiprotocolmessagingarchitecture,which is capableof switchingbetweenSOAP and
binaryprotocols,dependingupontheperformanceneedsof theuser.
7 Acknowledgements
Theauthorswould like to thankthereviewersandthemembersof theExtremeComputingLabo-
ratory, IndianaUniversityfor their insightful comments.In particular, we aregratefulto Kenneth
Chiu, Al RossiandShava Smallen,who arecurrentstaff membersat the ExtremeLab, and to
VenkateshChoppella,Rahul Indurkar, Nirmal Mukhi, BenjaminTemko andJuanVillacis, who
havebeenpastmembersof theprojectgroup.
ThisresearchwassupportedbyNSFgrants4029710and4029713,NCSAAlliance,andDOE2000.
19
References
[1] GNOME,visited4-1-2001.www.gnome.org.
[2] Brian Tierney et al. White paper: A grid monitoring servicearchitecture(draft), visited
03-10-01.http://www-didc.lbl.gov/GridPerf/papers/GMA.pdf.
[3] D. Box et al. Simple Object Access Protocol 1.1. Technical report, W3C, 2000.
http://www.w3.org/TR/2000/NOTE-SOAP-20000508/.
[4] DennisGannonet al. Programmingthe Grid: DistributedSoftwareComponents,P2Pand
Grid Web Servicesfor ScientificApplications. Journal of Cluster Computing, 2002. To
appear.
[5] DietmarErvin etal. TheUnicoreHPCPortal,visited04-25-2001.http://www.unicore.de/.
[6] Geoffrey Fox et al. The Gateway Computational Web Portal, visited 04-27-01.
http://www.gatewayportal.org/.
[7] George Myerset al. TheNASA InformationPower Grid (IPG) LaunchPad Portal,visited
04-27-2001.http://www.ipg.nasa.gov/.
[8] JackDongarraetal. Netsolve,visited04-27-01.http://www.cs.utk.edu/netsolve/.
[9] JasonNovotny et al. The grid portal developmentkit (gpdk) project, visited 04-01-01.
http://dast.nlanr.net/Features/GridPortal/.
[10] JasonNovotny etal. Myproxy, visited04-12-01.http://dast.nlanr.net/Features/MyProxy/.
[11] ManishParasharetal. DISCOVER, visited04-27-01.http://www.discoverportal.org/.
20
[12] RajkumarBuyyaetal. Nimrod,A Tool for DistributedParametricModelling, visited04-25-
2001.http://www.csse.monash.edu.au/davida/nimrod.html/.
[13] Ian FosterandCarl Kesselman.The GRID: Blueprint for a New Computing Infrastructure.
Morgan-Kaufmann,1998.
[14] Grid ForumInformationServicesWorkingGroup.GGFGISWorkingGroupCharter,visited
06-29-01.http://www-unix.mcs.anl.gov/gridforum/gis/.
[15] ObjectManagementGroup.TheCommonObjectRequestBroker: Architectureandspecifi-
cation,July1995.Revision2.0.
[16] Pablo Group. AutoPilot : Real-Time Adaptive ResourceControl, visited 04-01-2001.
http://www-pablo.cs.uiuc.edu/Project/Autopilot/AutopilotOverview.htm.
[17] IETF. WebDav, visited8-20-01.http://www.ics.uci.edu/ejw/authoring/.
[18] Indiana University. The grid accessportal for physics applications,visited 08-14-01.
http://lexus.physics.indiana.edu/griphyn/grappa/.
[19] Albert EinsteinInstitute.Cactus,visited04-27-01.http://www.cactuscode.org/.
[20] ArgonneNationalLab. CoG,visited04-12-2001.http://www.globus.org/cog.
[21] ArgonneNationalLab. GSI,visited04-12-2001.http://www-fp.globus.org/security/v1.1/.
[22] ArgonneNational Laboratory, Indiana Univeristy, The AdvancedComputingLaboratory
at Los Alamos National Laboratory, LawrenceLivermoreNational Lab, and Univeristy
of Utah. CommonComponentArchitectue,visited1-10-2000. http://z.ca.sandia.gov/ cca-
forumseealsohttp://www.extreme.indiana.edu/ccat.
21
[23] JeffersonLabs.LatticePortal,visited04-27-01.http://lqcd.jlab.org/.
[24] Microsoft. .NET framework, visited02-10-01.http://www.microsoft.com/net/.
[25] Microsoft. COM, visited4-2-2001.http://www.microsoft.com/com.
[26] SunMicrosystems.Jini, visited3-1-2001.http://www.sun.com/jini.
[27] SunMicrosystems.EJB,visited7-15-99.http://java.sun.com/products/ejb/index.html.
[28] Tokyo Instituteof Technology. JiPang: A Jini-basedComputingPortalSystem,visited04-
27-01.http://matsu-www.is.titech.ac.jp/suzumura/jipang/.
[29] Tokyo Instituteof Technology. Ninf, visited04-27-01.http://ninf.etl.go.jp.
[30] ORNL, LBNL, and PNNL. The DOE2000 Electronic Notebook, visited 04-27-01.
http://www.emsl.pnl.gov:2080/docs/collab/research/ENResearch.html.
[31] SanDiego SupercomputerCenter(SDSC),the University of Texas(UT), and the Univer-
sity of Michigan(UM). NPACI Hot Page,visited04-25-2001.https://hotpage.npaci.edu/.
[32] A. Slominski,M. Govindaraju,D. Gannon,andR. Bramley. Designof anXML basedInter-
operableRMI System: SoapRMIC++/Java1.1. In Proceedings of the International Confer-
ence on Parallel and Distributed Processing Techniques and Applications, Las Vegas, Pages
1661-1667, June25-282001.
[33] AleksanderSlominski,MadhusudhanGovindaraju,DennisGannon,andRandallBramley.
SoapRMIEvents:DesignandImplementation.TechnicalReportTR-549,IndianaUniversity,
May 2001.
22
[34] MississippiStateUniversity. TheMississippiComputationalWebPortal,visited04-27-01.
http://WWW.ERC.MsState.Edu/labs/mssl/mcwp/.
[35] Rich Wolski, Neil T. Spring,andJim Hayes.TheNetwork WeatherService:A Distributed
ResourcePerformanceForecastingServicefor Metacomputing. Journal of Future Gen-
eration Computing Systems, 1999. also UCSD TechnicalReportNumberTR-CS98-599,
September, 1998.
23
List of Figures
1. TheXCAT SciencePortalArchitecture
2. An XML file with notebookmetadata
3. TheNotebookDatabase
4. Snapshotof XCAT SP
5. ScriptPage
6. XCAT ApplicationManagers
7. Scriptabilityof ApplicationManagers
8. JythonAPI to XCAT
9. Eventvisualizershowing machineutilizationevents
10. TheEventSubsystem
11. Subscribingto anEventChannel
12. An exampleXML ResourceEvent
13. LinkedChemicalEngineeringCodes
14. ChemicalEngineeringApplicationSetup
15. ParameterForm
24
WebBrowser
Local
Components
Viz Tools
MyPortal Active Notebook Server
Authentication GSI COG
Grid ToolsScript
EngineNotebookDatabase
Grid Performance Monitor
Channel
Application Proxy
Application Proxy
WrappedApplication
WrappedApplication
Soap Event
The Grid
SensorsMachine
Workstation Environment
Figure1: TheXCAT SciencePortalArchitecture
25
<activeNotebook xmlns="http://www.extreme.indiana.edu/an">
<activeNotebookInfo>
<title>Notebook_Intro (session)</title>
<creationDate>Thu Apr 19 10:54:10 EST 2001</creationDate>
<modifiedDate>Thu Apr 19 10:54:18 EST 2001</modifiedDate>
<version>1.0</version>
<id>NotebookIntro.7444</id>
<open>true</open>
<relatedTo>NotebookIntro</relatedTo>
<unsaved>false</unsaved>
</activeNotebookInfo>
<pageContent>
<title>BigPicture</title>
<url>/an/database/notebook/nNotebookIntro.7444/
pBigPicture/big_picture.html</url>
<id>BigPicture</id>
<number>1</number>
<open>false</open>
</pageContent>
</activeNotebook>
Figure2: An XML file with notebookmetadata
26
The Notebook Database
Notebook Server
Database Interface
Notebook 1 Notebook 2 Notebook 3
Page 1 Page 2 Script Page 1 Page 2 Page Script
Figure3: TheNotebookDatabase
27
Figure4: Snapshotof XCAT SP
28
Figure5: ScriptPage
29
Figure6: XCAT ApplicationManagers
30
Figure7: Scriptabilityof ApplicationManagers
31
def createComponent (componentInfo):
def setMachineName (componentWrapper, machineName):
def setCreationMechanism (componentWrapper, creationMechanism):
def createInstance (componentWrapper):
def connectPorts (outputPortComponent, outputPortName,
inputPortComponent, inputPortName):
def start (componentWrapper, usesPortClassName,
usesPortType, providesPortName):
def kill (componentWrapper, usesPortClassName,
usesPortType, providesPortName):
def invokeMethodOnComponent (componentWrapper, usesPortClassName,
usesPortType, providesPortName,
methodName, methodParams):
Figure8: JythonAPI to XCAT
32
Figure9: Eventvisualizershowing machineutilizationevents
33
Publisher Agent Listener Agent
� � � � �
Publisher Listener
� � � � �
Event Channels
handleEvent
handleEvent handleEvent
handleEventResponseResponse
start/continue/endEventPull
subscribeLease
eventLease
array of events
start/continue/endEventPull
array of events
Discovery Service(LDAP)
Figure10: TheEventSubsystem
34
# A specialization of the generic EventListener
class MyEventListener(EventListener, RemoteObject,
SubscriptionRenewListener):
def __init__(self, expID):
# constructor code goes here
# Code to register with the event channel
def subscribeToListenerAgent(expID, url):
.... # some initialization
# create an instance of the EventListener
receiver = MyEventListener(expID)
# register with the Listener Agent
agent = Util.getLocalListenerAgent(...params...)
# Get first batch of events through agent
result = agent.startPull(timePeriod, filter)
# Consume list of events from result.events[]
while (...interested in more events...):
# Get next batch of events
result = agent.continuePull(result.cookie)
# Consume list of events from result.events[]
# Done pulling events
agent.stopPull(result.cookie)
Figure11: Subscribingto anEventChannel
35
<MachineUtilizationEvent>
<eventNamespace>http://www.extreme.indiana.edu/soap/
events/resdat#MachineUtilizationEvent
</eventNamespace>
<eventType>resdata.machine.utilization</eventType>
<timestamp>2002-01-07T17:41:28.072Z</timestamp>
<arriveTimestamp>2002-01-07T17:41:29.151Z
</arriveTimestamp>
<source>rainier.extreme.indiana.edu</source>
<handback>resviz_channel</handback>
<cpuUtilization>0.88<cpuUtilization>
<memoryUsed>123988</memoryUsed>
</MachineUtilizationEvent>
Figure12: An exampleXML ResourceEvent
36
Figure13: LinkedChemicalEngineeringCodes
37
Science Portal
Application Manager
Application Manager
Monte Carlo Simulation
Continuum Simulation
GridFile Management
file file
Figure14: ChemicalEngineeringApplicationSetup
38
Figure15: ParameterForm