Date post: | 05-Jan-2016 |
Category: |
Documents |
Upload: | clara-johnston |
View: | 215 times |
Download: | 1 times |
QuakeSim Project: Portals and Web
Services for Geophysics
Marlon Pierce
Indiana University
QuakeSim Project SummaryGoal is to provide a distributed environment
for connecting scientific computing and data resources with Web based user interfaces.
QuakeSim’s IT development includesPortals for user interfaces.Web Services for running remote
applications and accessing databasesDatabases for semantic fault models (USC)
This talk reviews major revisions that we have undertaken since 2006.Almost a complete rewrite of 2004-2006
system.
Portlets + Client Stubs
DB Service
JDBC
DB
Job Sub/Mon And FileServices
Operating andQueuing Systems
WSDLWSDL
Browser Interface
WSDL
WSDL
WSDLWSDL WSDL
VisualizationOr MapService
DB
WSDL
Host 1 (Quaketables) Host 2 (Grid) Host 3 (G Maps)
My “octopus” diagram, from the archives.
SOAP/HTTP
HTTP(S)
Some Design Choices Build portals out of portlets (Java Standard)
Reuse capabilities from our Open Grid Computing Environments (OGCE) project, the REASoN GPS Explorer project, and many TeraGrid Science Gateways.
Decorate with Google Maps, Yahoo UI gadgets, etc.
Use Java Server Faces to build individual component portlets. Build standalone tools, then convert to portlets at the very end.
Use simple Web Services for accessing codes and data. Keep It Stateless …
Use Globus job and file management services for interacting with high performance computers.
Favor Google Maps and Google Earth for their simplicity, interactivity and open APIs. Generate KML and GeoRSS
Use Apache Maven based build and compile system
Some QuakeSim Applications and Their DataDisloc, Simplex
Fault models are used to calculate surface displacements (Disloc) using Okada method.
Simplex is the inverse.
GeoFEST (JPL/CalTech)Finite element code for detailed modeling of fault
stresses, seismic displacements, uses fault models as input.
Coupled to mesh generation tools
Regularized Dynamic Annealing Hidden Markov Method (RDAHMM) (JPL)Time series analysis code, can be applied to GPS and
seismic archives. Identifies signal components (possibly associated with
underlying physical causes) with no fixed parameters.
QuakeSim, Version 1 Reason to Revise QuakeSim, Version 2
Application Web Service for wrapping a.out executables. Execution management service built with Apache Ant.
Services too coupled to portal; no simple WSDL programming interface; could not be used in workflow engines; not self contained
Give each code a proper service interface. Retain Apache Ant core but extend. Keep WSDL message structure simple (Strings, ints, doubles, URLs), wrapped as Java Beans
File Management Service Unnecessary, too coupled to Apache Axis 1.0
HTTP GET, URLs
Context Management Service manages persistent portal sessions using recursive XML structure.
Too slow (file system); didn’t scale; XML databases didn’t mature; Object-Relational Mappings (ORM) not efficient
Using DB40; all services communicate with easily XML serializable JavaBeans.
OGC-compatible map and data services
Too complicated; ORM is a big overhead.
Google Maps, KML generating services
Serial job submission NSF TeraGrid and Open Science Grid run full time production Grids for HPC.
Condor-G/Birdbath based job management extensions to GeoFEST service.
Daily RDAHMM Updates
TeraGrid Supercomputing Resources (GPIR)
Queue Prediction Service (QBETS)
Forecasts time you willwait in the queue on various TG supercomputers. Inheritedfrom OGCE project.
GeoFEST Finite Element Modeling portlet and plotting tools
Disloc output converted to KML and plotted.
Web 2.0 for Science GatewaysEnterprise Approach Web 2.0 Approach
JSR 168 Portlets Gadgets, Widgets
Server-side integration and processing
AJAX, client-side integration and processing, JavaScript
SOAP RSS, Atom, JSON
WSDL REST (GET, PUT, DELETE, POST)
Portlet Containers Open Social Containers (Orkut, LinkedIn, Shindig); Facebook; StartPages
User Centric Gateways Social Networking Portals
Workflow managers (Taverna, Kepler, etc)
Mash-ups
Grid computing: Globus, condor, etc
Cloud computing: Amazon WS Suite, Xen Virtualization
More InformationEmail: [email protected]
QuakeSim Web Site: www.quakesim.org
Portal URL: http://gf7.ucs.indiana.edu:8080/gridsphere
Portal SourceForge Page:https://sourceforge.net/projects/crisisgrid
Code SVN:http://crisisgrid.svn.sourceforge.net/viewvc/crisisgrid/
Acknowledgments
QuakeSim work is funded by NASA AIST (A. Donnellan, PI) and ACCESS (Y. Bock, PI) programs.
Indiana University developers: Galip Aydin, Xiaoming Gao, Zhigang Qi
Robert Granat (JPL), Jay Parker (JPL), Maggi Glasscoe (JPL), John Rundle (UC-Davis), Harout Nazerian (JPL), Rami Al-Ghanmi (USC), Dennis Mcleod (USC), Paul Jamason (Scripps), Ruey-Juin Chang (Scripps), Gerry Simila (CSUN)
Grid Job Submission Globus provides a universal queuing system interface.
PBS, LoadLeveler, Sun Grid Engine, LSF
We chose Condor-G as our job management software for submitting jobs to HPC queuing systems. University of Wisconsin Works with Globus, Matlab DCE, Unicore, etc.
We co-locate Condor-G with our GeoFEST Web Service. Communication is through Birdbath, Condor’s Web Service
interface. So GeoFEST service API is more or less the same, just now Grid
enabled.
We also plan to release a general version of this service. Condor command line and Birdbath have different names for job
description parameters. Big Easter Egg hunt to find this, but now we know.
Portlet SummaryRDAHMM Set up and run RDAHMM, query Scripps
GRWS GPS Service, maintain persistent user sessions.
ST_Filter Similar to RDAHMM portlet; ST_Filter has much more input.
Station Monitor Shows GPS stations on a Google Map, displays last 10 minutes of data.
Real Time RDAHMM Displays RDAHMM results of last 10 minutes of GPS data in a Google map.
Daily RDAHMM Calculates, updates RDAHMM event classifications with daily updated GPS data from SOPAC’s GRWS service (14 day delay, but uses all the data).
GeoFEST Create input geometries, generate FE meshes, run parallel FEM solvers.
Disloc, Simplex Calculate service displacements from fault models.
Security Concerns
They’ll see the Big Board!
QuakeSimDistributed Environment for Modeling Observations
Managing Real Time GPS Data
Slides from Galip Aydin
California Real Time NetworkNetwork Data Rates Message Format
Time RYO ASCII GML
CRTN GPS Site Positions(9 Stations)
1 second 1.5KB 4.03KB 48.7KB
1 hour 5.31MB 14.18MB 171.31MB
1 day 127.44MB 340.38MB 4.01GB
1 month 3.8GB 9.97GB 123.3GB
1 year 45.8GB 119.67GB 1.41TB
Entire SCIGN Network (250
stations)1year 1.23TB 16.18TB 160TB
Continuous GPS Stations (CGPS) are depicted as triangles while the Real-Time stations are represented as circles. Image is obtained from SOPAC GPS Explorer at http://sopac.ucsd.edu/projects/realtime
How does one manage all the data generated by the 85 stations? How can you get just the data you want?
Note this is fundamentally different from traditional request/response style Web Services.
25
Processing Real-Time GPS Streams
Raw Data
70107010
70117011
70127012
RYOPorts
NB Server
ScrippsRTD
Server
ScrippsRTD
Server
Raw Data
A Complete Sensor Message Processing Path, including a data analysis application.
GPS Networks
26Application Integration with Real-Time Filters
Station Monitor Filter records real-time positions for 10 minutes and calculates position changes
Graph Plotter Application creates visual representation of the positions.
RDAHMM Filter records real-time positions for 10 minutes and invokes RDAHMM application which determines state changes in the XYZ signal.
Graph Plotter Application creates visual representation of the RDAHMM output.
27
2 – Multiple Publishers Test
We add more GPS networks by running more publishers.
The results show that 1000 publishers can be supported with no performance loss. This is an operating system limit.
Topic 1A
Topic 1B
Topic 2
Topic n
28
4 – Multiple Brokers TestNaradaBrokering allows
creation of Broker networks.
We create a two-broker network.
Messages published to first broker can be received from the second broker.
We take timings on each broker.
We connect 750 clients to each broker and run for 24 hours. We chose 750 clients to stay well below the saturation limit.
The results show that the performance is very good and similar to single broker test.
NB Server 1
NB Server 2
Topic 1A
Topic 1B
Topic 1B
NB Server 2