Date post: | 18-Jul-2015 |
Category: |
Technology |
Upload: | lab-southwest |
View: | 276 times |
Download: | 0 times |
University of Illinois at Urbana-Champaign National Center for Supercomputing Applications
Cybe r e nvi r onme nt s @ NCSA
Suppor t i ng Communi t y- s c al e Sc i e nc e
Jim MyersAssociate Director
Collaborative TechnologiesNCSA
National Center for Supercomputing Applications
Be yond Cybe r i nf r as t r uc t ur e
• CyberInfrastructure commonly refers to infrastructure (networks, compute, and data resources) plus the middleware (grid) that links those resources together and presents them in a uniform standard way.
• CyberEnvironments is a term NCSA has coined to describe the complete End-to-End solution. This integrates Shared and Custom Cyberinfrastructure into a process-oriented framework for the community and researchers that allow them to focus on their research, not on accessing and managing the CI.
• A CyberCommunity is a distributed group of people (virtual organization) with common goals and shared knowledge. Size ranges from a few individuals to an interdisciplinary or international groups. These groups can include, researchers, policy makers, responders, educators, and citizens and often have a long term identity and purpose.
National Center for Supercomputing Applications
Cybe r e nvi r onme nt s :
• Enable researchers to tackle more, and more complex challenges leading to – Enhanced production of knowledge and– Enhanced application of that knowledge to understanding our world,
developing solutions, and making informed decisions
National Center for Supercomputing Applications
The Sys t e ms Sc i e nc e Re vol ut i on
• Research spans multiple disciplines/sub-disciplines
• Coordination through– Community Resources– Bi-directional flow/feedback of
information• Partial results being combined to
produce new knowledge• Experiment/Theory/Model
comparisons• Multiscale optimizations
• Rapid Evolution• High Complexity
• Resources will be distributed• With multiple curators
Supernova Cosmology Requires Complex,Widely Distributed Workflow ManagementSupernova Cosmology Requires Complex,Supernova Cosmology Requires Complex,Widely Distributed Workflow ManagementWidely Distributed Workflow Management
Slide from Bill Johnston, LBNL
National Center for Supercomputing Applications
End t o e nd Sc i e nt i f i c Pr ogr e s s i s l i mi t e d by t he manual pr oc e s s e s :
Data discoveryTranslationExperiment setupGroup coordinationTool integrationTraining
Feature ExtractionData interpretationAcceptance of new models/toolsDissemination of best practicesInterdisciplinary communication
Data production Processing power Data transfer/storage !
National Center for Supercomputing Applications
Round- Tr i p I nf or mat i on Logi s t i c s
• Desktop applications accessing remote resources
• Individuals publishing to communities and accessing reference information, best practices, etc.
• Unique capabilities linked into end-to-end community processes
• Inter-community connectivity
• Evolving at the speed of science
Individual
Unique capabilities
High Performance Resources
Desktop Community
End-to-end processes
National Center for Supercomputing Applications
Ke y I s s ue s
• How do we build a system before the parts are done?• How do we evolve the system to keep it current?• How do we convey knowledge as well as tools to end users?• How do we coordinate without centralizing?
• Technology Responses: – Workflow
• Ability to integrate independent web services• Ability to hide workflow behind applications
– Rich metadata• Tracking provenance• Context-based data discovery• Distributed data stores• Data translation/data virtualization
– Cyberenvironments• Engineering view of cutting-edge science• Collaboration capabilities• ‘Publication’ – exposing work to groups & the public
– Streams/Events/Feature Management– Core Domain Services, e.g. GIS
National Center for Supercomputing Applications
NCSA Pr oc e s s e s
• Analysis of science and engineering processes across many disciplines
• Identification of challenges and appropriate design responses• Research/Technology Roadmaps• Integrated project teams (IPTs) taking leadership roles within
specific communities with strong partners to develop Cyberenvironments/CI– Producing pilot/production capabilities– Advancing technologies along roadmaps
• Backed by:– 20 years of experience in user/community engagement– Leadership roles in cutting edge Cyberenvironment projects in many
disciplines– Strong R&D efforts in Environments/Grid/Viz/Knowledge Discovery,..– Central role in national/global cyberinfrastructure definition/development
National Center for Supercomputing Applications
• Want a systems-science approach to address complex problems– New knowledge is assimilated from different data,
tools, and disciplines at each scale– Real-time bi-directional information flow– Multiple applications for the same information
• But– Normal publication is slow and lossy– Data has different formats, hidden dependencies– Standardization is hard to do up-front– Multi-scale information is complex and its
pedigree and context matters
Need lighter weight, flexible, adaptive mechanisms for sharing data
groups communities
Combustion: a Multi-scale Chemical Science Challenge
National Center for Supercomputing Applications
CMCS Por t al
• CHEF (Sakai precursor)
• SAM – Basic data/metadata
management– Metadata extraction– Data Translations
• Additional portlets– Metadata view/search– Provenance graph– E-notebook– Chemistry apps
• Email notifications
National Center for Supercomputing Applications
CMCS Pi l ot Sc i e nc e Gr oups• DNS– Jackie Chen, David Leahy
–Feature detection & tracking in DNS data
• HCCI University Consortium – Bill Pitz–Homogeneous Charge Compression Ignition
• PrIMe – led by Michael Frenklach–Development and publishing chemical reaction models
• Real Fuels Project– Wing Tsang, Tom Allison–Lead real fuels chemistry at NIST
• IUPAC – led by Branko Ruscic–Develop and publish validated thermochemical data
• Quantum Chemistry – Theresa Windus–QM Reference data
National Center for Supercomputing Applications
Communi t y Cur at i on of Dat a: Quant um Che mi s t r y Bas i s Se t s
University of Illinois at Urbana-Champaign National Center for Supercomputing Applications
File Interventions
Maeviz – [Memphis Test Bed]
Inventory Hazards Vulnerability Decision support Interdependencies Help
?Consequence Table
OK Cancel
Earthquake Level: 5% PE in 50 years
Decision Option: Equivalent Cost Analysis
Prob. Distribution Preference Plot POS plot Compare Schemes
?Scheme Comparison
OK Cancel
Description
Scheme #1C2M RebuildC2L RebuildURML Rebuild
Scheme #2C2M Rehab LSC2L Rehab LSURML No Action
Consequence Comparison
0102030405060708090
100
No Action Scheme #1 Scheme #2
Alternatives
Loss
($M
)
Life Loss
Dollar Loss
Input Motion Parameter
So
cia
l/Eco
no
mic
Imp
act
Lim
it S
tate
Input error margin
Response error margin
Input Motion Parameter
So
cia
l/Eco
no
mic
Imp
act
Lim
it S
tate
Input error margin
Response error margin
Input Motion Parameter
So
cia
l/Eco
no
mic
Imp
act
Lim
it S
tate
Input error margin
Response error margin
MAEVi z Cybe r e nvi r onme ntCons e que nc e - Bas e d Ri s k Manage me nt
Mid-America Earthquake Center
0.6g
0.5g
0.3g
0.6g
0.5g
0.3g
• Engineering View of MAE Center Research• Portal-based Collaboration Environment• Distributed data/metadata Sources• Builds on NEESgrid technologies
Hazard Definition
Inventory Selection
FragilityModels
Damage Prediction
Decision Support
National Center for Supercomputing Applications
NEESgrid UIUC
NEESgrid UIUC
http://neespop.ce.uiuc.edu:9271/chef/portal/group/NEESgridUIUC/page/default.psml/js_pane/P-f16a0kkk
Narutoshi Nakata
Project Name: UIUC_ShakeTableExperiment
NEESgrid UIUC
UIUC
UIUC
National Center for Supercomputing Applications
Envi r onme nt al Obs e r vat or i e s
NCSA including CAC is involved in the development of CI for a number of environmental communities
• CUAHSI (Consortium of Universities for the Advancement of Hydrologic Sciences Inc.) for hydrology
• NEON (National Ecological Observatory Network) for ecology• LOOKING (Laboratory for the Ocean Observatory Knowledge Integration
Grid)• CLEANER (Collaborative Large Scale Engineering Analysis Network for
Environmental Research) for environmental engineering
• LTER (U.S. Long-Term Ecological Research Network) investigating
ecological processes over long temporal and broad spatial scale
National Center for Supercomputing Applications
Long Te r m Ec ol ogi c al Re s e ar c h ( LTER)
• Established 1980 (25 years)
• 26 Research Sites & 1 Support Site (LNO)– North America– Artic/Antarctica– Puerto Rico/Tahiti
• Five Core Areas of Study– Primary Plant Production– Organism Population Studies– Movement of Organic Matter– Movement of Inorganic Matter– Disturbance Patterns
• Questions are being asked at the Regional, National, and Global scale
National Center for Supercomputing Applications
LTER Pi l ot St udy
• Portal User Interface
• Single Signon
• Data Discovery
• Secure Data Staging
• Data Audit Trail
• Data Analysis via HPC system
National Center for Supercomputing Applications
Lar ge Synopt i c Sur ve y Te l e s c ope ( LSST)
• A new telescope located in Chileo 8.4m dia. Mirror, 10 sq. degrees FOVo 3 GPixel Camerao Image available sky every 3 dayso First light: January 2012
• Science Mission: observe the time-varying skyo Dark Energy and the accelerating universeo Comprehensive census Solar System objectso Study optical transientso Create a galactic map
• The LSST collaborationo Currently about a dozen institutions, including 3 DOE labso Schedule:
• D&D phase: 2004-2007 (funded by NSF grant, private money, in-kind contributions)
• Construction: 2007-2012 (funded by NSF & DOE)• Operation: 2012-
o NCSA Team headed by Ray Plante: 4 FTEs from NCSA, 2 FTEs from UIUC, 3 FTEs from NSF
Data Generation Rate: 30 TB/night, 6 PB/yearTotal Disk Storage: 18 PBNominal Computing required: 20+ TflopsSite-to-archive network bandwidth: 2.5 Gbits/sProcessing latency for real time alerts: ~ 60 secs
National Center for Supercomputing Applications
LEAD
• Mesoscale weather is VERY DYNAMIC but our tools, cyber environments, research methodologies and learning modalities are VERY STATIC
• Getting even static capability is an enormous challenge due to the complexity of the tools and the primitive information technology infrastructures used to link them
National Center for Supercomputing Applications
NCSA Pr oc e s s e s
• Analysis of science and engineering processes across many disciplines
• Identification of challenges and appropriate design responses• Research/Technology Roadmaps• Integrated project teams (IPTs) taking leadership roles within
specific communities with strong partners to develop Cyberenvironments/CI– Producing pilot/production capabilities– Advancing technologies along roadmaps
• Backed by:– 20 years of experience in user/community engagement– Leadership roles in cutting edge Cyberenvironment projects in many
disciplines– Strong R&D efforts in Environments/Grid/Viz/Knowledge Discovery,..– Central role in national/global cyberinfrastructure definition/development
National Center for Supercomputing Applications
Community CyberEnvironments
Cybe r e nvi r onme nt s Ar c hi t e c t ur e Pe r s pe c t i ve
Security
Data-base
SMP Mass Store
Network
Visualizationsystems
Applications Services (HPC, Instrument, Analysis,…)
Core Services
Orchestration
Scientific Content/ProcessMgmt Services
CollaborativeServices
E-ScienceServices
Data Mgmt Analytics Visualization Stream Mgmt
CommunityKnowledge Services
instrumentsSensor nets
National Center for Supercomputing Applications
Ke y c onc e pt s
• Lightweight environment frameworks– Portlet/plug-in models– Contextualized collaboration capabilities
• Distributed Scientific Content & Process Mgmt / Semantics– Tracking provenance– Metadata Context-based data discovery, translation, virtualization– Base for knowledge services
• Workflow/Services – Ability to integrate independent web services, manage complexities of CI– Application/ process-oriented interface (Schema/ontology-driven)
• Visual Analytics– Identification of features/patterns from one domain in terms of another…
• Streaming/steering/event-driven science– Marshaling additional sensors for interesting phenomena– On-demand simulation
• Living Cyberenvironments– End-to-end, e.g. Engineering view of cutting-edge science– Community managed/evolved– Science lifecycle support – research, publication, curation, …
National Center for Supercomputing Applications
Cybe r e nvi r onme nt s
Mos ai c and Cybe r e nvi r onme nt s• Mosaic
– By early 1990s, the internet had a wealth of resources, but they were inaccessible to most scientists
– Hyperlinking and document formatting did nothing new except lower the barriers to information access
• Cyberenvironments– By the early 2000’s, the internet and
grid had a wealth of interactive resources, but they were inaccessible to most scientists
– Cyberenvironments will lower barriers to orchestrating these resources
National Center for Supercomputing Applications
SNAC: My Pos i t i on St at e me nt
• Cyberenvironments have unsolved issues– How do we discover data, services, best practices
without hierarchical management?• Organization virtual organizations
• Disciplines system science
– How do we structure large systems projects so they succeed?
• Can we identify communities who are ‘cyber-ready’?
• Can we suggest technologies based on community structure?