+ All Categories
Home > Technology > Cyberistructure

Cyberistructure

Date post: 18-Jul-2015
Category:
Upload: lab-southwest
View: 276 times
Download: 0 times
Share this document with a friend
Popular Tags:
25
University of Illinois at Urbana-Champaign National Center for Supercomputing Applications Cyberenvi ronment s @ NCSA Support i ng Communi t y- scal e Science Jim Myers Associate Director Collaborative Technologies NCSA
Transcript

University of Illinois at Urbana-Champaign National Center for Supercomputing Applications

Cybe r e nvi r onme nt s @ NCSA

Suppor t i ng Communi t y- s c al e Sc i e nc e

Jim MyersAssociate Director

Collaborative TechnologiesNCSA

National Center for Supercomputing Applications

Be yond Cybe r i nf r as t r uc t ur e

• CyberInfrastructure commonly refers to infrastructure (networks, compute, and data resources) plus the middleware (grid) that links those resources together and presents them in a uniform standard way.

• CyberEnvironments is a term NCSA has coined to describe the complete End-to-End solution. This integrates Shared and Custom Cyberinfrastructure into a process-oriented framework for the community and researchers that allow them to focus on their research, not on accessing and managing the CI.

• A CyberCommunity is a distributed group of people (virtual organization) with common goals and shared knowledge. Size ranges from a few individuals to an interdisciplinary or international groups. These groups can include, researchers, policy makers, responders, educators, and citizens and often have a long term identity and purpose.

National Center for Supercomputing Applications

Cybe r e nvi r onme nt s :

• Enable researchers to tackle more, and more complex challenges leading to – Enhanced production of knowledge and– Enhanced application of that knowledge to understanding our world,

developing solutions, and making informed decisions

National Center for Supercomputing Applications

The Sys t e ms Sc i e nc e Re vol ut i on

• Research spans multiple disciplines/sub-disciplines

• Coordination through– Community Resources– Bi-directional flow/feedback of

information• Partial results being combined to

produce new knowledge• Experiment/Theory/Model

comparisons• Multiscale optimizations

• Rapid Evolution• High Complexity

• Resources will be distributed• With multiple curators

Supernova Cosmology Requires Complex,Widely Distributed Workflow ManagementSupernova Cosmology Requires Complex,Supernova Cosmology Requires Complex,Widely Distributed Workflow ManagementWidely Distributed Workflow Management

Slide from Bill Johnston, LBNL

National Center for Supercomputing Applications

End t o e nd Sc i e nt i f i c Pr ogr e s s i s l i mi t e d by t he manual pr oc e s s e s :

Data discoveryTranslationExperiment setupGroup coordinationTool integrationTraining

Feature ExtractionData interpretationAcceptance of new models/toolsDissemination of best practicesInterdisciplinary communication

Data production Processing power Data transfer/storage !

National Center for Supercomputing Applications

Round- Tr i p I nf or mat i on Logi s t i c s

• Desktop applications accessing remote resources

• Individuals publishing to communities and accessing reference information, best practices, etc.

• Unique capabilities linked into end-to-end community processes

• Inter-community connectivity

• Evolving at the speed of science

Individual

Unique capabilities

High Performance Resources

Desktop Community

End-to-end processes

National Center for Supercomputing Applications

Ke y I s s ue s

• How do we build a system before the parts are done?• How do we evolve the system to keep it current?• How do we convey knowledge as well as tools to end users?• How do we coordinate without centralizing?

• Technology Responses: – Workflow

• Ability to integrate independent web services• Ability to hide workflow behind applications

– Rich metadata• Tracking provenance• Context-based data discovery• Distributed data stores• Data translation/data virtualization

– Cyberenvironments• Engineering view of cutting-edge science• Collaboration capabilities• ‘Publication’ – exposing work to groups & the public

– Streams/Events/Feature Management– Core Domain Services, e.g. GIS

National Center for Supercomputing Applications

NCSA Pr oc e s s e s

• Analysis of science and engineering processes across many disciplines

• Identification of challenges and appropriate design responses• Research/Technology Roadmaps• Integrated project teams (IPTs) taking leadership roles within

specific communities with strong partners to develop Cyberenvironments/CI– Producing pilot/production capabilities– Advancing technologies along roadmaps

• Backed by:– 20 years of experience in user/community engagement– Leadership roles in cutting edge Cyberenvironment projects in many

disciplines– Strong R&D efforts in Environments/Grid/Viz/Knowledge Discovery,..– Central role in national/global cyberinfrastructure definition/development

National Center for Supercomputing Applications

• Want a systems-science approach to address complex problems– New knowledge is assimilated from different data,

tools, and disciplines at each scale– Real-time bi-directional information flow– Multiple applications for the same information

• But– Normal publication is slow and lossy– Data has different formats, hidden dependencies– Standardization is hard to do up-front– Multi-scale information is complex and its

pedigree and context matters

Need lighter weight, flexible, adaptive mechanisms for sharing data

groups communities

Combustion: a Multi-scale Chemical Science Challenge

National Center for Supercomputing Applications

CMCS Por t al

• CHEF (Sakai precursor)

• SAM – Basic data/metadata

management– Metadata extraction– Data Translations

• Additional portlets– Metadata view/search– Provenance graph– E-notebook– Chemistry apps

• Email notifications

National Center for Supercomputing Applications

CMCS Pi l ot Sc i e nc e Gr oups• DNS– Jackie Chen, David Leahy

–Feature detection & tracking in DNS data

• HCCI University Consortium – Bill Pitz–Homogeneous Charge Compression Ignition

• PrIMe – led by Michael Frenklach–Development and publishing chemical reaction models

• Real Fuels Project– Wing Tsang, Tom Allison–Lead real fuels chemistry at NIST

• IUPAC – led by Branko Ruscic–Develop and publish validated thermochemical data

• Quantum Chemistry – Theresa Windus–QM Reference data

National Center for Supercomputing Applications

Communi t y Cur at i on of Dat a: Quant um Che mi s t r y Bas i s Se t s

University of Illinois at Urbana-Champaign National Center for Supercomputing Applications

File Interventions

Maeviz – [Memphis Test Bed]

Inventory Hazards Vulnerability Decision support Interdependencies Help

?Consequence Table

OK Cancel

Earthquake Level: 5% PE in 50 years

Decision Option: Equivalent Cost Analysis

Prob. Distribution Preference Plot POS plot Compare Schemes

?Scheme Comparison

OK Cancel

Description

Scheme #1C2M RebuildC2L RebuildURML Rebuild

Scheme #2C2M Rehab LSC2L Rehab LSURML No Action

Consequence Comparison

0102030405060708090

100

No Action Scheme #1 Scheme #2

Alternatives

Loss

($M

)

Life Loss

Dollar Loss

Input Motion Parameter

So

cia

l/Eco

no

mic

Imp

act

Lim

it S

tate

Input error margin

Response error margin

Input Motion Parameter

So

cia

l/Eco

no

mic

Imp

act

Lim

it S

tate

Input error margin

Response error margin

Input Motion Parameter

So

cia

l/Eco

no

mic

Imp

act

Lim

it S

tate

Input error margin

Response error margin

MAEVi z Cybe r e nvi r onme ntCons e que nc e - Bas e d Ri s k Manage me nt

Mid-America Earthquake Center

0.6g

0.5g

0.3g

0.6g

0.5g

0.3g

• Engineering View of MAE Center Research• Portal-based Collaboration Environment• Distributed data/metadata Sources• Builds on NEESgrid technologies

Hazard Definition

Inventory Selection

FragilityModels

Damage Prediction

Decision Support

National Center for Supercomputing Applications

NEESgrid UIUC

NEESgrid UIUC

http://neespop.ce.uiuc.edu:9271/chef/portal/group/NEESgridUIUC/page/default.psml/js_pane/P-f16a0kkk

Narutoshi Nakata

Project Name: UIUC_ShakeTableExperiment

NEESgrid UIUC

UIUC

UIUC

National Center for Supercomputing Applications

Envi r onme nt al Obs e r vat or i e s

NCSA including CAC is involved in the development of CI for a number of environmental communities

• CUAHSI (Consortium of Universities for the Advancement of Hydrologic Sciences Inc.) for hydrology

• NEON (National Ecological Observatory Network) for ecology• LOOKING (Laboratory for the Ocean Observatory Knowledge Integration

Grid)• CLEANER (Collaborative Large Scale Engineering Analysis Network for

Environmental Research) for environmental engineering

• LTER (U.S. Long-Term Ecological Research Network) investigating

ecological processes over long temporal and broad spatial scale

National Center for Supercomputing Applications

Long Te r m Ec ol ogi c al Re s e ar c h ( LTER)

• Established 1980 (25 years)

• 26 Research Sites & 1 Support Site (LNO)– North America– Artic/Antarctica– Puerto Rico/Tahiti

• Five Core Areas of Study– Primary Plant Production– Organism Population Studies– Movement of Organic Matter– Movement of Inorganic Matter– Disturbance Patterns

• Questions are being asked at the Regional, National, and Global scale

National Center for Supercomputing Applications

LTER Pi l ot St udy

• Portal User Interface

• Single Signon

• Data Discovery

• Secure Data Staging

• Data Audit Trail

• Data Analysis via HPC system

National Center for Supercomputing Applications

Lar ge Synopt i c Sur ve y Te l e s c ope ( LSST)

• A new telescope located in Chileo 8.4m dia. Mirror, 10 sq. degrees FOVo 3 GPixel Camerao Image available sky every 3 dayso First light: January 2012

• Science Mission: observe the time-varying skyo Dark Energy and the accelerating universeo Comprehensive census Solar System objectso Study optical transientso Create a galactic map

• The LSST collaborationo Currently about a dozen institutions, including 3 DOE labso Schedule:

• D&D phase: 2004-2007 (funded by NSF grant, private money, in-kind contributions)

• Construction: 2007-2012 (funded by NSF & DOE)• Operation: 2012-

o NCSA Team headed by Ray Plante: 4 FTEs from NCSA, 2 FTEs from UIUC, 3 FTEs from NSF

Data Generation Rate: 30 TB/night, 6 PB/yearTotal Disk Storage: 18 PBNominal Computing required: 20+ TflopsSite-to-archive network bandwidth: 2.5 Gbits/sProcessing latency for real time alerts: ~ 60 secs

National Center for Supercomputing Applications

LEAD

• Mesoscale weather is VERY DYNAMIC but our tools, cyber environments, research methodologies and learning modalities are VERY STATIC

• Getting even static capability is an enormous challenge due to the complexity of the tools and the primitive information technology infrastructures used to link them

National Center for Supercomputing Applications

NCSA Pr oc e s s e s

• Analysis of science and engineering processes across many disciplines

• Identification of challenges and appropriate design responses• Research/Technology Roadmaps• Integrated project teams (IPTs) taking leadership roles within

specific communities with strong partners to develop Cyberenvironments/CI– Producing pilot/production capabilities– Advancing technologies along roadmaps

• Backed by:– 20 years of experience in user/community engagement– Leadership roles in cutting edge Cyberenvironment projects in many

disciplines– Strong R&D efforts in Environments/Grid/Viz/Knowledge Discovery,..– Central role in national/global cyberinfrastructure definition/development

National Center for Supercomputing Applications

Community CyberEnvironments

Cybe r e nvi r onme nt s Ar c hi t e c t ur e Pe r s pe c t i ve

Security

Data-base

SMP Mass Store

Network

Visualizationsystems

Applications Services (HPC, Instrument, Analysis,…)

Core Services

Orchestration

Scientific Content/ProcessMgmt Services

CollaborativeServices

E-ScienceServices

Data Mgmt Analytics Visualization Stream Mgmt

CommunityKnowledge Services

instrumentsSensor nets

National Center for Supercomputing Applications

Ke y c onc e pt s

• Lightweight environment frameworks– Portlet/plug-in models– Contextualized collaboration capabilities

• Distributed Scientific Content & Process Mgmt / Semantics– Tracking provenance– Metadata Context-based data discovery, translation, virtualization– Base for knowledge services

• Workflow/Services – Ability to integrate independent web services, manage complexities of CI– Application/ process-oriented interface (Schema/ontology-driven)

• Visual Analytics– Identification of features/patterns from one domain in terms of another…

• Streaming/steering/event-driven science– Marshaling additional sensors for interesting phenomena– On-demand simulation

• Living Cyberenvironments– End-to-end, e.g. Engineering view of cutting-edge science– Community managed/evolved– Science lifecycle support – research, publication, curation, …

National Center for Supercomputing Applications

Cybe r e nvi r onme nt s

Mos ai c and Cybe r e nvi r onme nt s• Mosaic

– By early 1990s, the internet had a wealth of resources, but they were inaccessible to most scientists

– Hyperlinking and document formatting did nothing new except lower the barriers to information access

• Cyberenvironments– By the early 2000’s, the internet and

grid had a wealth of interactive resources, but they were inaccessible to most scientists

– Cyberenvironments will lower barriers to orchestrating these resources

National Center for Supercomputing Applications

SNAC: My Pos i t i on St at e me nt

• Cyberenvironments have unsolved issues– How do we discover data, services, best practices

without hierarchical management?• Organization virtual organizations

• Disciplines system science

– How do we structure large systems projects so they succeed?

• Can we identify communities who are ‘cyber-ready’?

• Can we suggest technologies based on community structure?

National Center for Supercomputing Applications

SNAC: My Pos i t i on St at e me nt ( 2)

• Cyberenvironments will be a rich resource for network research– Computer mediated communication– Workflow– E-notebooks/annotation services– Computer mediated model translation