+ All Categories
Home > Documents > Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Date post: 03-Jan-2016
Category:
Upload: dillon-chavez
View: 22 times
Download: 0 times
Share this document with a friend
Description:
The 4th China-US Roundtable on Scientific Data Cooperation Advanced Cyber-infrastructure for Scientific Data Applications in CAS. Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA. Outline. Background Advanced Cyber-Infrastructure in CAS - PowerPoint PPT Presentation
Popular Tags:
35
Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA The 4th China-US Roundtable on Scientific The 4th China-US Roundtable on Scientific Data Data Cooperation Cooperation Advanced Cyber-infrastructure Advanced Cyber-infrastructure for Scientific Data for Scientific Data Applications Applications in CAS in CAS
Transcript
Page 1: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Tieniu TAN

Deputy Secretary-General

Chinese Academy of Sciences (CAS)

29 Mar. 2010, Irvine, USA

The 4th China-US Roundtable on Scientific The 4th China-US Roundtable on Scientific Data Data CooperationCooperation

Advanced Cyber-infrastructure for Advanced Cyber-infrastructure for Scientific Data ApplicationsScientific Data Applications in CASin CAS

Page 2: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

OutlineOutlineBackgroundAdvanced Cyber-Infrastructure in CASTypical Data Intensive e-Science

Applications in CASConclusion

Page 3: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

ScientificScientific DataData DelugeDeluge

Scientists face a data deluge– Vast volume of scientific data captured by

large scientific facilities, ubiquitous sensors, new instruments and computer models

Science and engineering research have become increasingly data-intensive – New scientific opportunities are emerging

from increasingly effective data organization, access and usage (NSF, 2007)

Page 4: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Data-intensive scientific discovery:Data-intensive scientific discovery:e-Sciencee-Science

The fourth paradigm: data-intensive scientific discovery (Microsoft, 2009)– A Transformed Scientific Method

e-Science is synthesis of information technology and science, giving priority to scientific data lifecycle and data exploration (Jim Gray) – data captured by instruments or generated by simulator;

processed by software; information/knowledge stored in computer; scientist analyzes database / files; using data management and statistics

Page 5: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

China National Scientific Data China National Scientific Data Sharing InitiativesSharing Initiatives

Ministry of Science and Technology (MOST) started the implementation of Scientific Data Sharing Program (SDSP) in 2002– Supporting almost 20 projects to promote scientific data

sharing

National Science & Technology Infrastructure (NSTI) was launched in 2005 by MOST and Ministry of Finance ( Http://www.escience.gov.cn)– Supporting 38 projects for promoting Science and

Technology Resources, data and information sharing and Open Access

– Total funding ~2 billion RMB

Page 6: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

High Speed Network-CSTNET-CSTNET-CNGI-GLORIAD

1.Field observation stations2.Large scientific facilities3.others

Advanced CI for Data Lifecycle in CASAdvanced CI for Data Lifecycle in CAS

Application

Generation&Collection

Trans-mission

Computing&Analysis

Storage &Curation

Data

Information Stream

Information S

tream

Information Stream

Information S

tream

Info

rmatio

n Stre

am

Data Centers-storage &preservation-Curation-Sharing and Service

Supercomputing Grid-Computing-Analysis-Mining -visualization

Data intensive e-Science Applications

Page 7: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Data generation Data generation Large scientific facilities produce huge data

– +20 in operation– +20 under construction

Long-term field observation stations– +100 stations covering Ecology, Environment,

Space, etc.Other research data, including experiments,

modeling, computing, etc.– 100 institutes, more than 50000 researchers in

CAS

Page 8: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Network Field ObservationNetwork Field ObservationNetwork expanded to link field observations

– Real Time Data Collection

CERN China

Ecology system Research Network

Disaster and Environment Observation

Astronomy and space observation

Page 9: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Meridian Space Weather Meridian Space Weather Monitoring ProgramMonitoring Program

More than 10TB data will be generated and transmitted to Beijing per year

data analysis needs 20Tflops

A data system and processing infrastructure being built

Page 10: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Cosmic-ray observatory: Cosmic-ray observatory:

ARGO/ASARGO/ASCosmic-ray observatory at

Yangbajing in Tibet: – ARGO: China-Italy

– AS: China-Japan

~200TB raw data per year.

Data transferred from YBJ-ARGO and processed at IHEP and INFN

Rec. data accessible by collaborators.

Page 11: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

BEPCII / BESIIIBEPCII / BESIII

BEPC: Beijing Electron-Positron Collider– upgrade: BEPCII/BESIII, operational in 2008

– 2.0 ~ 4.6 GeV/C

– (3~10)×1032 cm-2s-1

– 36 Institutions from China, US, Germany, Russian, and Japan

– 4000+ KSI2K for data process and physics analysis

– 5+ PB in five years

Page 12: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Data Transmission-High Speed Data Transmission-High Speed NetworkNetwork

China Science and Technology Network (CSTNet)

Non-profitable, academic and research networks in China to support advanced science applications and research on next generation Internet

Connect some 200 institutes, and 1,000,000 end users

Page 13: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Lanzhou

Xinjiang

Xian

Shenyang

Changchun

Chengdu

Kunming

Wuhan

Guangzhou

ShanghaiHefeiLasa

Qingdao

Haerbin

Xining

Dalian

Guiyang

Yangbajing

Xishuangbanna

Changsha

TianJin

2.5Gb/s

155Mb/s

< 155Mb/s

Figure

HongKong1Gb/s Taiwan

Shenzhen

Fuzhou

Ningbo

Nanjing

ShanxiShijiazhuang

Beijing

CSTNET Backbone

Page 14: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Interconnecting with otherInterconnecting with other Networks Networks

RussiaNetherland

USA

KISTI Korea

NICT Japan

AS Hongkong

GOOGLE Hongkong

HKIX Hongkong

CUHK Hongkong

China169 China Unicom

ChinaNet TELECOM

CERNET

HKOEPCSTNET

Gloriad

10G

2.5G

2.5G

1G

1G

1G

1G

1G1G

2.5G

2.5G

2G

155M

155M

700M

BJ NAP

2.5GHongkong 2G

Internet

Beijing

Page 15: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

上海

Jiling

辽宁

Guangzhou

兰州

XinJiang

Beijing

10Gbps

International Link10G

羊羊羊

100+ Institutes40+ Field stations and big science facilitiesComputing facilities and storage facilities

CSTNET-CNGIAn IPv6 Network for Science based on CSTNET will start to build this year

Chengdu

XI’AN

Kunming

WuHan

HefeiNanjing

Page 16: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Data Storage and CurationData Storage and Curation

A General Scientific Data Center – Common data infrastructure construction, operation

– Data archive and preservation

Some domain specific scientific data centers– Discipline data curation and sharing service

A CAS scientific data app project – Multi-discipline data sharing and applications

A series of domain-based scientific data sharing systems and institute level data sharing infrastructure

Page 17: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Data Resource CenterData Resource Center

A General Scientific Data Center

A new organization responsible for data preservation, curation and access service in CAS Mass data backup

Data online

service

Mas

s da

ta a

naly

sis

and

proc

ess

Long-term preservation of important data

Data ResourceCenter

Tech

nolo

gy s

ervi

ce Netw

ork storage space

system environment

Application

service

mas

s da

ta

Managemen

t system

collaborator

staf

f

Page 18: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Massive Storage System in Data Massive Storage System in Data Resource CenterResource Center

Massive Storage System– Scientific data archive system (5PB

tape) – Online data storage system (1PB

disk array)Internet-based service (Cloud

Service) – Data backup– Archiving and curation– on-line data access and analysis

Page 19: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Domain Specific Scientific Data Domain Specific Scientific Data CentersCenters

World Data Center(World Data System) in CAS– Natural Resource Environment Data Center

– Astronomy Data Center

– Space Data Center

– Geophysics Data Center

– Glacier and Frozen Earth Data Center

Page 20: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Scientific Databases (SDB)

A Long-term mission started in 1986 which was funded by CAS– data from research, for research

Collecting multi-discipline research data and promoting data sharing– More than 350 research

databases and 400 datasets by 61 institutes

– Over 60TB data available to open access and download

http://www.csdb.cn

Page 21: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Scientific Databases (cont.)

8 Resource databases– Geo-Science

– Biodiversity

– Chemistry

– Astronomy

– Space Science

– Micro biology and virus

– Material science

– Environment

2 Reference databases– China Species

– compound4 Application-Oriented

databases– High Energy (ITER)

– Western Environment Research

– Ecology research

– Qinghai Lake Research

Page 22: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Scien

tific Data G

ridScientific Data and databases

Scientific Data Grid Middleware

Scientific Data Grid Applications

Bioscience Gateway Geosciences Gateway

Chemistry Gateway Other Gateways

CAS Scientific Data Grid

Integrating distributed scientific data into a com-prehensive service and application environment

Linking all data canters as a data net

Page 23: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Scientific Computing GridScientific Computing Grid

Access Through network

Local/Remote User

Resource Abstracting

Cooperation

Resource Interconnection

Other network resource and environment

Database, e-Science, ARP, website, science, TRP

CNGRID & environment

Super Computing Grid

App

licat

ion

serv

ice

and

Tech

nica

l sup

porti

ng S

yste

m,

Uni

form

Sys

tem

ope

ratin

g, S

uppo

rting

& S

ervi

ce.

Uniform

Regulations

SCCAS, 120+Tflops

Computing capacity

8+ Branches:50 Tflops commonComputing capacity

Institute Computing Resource

50 Tflops common

Computing capacity

Lenovo 7000, Peak: 143TeraFLOPS

Page 24: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Scientific Computing GridScientific Computing Grid

HPC, Cluster, Workstation, Storage

Windows / Linux Clients

Web Portal

Grid Middleware

Users Administrator

Page 25: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

HEP Grid in ChinaHEP Grid in China

Access to the LHC data for scientific research: A grid computing system is built in CAS

WLCG MoU signed with CERN in 2006 to build a Tier-2 center at IHEP for both the ATLAS and CMS experiments.

IHEPIHEPPKUPKU

SDUSDUUSTCUSTC

NJUNJU

Page 26: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Tier-2 site at IHEPTier-2 site at IHEP

WLCG site based on EGEE/gLite

Associated with CC-IN2P3 in Lyon

Work nodes with 1600 cores

400 TB disk space

Page 27: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Typical data intensive e-Science Typical data intensive e-Science ApplicationsApplications

Developing a series of pilot e-Science applications– Most are data intensive

Page 28: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Pt>20 GeV/c Tracks

ttH(2l2b4j2) full simulation event display

ttH-2L selection

ttbar mimic to ttHWW

HEP Grid Applications: ATLAS HEP Grid Applications: ATLAS MC StudyMC Study

Page 29: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Rosetta

Early/Late Stage

HEP Grid application: protein HEP Grid application: protein predictionprediction

Explore the non-natural protein sequence space Set up a massive protein structure prediction environment Develop web tools for the biology community Result of EUChinaGrid project (EU FP6 project)

KWCWPFASHNDLKVQSQWYVEPPDTIPPYNKYGTNFIKHCQYIAHMQGDTHFFNRVRMHQLWKIIVDCAY

Page 30: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

ChinaFLUXBuilt in 2002 for climate change and environment research

Page 31: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

31

Data System

Observation systemObservation system Modeling and visualizationModeling and visualizationData transmission Data transmission

ChinaFLUX e-Science Environment

Page 32: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Real data from sensors to field stations, then to institutes, finally to data centers to process and share

Cyberinfrastructure for data collection Cyberinfrastructure for data collection

综合研究中心

服务器 可视化显示屏 存储设备

归档 在线存储

通量塔

摄像头 无线传输设备

台站

服务器

下一代互联网络

近距离无线连接

远距离无线连接

Internet接入

软件工具

仪器设备状态监控和异常警报

数据管理

数据处理

ChinaFlux数据服务门户

Vpn

北京生态网络综合中心

长白山站

Vpn

内蒙古站

Vpn

禹城站Vpn

千烟洲站Vpn

鼎湖山站Vpn

哀牢山站

Vpn当雄站

Vpn

Vpn

海北站

基地

Internet

集中器/交换机

传感器

无线网终端接收器

传感器

无线网终端接收器

传感器分散监测点 传感器集中

监测点

传感器

无线网终端接收器

无线网络传感器

无线传感器网络监测点

Page 33: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Data intensive Data intensive applicationapplication environment environment

Data synthesis and integration

Data analysis and modeling

visualization

Page 34: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

OPEN S

CIENCE

CLOUD

OPEN S

CIENCE

CLOUD

IaaSNetwork Service

Computing ServiceStorage Service

IaaSNetwork Service

Computing ServiceStorage Service

Conclusion

PaasData intensive

application environment

PaasData intensive

application environment

SaasSoftware and tools for data

curation, analysis, mining and visualization…

SaasSoftware and tools for data

curation, analysis, mining and visualization…

Building an Open Science Cloud serving not only CAS researchers, but also the wider scientific community!

DaaSScientific data and databases

Service

DaaSScientific data and databases

Service

Page 35: Tieniu TAN Deputy Secretary-General Chinese Academy of Sciences (CAS) 29 Mar. 2010, Irvine, USA

Thank youThank you !!


Recommended