+ All Categories
Home > Data & Analytics > The Pacific Research Platform

The Pacific Research Platform

Date post: 08-Jan-2017
Category:
Upload: larry-smarr
View: 516 times
Download: 0 times
Share this document with a friend
23
“The Pacific Research Platform” Briefing to The Quilt Visit to Calit2’s Qualcomm Institute University of California, San Diego February 10, 2016 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD http://lsmarr.calit2.net 1
Transcript
Page 1: The Pacific Research Platform

“The Pacific Research Platform”

Briefing to The Quilt Visit to Calit2’s Qualcomm Institute

University of California, San DiegoFebruary 10, 2016

Dr. Larry SmarrDirector, California Institute for Telecommunications and Information Technology

Harry E. Gruber Professor, Dept. of Computer Science and Engineering

Jacobs School of Engineering, UCSDhttp://lsmarr.calit2.net

1

Page 2: The Pacific Research Platform

Vision: Creating a West Coast “Big Data Freeway” Connected by CENIC/Pacific Wave

Use Lightpaths to Connect All Data Generators and Consumers,

Creating a “Big Data” FreewayIntegrated With High Performance Global Networks

“The Bisection Bandwidth of a Cluster Interconnect, but Deployed on a 20-Campus Scale.”

This Vision Has Been Building for Over a Decade

Page 3: The Pacific Research Platform

NSF’s OptIPuter Project: Demonstrating How SuperNetworks Can Meet the Needs of Data-Intensive Researchers

OptIPortal– Termination

Device for the

OptIPuter Global

Backplane

Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PIUniv. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST

Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent

2003-2009 $13,500,000

In August 2003, Jason Leigh and his

students used RBUDP to blast data from NCSA to SDSC

over theTeraGrid DTFnet,

achieving18Gbps file transfer out of the available 20Gbps

LS Slide 2005

Page 4: The Pacific Research Platform

DOE ESnet’s Science DMZ: A Scalable Network Design Model for Optimizing Science Data Transfers

• A Science DMZ integrates 4 key concepts into a unified whole:– A network architecture designed for high-performance applications,

with the science network distinct from the general-purpose network

– The use of dedicated systems for data transfer

– Performance measurement and network testing systems that are regularly used to characterize and troubleshoot the network

– Security policies and enforcement mechanisms that are tailored for high performance science environments

http://fasterdata.es.net/science-dmz/Science DMZCoined 2010

The DOE ESnet Science DMZ and the NSF “Campus Bridging” Taskforce Report Formed the Basis for the NSF Campus Cyberinfrastructure Network Infrastructure and Engineering (CC-NIE) Program

Page 5: The Pacific Research Platform

Based on Community Input and on ESnet’s Science DMZ Concept,NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways

2012-2015 CC-NIE / CC*IIE / CC*DNI PROGRAMS

Red 2012 CC-NIE AwardeesYellow 2013 CC-NIE AwardeesGreen 2014 CC*IIE AwardeesBlue 2015 CC*DNI AwardeesPurple Multiple Time Awardees

Source: NSF

Page 6: The Pacific Research Platform

The Pacific Research Platform: The Next Logical Step – Connect Multiple Campus Science DMZs with 10-100Gbps Lightpaths

NSF CC*DNI Grant$5M 10/2015-10/2020

PI: Larry Smarr, UC San Diego Calit2Co-Pis:• Camille Crittenden, UC Berkeley CITRIS, • Tom DeFanti, UC San Diego Calit2, • Philip Papadopoulos, UC San Diego SDSC, • Frank Wuerthwein, UC San Diego Physics

and SDSC

Page 7: The Pacific Research Platform

FIONA – Flash I/O Network Appliance:Termination Device for 10-100Gbps Flows

UCOP Rack-Mount Build:

FIONAs Are Science DMZ Data Transfer Nodes &Optical Network Termination Devices

UCSD CC-NIE Prism Award & UCOPPhil Papadopoulos & Tom DeFanti

Joe Keefe & John Graham

Cost $8,000 $20,000

Intel Xeon Haswell Multicore

E5-1650 v3 6-Core

2x E5-2697 v3 14-Core

RAM 128 GB 256 GB

SSD SATA 3.8 TB SATA 3.8 TB

Network Interface 10/40GbEMellanox

2x40GbE Chelsio+Mellanox

GPU NVIDIA Tesla K80

RAID Drives 0 to 112TB (add ~$100/TB)

John Graham, Calit2’s QI

Page 8: The Pacific Research Platform

FIONAs as Uniform DTN End Points

Existing DTNs

As of October 2015

FIONA DTNs

UC FIONAs Funded byUCOP “Momentum” Grant

Page 9: The Pacific Research Platform

Ten Week Sprint to Demonstrate the West Coast Big Data Freeway System: PRPv0

Presented at CENIC 2015 March 9, 2015

FIONA DTNs Now Deployed to All UC CampusesAnd Most PRP Sites

Page 10: The Pacific Research Platform

Pacific Research PlatformMulti-Campus Science Driver Teams

• Jupyter Hub• Biomedical

– Cancer Genomics Hub/Browser– Microbiome and Integrative ‘Omics– Integrative Structural Biology

• Earth Sciences– Data Analysis and Simulation for Earthquakes and Natural Disasters– Climate Modeling: NCAR/UCAR– California/Nevada Regional Climate Data Analysis– CO2 Subsurface Modeling

• Particle Physics• Astronomy and Astrophysics

– Telescope Surveys– Galaxy Evolution– Gravitational Wave Astronomy

• Scalable Visualization, Virtual Reality, and Ultra-Resolution Video 10

Page 11: The Pacific Research Platform

PRP First Application: Distributed IPython/Jupyter Notebooks: Cross-Platform, Browser-Based Application Interleaves Code, Text, & Images

IJuliaIHaskellIFSharpIRubyIGoIScalaIMathicsIaldorLuaJIT/TorchLua KernelIRKernel (for the R language)IErlangIOCamlIForthIPerlIPerl6IoctaveCalico Project • kernels implemented in Mono,

including Java, IronPython, Boo, Logo, BASIC, and many others

IScilabIMatlabICSharpBashClojure KernelHy KernelRedis Kerneljove, a kernel for io.jsIJavascriptCalysto SchemeCalysto Processingidl_kernelMochi KernelLua (used in Splash)Spark KernelSkulpt Python KernelMetaKernel BashMetaKernel PythonBrython KernelIVisual VPython Kernel

Source: John Graham, QI

Page 12: The Pacific Research Platform

GPU JupyterHub:

2 x 14-core CPUs256GB RAM 1.2TB FLASH

3.8TB SSDNvidia K80 GPU

Dual 40GbE NICsAnd a Trusted Platform

Module40Gbps

GPU JupyterHub:

1 x 18-core CPUs128GB RAM 3.8TB SSD

Nvidia K80 GPUDual 40GbE NICs

And a Trusted Platform Module

PRP UC-JupyterHub BackboneNext Step: Deploy Across PRP

Source: John Graham, Calit2 UC Berkeley UC San Diego

Page 13: The Pacific Research Platform

OSG Federates Clusters in 40/50 States:A Major XSEDE Resource

Source: Miron Livny, Frank Wuerthwein, OSG

Page 14: The Pacific Research Platform

Open Science Grid Has Had a Huge Growth Over the Last Decade -Currently Federating Over 130 Clusters

Crossed 100 Million

Core-Hours/MonthIn Dec 2015

Over 1 Billion Data Transfers

Moved200 Petabytes

In 2015

Supported Over200 Million Jobs

In 2015

Source: Miron Livny, Frank Wuerthwein, OSG

ATLAS

CMS

Page 15: The Pacific Research Platform

PRP Prototype of LambdaGrid Aggregation of OSG Software & ServicesAcross California Universities in a Regional DMZ

• Aggregate Petabytes of Disk Space & PetaFLOPs of Compute, Connected at 10-100 Gbps

• Transparently Compute on Data at Their Home Institutions & Systems at SLAC, NERSC, Caltech, UCSD, & SDSC

SLAC

UCSD & SDSC

UCSB

UCSC

UCD

UCR

CSU Fresno

UCI

Source: Frank Wuerthwein, UCSD Physics;

SDSC; co-PI PRP

PRP Builds on SDSC’s

LHC-UC ProjectCaltech

ATLAS

CMS

other physics

life sciences

other sciences

OSG Hours 2015 by Science Domain

Page 16: The Pacific Research Platform

Two Automated Telescope SurveysCreating Huge Datasets Will Drive PRP

300 images per night. 100MB per raw image

30GB per night

120GB per night

250 images per night. 530MB per raw image

150 GB per night

800GB per nightWhen processed

at NERSC Increased by 4x

Source: Peter Nugent, Division Deputy for Scientific Engagement, LBLProfessor of Astronomy, UC Berkeley

Precursors to LSST and NCSA

PRP Allows Researchersto Bring Datasets from NERSC

to Their Local Clusters for In-Depth Science Analysis

Page 17: The Pacific Research Platform

Global Scientific Instruments Will Produce Ultralarge Datasets Continuously Requiring Dedicated Optic Fiber and Supercomputers

Square Kilometer Array Large Synoptic Survey Telescope

https://tnc15.terena.org/getfile/1939 www.lsst.org/sites/default/files/documents/DM%20Introduction%20-%20Kantor.pdf

Tracks ~40B Objects,Creates 10M Alerts/Night

Within 1 Minute of Observing

2x40Gb/s

Page 18: The Pacific Research Platform

PRP Will Support the Computation and Data Analysisin the Search for Sources of Gravitational Radiation

Augment the aLIGO Data and Computing Systems

at Caltech, by connecting at 10Gb/s to SDSC Comet supercomputer,

enabling LIGO computations to enter via the same PRP “job cache” as for LHC.

Page 19: The Pacific Research Platform

HPWREN Users and Public Safety ClientsGain Redundancy and Resilience from PRP Upgrade

San Diego CountywideSensors and Camera

ResourcesUCSD & SDSU

Data & ComputeResources

UCSD

UCR

SDSU

UCIUCI & UCR

Data Replicationand PRP FIONA Anchors

as HPWREN ExpandsNorthward

10X Increase During Wildfires

• PRP CENIC 10G Link UCSD to SDSU– DTN FIONAs Endpoints– Data Redundancy – Disaster Recovery – High Availability – Network Redundancy

Data From Hans-Werner Braun

Source: Frank Vernon, Greg Hidley, UCSD

Page 20: The Pacific Research Platform

PRP Backbone Sets Stage for 2016 Expansion of HPWREN, Connected to CENIC, into Orange and Riverside Counties

• Anchor to CENIC at UCI– PRP FIONA Connects to

CalREN-HPR Network– Data Replication Site

• Potential Future UCR CENIC Anchor

• Camera and Relay Sites at:– Santiago Peak– Sierra Peak– Lake View– Bolero Peak– Modjeska Peak– Elsinore Peak– Sitton Peak– Via Marconi

Collaborations through COAST –County of Orange Safety Task Force

UCR

UCI

UCSD

SDSU

Source: Frank Vernon, Greg Hidley, UCSD

Page 21: The Pacific Research Platform

40G FIONAs

20x40G PRP-connected

WAVE@UC San Diego

PRP Links FIONA ClustersCreating Distributed Virtual Reality

PRP

CAVE@UC Merced

Page 22: The Pacific Research Platform

UCD

UCSF

Stanford

NASAAMES/NREN

UCSC

UCSB

Caltech

USC UCLA

UCIUCSD SDSU

UCR

EsnetDoE Labs

UW/PNWGPSeattle

Berkeley

UCM

Los Nettos

Internet2

Internet2Seattle

Note: This diagram represents a subset of sites and connections.

* Institutions withActive Archaeology Programs

“In an ideal world –Extremely high bandwidth to

move large cultural heritage datasets around the PRP cloud for

processing & viewing in CAVEs around PRP with Unlimited Storage

for permanent archiving.”-Tom Levy, UCSD

PRP is NOT Just for Big Data Science and Engineering:Linking Cultural Heritage and Archaeology Datasets

Building on CENIC’s ExpansionTo Libraries, Museums,

and Cultural Sites

Page 23: The Pacific Research Platform

Next Step: Global Research PlatformBuilding on CENIC/Pacific Wave and GLIF

Current InternationalGRP Partners


Recommended