Source: Jim Dolgonas, CENIC
CENIC is Removing the Inter-Campus Barriers in California
~ $14MInvested
in Upgrade
Now Campuses Need to Upgrade
The “Golden Spike” UCSD Experimental Optical Core:Ready to Couple Users to CENIC L1, L2, L3 Services
Source: Phil Papadopoulos, SDSC/Calit2 (Quartzite MRI PI, OptIPuter co-PI)
Funded by NSF MRI
Grant
Lucent
Glimmerglass
Force10
OptIPuter Border Router
CENIC L1, L2Services
Cisco 6509
Currently:
>= 60 endpoints at 10 GigE
>= 30 Packet switched
>= 30 Switched wavelengths
>= 400 Connected endpoints
Approximately 0.5 Tbps Arrive at the “Optical”
Center of Hybrid Campus Switch
Network Today
Quartzite
Calit2 SunlightOptical Exchange Contains Quartzite
Maxine Brown,
EVL, UICOptIPuter
Project Manager
What the Network Enables
• Data, Computing anywhere on Campus
• Always-on high-resolution streaming
• Large-scale data movement w/o impacting commodity net.
• Complete re-factoring of where network-connected resources are located
Campus Fiber Network Based on Quartzite Allowed UCSD CI Design Team to Architect Shared Resources
UCSD Storage
OptiPortalResearch Cluster
Digital Collections
Lifecycle Management
PetaScale Data
Analysis Facility
HPC SystemCluster Condo
UC Grid Pilot
Research Instrument
N x 10Gbe
DNA Arrays, Mass Spec.,
Microscopes, Genome
Sequencers
Source: Phil Papadopoulos, SDSC/Calit2
Triton – A Downpayment on Campus-scale CI
• Standard Compute Cluster (256 nodes, 2048 Cores, 6TB RAM)• Large-memory Cluster (28 nodes, 896 cores, 9TB RAM)• Large-scale storage
– At baby stage with 180TB and 4GB/sec– Goal is ~4PB and 100GB/sec BW
• Structure managed with Rocks. An open system.• Will also function as a high-performance cloud platform
TritonResource: Expect initial production on compute systems ~June 2009
Data Oasis storage system expected fall 2009
Triton Designed for Particular Apps
• Overriding need for Large Memory nodes – 8 @ 512GB, 20 @ 256GB (4 dedicated as DB’s)
A Small Sampling:
• Regional Ocean Circulation (COMPAS @ Scripps)– Scalable algorithm + single node optimization step (> 150GB memory
needed)
• 3D Tomographic Reconstruction of EM Images (Medicine)– 256, 512GB “on the small side”
• DNA Sequence Analysis with short sequence reads - > 128 GB • Human Heart Full Beat Simulation (Bioengineering)
– 100 – 200 GB
• Drug discovery and design from first principles.
Triton Network Connectivity
• Total Switch Capacity – 512 X 10 Gbit/sec = 5 Tbit/s ($150K)
• 32 x 10GbE to Campus Networks including at least 5x10GbE to Quartzite OptIPuter. – All external-to-UCSD
high-speed networks could terminate on Triton at full rate
Mid Construction – Large Memory Nodes Integrated into Switch (28 nodes, 40Gbit/s/Node)
The NSF-Funded GreenLight ProjectGiving Users Greener Compute and Storage Options
• Measure and Control Energy Usage:– Sun Has Shown up to 40% Reduction in Energy
– Active Management of Disks, CPUs, etc.
– Measures Temperature at 5 Levels in 8 Racks
– Power Utilization in Each of the 8 Racks
– Chilled Water Cooling Systems
UCSD Structural Engineering Dept. Conducted Sun MD
Tests May 2007
UCSD (Calit2 & SOM) Bought Two Sun MDs
May 2008Source: Tom DeFanti, Calit2;
GreenLight PI
The GreenLight Project: Instrumenting the Energy Cost of Computational Science
• Focus on 5 Communities with At-Scale Computing Needs:– Metagenomics– Ocean Observing– Microscopy – Bioinformatics– Digital Media
• Measure, Monitor, & Web Publish Real-Time Sensor Outputs– Via Service-oriented Architectures– Allow Researchers Anywhere To Study Computing Energy Cost– Enable Scientists To Explore Tactics For Maximizing Work/Watt
• Develop Middleware that Automates Optimal Choice of Compute/RAM Power Strategies for Desired Greenness
• Partnering With Minority-Serving Institutions Cyberinfrastructure Empowerment Coalition
Source: Tom DeFanti, Calit2; GreenLight PI
Research Needed on How to Deploy a Green CI
• Computer Architecture – Rajesh Gupta/CSE
• Software Architecture – Amin Vahdat, Ingolf Kruger/CSE
• CineGrid Exchange – Tom DeFanti/Calit2
• Visualization – Falko Kuster/Structural Engineering
• Power and Thermal Management – Tajana Rosing/CSE
• Analyzing Power Consumption Data – Jim Hollan/Cog Sci
• Direct DC Datacenters– Tom Defanti, Greg Hidley
http://greenlight.calit2.net
MRI
New Techniques for Dynamic Power and Thermal Management to Reduce Energy Requirements
Dynamic Thermal Management (DTM)
• Workload Scheduling:• Machine learning for Dynamic
Adaptation to get Best Temporal and Spatial Profiles with Closed-Loop Sensing
• Proactive Thermal Management• Reduces Thermal Hot Spots by Average
60% with No Performance Overhead
Dynamic Power Management (DPM)
•Optimal DPM for a Class of Workloads•Machine Learning to Adapt
• Select Among Specialized Policies• Use Sensors and
Performance Counters to Monitor• Multitasking/Within Task Adaptation
of Voltage and Frequency• Measured Energy Savings of
Up to 70% per Device
NSF Project Greenlight• Green Cyberinfrastructure in
Energy-Efficient Modular Facilities • Closed-Loop Power &Thermal
Management
System Energy Efficiency Lab (seelab.ucsd.edu)Prof. Tajana Šimunić Rosing, CSE, UCSD