Date post: | 12-Jan-2016 |
Category: |
Documents |
Upload: | junior-jackson |
View: | 215 times |
Download: | 0 times |
Scientific Computing in theConsumer Digital Infrastructure
David P. Anderson
Space Sciences LabUniversity of California, Berkeley
The Austin ForumNovember 7, 2013
Science needs computing power
● High-performance computing● High-throughput computing
– Thousands or millions of independent jobs
– What matters is the rate of job completion, not the turnaround time of individual jobs
High-throughput computing applications
● Physical simulation
– particle collision– atomic/molecular (bio, nano)– Earth climate system
● Compute-intensive data analysis
– particle physics (LHC)– Astrophysics (radio, gravitational)– genomics
● Bio-inspired optimization
– genetic algorithms, flocking, ant colony etc.
Approaches to HTC
● Cluster computing– lots of commodity or rack-mounted PCs in a
room● Grid computing
– share clusters between organizations● Cloud computing
– rent cluster nodes, e.g. Amazon EC2● Volunteer computing
– use computers owned by consumers
The Consumer Digital Infrastructure
● Computing devices– Desktop and laptop computers– Mobiles devices: tablets, smart phones– Game consoles– Set-top boxes, DVRs– Appliances
● Commodity Internet– Cable, DSL, fiber to the home, cell networks
Measures of computing speed
● Floating-point operation (FLOP)● GigaFLOPS (109/sec): 1 Central Processing Unit (CPU)● TeraFLOPS (1012/sec): 1 Graphics Processing Unit
(GPU)● PetaFLOPS (1015/sec): 1 supercomputer● ExaFLOPS (1018/sec): current Holy Grail
CDI performance potential
● 1 billion Desktop/laptop PCs– CPUs: 10 ExaFLOPS– GPUs: 1,000 ExaFLOPS
● 2.5 billion smartphones– CPUs: 10 ExaFLOPS
Volunteer computing
● Consumers donate computing capacity to– support science– be in a community– compete
● History– 1997: GIMPS, distributed.net– 1999: SETI@home, Folding@home– 2003: BOINC
Limiting factors
● Volunteership– Study of college students [Toth 2006]
● 5% would “definitely participate”● 10% would “possible participate”
● PC availability– 65% average availability [Kondo 2008]– 35% of PCs are available 24/7
Other limiting factors
● Network bandwidth (client, server)– Commodity Internet
● Memory, disk usage– new PCs average 6 GB RAM
BOINC: middleware for volunteer computing
● Supported by NSF since 2002● Open source (LGPL)● Based at University of California, Berkeley● http://boinc.berkeley.edu
Volunteer computing with BOINC
volunteers projects
CPDN
LHC@home
WCGattachments
How to volunteer
Choose projects
Configure
Community
Creating a BOINC project
● Install BOINC server software on a Linux box
● Compile apps for Windows/Mac/Linux● Attract volunteers
– develop web site– generate publicity– communicate with volunteers
Volunteer computing today
● 500,000 active computers● 50 projects● 15 PetaFLOPS average
Some BOINC-based projects
● IBM World Community Grid● Einstein@home● Climateprediction.net● LHC@home● Rosetta@home
Cost
The cost of 10 TeraFLOPS for 1 year:● CPU cluster: $1.5M● Amazon EC2: $4M
– 5,000 small instances● Volunteer: ~ $0.1M
How BOINC works
home PC
BOINCclient
project
HTTP
download data, executables
compute
upload outputs
BOINCserver
get jobs
Issues handled by BOINC
● Heterogeneous computers● Untrusted, anonymous computers
– Result validation● replication, adaptive replication
● Credit: amount of work done● Consumer-friendly client
Using GPUs
● BOINC detects and schedules GPUs– NVIDIA, AMD, Intel– multiple/mixed GPUs– various language systems (CUDA, OpenCL,
CAL)● Issues
– non-preemptive GPU scheduling– no paging of GPU memory
Multicore apps
● Next-generation PCs may have 100 cores● BOINC supports multi-core apps
– OpenMP, MPI– OpenCL CPU apps
Using VM technology
● CDI platforms:– 85% Windows– 7% Linux– 7% Mac OS X
● Developing and maintaining versions for different platforms is hard
● Even making a portable Linux executable is hard
Virtual machines
Host operating system
Guest operating system
application
Virtual machines
Windows 7
Debian Linux 2.6
application
BOINC VM support
● Create a VM image for your favorite environment
● Create executables for that environment
BOINCclient
VirtualBoxexecutive
Vboxwrapper
VM instanceshared directory:executableinput, output files
VM advantages
● Develop in your favorite environment– No need for multiple versions
● A VM is a strong “sandbox”– Can run untrusted applications
● Free “checkpointing”
BOINC on Android
● New GUI● Battery-related issues● Released July 2013
– Google, Amazon App Stores– ~50K active devices
Why hasn’t volunteer computing gained traction?
● “Ecosystem of projects” model– Lots of competing projects
● Problems with this model– Creating/operating a project is too hard and
risky– Volunteers need simplicity– No coherent PR; too many brands
Umbrella projects
● One project serves many scientists● Examples
– CAS@home (Chinese Academy of Science)– World Community Grid (IBM)– U. of Westminster (desktop grid)– Ibercivis (Spanish consortium)
Integrating BOINC
● HTCondor (U. of Wisconsin)– Goal: BOINC-based back end for Open
Science Grid or any Condor pool
BOINCserver
HTCondor node
Grid manager
BOINC GAHP
Job submission
Integrating BOINC
● HUBzero (Purdue)– Goal: BOINC-based back end for science
portals such as nanoHUB
BOINCserver
HubprojectsprojectsPCs
Proposal: Science@home
● Single “brand” for volunteer computing● Volunteers register for science areas
rather than projects● How to allocate computing power?
– Involve the HPC, scientific funding communities
projectsprojects
Implementing Science@home
● BOINC “account manager” architecture
Science@home
BOINCclient
projects
Summary
● Volunteer computing is– Usable for most HTC applications– A path to ExaFLOPS computing– A way to popularize science
● BOINC provides the software infrastructure
● Barriers are largely organizational