Date post: | 20-Jan-2016 |
Category: |
Documents |
View: | 214 times |
Download: | 0 times |
BOINCThe Year in Review
David P. Anderson
Space Sciences LaboratoryU.C. Berkeley
22 Oct 2009
Volunteer computing
• Throughput is now 10 PetaFLOPS
– mostly Folding@home
• Volunteer population is constant
– 330K BOINC, 200K F@h
• Volunteer computing still unknown in
– HPC world
– scientific computing world
– general public
ExaFLOPS
• Current PetaFLOPS breakdown:
• Potential: ExaFLOPS by 2010– 4M GPUs * 1 TFLOPS * 0.25 availability
Processor type0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5 4.6
2.42.2
1.2
NVIDIACPUPS3 (Cell)ATI
Projects
• No significant new academic projects
– but signs of life in Asia
• No new umbrella projects
• AQUA@home: D-Wave systems
• Several hobbyist projects
BOINC funding
• Funded into 2011
• New NSF proposal
Facebook apps
• Progress thru Processors (Intel/GridRepublic)
– Web-only registration process
– lots of fans, not so many participants
• BOINC Milestones
• IBM WCG
Research
• Host characterization
• Scheduling policy analysis
– EmBOINC: project emulator
• Distributed applications
– Volpex
• Apps in VMs
• Volunteer motivation study
Fundamental changes
• App versions now have dynamically-determined processor usage attributes (#CPUs, #GPUs)
• Server can have multiple app versions per (app, platform) pair
• Client can have multiple versions per app
• An issued job is linked to an app version
Scheduler request
• Old (CPU only)
– requested # seconds
– current queue length
• New: for each resource type (CPU, NVIDIA, ...)
– requested # seconds
– current high-priority queue length
– # of idle instances
Schedule reply
• Application versions include
– resource usage (# CPUs, # GPUs)
– FLOPS estimate
• Jobs specify an app version
• A given reply can include both CPU and GPU jobs for a given application
Client: work fetch policy
• When? From which project? How much?• Goals
– maintain enough work– minimize scheduler requests– honor resource shares
• per-project “debt”
CPU 0
CPU 3
CPU 2
CPU 1
maxmin
Work fetch for GPUs: goals
• Queue work separately for different resource types
• Resource shares apply to aggregate
Example: projects A, B have same resource share
A has CPU and GPU jobs, B has only GPU jobs
GPU
CPU A
BA
Work fetch for GPUs
• For each resource type
– per-project backoff
– per-project debt• accumulate only while not backed off
• A project’s overall debt is weighted average of resource debts
• Get work from project with highest overall debt
Client: job scheduling
• GPU job scheduling– client allocates GPUs– GPU prefs
• Multi-thread job scheduling– handle a mix of single-, multi-thread jobs– don’t overcommit CPUs
GPU odds and ends
• Default install is non-service• Dealing with sporadic usability
– e.g. Remote Desktop• Multiple non-identical GPUs• GPUs and anonymous platform
Other client changes
• Proxy auto-detection
• Exclusive app feature
• Don’t write state file on each checkpoint
Screensaver
• Screensaver coordinator
– configurable
• New default screensaver
• Intel screensaver
Scheduler/feeder
• Handle multiple app versions per platform
• Handle requests for multiple resources
– app selection
– completion estimate, deadline check
• Show specific messages to users
– “no work because you need driver version N”
• Project-customized job check
– jobs need different # of GPU processors
• Mixed locality and non-locality scheduling
Server
• Automated DB update
• Protect admin web interface
Manager
• Terms of use feature
• Show only projects supporting platform– need to extend for GPUs
• Advanced view is keyboard navigable
• Manager can read cookies (Firefox, IE)
– web-only install
Apps
• Enhanced wrapper
– checkpointing, fraction done
• PyMW: master/worker Python system
Community contributions
• Pootle-based translation system
– projects can use this
• Testing– alpha test project
• Packaging– Linux client, server packages
• Programming
– lots of flames, little code
What didn’t get done
• Replace runtime system
• Installer: deal with “standby after X minutes”
• Global shutdown switch
Things on hold
• BOINC on mobile devices
• Replace Simple GUI
Important things to do
• New system for credit and runtime estimation– we have a design!
• Keep track of GPU availability separately
• Steer computers with GPUs towards projects with GPU apps
• Sample CUDA app
BOINC development
• Let us know if you want something
• If you make changes of general utility:
– document them
– add them to trunk