Introduction to the BOINC software

Post on 06-Jan-2016

21 views 1 download

description

David P. Anderson Space Sciences Laboratory University of California, Berkeley. Introduction to the BOINC software. Outline. Abstractions The BOINC server software The BOINC client software and runtime system. Design goals. A few applications, lots of jobs High performance - PowerPoint PPT Presentation

transcript

Introduction to the BOINC software

David P. Anderson

Space Sciences LaboratoryUniversity of California, Berkeley

Outline

• Abstractions

• The BOINC server software

• The BOINC client software and runtime system

Design goals

• A few applications, lots of jobs

• High performance

– millions of jobs per day

• Scalability

• Fault tolerance

Abstractions• Platform

• App version

– a collection of files, one of which is an executable main program

– associated with a platform

• App

– a set of app versions that all perform roughly the same computation

– may have versions for different platforms

– may have different versions for one platform (GPU, non-GPU)

Abstractions

• Workunit (job)

– a collection of input files

– associated with an app (not an app version!)

– attributes• resource estimates and bounds

• latency bound

• Result (job instance)

– a collection of output files

– associated with a workunit

Anatomy of a BOINC project

MySQLdatabase

project root/bin/cgi-bin/download/

00/ .. 3ff/html/log_*/templates/upload/

00/ .. 3ff/

serversdaemons andperiodic tasks

clients

Work generator

MySQLdatabase

project root/bin/cgi-bin/download/

00/ .. 3ff/html/log_*/templates/upload/

00/ .. 3ff/

work generator

• Creates input files

• Creates workunits

• One per app

• Flow control

– disk space

– DB size

Specifying a job

• Workunit template

– XML document describing• input files (logical, physical names)

• job attributes

• Result template

– XML document describing output files

• create_work()

– specifies templates, app, input files

Validator

• Check result validity

• Compare replicas

• May be app-specific

MySQLdatabase

project root/bin/cgi-bin/download/

00/ .. 3ff/html/log_*/templates/upload/

00/ .. 3ff/

validator

Validation

• Clients may

– return bad results

– exaggerated claimed credit

• Strategies

– app-specific consistency checking

– replication• fuzzy comparison

• homogeneous redundancy

– adaptive replication

Assimilator

• Processes completed results

• App-specificMySQL

database

project root/bin/cgi-bin/download/

00/ .. 3ff/html/log_*/templates/upload/

00/ .. 3ff/

assimilator

Summary

• Create app, app versions for different platforms

• Develop work generator

• Develop validator

• Develop assimilator

Isn’t there a simpler way?

Single-job submission

• Assemble your input files and executable, thenboinc_submit --input foo --output blah program

• How this works:

– uses “wrapper” app

– executable is part of workunit

– templates are created automatically

• What it doesn’t do:

– multi-platform

– validation

Job dispatch

MySQLdatabase

project root/bin/cgi-bin/download/

00/ .. 3ff/html/log_*/templates/upload/

00/ .. 3ff/

transitioner

feeder

scheduler(CGI or FastCGI)

share-memoryjob cache

clients

File transfer

project root/bin/cgi-bin/download/

00/ .. 3ff/html/log_*/templates/upload/

00/ .. 3ff/

Apache

file uploadhandler

clients

Janitorial daemons

MySQLdatabase

project root/bin/cgi-bin/download/

00/ .. 3ff/html/log_*/templates/upload/

00/ .. 3ff/

file deleter

DB purger

Ways to deploy a BOINC server

• Linux server

• Server VM for VMWare

• Server VM for Amazon EC2

The BOINC runtime system

Directory structure:

BOINC/projects/

lhcathome/physical_name0physical_name1

setiathome/slots/

0/logical_name0 (link file)logical_name1

1/

BOINC client

application

share-memorymessage-passing

BOINC runtime

fraction doneCPU time

suspendresumequit

Basic API

• boinc_init()

– creates a thread that handles messages

• boinc_finish()

– creates a “finish file”

• boinc_resolve_filename()

– maps logical to physical file names

Checkpointing

• boinc_time_to_checkpoint()

– call at points where you can checkpoint

• boinc_checkpoint_done()

– call when you’re finished checkpointing

Compound applications

• Examples:

– coordinator program runs several worker programs in sequence

– “switcher” program probes CPU architecture, selects which executable to run

• Variants of boinc_init() let you specify which app is main program, and how messages are handled

• Each message type must be handled by 1 process

Long-running applications

• Trickle-up messages

• Trickle-down messages

• Intermediate file transfers

Legacy applications

• The BOINC wrapper

– takes XML “job file”

– handles all messages

GPU and multithread apps

• Server

– you supply a function that takes an app version and a host, and returns resource usage and estimated FLOPS

– the BOINC scheduler chooses the best version

• Client

– senses and reports coprocessors (e.g. NVIDIA GPUs)

– coprocessor-aware scheduling and work fetch