+ All Categories
Home > Documents > Science Project to Boinc Server Interfaces Constellation v2

Science Project to Boinc Server Interfaces Constellation v2

Date post: 24-Oct-2014
Category:
Upload: aerospaceresearch
View: 96 times
Download: 0 times
Share this document with a friend
7
Constellation Platform - Science Project to BOINC Server Interfaces by Dipl.-Ing. (FH) Andreas HORNIG [email protected] This text (text only, without images) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. 1/7
Transcript
Page 1: Science Project to Boinc Server Interfaces Constellation v2

Constellation Platform - Science Project to BOINC Server Interfacesby Dipl.-Ing. (FH) Andreas HORNIG

[email protected]

This text (text only, without images) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.1/7

Page 2: Science Project to Boinc Server Interfaces Constellation v2

Table of Content 1 PURPOSE.........................................................................................................................................2 2 POINT OF CONTACT.....................................................................................................................2 3 CONSTELLATION PLATFORM....................................................................................................3 4 DISTRIBUTED COMPUTING GRID SYSTEM............................................................................3

4.1 Layer 0: Science Project...........................................................................................................3 4.2 Layer 1: BOINC-Server / Constellation Platform....................................................................3 4.3 Layer 2: BOINC-Client............................................................................................................4

5 APPLICATION.................................................................................................................................4 5.1 Programming Language(s).......................................................................................................4 5.2 Hard- and Software...................................................................................................................4

5.2.1 Operating System..............................................................................................................4 5.2.2 CPU/GPU..........................................................................................................................5 5.2.3 Execution: USB Flash Drive Test.....................................................................................5

5.3 Workunit...................................................................................................................................5 5.4 Resumability.............................................................................................................................5 5.5 Result........................................................................................................................................5 5.6 Control Mechanisms.................................................................................................................6

5.6.1 Validation..........................................................................................................................6 5.6.2 Result Control...................................................................................................................6

6 COMPARISION...............................................................................................................................7 7 REFERENCES:................................................................................................................................7

Keywords: distributed computing, aerospace, engineering, numerics, cloud, services, cluster, server

1 PURPOSEThis document is meant to give an overview about the typical workflow of a distributed computing project and the basic segments. The document can be used as a decision support for involvement considerations.

2 POINT OF CONTACT

Dipl.-Ing. (FH) Andreas HORNIG (Head-of-Platform)

Address:

AerospaceResearch.net/Constellation

König-Karl-Straße 27 Email: [email protected]

70372 Stuttgart / Germany „Distributed Computing for Humankind!“

This text (text only, without images) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.2/7

Page 3: Science Project to Boinc Server Interfaces Constellation v2

3 CONSTELLATION PLATFORMConstellation[0] is a distributed computing platform and open to various aerospace related applications that need computing power to solve their numerical tasks.The computing capacity for the tasks is generated by a distributed computing grid formed by volunteers donating (idle) computing time.The infrastructure for this kind of assignment is called BOINC (Berkeley Open Infrastructure for Network Computing)[1]. It is highly scalable and includes computing-nodes vary from high performance cluster computers to energy efficient netbooks.

4 DISTRIBUTED COMPUTING GRID SYSTEMA distributed computing grid can be divided into three layers. The Science Project, BOINC-Server and BOINC-Client with independent authorities.

4.1 Layer 0: Science ProjectThe application project provides the software (app), the workunits (wu), technical support in forums and project description for the website.The science project owner and project scientists transfer the files needed for the application and workunits to the the BOINC-Server. A transfer of workunits can be done via a secured ftp connection, on external drives or even on a parsable and external site the BOINC-Server regularly checks for new workunits. In case of a new application version, this has also be transferred to the BOINC-server.For workunits and applications the check-in and register to the BOINC-server can be done by the platform administration.After the results are returned and validated, they will be avaibale to the science project owner for transfering them via sftp or external drive back to another destination.

4.2 Layer 1: BOINC-Server / Constellation PlatformThe BOINC-Server is responsible to send the application and workunits to BOINC-Clients and receive results from Clients.To find and eliminate errorneous results, a workunit can be multiplied according to a quorum and send to different Clients to be processed. This allows to detect errors or deviations in the results to

This text (text only, without images) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.3/7

Figure 1: Distributed Computing Layers - Workflow

Page 4: Science Project to Boinc Server Interfaces Constellation v2

round-off errors or active and passive manipulation. The quorum instructs the validator to compare all results of one workunit and only accepts the ones which meets the validation criterion. In case of invalid results, additional copies of the workunit will be re-sent to other clients and the new results will also be included in the validation process. Otherwise the workunit will be marked as failed for the science project owner and project scientist.

4.3 Layer 2: BOINC-ClientThe BOINC-Client layer is the executable layer where the application is started and the task is processed.The user attaches the BOINC-Client software to the Constellation-server or any other BOINC-Project. Then the client will ask for apps and workunits and download them automatically.When downloaded the application is launched and the client itself will monitor the process, display the progression and pause or resume the process.When the workunit's task is finished and the result file is saved the BOINC-Client will send back the result and performs a garbage control.There is no remote control interface; all commands for the application has to be stored in the workunits or in the app itself.

5 APPLICATIONThe application (app) is the executing part of the science project and solves the numerical task specified in a workunit file.

5.1 Programming Language(s)The application has to be provided by the science project and can be coded in different languages. A native integration into BOINC can be done with a basic api[2] and in C/C++ but allmost all self-executable files can be used as long as they are not depending on pre-installed software and do include all neccesary components to run. Therefore, a BOINC-Wrapper[3] will serve as an interface to such executables and BOINC-Client. When a „USB flash drive test“ can be performed on a standard, not prepared operating system with that application, the executable can be used for BOINC purposes.

5.2 Hard- and Software

5.2.1 Operating SystemCurrently (29.05.12), there are 2700 users and 7552 host PCs registered to Constellation[4] with a daily delivery rate of 300 users sending back workunits.The heterogenous nature of distributed computing results in 80.5 % Windows, 14.8 % Linux and 4.1 % MAC operating systems, but as a self-scaling system this changes and can be changed by the project owner himself by attaching his own hardware to the project or promoting the app amoung the volunteer community.

This text (text only, without images) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.4/7

Page 5: Science Project to Boinc Server Interfaces Constellation v2

5.2.2 CPU/GPUThe application can be coded to use the CPU, a GPU[5] or both. There are volunteers actively demanding GPU support to use the full hardware to the maximum. But the basis will be served with CPU support.Parallel computation can be processsed in system environment on a specific PC and then on all CPU cores and GPU shaders or likewise combinations.

5.2.3 Execution: USB Flash Drive TestDue to the vast combinatorial solution space of user hard- and software, an app should include all needed system parts needed for execution or just rely on system specific support (operating system, drivers, libraries and binaries).This can be illustrated by using the application from a USB flash drive and on a PC without administration rights. When the app can be executed and successfully run, the application bundle is suitable for BOINC.

5.3 WorkunitWorkunits must be generated by the project scientist to be solved independently on one client without the need to transfer data to other clients. It is a strict server-client connection. On the client itself the workunit can be processed in parallel modes such a splitting meshes and let different CPU cores or different GPU shaders compute the parted task and communicate with each other parted mesh on the same client system.A workunit runtime can be formed several minutes up to several days. Runtimes with less then 10 minutes will result in a high BOINC-Server load and should be avoided. Runtimes with 200 hours are possible, as ClimatePrediction.net[6] is doing, but it will reduce the amount of volunteers supporting the project, because credits (see later section) will be awarded late.Constellation users voted for runtimes of 12 hours on a 1.6 Ghz reference intel centrino core duo laptop. So the project scientist can have a wide range of runtimes.

5.4 ResumabilityIn case of runtimes with more than 1 hour, a resuming functionality should be included. The default value of the BOINC-Client is to switch apps every 66 minutes and a typical BOINC user is attached to more than just one BOINC project like Constellation. To avoid a loss of already computed progression data and to avoid the maximum allowed runtime set be the project scientist, a "resume" functionality should be included saving the current data set that can be used after resumption.It also has to be kept in mind that a user could accidentally shut down the PC, or the software can be closed or a system error can force a software closing. In this case the computation can be started from a saving point instead of starting from the beginning.

This text (text only, without images) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.5/7

Page 6: Science Project to Boinc Server Interfaces Constellation v2

5.5 ResultThe results are the stored solutions of the processed workunit and can be formated as needed by the science project owner. The result files will be send back by the BOINC-Client and will be used by a validation system.

5.6 Control Mechanisms

5.6.1 ValidationBOINC-Server includes a validaion system that can be used. The project scientist can decide not to use it and only send out one workunit and receivce one result.He can set a quorum and has the choice to validate the results of the multiplied workunits with a standard bitwise file inspector or with a self written validator script. The bitwise file inspector only compares all results on bitlevels and has advantages for easier cases.Emphasis has to be laid on how a result file is saved and on what system. For example, a carriage return on Windows and Linux are different on bitlevels. It doesn't have an influence on the computation data, but the bitlevel is different and the validator will flag them as invalid.For more advanced validation an own validator can be used and integrated dependend on the needs of the project scientist.

5.6.2 Result ControlA distributed computing grid is as safe as other networks or cloud services. The main difference is the internet wide spread of receivers. On the one hand the processing isn't handled on a central location with closed environment, so validation will hinder manipulated results to be fed back into the system. On the other hand a distributed spreading of hundreds of thousands workunits only containing a small part to the overall solution is hard to reverse engineer.The application itself can be protected by standard programming mechanism and the results can be encrypted before temporarly stored on the user's PC and sent back to the BOINC-Server. In case of encryption, the validation technique has to be able to handle it.It is also possible to set up an internal BOINC system in an intranet, but in this case the self-scaling effects to this grid is not possible because of the lack of external volunteers.The self-scaling effect also provides a high distribution of workunits amoung all users at different locations. The higher the amount of workunits of one workbatch the higher the grade of improbabiltiy for one external party to assemble all workunits back together to one batch.

This text (text only, without images) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.6/7

Figure 2: control mechanisms in workflow – decryption a) on project side b) on server side

Page 7: Science Project to Boinc Server Interfaces Constellation v2

6 COMPARISION

(NOTE: decision matrix is weighted and ranked for example reasons. Only for impression)

7 REFERENCES:[0] Constellation Platform - http://aerospaceresearch.net/constellation/[1] BOINC - http://boinc.berkeley.edu/[2] BOINC Basic Api - http://boinc.berkeley.edu/trac/wiki/BasicApi[3] BOINC Wrapper App - http://boinc.berkeley.edu/trac/wiki/WrapperApp[4] BoincStats.com - http://boincstats.com/en/stats/104/project/detail[5] BOINC GPU Computing - http://boinc.berkeley.edu/wiki/GPU_computing[6] Climateprediction.net - http://www.climateprediction.net/

This text (text only, without images) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.7/7

Cluster # 117 Cloud # 121 DC (public) # 118Serviceaccess 5 slots 2 10 slots 2 10 permanent 3 15

access acomodation 4 buy 1 4 paid on demand 2 8 basically free 3 12

acessablity 3 99.0% 2 6 3 9 1 3

Hardware:capacity 4 high 3 12 high 3 12 basic 2 8

scalable capacity 2 limited 2 4 limited 2 4 self scaling 2 4

latency for iterations 3 short 3 9 short 3 9 high / not practical 1 3

Application Software

1 homogenous 3 3 homogenous 3 3 heterogenous 2 2

Programming Language 1 specific 3 3 specific 3 3 various self executable 3 3

Costs:hardware (basic) 5 high 2 10 high 2 10 low 3 15

hardware (updates) 2 regularly 2 4 regularly 2 4 no, scaling 3 6

power (running) 2 high 2 4 high 2 4 low 3 6

Internet:workunit transfer 1 n 3 3 y 2 2 y 2 2

for processing 1 n 3 3 n 3 3 y 1 1

location interconnection 1 y 3 3 y 3 3 y 3 3

Safety:encryption of results 1 y 3 3 y 3 3 y 3 3

transfer via internet 3 y 3 9 y 3 9 y 3 9

transfer via flash drives 2 y 3 6 y 3 6 y 3 6

local processing 3 y 3 9 y 3 9 n 2 6

2 n 2 4 n 2 4 y 3 6

2 n 3 6 n 2 4 y 1 2

Public Relation & Outreach:awareness 1 Top500.org 2 2 Top500.org 2 2 direct, citizen science 3 3

Weight(1-5, best)

99.9%(guaranteed)

when node drops out It's automatically replaced

Operating Systems(Linux, Unix, Win, Mac)

world wide distribution of tasks(no single point of attack)

world wide distribution of tasks(publicity)


Recommended