WP3 Tools: distributed molecularsimulations and remote submissioninterface
Toni Giorgino
Computational Biochemistry and Biophysics Lab / GRIB-IMIMwww.multiscalelab.org
22-23 jan 2009
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 13
MD at the microsecond scale
� Full-atom MD is computationally expensive� All solvation water molecules modeled
� Non-bonded interactions
� All degrees of freedom in a protein
z
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 14
MD at the microsecond scale
� Full-atom MD is computationally expensive� All solvation water molecules modeled
� Non-bonded interactions
� All degrees of freedom in a protein
z
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 15
� Tackled at several scales� Combine advances of large-scale
infrastructures ⇒ distributed computing� ...with those in commodity computing
architectures ⇒ GPU
One node(CPU)
0.4 ns/day
1
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 16
� Tackled at several scales� Combine advances of large-scale
infrastructures ⇒ distributed computing� ...with those in commodity computing
architectures ⇒ GPU
One node(CPU)
0.4 ns/day
Within-node 1
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 17
� Tackled at several scales� Combine advances of large-scale
infrastructures ⇒ distributed computing� ...with those in commodity computing
architectures ⇒ GPU
One node(CPU)
0.4 ns/day
One node,accelerated4.5 ns/day
Within-node 1
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 18
� Tackled at several scales� Combine advances of large-scale
infrastructures ⇒ distributed computing� ...with those in commodity computing
architectures ⇒ GPU
One node(CPU)
0.4 ns/day
One node,accelerated4.5 ns/day
Within-node Among-nodes1
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 19
� Tackled at several scales� Combine advances of large-scale
infrastructures ⇒ distributed computing� ...with those in commodity computing
architectures ⇒ GPU
One node(CPU)
0.4 ns/day
One node,accelerated4.5 ns/day
Distributed computing
X 1,700, growing
Within-node Among-nodes1
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 20
Tools
One node(CPU)
0.25 ns/day
One node,accelerated4.6 ns/day
Distributed computing
X 1,700, growing
Within-node Among-nodes1
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 21
Tools
One node(CPU)
0.25 ns/day
One node,accelerated4.6 ns/day
Distributed computing
X 1,700, growing
Within-node Among-nodes1
AceMD
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 22
Tools
One node(CPU)
0.25 ns/day
One node,accelerated4.6 ns/day
Distributed computing
X 1,700, growing
Within-node Among-nodes1
AceMD Boinc
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 23
Tools
One node(CPU)
0.25 ns/day
One node,accelerated4.6 ns/day
Distributed computing
X 1,700, growing
Within-node Among-nodes1
AceMD Boinc
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 24
Computing architecture
� Based on BOINC (Berkeley Open Infrastructure for Network Computing)
� Allows (loosely-coupled) computations tobe distributed over the Internet
� Volunteers contribute CPU cycles� Not MD-specific� Public statistics at www.boincstats.com
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 25
Using computing power
• The infrastructure becomes useful for the community with a distributed submission system
One node(CPU)
0.25 ns/day
One node,accelerated4.6 ns/day
Distributed computing
× 2,000, growing
Within-node Among-nodes1
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 26
Using computing power
• The infrastructure becomes useful for the community with a distributed submission system
One node(CPU)
0.25 ns/day
One node,accelerated4.6 ns/day
Distributed computing
× 2,000, growing
Within-node Among-nodes1
Sim
. ent
rypo
int
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 27
Using computing power
• The infrastructure becomes useful for the community with a distributed submission system
One node(CPU)
0.25 ns/day
One node,accelerated4.6 ns/day
Distributed computing
× 2,000, growing
Within-node Among-nodes1
Sim
. ent
rypo
int
Research team 1 – hERGScientist 2 – gA…Research team N – protN
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 28
Remote simulation submission
• Pre-defined “best practice” submission protocols
• Researchers stage in/out files– They are taken in charge by Boinc
• Accounting: – Given the scale of deployment, increasing
“trustedness” associated to each simulation
– FLOPS consumed
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 29
Example: submission
1. Client parses local files and prepares a description of the simulation (may contain user-supplied metadata)
2. Inputs are uploaded to a staging area3. Execution
• Consis check• Move to execution area• Launch simulation
4. Results continuously produced; “work units” spanned until the protocol ends (# of ns × bins)
5. Clients periodically retrieve & delete results
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 30
Technically
� boinc_submit− Uploads local input files
− Creates config, launches WUs, etc
− Returns immediately
� boinc_stat� boinc_retrieve� [boinc_cancel et al]
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 31
Common traits
� All client commands have a notion of username and password to send forauthentication
� Authentication checked with Boinc protocol
� Server remembers uploaded files for laterdeletion (server-side “RemoteWU” entity)
� A RemoteWU has an ID for the clients toretrieve
− ideally equal to the usual WU id
VPH NoE Meeting - Oxford Thursday, January 22, 2009 Toni Giorgino 32
Outlook
• Expose remote submission interfaces• Identify ontologies for inclusion of relevant
annotations• Introduce metadata at least for annotating
inputs (to be carried out in outputs)