HIGH PERFORMANCE
BIOINFORMATICS
Group May 09-06Bryan McCoy
Kinit PatelTyson Williams
Problem/Need Statement Current ways to solve Bioinformatics
problems are either slow or very expensive.
There is a need for a way to reduce cost and still deliver high performance in a computer system that can solve Bioinformatics problems.
What is Bioinformatics? Genetic sequencing. Massive amounts of data. Simple operations but many of them. Perfect for distributed computing.
Proposed Solution Use a cluster of
PS3s with their embedded Cell processors.
Cell Broadband Engine Has 1 central
PowerPC based PPE.
Has 8 surrounding SPEs.
The 8 SPEs are connected via the element interconnect bus.
Cell Broadband Engine
Functional requirements FR1. Ported applications shall run on
the Cell B.E. FR2. The results returned shall be the
same as the original program. FR3. The applications shall return their
runtime. FR4. The applications shall execute in
parallel on multiple Cell B.E.s.
Non-Functional Requirements NF1. The Cells shall all run on the Linux
OS. NF2. The resulting runtimes of the
ported applications shall be faster than on the original applications.
NF3. The ported application shall be coded in the C language.
Operating Environment Use Fedora 9 OS as
it is currently supported by the Cell SDK 3.1.
Uses the command line for user interface.
Use the IBM XLC compiler and/or the current GCC compiler.
Market Survey Results of the survey point to a huge speed
up of computationally intensive programs. Dr. Gaurav Khanna at the University of
Massachusetts Dartmouth used cluster of 8 PS3s to replace a supercomputer.
Universitat Pompeu Fabra, in Barcelona, deployed in 2007 a BOINC system called PS3GRID for collaborative biological computing.
Deliverables The Source Code. Compiled Executable. Runtime Comparisons. Project Final Report. Project Poster. Project Final Presentation.
Work Breakdown Structure
Port Apps to Cluster PS3s
Problem Definition
Research Cell/B.E
Research Bioperf Suite
Research Distributed Parallel Algorithms
Research Previously Done Work
End Product Design
Design Requirements
Design Process
Design Documents
Considerations and Selections
Decide Which Linux to Install
Decide which applications to port
End Product Implementation
Hardware Implementation
Prototyping Implementation
Software Implementation
End Product Testing
Ensure Correctness of Output Results
Benchmarking
Final Documentation and Demonstration
Create Final Report
Create Project Poster
Prepare for Presentation
Costs Time
Approximately 555 man hours total.
Freely donated.Total cost $0.
Equipment3 PS3s Crossbar routerProvided for us by
client.Total cost $0.
Resource Requirements 3 PlayStation 3s. High performance network switch. Books on distributed computing on Cell. Time.
Work Schedule Gant chart
Risk Assessment Slow network speed. Software support. Limited RAM. Hardware Failure.
Lower quality entertainment hardware. Limited prior experience. Software development schedule.
Design Further divide the application into
multiple threads for SPE execution on multiple PS3s, alter the functional logic, and vectorize the code where possible.
Software Decomposition Diagram
System Requirements SR1. The system shall allow the user to input multiple
DNA sequences in FASTA format through a file interface.
SR2. The system shall output all of the most parsimonious trees implied by the input data to the screen.
SR3. The system shall share computational work among the PPE and SPEs available to each client/server process.
SR4. The front-end shall share computational work with available back-end processes.
SR5. The front-end shall be able to connect to at least 2 back-end processes via a high performance router.
System Analysis The key is data flow. Broken into 3 stages.
DNA sequences distributed to the PPEs down to the SPEs
Each SPE searches every possible parsimony tree for the best possible score using a branch and bound heuristic.
Finally the results are aggregated back to the main PPE and the results output.
Specifications Input
DNA sequence files in FASTA format. Output
Runtime of the application.The most parsimonious phylogenetic tree.The parsimony score of the phylogenetic
tree.
Specifications User Interface
No changes to the user interface.Uses a command line interface.
Specifications Hardware
3 PlayStation 3sHigh performance
Cross-Bar network switch.
Specifications Software
Fedora 9 with Linux 2.6.25 kernel for the Power PC
IBM Cell SDK 3.1IBM XLC 9.0 and GCC 4.3 compilers.DNAPenny 3.6.Bioperf Suite
Specifications Testing
Compare benchmarked runtimes over several iterations and inputs to get averages.
Compare these runtimes with previous group’s runtimes on single Cell processor.
Compare these runtimes with previous group’s runtimes on a high performance server.○ Quad-core Intel Xeon 3.0GHz, 6GB RAM.
Acknowledgements May08-24 group
Kyle ByerlyShannon McCormickMatt RohlfBryan Venteicher
Bioperf developersDavid A. Bader, Georgia Tech Yue Li, Univ. of Florida Tao Li, Univ. of Florida Vipin Sachdeva, IBM Austin
Questions?
Previous Results and Projected Results
Code revision 4-Way 3.0GHz Machine (seconds)
X Speedup
PlayStation 3 (seconds)
X Speedup
dnapenny_orig 823.568 1 7793.915 1
dnapenny_slimmer 360.131 2.28685673
941.981 8.273962
parallel_dnapenny_1.0 221.432 3.71928177
780.867 9.9811043
supplement_spe_parallel_1SPE
1111.471 7.0122522
supplement_spe_parallel_3SPE
443.521 17.572821
supplement_spe_parallel_6SPE
277.233 28.11323
supplement_parallel_vector_1SPE
260.952 29.867236
supplement_parallel_vector_3SPE
153.656 50.723141
supplement_parallel_vector_6SPE
130.59 59.682326
Cluster with 3 PlayStations
(Projected)
~54.8 ~142.224
1 2 3 4 5 6 7 80
10
20
30
40
50
60
70
f(x) = 5.72802144736842 x + 21.9361413947368R² = 0.887915258548363
Number of available SPEs + PPE
x Sp
eedU
p (C
ompa
red
to o
rigin
al p
rogr
am ru
nnin
g on
one
PPE
)
Summary Cost: $0. Equipment provided. Time: 555 approximate man hours.
Freely Donated. Results: 4x the performance of a
similarly priced system.