Date post: | 21-Dec-2015 |
Category: |
Documents |
View: | 220 times |
Download: | 0 times |
Cluster Computer For Bioinformatics
Applications
Nile University,
Bioinformatics Group.
Hisham Adel
2008
3
Points
• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.
5
Points
• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.
6
Cluster Definition
•Group of computers and servers (connected together) that act like a single system.
•Each system called a Node.
•Node contain one or more Processor , Ram ,Hard disk and LAN card.
•Nodes work in Parallel.
•We can increase performance by adding more Nodes.
9
Points
• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.
10
Cluster types
•Load Balancing Cluster (Parallel BLAST).
•Computing Cluster(Parallel sequence alignment).
•High-availability (HA) clusters.
15
Points
• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.
17
Communication : Switch 5-Port 10/100Mbps.
Processor and Ram: -Master Node Duo core Processor 1.86 GHZ. Ram 1GB.-Node 1 Pentium 4 Ram 1GB.-Node 2 Pentium 4 Ram 1GB-Node 3 Pentium 4 Ram 512 MB
Our Cluster specification
18
Operating System OPEN SUSE 10.3
http://software.opensuse.org/
MPICH2
http://www.mcs.anl.gov/research/projects/mpich2/
Our Cluster specification (cont’)
19
Points
• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.
28
Points
• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.
31
How to Align two Sequences.
if we have two sequences A A A C G A A A T G ALet match=1, gap=-1 , miss-match=0.
they can be aligned as:
1- A A A C G A | | | | | | Score=3 A A T _ G A
2- A A A C _ G A | | | | | | | Score=1 A A _ _ T G A
32
Points
• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance• Cluster Computer for Basic Problems..• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.
35
Blast search types.
BLASTN - Compares a nucleotide query sequence against a nucleotide sequencedatabase.
BLASTP- Compares an amino acid query sequence against a protein sequencedatabase.
TBLASTN- Compares a protein query sequence against a nucleotide sequenceDatabase.
BLASTX- Compares nucleotide query sequence against a protein sequence database.
38
Parallel BLAST(cont’)
Formatdb.c
Nucleotide sequence database “formatdb -i DATABASE -p F “.
Protein sequence database “formatdb -i DATABASE -p T “.
39
Linux_Cluster_BLASTALL.c
“blastall -p BLAST Search Type -d DATABASE -i QUERY FILE -o out . Txt”
Parallel BLAST(cont’)
40
Results Average of running 1000 Query, 1000 times.
month.htgs (573 MB)drosoph.nt (118,6 MB))
igseqnt (67.5 MB)Yeastnt (3.2 MB)
mito.nt (3.2 MB)Pdbnt (1.7 MB)
0.0000000
0.2000000
0.4000000
0.6000000
0.8000000
1.0000000
1.2000000
1.4000000
1.6000000
1.8000000
Nucleotide-Nucleotide
1 Node
3 Nodes-Query time
3-Nodes-Query and communication time
Database(Size)
Tim
e(S)
41
Results(cont’) Average of running 1000 Query, 1000 times.
env_nr(1.6GB) nr(573MB) SwissProt(160MB) Pdbaa(20MB) Yeast.aa(3.2MB)
0.000000
10.000000
20.000000
30.000000
40.000000
50.000000
60.000000
70.000000
80.000000
90.000000
Amino acid_Amino acid
1 Node-Query Time
3 Nodes-Query time
3 Nodes-Query and communication time
Database(size)
Tim
e(S)
42
Results(cont’) Average of running 1000 Query, 1000 times.
env_nr(1.6GB) Swissprot(160MB) nr(84.7MB) Pdbaa(20.4MB) yeast.aa(3.2MB)
0.0000000
10.0000000
20.0000000
30.0000000
40.0000000
50.0000000
60.0000000
70.0000000
80.0000000
90.0000000
Amino acid_Nucltide
1 Node Query time
3 Nodes Query time only
3 Nodes Query and Communication time
Database(Size)
Tim
e(S
)
43
Conclusion about Parallel BLAST.
•Performane: Batter by using CLUSTER.
•Scalability:More Nodes time decrease.
44
Points
• Introduction.• Cluster and Supercomputers.• Cluster Types and Advantages.• Our Cluster.• Cluster Performance.• Cluster Computer for Basic Problems.• General Idea about Sequence Alignment.• BLAST and Parallel BLAST Algorithm.• Sequence Alignment and Parallel Sequence Alignment. • Learned Skills.
51
Learned Skills.
•Using Linux (Suse 10.3) operating system.
• Programming using C language.
• Cluster computers and how to build one.
• MPICH2 for message passing interfaces between nodes.
• Latex.
• Team working, and helping each other.
• Presentation skills.