Date post: | 30-Dec-2015 |
Category: |
Documents |
Upload: | kenneth-roach |
View: | 36 times |
Download: | 1 times |
ClusterCluster
A Cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand alone/complete computers cooperatively working together as a single, integrated computing resource.
clus·ter n. 1. A group of the same or similar elements gathered or occur
ring closely together; a bunch: “She held out her hand, a small tight cluster of fingers” (Anne Tyler).
2. Linguistics. Two or more successive consonants in a word, as cl and st in the word cluster.
Why Parallel Processing?
Evolutionary Computation
FeaturesIt simulates the mechanism of creatures’ heredity and evolution. It can apply to several types of problems.It needs a huge computational costs.
There are several individuals.Tasks can be divided into sub tasks.
High Performance Computing
Name Rmax(Gflops)4938
9632
2144
6144
1336
# Proc
8192
2379
5808
1608
1417
ASCI White
ASCI Red2
Top500Top500http://www.top500.org
SP Power III5
ASCI Blue4
ASCI Blue Pacific3
1
Ranking
Parallel Computers
Commodity HardwareCPU
Pentium
AlphaPower etc.
NetworkingInternet
Lan
Wan
Gigabit
cable less
etc.
PCs + Networking
PC Clusters
Why PC Cluster?Why PC Cluster?
hardwareCommodity Off-the-shelf
SoftwareOpen sourceFree ware
PeoplewareUniversity students and staffLab nerds
High ability
Low Cost
Easy to setup
Easy to use
Possession
Name Rmax(Gflops)237
232.6
143.3
96.2
64.7
# Proc
512
580
528
196
132
Los Lobos
CPlant Cluster84
Top500Top500http://www.top500.org
SCore II/PIII 800 MHz396
Kepler PIII 650 MHz215
CLIC PIII 800 MHz126
60
Ranking
Contents of this tutorial
Concept of PC ClustersSmall ClusterAdvanced Cluster
HardwareSoftware
Books, Web sites, …Conclusions
What is cluster computing systems?
Beowulf Cluster
A Beowulf is a collection of personal computers (PCs) interconnected by widely available networking running any one of several open-source Unix-like operating systems.Some Linux clusters are built for reliability instead of speed. These are not Beowulfs. The Beowulf Project was started by Donald Becker when he moved to CESDIS in early 1994. CESDIS was located at NASA's Goddard Space Flight Center, and was operated for NASA by USRA.
http://beowulf.org/
AvalonAvalon
Los Alamos NationalLaboratory
Alpha(140) + MyrinetBeowulfFirst Beowulf in the ranking of Top 500
http://cnls.lanl.gov/Frames/avalon-a.html
The Berkeley NOW project
The Berkeley NOW project is building system support for using a network of workstations (NOW) to act as a distributed supercomputer on a building-wide scale. April 30, 1997: NOW makes LINPACK Top 500!June 15, 1998: NOW Retreat Finale
http://now.cs.berkeley.edu/
Cplant ClusterCplant Cluster
Sandia National LaboratoryAlpha(580) + Myrinet
http://www.cs.sandia.gov/cplant/
RWCP ClusterRWCP Cluster
Japanese typical clusterScore, Open MPMyrinet
http://pdswww.rwcp.or.jp/
Doshisha Cluster
Pentium III 0.8G (256) + Fast EthernetPentium III 1.0 G (2*64) + Myrinet 2000
http://www.is.doshisha.ac.jp/cluster/index.html
Let’s start to build simple Cluster system !!
Simple Cluster
$10000
8nodes + gateway(file server)Fast EthernetSwitching Hub
What do we need?Hardware
CPUmemorymotherboardhard disccasenetwork cardcablehub
Normal PCs
Classification of Parallel Computers
Classification of Parallel Computers
What do we need?Software
OStoolsEditorCompilerParallel Library
Message passing
Message Passing Libraries
MPI (Message Passing Interface)
PVM (Parallel Virtual Machine)
PVM was developed at Oak Ridge National Laboratory and the University of Tennessee.
MPI is an API of message passing.1992: MPI forum1994 MPI 11887 MPI 2
http://www.epm.ornl.gov/pvm/pvm_home.html
http://www-unix.mcs.anl.gov/mpi/index.html
Implementations of MPI
Free ImplementationMPICH : LAM :WMPI : Windows 95 , NTCHIMP/MPIMPI Light
Bender ImplementationImplementations of parallel computersMPI/PRO :
Procedure of constructing clusters
Prepare several PCs
Connected PCs
Install OS and tools
Install developing tools and parallel library
Installing MPICH/LAM
# rpm –ivh lam-6.3.3b28-1.i386.rpm
# rpm –ivh mpich-1.2.0-5.i386.rpm
# dpkg –i lam2_6.3.2-3.deb# dpkg –i mpich_1.1.2-11.deb# apt-get install lam2# apt-get install mpich
Parallel programming (MPI)
Massive parallel computer
PC-Cluster
user
gateway
JobsTasks
InitializationCommunicatorAcquiring number of process
Acquiring rank
Termination
Programming style sheet
# include “mpi.h”
int main( int argc, char **argv )
{
MPI_Init(&argc, &argv ) ;
MPI_Comm_size( …… ) ;MPI_Comm_rank( …… ) ;
/* parallel procedure */
MPI_Finalize( ) ;
return 0 ;
}
One by one communication
Group communication
Communications
Process A Process BReceive/send data
Receive/send data
One by one communication
[Sending] MPI_Send( void *buf, int count, MPI_Datat
ype datatype, int dest, int tag, MPI_Comm comm)void *buf : Sending buffer starting address (IN)
int count : Number of Data (IN)
MPI_ Datatype datatype : data type (IN)
int dest : receiving point (IN)
int tag : message tag (IN)
MPI_Comm comm : communicator(IN)
[Receiving]MPI_Recv( void *buf, int count, MPI_Datatyp
e datatype, int source, int tag, MPI_Comm comm, MPI_Status status)
void *buf : Receiving buffer starting address (OUT)
int source : sending point (IN)
int tag : Message tag (IN)
MPI_Status *status : Status (OUT)
One by one communication
~Hello.c~#include <stdio.h>#include "mpi.h"void main(int argc,char *argv[]){ int myid,procs,src,dest,tag=1000,count; char inmsg[10],outmsg[]="hello"; MPI_Status stat; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&myid); count=sizeof(outmsg)/sizeof(char); if(myid == 0){ src = 1; dest = 1; MPI_Send(&outmsg,count,MPI_CHAR,dest,tag,MPI_COMM_WORLD); MPI_Recv(&inmsg,count,MPI_CHAR,src,tag,MPI_COMM_WORLD,&stat); printf("%s from rank %d\n",&inmsg,src); }else{ src = 0; dest = 0; MPI_Recv(&inmsg,count,MPI_CHAR,src,tag,MPI_COMM_WORLD,&stat); MPI_Send(&outmsg,count,MPI_CHAR,dest,tag,MPI_COMM_WORLD); printf("%s from rank %d\n",&inmsg,src); } MPI_Finalize(); }
One by one communication
MPI_Recv(&inmsg,count,MPI_CHAR,src, tag,MPI_COMM_WORLD,&stat); MPI_Send(&outmsg,count,MPI_CHAR,dest, tag,MPI_COMM_WORLD);
MPI_Sendrecv(&outmsg,count,MPI_CHAR,dest, tag,&inmsg,count,MPI_CHAR,src, tag,MPI_COMM_WORLD,&stat);
0
0.5
1
1.5
2
2.5
3
3.5
4
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1x
y
Calculation of PI (approximation)
1
0 21
4dx
x
Integral calculus is divided in to sub sections.
Each subsection is allotted to processors.
Results of calculation are assembled.
-Parallel conversion-
Group communication
BroadcastMPI_Bcast( void *buf, int count, MPI_Datatype
datatype, int root, MPI_Comm comm )
Data
Rank of sending point
• Communication and operation (reduce) MPI_Reduce( void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm ) Operation handle
Operation
Group Communication
Rank of receiving point
MPI_SUM, MPI_MAX, MPI_MIN, MPI_PROD
Approximation of PI Programming flow
More Cluster systems !!
CPU
Intel Pentium III, IVAMD Athlon Transmeta Crusoe
Hardware
http://www.intel.com/
http://www.amd.com/
http://www.transmeta.com/
Network
GigabitWake On LAN
Hardware
EthernetGigabit EthernetMyrinetQsNetGiganetSCI
AtollVIAInfinband
Hard disc
SCSIIDERaidDiskless Cluster
Hardware
http://www.linuxdoc.org/HOWTO/Diskless-HOWTO.html
Case
Rack
Hardware
Box
inexpensive
compact
maintenance
Software Software
OS
Linux Kernels
Open source networkFree ware
The /proc file systemLoadable kernel modulesVirtual consolesPackage management
Features
OS
Linux Kernels
Linux Distributions
Red Hat www.redhat.comDebian GNU/Linux www.debian.orgS.u.S.E. www.suse.comSlackware www.slackware.org
http://www.kernel.org/
Administration software
Administration software
NFS ( Network File System)NIS (Network Information System)NTP (Network Time Protocol)
server client
client
client
Resource Management and SchedulingResource Management and Scheduling
Process distributionLoad balanceJob scheduling of multiple tasks
CONDOR http://www.cs.wisc.edu/condor/DQS http://www.scri.fsu.edu/~pasko/dqs.htmlLSF http://www.platform.com/index.htmlThe Sun Grid Engine http://www.sun.com/software/gridware/
Tools for Program Development
Tools for Program Development
Editor Emacs
Language C, C++, Fortran, Java
CompilerGNU http://www.gnu.org/
NAG http://www.nag.co.uk
PGI http://www.pgroup.com/
VAST http://www.psrv.com/
Absoft http://www.absoft.com/
Fujitsu http://www.fqs.co.jp/fort-c/
Intel http://developer.intel.com/software/products/compilers/index.htm
Tools for Program Development
Tools for Program Development
MakeCVS Debugger
GdbTotal View http://www.etnus.com
Free MPI Implementations
Lam
http://www-unix.mcs.anl.gov/mpi/index.htmlEasy to useHigh portability
for UNIX, NT/Win, Globus
mpich
http://www.lam-mpi.org/High availability
LAM (6.3.2)
0
10
20
30
40
50
60
0 20 40 60
Number of Process
Spe
edup
1X5X10X50X100X500X
MPICH (1.2.1)
0
10
20
30
40
50
60
0 20 40 60
Number of Process
Spe
edup
1X5X10X50X100X500X
# node 32 ,2Processor type Pentium III 700MHz
Memory 128 Mbytes OS Linux 2.2.16
Network Fast Ethernet , TCP/IP
Switching HUB
MPICH VS LAM ( SMP)
DGAGcc(2.95.3), mpicc-O2 –funroll - loops
LAM (6.4-a3)
0
2
4
6
8
0 10 20 30
Number of Process
Spe
edup
1X5X10X50X100X500X
MPICH (1.2.0)
0
2
4
6
8
0 10 20 30
Number of Process
Spe
edup
# node 8processor Pentium 850MHzⅢmemory 256 Mbytes
OS Linux 2.2.17Network Fast Ethernet , TCP/IP
Switching HUB
DGAGcc(2.95.3), mpicc-O2 –funroll - loops
MPICH VS LAM (# process)
ProfilerProfiler
MPE (MPICH)Paradyn http://www.cs.wisc.edu/paradyn/
Vampierhttp://www.pallas.de/pages/vampir.htm
Message passing library for WinMessage passing library for Win
PVMPVM3.4WPVM
MPImpichWMPI(Critical Software)MPICH/NT(Mississippi State Univ.)MPI Pro(MPI Software Technology)
Cluster Distribution
FAI http://www.informatik.uni-koeln.de/fai/
Alinka http://www.alinka.com/
Mosix http://www.mosix.cs.huji.ac.il/
Bproc http://www.beowulf.org/software/bproc.html
Scyld http://www.scyld.com/
Scorehttp://pdswww.rwcp.or.jp/dist/score/html/index.html
Math LibraryPhiPac from BerkeleyFFTW from MIT www.fftw.orgAtlas
Automatic Tuned Linear Algebra softwarewww.netlib.org/atlas/
ATLAS is an adaptive software architecture and faster than all other portable BLAS implementations and it is comparable with machine specific libraries provided by the vender.
Math Library
PETScPETSc is a large stuite of data structures and routin
es for both uni and parallel processor scientific computing.
http://www-fp.mcs.anl.gov/petsc/
Parallel Genetic Algorithms
Models of Parallel GAs
Master Slave (Micro grained )
Cellular (Fine grained)
Distributed GAs
(Island, Coarse grained)
Master Slave model
a) delivers each individual to slave
Master node
client client client
evaluate
client client client client client client
b) returns the value as soon as finishes calculation
c) sends non-evaluated individual from master
crossover mutation evaluation selection
evaluate evaluate
Cellular GAs
Distributed Genetic Algorithms(Island GAs)
subpopulation
migration
Searching Ability of DGAs
Books and Web sites
BooksBooks
“Building Linux Clusters”“How to Build Beowulf”“High Performance Cluster Computing”
Web sitesWeb sitesIEEE Computer Society Task Force on Cluster Computinghttp://www.ieeetfcc.org/
White Paper http://www.dcs.port.ac.uk/~mab/tfcc/WhitePaper/
Cluster top 500http://clusters.top500.org/
Beowulf Projecthttp://www.beowulf.org/
Beowulf Under Groundhttp://www.beowulf-underground.org/
In this tutorial….
Concept of cluster systemHow to built systemsParallel Genetic Algorithms
SSI(Single System Image)
SSI(Single System Image)
Entry pointFile directoryControl pointVirtual NetworkMemory SpaceJob ManagerUser InterfaceMisc
Global Computing (GRID)
Internet
There are several types of computers
Powerful calculation resources
ex. SETI@home
Project rc5
From global to space computingFrom global to space computing
Distributed ComputingDistributed Computing