MPI (Version 0.99 2006 11 8 ) 1 1 MPI ( Message Passing Interface ) 1 1.1 MPI................................. 1 1.2............................... 2 1.2.1 MPI GATHER.......................... 2 1.2.2 MPI GATHERV......................... 2 1.2.3 MPI ALLGATHER....................... 4 1.2.4 MPI ALLGATHERV....................... 4 1.2.5 MPI REDUCE.......................... 5 1.3 DO.......... 6 2 granurality 7 3 8 3.1........................... 8 4 8 A C 10 B MPI LINUX Machine 11 C MPI 11 1 ogawa@eedept.kobe-u.ac.jp i
1 MPI ( Message Passing Interface ) 1.1 MPI MPI Message Passing Interface CPU node 1PVM(Parallel Virtual Machine) MPI MPI CPU1 CPU2 CPU3 CPU N memory memory memory memory Giga-Bit Ether Net (or Faster Network) 1 MPI MPI (FORTRAN) C * integer myrank,nprocs,mpi_comm_world! necessary call mpi_init(ierr) call mpi_comm_size(mpi_comm_world,nprocs,ierr) call mpi_comm_rank(mpi_comm_world,myrank,ierr).. (Program Main Body). * C call mpi_finalize(ierr) end 1
(Program Main Body) myrank 1.2 1.2.1 MPI GATHER (root) (recvbuf) MPI SCATTER call mpi_gather(sendbuf, sendcount, sendtype, & recvbuf, recvcount, recvtype, root, mpi_comm_world, ierr) sendbuf () sendcount sendtype recvbuf ) recvcount recvtype root mpi comm world mpi comm world ierr ( ) 1 1.2.2 MPI GATHERV (root) 2
1 MPI INTEGER MPI REAL MPI DOUBLE PRECISION MPI COMPLEX MPI 2INTEGER MPI 2REAL MPI 2DOUBLE PRECISION {MPI INTEGER, MPI INTEGER} {MPI REAL, MPI REAL} {MPI DOUBLE PRECISION, MPI DOUBLE PRECISION} MPI GATHER ( MPI ALLGATHER call mpi_gatherv(sendbuf, sendcount, sendtype, & recvbuf, recvcount, displs, recvtype, root, mpi_comm_world,ierr) sendbuf () sendcount sendtype recvbuf ) recvcount i i+1 0 ) displs i recvbuf () i+1 0 ) root recvtype root mpi comm world mpi comm world ierr 3
1.2.3 MPI ALLGATHER MPI GATHER MPI BCAST call mpi_allgather(sendbuf, sendcount, sendtype, & recvbuf, recvcount, recvtype, mpi_comm_world,ierr) sendbuf () sendcount sendtype recvbuf ) recvcount recvtype mpi comm world ierr 1.2.4 MPI ALLGATHERV MPI ALLGATHER ( MPI ALLGATHER call mpi_allgatherv(sendbuf, sendcount, sendtype, & recvbuf, recvcount, displs, recvtype, mpi_comm_world,ierr) 4
sendbuf () sendcount sendtype recvbuf ) recvcount i i+1 0 ) displs i recvbuf () i+1 0 ) recvtype mpi comm world ierr c c do irank=0,nprocs-1 call para_range(1,iemax,nprocs,irank,jsta,jend) jjlen(irank)=ild*(iwm-1)*(jend-jsta+1) idisp(irank)=ild*(iwm-1)*(jsta-1) end do call para_range(1,iemax,nprocs,myrank,ista,iend) call mpi_allgatherv(adedmmy(1,ista),jjlen(myrank), & MPI_DOUBLE_PRECISION,ADE,JJLEN,IDISP, & MPI_DOUBLE_PRECISION,MPI_COMM_WORLD,IERR) 1.2.5 MPI REDUCE MPI REDUCE 2 call mpi_reduce(sumi,sumall,1,mpi_double_precision, & mpi_sum,0,mpi_comm_world,ierr) 5
2 MPI REDUCE MPI SUM( ) MPI PROD( ) MPI MAX() MPI MIN() MPI MAXLOC( ) MP MINLOC( ) MPI INTEGER MPI REAL MPI DOUBLE PRECISION MPI COMPLEX MPI INTEGER MPI REAL MPI DOUBLE PRECISION MPI 2INTEGER MPI 2REAL MPI 2DOUBLE PRECISION sumi () sumall ) 1 comm mpi double precision mpi sum 0 comm mpi comm world ierr 1.3 DO c c c subroutine para_range(n1,n2,nprocs,irank,ista,iend) iwork = (n2-n1)/nprocs + 1 ista = min(irank*iwork+n1,n2+1) iend = min(ista+iwork-1,n2) end 6
2 granurality 2 DO MAIN PROGRAM DO II = 1, 8 CALL SUB1(II) CALL SUB2(SUM) A(II) = SUM SUBTOUTINE SUB1(II) CALL SUB3(II) SUBROUTINE SUB3(II) DO I1 = 1, 100 A(II) = I1 * II DO I2 = 1, 100 B(I2) = A(I2-1) + A(I2+1) SUBROUTINE SUB2(SUM) CALL SUB4(SUM) SUBROUTINE SUB4(SUM) DO I1 = 1, 100 B(I1) = B(I1) + 1.0 SUM = 0.0 DDO I2 = 1, 100 SUM = SUM+B(I2) 2 Single CPU Source Code 2 2 3 DO DO 7
MAIN PROGRAM DO II = ISTART, IEND CALL SUB1(II) CALL SUB2(SUM) A(II) = SUM parallelization in the highest level SUBTOUTINE SUB1(II) CALL SUB3(II) SUBROUTINE SUB3(II) DO I1 = 1, 100 A(II) = I1 * II DO I2 = 1, 100 B(I2) = A(I2-1) + A(I2+1) SUBROUTINE SUB2(SUM) CALL SUB4(SUM) SUBROUTINE SUB4(SUM) DO I1 = 1, 100 B(I1) = B(I1) + 1.0 SUM = 0.0 DDO I2 = 1, 100 SUM = SUM+B(I2) 3 2 3 MPICH [?] 3.1 fortran 77/90 mpif77, c/c++ mpicc (ifc, icc) ( intel fortran compiler (ifc) c/c++ compiler (icc) ) % mpif77 -o exec_file foo.f [-O3 -tpp7 -xw -static] # foo.f exec_file % mpif77 -c foo.f [-O3 -tpp7 -xw -static] # % mpif77 foo.o -o exec_file # 4, mpirun % mpirun -np 8 exec_file 8
# exec_file 8 CPU % bsub -o logfile.out -q normal -n 8 "mpijob mpirun exec_file " # LSF bsub exec_file 8 CPU 9
A C C C, C++) mpi gather FORTRAN #include "mpi.h" /* include header file */ void main(int argc, char **argv) { int myrank, error, buffer mpi_status status; mpi_init(argc, argv); mpi_comm_rank(mpi_comm_world, &myrank); (main program body) mpi_finalize(); /* necessary */ } 10
B MPI LINUX Machine W. Gropp and E. Lusk, Installation and User s Guide ti MPICH, a Portable Implementaiton of MPI Version 1.2.5 The ch shmem device for Shared Memory Processors, Mathematics and Computer Science Division, University of Chicago and Argonne National laboratory. http//www.mcs.anl.gov/mpi/mpich/index.html http//www.epm.ornl.gov/pvm/pvm home.html C MPI MPI 11