Untitled

Similar documents
AutoTuned-RB

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2

1 (bit ) ( ) PC WS CPU IEEE754 standard ( 24bit) ( 53bit)

untitled

untitled

Slide 1

FFTSS Library Version 3.0 User's Guide

2 2.1 Mac OS CPU Mac OS tar zxf zpares_0.9.6.tar.gz cd zpares_0.9.6 Mac Makefile Mekefile.inc cp Makefile.inc/make.inc.gfortran.seq.macosx make

Microsoft Word - appli_OpenMX_install.docx

2008 ( 13 ) C LAPACK 2008 ( 13 )C LAPACK p. 1

MPI usage

ストリーミング SIMD 拡張命令2 (SSE2) を使用した SAXPY/DAXPY

C

comment.dvi

appli_HPhi_install

Second-semi.PDF

RedHat OpenFOAM OpenFOAM ver 2.3 RedHat(RHEL)

44 6 MPI 4 : #LIB=-lmpich -lm 5 : LIB=-lmpi -lm 7 : mpi1: mpi1.c 8 : $(CC) -o mpi1 mpi1.c $(LIB) 9 : 10 : clean: 11 : -$(DEL) mpi1 make mpi1 1 % mpiru

インテル(R) Visual Fortran Composer XE

Sophos Anti-Virus UNIX or Linux startup guide

MBLAS¤ÈMLAPACK; ¿ÇÜĹÀºÅÙÈǤÎBLAS/LAPACK¤ÎºîÀ®

LAN Copyright c Daikoku Manabu This tutorial is licensed under a Creative Commons Attribution 2.1 Japan License

8 / 0 1 i++ i 1 i-- i C !!! C 2

Informatics 2014

Intel® Compilers Professional Editions

LetItB Installation Manual - Japanese version

3.2 Linux root vi(vim) vi emacs emacs 4 Linux Kernel Linux Git 4.1 Git Git Linux Linux Linus Fedora root yum install global(debian Ubuntu apt-get inst

SQUFOF NTT Shanks SQUFOF SQUFOF Pentium III Pentium 4 SQUFOF 2.03 (Pentium 4 2.0GHz Willamette) N UBASIC 50 / 200 [

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1


XMPによる並列化実装2

OpenCV Windows(cygwin) Linux USB PC [1] Inel OpenCV OpenCV 1 Windows Linux OpenCV (a) (b)2 (c) (d) 1: OpenCV 1

倍々精度RgemmのnVidia C2050上への実装と応用

Itanium2ベンチマーク

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1

para02-2.dvi

20 H8/3069LAN H. Fukura

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1

2

Platypus-QM β ( )

Informatics 2010.key

C C UNIX C ( ) 4 1 HTML 1

Microsoft Word - C.....u.K...doc

Informatics 2015

$ cmake --version $ make --version $ gcc --version 環境が無いあるいはバージョンが古い場合は yum などを用いて導入 最新化を行う 4. 圧縮ファイルを解凍する $ tar xzvf gromacs tar.gz 5. cmake を用

インテル(R) Visual Fortran Composer XE 2011 Windows版 入門ガイド

IntelR Software Development Tools for Apple

( CUDA CUDA CUDA CUDA ( NVIDIA CUDA I

C¥×¥í¥°¥é¥ß¥ó¥° ÆþÌç

Gauss

卒業論文

Apache Web Server 2 Compaq ActiveAnswers Deskpro Compaq Insight Manager Fastart Systempro Systempro/LT ProLiant ROMPaq Qvision SmartStart NetFlex Quic

I I / 47

_Vol16No2.indd

Microsoft Word - w_mkl_build_howto.doc

④辻修平_37-48.smd

1 return main() { main main C 1 戻り値の型 関数名 引数 関数ブロックをあらわす中括弧 main() 関数の定義 int main(void){ printf("hello World!!\n"); return 0; 戻り値 1: main() 2.2 C main

Quickstart Guide 3rd Edition

IMSL Fortran Numerical Library Ver for Linux, Unix IMSL Fortran ライブラリ Ver インストールガイド (LINUX, UNIX 版 ) ローグウェーブソフトウェアジャパン株式会社カスタマーサポートセンター

‚æ2›ñ C„¾„ê‡Ìš|

MINI2440マニュアル

Gfarm/MPI-IOの 概要と使い方


Linux XScreenSaver T020074

インテル(R) C++ Composer XE 2011 Windows版 入門ガイド

スパコンに通じる並列プログラミングの基礎

インテル® C++ コンパイラー 11.1 Mac OS* X 版プロフェッショナル・エディション インストール・ガイドおよびリリースノート

v10 IA-32 64¹ IA-64²

大規模共有メモリーシステムでのGAMESSの利点

スパコンに通じる並列プログラミングの基礎

num2.dvi

Microsoft Word - appli_SMASH_tutorial_2.docx

DPD Software Development Products Overview

NAG Fortran Library, Mark 24 FSL6I24DCL - License Managed Linux 64 (Intel 64 / AMD64), Intel Fortran, Double Precision インストールノート 内容 1. イントロダクション... 1

NAG Fortran Library, Mark 24 FLL6I24DCL - License Managed Linux 64 (Intel 64 / AMD64), Intel Fortran, Double Precision インストールノート 内容 1. イントロダクション... 1


並列計算の数理とアルゴリズム サンプルページ この本の定価 判型などは, 以下の URL からご覧いただけます. このサンプルページの内容は, 初版 1 刷発行時のものです.

ViewSonic Corporation, Macintosh Power Macintosh Microsoft Windows Windows ViewSonic 3 OnView ViewMatch ViewMeter ViewSonic ViewSonic, ViewSonic

r07.dvi

I 2 tutimura/ I 2 p.1/??

ohp07.dvi

NAG Fortran Library, Mark 21 # FLLUX21DCL

超初心者用

main.dvi

橡環境設定.PDF

1st-session key

Report Template

Source: Intel.Config: Pentium III Processor-Intel Seattle SE440BX-2, 128MB PC100 CL2 SDRAM Intel 440BX-2 Chipset Platform- Diamond Viper 550 /

j x j j j + 1 l j l j = x j+1 x j, n x n x 1 = n 1 l j j=1 H j j + 1 l j l j E

joho09.ppt

untitled

. UNIX, Linux, KNOPPIX. C,.,., ( 1 ) p. 2

Appendix

strtok-count.eps

Compiled MODELSでのDFT位相検出装置のモデル化と評価

I. Backus-Naur BNF : N N 0 N N N N N N 0, 1 BNF N N 0 11 (parse tree) 11 (1) (2) (3) (4) II. 0(0 101)* (

:30 12:00 I. I VI II. III. IV. a d V. VI

untitled

1重谷.PDF

£Ã¥×¥í¥°¥é¥ß¥ó¥°ÆþÌç (2018) - Â裵²ó ¨¡ À©¸æ¹½Â¤¡§¾ò·ïʬ´ô ¨¡

Transcription:

VASP 2703 2006 3

VASP 100 PC 3,4 VASP VASP VASP FFT. (LAPACK,BLAS,FFT), CPU VASP. 1 C LAPACK,BLAS VASP VASP VASP VASP bench.hg VASP CPU CPU CPU northwood LAPACK lmkl lapack64, BLAS lmkl p4 LA- PACK liblapack, BLAS libblas 51% CPU prescott LAPACK lapack double, BLAS libgoto LAPACK lapack double, BLAS lmkl em64t 40% -O0 -O1 60% 2 VASP CPU northwood LAPACK lmkl lapack64, BLAS lmkl CPU2 36% CPU3 95% CPU4 249%

1 2 2 3 2.1 CPU.............................. 3 3 BLAS,LAPACK 4 3.1............................... 4 3.1.1 Northwood........................... 4 3.1.2 Prescott............................. 6 3.2.................................. 8 3.3.................................. 8 4 CPU VASP 9 4.1.................... 9 4.1.1 Northwood........................... 9 4.1.2 Prescott............................. 10 4.2......................... 10 4.2.1 Northwood........................... 11 4.2.2 Prescott............................. 12 5 VASP 13 5.1 VASP.................. 14 6 17 A 18 B 20 C VASP 22 D MPICH 27 1

1 VASP 100 PC 3,4 VASP VASP VASP FFT. (LAPACK,BLAS,FFT), CPU VASP. 2

2 1. 1 C LAPACK,BLAS 2. VASP VASP VASP VASP VASP bench.hg 3. CPU 2.1 CPU 1. CPU: Intel Pentium4 Northwood (3.2GHz FSB=800MHz) OS: SuSE Linux9.3 : ASUS P4C800 : 2 :512KB : 1GB (Trancend PC3200 512MB ECC DIMM 2) : Seagate ST380817AS(SerialATA 80GB) 1 2. CPU: Intel Pentium4 Prescott 650(3.4GHz FSB=800MHz) OS: SuSE Linux9.3 : SuperMicro PDSGE : 2 :2MB : 2GB (PC2-4200 ECC 1GB DIMM 2) : Seagate ST3800817AS(SerialATA 80GB) 1 3

3 BLAS,LAPACK 3.1 1 C (A ) LAPACK,BLAS. LAPACK(Linear Algebra PACKage) netlib FORTRAN 77 CLAPACK C LAPACK CPU BLAS(Basic Linear Algebraic Subprograms) BLAS LAPACK BLAS CPU BLAS( ) C BLAS BLAS LAPACK LAPACK,BLAS LAPACK [1] n n LAPACK 1 0.67 N 3 3.1.1 Northwood CPU Intel Pentium4 northwood 3.2GHz 1 C LAPACK,BLAS. 3.1 LAPACK liblapack,blas libblas 1000 1000 0.82 817Mflops Mflops 1 100 ( ) flops Floating point number Operations Per Second 1 1 M( ) 100 (10 6 ). 2000 2000 8.24 325Mflops 2 4

3.1: N N. LAPACK BLAS N =1000 N =2000 time Mflops time Mflops libblas 0.82 817 8.24 325 liblapack lmkl p4 0.23 2913 1.53 1752 libgoto 0.42 1595 3.72 720 libblas 0.22 3045 1.33 2015 lmkl-lapack64 lmkl p4 0.20 3350 1.32 2030 libgoto 0.20 3350 1.31 2045 2 VASP LAPACK liblapack,blas libgoto 1000 1000 0.42 1596Mflops 2000 2000 3.72 720Mflops LAPACK libgoto libblas liblapack LAPACK liblapack,blas lmkl p4 1000 1000 0.23 2913Mflops 2000 2000 1.53 1752Mflops LAPACK lmkl p4 libgoto libblas liblapack. LAPACK lmkl lapack64,blas libbas, lmkl p4, libgoto 1000 1000 0.22, 0.2, 0.2 3045, 3350, 3350Mflops 2000 2000 1.33, 1.32, 1.31 2015, 2030, 2045Mflops LAPACK lmkl lapack64 liblapack lmkl lapack64 BLAS 3.1 LAPACK,BLAS liblapack SuSE Linux LAPACK lmkl lapack64 lmkl lib Math Kernal LIbrary 5

lapack64 LAPACK libblas SuSE Linux BLAS lmkl p4 Intel Math Kernal Library p4 Pentium4 (BLAS,FFT) libgoto Intel Pentium4 northwood BLAS http://www.tacc.utexas.edu/resources/software/ 3.1.2 Prescott CPU Intel Pentium4 prescott 3.4GHz 1 C LAPACK,BLAS. 3.2: N N. LAPACK BLAS N =1000 N =2000 time Mflops time Mflops liblapack libblas 0.86 779 6.60 406 lmkl lapack64 lmkl em64t 0.45 1489 4.27 628 lmkl lapack64 libgoto 0.16 4188 1.07 2505 3.2 LAPACK liblapack,blas libblas 1000 1000 0.86 779Mflops 2000 2000 6.60 406Mflops VASP 6

LAPACK lmkl lapack64,blas lmkl em64t 1000 1000 0.45 1489Mflops 2000 2000 4.27 628Mflops LAPACK lmkl lapack64, BLAS libgoto 1000 1000 0.16 4188Mflops 2000 2000 1.07 2505Mflops CPU Prescott Intel Math Kernal Library BLAS libgoto Intel Math Kernal Library LAPACK 3.2 LAPACK,BLAS liblapack CLAPACK LAPACK CLAPACK Fortran LAPACK C lmkl lapack64 libblas Intel Math Kernal Library EM64T LAPACK CLAPACK BLAS lmkl em64t libgoto Intel Math Kernal Library EM64T libgoto Intel Pentium4 prescott BLAS 7

3.2 CPU Intel Pentium4 northwood 3.2GHz C (B ) LAPACK,BLAS. LAPACK,BLAS LAPACK [1] n n LAPACK 1.33 N 3 3.3: N N. LAP ACK BLAS N=1000 N=2000 time Mflops time Mflops liblapack libblas 10.41 128 79.39 67 liblapack lmkl p4 9.90 134 74.87 71 lmkl lapack64 lmkl p4 9.62 138 72.47 73 3.3 LAPACK liblapack, BLAS libblas 1000 1000 10.41 128Mflops 2000 2000 79.39 67Mflops BLAS lmkl p4 1000 1000 9.9 134Mflops 2000 2000 74.87 71Mflops LAPACK lmkl lapack64, BLAS lmkl p4 1000 1000 9.62 138Mflops 2000 2000 72.47 73Mflops 3.3 CPU libgoto BLAS Northwood Intel Math Kernal Library BLAS Prescott libgoto CPU 8

4 CPU VASP 4.1 4.1.1 Northwood CPU Intel Pentium4 northwood 3.2GHz VASP (C ) VASP VASP bench.hg Intel 4.1 VASP 4.1: VASP. BLAS LAP ACK time lmkl p4 lapack double 203.5 lmkl lapack64 201.2 liblapack 202.9 libgoto lapack double 294.4 libblas liblapack 306.5 9

4.1.2 Prescott CPU Intel Pentium4 prescott 3.4GHz VASP VASP VASP bench.hg Intel 4.2 VASP 4.2: VASP. BLAS LAP ACK time lmkl em64t lapack double 192.7 lmkl lapack64 192.1 libgoto lapack double 137.4 (, ) VASP 4.2 Intel O0,O1,O2,O3. O0 O1 / O2 IA-32 Linux O1 O2 O3 O1 Pentium4 IA-32 O3. x{k W N B P} x Intel 1 10

K Pentium III Katmai, W Pentium 4 Willamete, N Northwood, B Pentium M Banias, P Prescott -xp Intel SSE3 SSE2 -xn -xw northwood SSE3 -xp This program was not built to run on the processor in your system. ax{k W N B P} 4.2.1 Northwood CPU Intel Pentium4 northwood 3.2GHz VASP VASP VASP bench.hg 4.3: VASP. OPTION BLAS LAP ACK time -O0 lmkl p4 lapack double 323.8 -O1 204.6 -O3 -xw -tpp7 203.5 -O3 -axn -xn-tpp7 -ip -mp1 203.0 -O0 lmkl p4 lmkl lapack64 323.9 -O1 206.8 -O3 -xw -tpp7 201.2 -O3 -axn -xn-tpp7 -ip -mp1 200.2 -O0 libgoto lapack double 435.5 -O1 316.7 -O3 -xw -tpp7 309.0 -O3 -axn -xn-tpp7 -ip -mp1 308.7 11

4.2.2 Prescott CPU Intel Pentium4 prescott 3.4GHz VASP VASP VASP bench.hg 4.4: VASP. OPTION BLAS LAP ACK time -O0 libgoto lapack double 254.0 -O1 140.8 -O3 -xw -tpp7 136.4 -O3 -axp -xp-tpp7 -ip -mp1 133.8 -O0 lmkl em64t llapack double 309.0 -O1 195.3 -O3 -xw -tpp7 191.0 -O3 -axp -xp-tpp7 -ip -mp1 188.7 -O0 lmkl em64t lmkl lapack64 308.1 -O1 194.7 -O3 -xw -tpp7 190.2 -O3 -axp -xp-tpp7 -ip -mp1 187.9 4.3, 4.4 -O0 -O1-03 - O0,-O1 VASP 12

5 VASP CPU Intel Pentium4 prescott 3.4GHz 2 VASP CPU CPU CPU MPI(Message Passing Interface) MPI C Fortran MPI MPICH MPICH(C ) VASP MPI MPICH mpirun 5.2 CPU 1 BLAS lmkl em64t,lapack lmkl lapack64 VASP 192.7 CPU 2 121.9 CPU 2 VASP 58% CPU 1 BLAS lmkl em64t,lapack lapack double VASP 195.3 CPU 2 123.3 CPU 2 VASP 58% CPU 1 BLAS libgoto,lapack lapack double VASP 137.4 CPU 2 93.5 CPU 2 VASP 47% CPU 2 VASP 13

5.1: CPU VASP. NODE BLAS LAP ACK time 1 lmkl em64t lmkl lapack64 192.7 2 lmkl em64t lmkl lapack64 121.9 1 lmkl em64t lapack double 195.3 2 lmkl em64t lapack double 123.3 1 libgoto lapack double 137.4 2 libgoto lapack double 93.5 5.1 VASP CPU Intel Pentium4 northwood 3.2GHz VASP 1 CPU VASP 2 VASP CPU 1/CPU VASP MPICH 5.1 CPU 2 CPU BLAS lmkl libgoto LAPACK -lmkl lapack64 lapack double -O0 -O3 -mp1 -tpp7 5.1 CPU 2 5.1 CPU 2 4 1/CPU CPU 8 CPU 4 14

5.2: CPU VASP. BLAS LAP ACK OPTION NODE time lmkl lmkl lapack64 -O0 -mp1 1 341.7 2 220.9 3 155.3 4 116.6 8 119.2 lmkl lmkl lapack64 -O1 -mp1 1 213.0 2 158.2 3 110.1 4 85.8 8 102.8 lmkl lmkl lapack64 -O3 -mp1 -tpp7 1 210.3 2 154.7 3 107.6 4 84.5 8 102.1 libgoto lapack double -O3 -mp1 -tpp7 1 244.0 2 175.9 3 123.0 4 95.6 8 108.3 15

5.1: VASP CPU 16

6 1. BLAS,LAPACK CPU 2. CPU VASP CPU northwood LAPACK lmkl lapack64, BLAS lmkl p4 LAPACK liblapack, BLAS libblas 51% CPU prescott LAPACK lapack double, BLAS libgoto LAPACK lapack double, BLAS lmkl em64t 40% -O0 -O1 60% 2 3. VASP CPU northwood LAPACK lmkl lapack64, BLAS lmkl CPU2 36% CPU3 95% CPU4 249% 17

A #include <stdio.h> #include <stdlib.h> #include <math.h> #include <time.h> //#include <veclib/veclib.h> #include "/usr/local/include/f2c.h" #include "/usr/local/include/clapack.h" void printmatrix(double *a, double *b, int n); int main(void){ long n, nrhs=1, lda, ldb, info; // double A[LDA*LDA], B[LDA*NRHS]; clock_t start, end; int i,j; double *a, *b; long *ipiv; scanf("%ld",&n); printf("%dn",n); lda=ldb=n; a=(double *)malloc(n*n*sizeof(double)); b=(double *)malloc(n*sizeof(double)); ipiv=(long *)malloc(n*sizeof(long)); for(i=0;i<n;i++){ for(j=0;j<n;j++){ a[i*n+j]=2*(double)random() / RAND_MAX -1.0; } } 18

for(i=0;i<n;i++){ b[i]=2*(double)random() / RAND_MAX -1.0; } // printmatrix(a,b,n); start=clock(); dgesv_(&n,&nrhs, a, &lda, ipiv, b, &ldb, &info); // MatrixInverse(a,b,n); // printmatrix(a,b,n); end=clock(); printf("%10.4fn",(double)(end-start)/clocks_per_sec); free(a); free(b); } return 0; void printmatrix(double *a, double *b, int n){ int i,j; for(i=0;i<n;i++){ for(j=0;j<n;j++){ printf("%10.5f",a[i*n+j]); } printf("%10.5f",b[i]); printf("n"); } printf("n"); return; } 19

B #include <stdio.h> #include <stdlib.h> #include <math.h> #include <time.h> //#include <veclib/veclib.h> #include "/usr/local/include/f2c.h" #include "/usr/local/include/clapack.h" void printmatrix(double *a, double *b, int n); int main(void){ long n, lda, lwork, info; // double A[LDA*LDA], B[LDA*NRHS]; char jobs= V, uplo= U ; clock_t start, end; int i,j; double *a, *w, *work; long *ipiv; scanf("%ld",&n); printf("%dn",n); lda=n; lwork=n*3; a=(double *)malloc(n*n*sizeof(double)); w=(double *)malloc(n*sizeof(double)); work=(double *)malloc(n*sizeof(double)); for(i=0;i<n;i++){ for(j=0;j<n;j++){ a[i*n+j]=2*(double)random() / RAND_MAX -1.0; } } 20

// printmatrix(a,w,n); start=clock(); //dgesv_(&n,&nrhs, a, &lda, ipiv, b, &ldb, &info); dsyev_( &jobs, &uplo, &n, a, &lda, w, work, &lwork, &info); // MatrixInverse(a,b,n); // printmatrix(a,b,n); end=clock(); printf("%10.4fn",(double)(end-start)/clocks_per_sec); free(a); free(w); return 0; } void printmatrix(double *a, double *b, int n){ int i,j; for(i=0;i<n;i++){ for(j=0;j<n;j++){ printf("%10.5f",a[i*n+j]); } printf("%10.5f",b[i]); printf("n"); } printf("n"); return; } 21

C VASP 1. Intel Fortran Compiler Version 9 (*.lic) /opt/intel/liceses/ mv /Desktop/commercial for l *.lic /opt/intel/licenses/ $ cd /media/l fc p 9 0 Intel Fortran Compiler Version 9 $./install.sh Please type a selection: 1 Please type a selection: 2 /opt/intel/liceses/commercial for l *.lic accept x.exit 2. Intel C++ Compiler Version 9 Fortran /opt/intel/liceses/ $ cd /media/l cc p 9 0 $./install.sh 22

Fortarn Linux Application Debugger Fortran. 3. Fortran, C++.cshrc $ emacs./cshrc set path= /opt/intel/fc/9.0/bin /opt/intel/cc/9.0/bin setenv LD LIBRARY PATH /opt/intel/mkl72/lib32: /opt/intel/fc/9.0/lib 4. YaST2 gcc glibc fftw3 (fftw fftw3 fftw3-debuginfo fftw3-devel fftw3-threads) lapack,blas 5. vasp.4.6.tar vasp.4.lib.tar $ tar -xvf vasp.4.6.tar $ tar -xvf vasp.4.lib.tar vasp.4.6/ vasp.4.lib/. 6. vasp.lib $ cd vasp.lib/ $ cp makefile.linux ifc P4 makefile vasp.lib/ Linux Intel fortran compiler(ifc),p4 makefile makefile 23

$ emacs makefile FC=ifc FC=ifort Intel fortran compiler ifc ifort $ make 7. vasp.4.6 $ cd vasp.4.6 $ cp makefile.linux ifc P4 makefile $ emacs makefile 7.1. FC=ifc FC=ifort 7.2. BLAS makefile BLAS /opt/libs/lbgoto p4 512-r0.6.so northwood BLAS=-L/opt/intel/mkl72/lib/32 -lmkl p4 -lsvml prescott BLAS=-L/opt/intel/mkl72/lib/em64t -lmkl em64t -lpthread -lsvml libgoto BLAS $ cd /opt/libs/ $ mkdir libgoto libgoto libgoto BLAS makefile CPU Intel northwood(presccot) BLAS=-L/opt/libs/libgoto/libgoto northwood(prescott)32p-r1.00.so -lpthread -lsvml 24

LINK = -lirc -lguide -lsvml -lcprts -lunwind -lcxa -lifport Wl,-rpath=/opt/libs/libgoto 7.3. LAPACK LAPACK=../vasp4.lib/lapack double.o Intel Math Kernel Library LAPACK northwood LAPACK=-L/opt/intel/mkl72/lib/32 -Imkl lapack64 presccot LAPACK= -L/opt/intel/mkl72/lib/em64t -lmkl lapack64 -lguide 7.4. FFT3D northwood FFT3D = fftw3d.o fft3dlib.o /usr/lib/libfftw3.a presccot FFT3D= fft3dfurth.o fft3dlib.o /usr/lib64/libffw3.a 7.5. MPI D MPICH FC=ifort -I/usr/lib/mpich-1.2.5.2/ FCL=/usr/lib/mpich-1.2.5.2/bin/mpif90 7.6. $ make cannot open shared object file PATH 25

vasp.4.6./vasp 8. VASP Hg.tar $ tar -xvf Hg.tar $ cd Hg $ directory where VASP resides/vasp 26

D MPICH 1. /etc/hosts IP 192.168.3.4 bob1 192.168.3.5 bob2 /etc/hosts.equiv./hosts bob1 bob2 1. MPICH http://www-unix.mcs.anl.gov/mpi/mpich/ mpich-1.2.5.2.tar.gz $ tar -xvf mpich-1.2.5.2.tar.gz $ cd mpich-1.2.5.2 $./configure prefix=/usr/lib/mpich-1.2.5.2 prefix= $./configure with-arch=linux with-device=ch p4 -fc=ifort -f90=ifort -prefix=/usr/local/bin mpich-1.2.5.2 27

$ make $ make install 2. $ cd /usr/lib/mpich-1.2.5.2/examples $ make cpi $./mpirun -np 1 cpi Process 0 on takeda1 pi is approximately 3.141600989231254, Error is 0.000000833333333323 wall clock time =0.000000 3. 3.1. $ cd /usr/local/mpich-1.2.5.2/share/ machines.linux takeda1 takeda2 # takeda1 takeda2 3.2. PATH /usr/local/mpich-1.2.5.2/bin bsh $ export PATH=$PATH:/usr/local/mpich-1.2.5.2/bin 28

csh.cshrc set path=/usr/local/mpich-1.2.5.2/bin 3.3. $./mpirun -np 2 cpi Process 0 on takeda1 Process 1 on takeda2 pi is approximately 3.141600989231254, Error is 0.000000833333333323 wall clock time =0.000000 3.4. VASP $ directory where VASP resides/vasp $./mpirun -np 2 $./mpirun -np 2 directory where VASP resides/vasp 29

30

[1] E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov, and D. Sorense, LAPACK ( 1995) [2] P. MPI ( 2001) [3] VASP http://cms.mpi.univie.ac.at/vasp/ 31