スパコンに通じる並列プログラミングの基礎

Similar documents
スパコンに通じる並列プログラミングの基礎

スパコンに通じる並列プログラミングの基礎

活用ガイド (ハードウェア編)

活用ガイド (ソフトウェア編)

困ったときのQ&A

tebiki00.dvi

Windows Cygwin Mac *1 Emacs Ruby ( ) 1 Cygwin Bash Cygwin Windows Cygwin Cygwin Mac 1 Mac 1.2 *2 ls *3 *1 OS Linux *2 *3 Enter ( ) 2

untitled

活用ガイド (ソフトウェア編)

main.dvi

II

untitled

i

2

ii

困ったときのQ&A

パソコン機能ガイド

パソコン機能ガイド

エクセルカバー入稿用.indd

01_.g.r..

入門ガイド


<4D F736F F F696E74202D C835B B E B8CDD8AB B83685D>

SC-85X2取説



bash on Ubuntu on Windows bash on Ubuntu on Windows bash on Ubuntu on Windows bash on Ubuntu on Windows bash on Ubuntu on Windows ˆ Windows10 64bit Wi

ii

untitled

Step2 入門

はしがき・目次・事例目次・凡例.indd


これわかWord2010_第1部_ indd

パワポカバー入稿用.indd

これでわかるAccess2010

Javaと.NET

GNU Emacs GNU Emacs

平成18年版 男女共同参画白書


OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë


i


Wide Scanner TWAIN Source ユーザーズガイド

01_OpenMP_osx.indd


™…


09中西



MultiPASS B-20 MultiPASS Suite 3.10使用説明書




リファレンス

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë

結婚生活を強める

openmp1_Yaguchi_version_170530

273? C

LAN Copyright c Daikoku Manabu This tutorial is licensed under a Creative Commons Attribution 2.1 Japan License

X Window System X X &

Slides: TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments

OpenMP (1) 1, 12 1 UNIX (FUJITSU GP7000F model 900), 13 1 (COMPAQ GS320) FUJITSU VPP5000/64 1 (a) (b) 1: ( 1(a))

2 Windows 10 *1 3 Linux 3.1 Windows Bash on Ubuntu on Windows cygwin MacOS Linux OS Ubuntu OS Linux OS 1 GUI Windows Explorer Mac Finder 1 GUI


インターネット入門

ÊÂÎó·×»»¤È¤Ï/OpenMP¤Î½éÊâ¡Ê£±¡Ë

(報告書まとめ 2004/03/  )


ストリーミング SIMD 拡張命令2 (SSE2) を使用した SAXPY/DAXPY

178 5 I 1 ( ) ( ) ( ) ( ) (1) ( 2 )

I II III 28 29

生活設計レジメ

44 4 I (1) ( ) (10 15 ) ( 17 ) ( 3 1 ) (2)


UNIX

Transcription:

2018.06.04 2018.06.04 1 / 62

2018.06.04 2 / 62

Windows, Mac Unix 0444-J 2018.06.04 3 / 62

Part I Unix GUI CUI: Unix, Windows, Mac OS Part II 2018.06.04 4 / 62 0444-J

( : ) 6 4 ( ) 6 5 * 6 19 SX-ACE * 6 22 (for OCTOPUS,VCC) * 6 26 SX-ACE (MPI) * 6 29 SX-ACE (HPF) * 8 23 Gaussian ( ) * http://www.hpc.cmc.osaka-u.ac.jp/lecture event/lecture/ 2018.06.04 5 / 62

Part I: UNIX 2018.06.04 6 / 62

CUI 2018.06.04 7 / 62

GUI CUI, OS GUI CUI OS (Windows, MacOS X, Unix) GUI (Graphical User Interface) CUI (Character User Interface)/ CLI (Command Line Interface) OS Unix CUI/CLI Unix Unix = CUI/CLI 0445-J 2018.06.04 8 / 62

GUI CUI GUI CUI 1 1 2018.06.04 9 / 62

CUI を理解するコツ I GUI は 地図で CUI は 写真である GUI = 地図 CUI = 写真 CUI では 今 どこにいるか が重要 基本的には 自分が歩いて行く (= cd コマンド等で移動). furihata@cmc.osaka-u.ac.jp (大阪大学サイバーメディアセンター スパコンに通じる並列プログラミングの基礎 ) 2018.06.04 0445-J 10 / 62

CUI II CUI = (shell) CUI =! 0445-J 2018.06.04 11 / 62

CUI III CUI Unix CUI ssh 0445-J 2018.06.04 12 / 62

CUI IV CUI Emacs vi Emacs vi ( ) emacs vi 2018.06.04 13 / 62 0445-J

Unix 2018.06.04 14 / 62

Unix : pwd cd.. cd hoge mkdir hoge rmdir hoge hoge hoge hoge mv hoge poko hoge poko or - poko, - 0445-J 2018.06.04 15 / 62

Unix : ls touch hoge rm hoge mv hoge poko or hoge hoge hoge poko - poko, - 0445-J 2018.06.04 16 / 62

Unix : less hoge hoge more, cat grep kore kore 0445-J 2018.06.04 17 / 62

Unix ( ) Emacs emacs hoge hoge ( emacs. C- Ctrl M- Esc ) C-x C-f C-x C-s C-x C-c C-g C-s hoge. emacs C- M-w C-w C-y hoge ( ) ( ) 2018.06.04 0445-J 18 / 62

Unix ( ) vi vi hoge hoge ( vi. ) i Esc ( ) h,j,k,l :wq :q! x, dd 1, 1 ( ) yy p 1. ( ) 2018.06.04 19 / 62

: Emacs + vi = spacemacs: 2014.10 first release. ver.0.200.13. 0445-J 2018.06.04 20 / 62

Unix Unix = CUI CUI CUI CUI = (shell) CUI CUI Emacs vi Unix 0445-J 2018.06.04 21 / 62

Part II: 2018.06.04 22 / 62

Part ( ) GO 2018.06.04 23 / 62 0491-J

2018.06.04 24 / 62

(SIMD): 1 SX-9 (NEC) 1 1 100GFlops. * 2007. (Top500 1 2011.06-12, 2017.11 10 ): 10.6 PFlops, 705,024. PrimeHPC FX10: 23.3 PFlops, 1,572,864. (, Top500 1 2016.06-2017.11 ): 93 PFlops, 10,649,600 : 1.26 P, : 1.3P * PrimeHPC FX10 spec 6Pbyte. ( ) : 12.7 MW, : 15.3 MW, ( 2: 17.8 MW) * 27, 13 MW 3 2018.06.04 25 / 62

: SX-ACE 3 423 TFlops = 0.423 PFlops, 6144. * 2017 6 TOP500 500 (549 TFlops) 8 ( 1 ) 96TB = 0.1 PB. ( ) 700 KW. * 1000 4 5 %. 8% 0491-J 2018.06.04 26 / 62

I (creative commons -attribution, share alike 3.0 by A.I.Graphic) 2018.06.04 27 / 62

II ILLIAC I, II, III ( 1952, 1962, SIMD 1966) Cray-1 ( 1976, 80-160MFLOPS). 80 (SX-5, 41 TFLOPS, 2002: SX-9, 131 TFLOPS, 2009). TOP500 1 2002.06-2004.06.. Japanese Computenik Blue Gene (2004) 32,768, 2007 212,992 (2011) 10.62 PFLOPS 2 (2013) CPU 33.8 PFLOPS. (2016) CPU 93 PFLOPS. 2018.06.04 28 / 62 0491-J

III (143/top 500 2017.11 ) CPU Cray IBM CPU (35) NEC CPU ( 202) CPU ( ) (21), (18), (15) (EU 86). 10 (top500 ). 0491-J 2018.06.04 29 / 62

, 2018.06.04 30 / 62

I Input Data Operation Output Data 2018.06.04 31 / 62

II ( ) or Input Operation Output 1 + 2 + 3 + + 100 = 5050 for i := 1 to 100 do result += i; ( 100 + 1 ) * 100 / 2 2018.06.04 32 / 62 0491-J

III-1 CPU, HDD Input Fast Operation Output 2018.06.04 33 / 62

III-2 10000 1000 CPU clock (MHz) 100 10 1 0.1 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 Year 2018.06.04 34 / 62

IV-1 ( ) - - SIMD (Single Instruction Multiple Data) Input Vector Operation Output 2018.06.04 35 / 62

IV-2 2018.06.04 0491-J 36 / 62

V-1 ( ) Input Parallel Operation Output 2018.06.04 37 / 62

V-2 0491-J 2018.06.04 38 / 62

VI-1 ( ) Input Parallel Operation Output 2018.06.04 39 / 62

VI-2 0491-J 2018.06.04 40 / 62

VII ( ) - - 0491-J 2018.06.04 41 / 62

VIII NEC SX PC GPU SIMD ( ) 0491-J 2018.06.04 42 / 62

2018.06.04 43 / 62

: I a n 1 a 1 (1 + δ) a + (1 a) n δ Speed up Ratio 60 50 40 30 20 a:50%, delta: 0% a:80%, delta: 0% a:90%, delta: 0% a:95%, delta: 0% a:99%, delta: 0% Speed up Ratio 100 80 60 40 a:50%, delta: 0% a:80%, delta: 0% a:90%, delta: 0% a:95%, delta: 0% a:99%, delta: 0% 10 20 0 10 20 30 40 50 60 70 80 90 100 Number of Processors 0 0 100 200 300 400 500 600 700 800 900 1000 Number of Processors 2018.06.04 0491-J 44 / 62

II Speed up Ratio 60 50 40 30 20 a:50%, delta: 50% a:80%, delta: 50% a:90%, delta: 50% a:95%, delta: 50% a:99%, delta: 50% Speed up Ratio 100 80 60 40 a:50%, delta: 50% a:80%, delta: 50% a:90%, delta: 50% a:95%, delta: 50% a:99%, delta: 50% 10 20 0 10 20 30 40 50 60 70 80 90 100 Number of Processors 0 0 100 200 300 400 500 600 700 800 900 1000 Number of Processors Speed up Ratio 60 50 40 30 20 a:50%, delta: 200% a:80%, delta: 200% a:90%, delta: 200% a:95%, delta: 200% a:99%, delta: 200% Speed up Ratio 100 80 60 40 a:50%, delta: 200% a:80%, delta: 200% a:90%, delta: 200% a:95%, delta: 200% a:99%, delta: 200% 10 20 0 10 20 30 40 50 60 70 80 90 100 Number of Processors 0 0 100 200 300 400 500 600 700 800 900 1000 Number of Processors 2018.06.04 0491-J 45 / 62

III. 0491-J 2018.06.04 46 / 62

2018.06.04 47 / 62

MPI (Message Passing Interface), CPU OpenMP (Multi Processing), CPU OpenMP, SIMD/, CUDA. 2018.06.04 48 / 62

I SIMD, ( ) for i:=1 to 10000 do a[i] := 2*i; 1 2018.06.04 49 / 62 0491-J

II-1 CPU / : (thread ) OpenMP OS - Grand Central Dispatch (MacOS X 10.6, FreeBSD), - intel TBB, Google Go, Rust. 2018.06.04 50 / 62 0491-J

II-2 OpenMP Fortran : program hello.!$omp parallel!$omp end parallel. end 0491-J 2018.06.04 51 / 62

III-1 : (Message) (NEC ) MPI (Message Passing Interface) 0491-J 2018.06.04 52 / 62

III-2 MPI 2018.06.04 53 / 62

: SIMD / (1) SIMD / : Julia lang: @simd for i=1:length(x) @simd @inbounds s += x[i] * y[i] end C/C++ on SX-ACE: #pragma vdir nodep #pragma vdir... for(i=1;i<length(x);i++){ s += x[i] * y[i]; } Computation speed 10 9 8 7 GFlops 6 5 4 3 2 1 normal SIMD SIMD (4 core CPU ) 5!! * PC (vaio, 4core), 1000, 100000. 2018.06.04 0491-J 54 / 62

: OpenMP/MPI (1) Parallel : Julia lang: nheads = @parallel (+) for i=1:200000000 Int(rand(Bool)) end C/C++ on SX-ACE: compile option -Pauto Computation speed Gops 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 normal parallel (4 core CPU ) 33!! * PC (vaio, 4core). 2018.06.04 55 / 62

: OpenMP/MPI (2) 2017 20 SX-ACE SIMD OpenMP. : 5, : 5 - - SIMD, OpenMP 2018.06.04 56 / 62

: OpenMP/MPI (3) Parallel : Black-Scholes (by Julia language) using ParallelAccelerator Intel Labs @acc begin end http://julialang.org/blog/2016/03/parallelaccelerator (36 core CPU ) 130!! 2018.06.04 0491-J 57 / 62

MPI (SX-ACE) 1 (1 cpu, 4 core) PC 4 4 MPI 2018.06.04 0491-J 58 / 62

Thank You! Thank You! 0491-J 2018.06.04 59 / 62