untitled

Size: px
Start display at page:

Download "untitled"

Transcription

1 OpenMP CPU CPU CPU CPU BUS CPU CPU MEM MEM Network CPU CPU MEM MEM Poorman s supercomputer Sun IPC cluster etlwiz Alpha cluster, 100 BASE-TX SWITCHATM beowulf class RWCP PC cluster MyrinetGigabit Ethernet, Fiber Channel, DEC Memory Channel, IBM SP2 network UCB CLUMPS, RWC COMPaS) ethernet : 10Mbps 100Mbps Gigabit either MyrinetSAN Network I/O PVM, P4, TCGMSGMPIMPI2 (Message Passing) (shared memory) DSMon MPI,PVM pthread, solaris thread, NT thread OpenMP annotation thread HPF annotation, distribution hint Fancy parallel programming languages 1

2 POSIX for(i=0;i<1000; i++) S += A[i] S Pthread, Solaris thread for(t=1;t<n_thd;t++) r=pthread_create(thd_main,t) thd_main(0); for(t=1; t<n_thd;t++) pthread_join(); PARAMCS For(t=1; t<n_thd;t++) CREATE(thd_main); thd_main(0) WAIT_FOR_END(n_thd-1); S POSIX int s; /* global */ int n_thd; /* number of threads */ int thd_main(int id) int c,b,e,i,ss; c=1000/n_thd; b=c*id; e=s+c; ss=0; for(i=b; i<e; i++) ss += a[i]; pthread_lock(); s += ss; pthread_unlock(); return s; OpenMP OK! #pragma omp parallel for reduction(+:s) for(i=0; i<1000;i++) s+= a[i]; OpenMP OpenMP OpenMP Parallel Regionwork sharing (for)(sections)single data scope orphan static extent dynamic extent OpenMP OpenMP OpenMP (Fortran/C/C++)directive ISV Oct Fortran ver.1.0 API Oct C/C++ ver.1.0 API (1999 F90 API?) URL 2

3 SGI Cray Origin ASCI Blue Mountain System SUN Enterprise PC-based SMP SGI Power Fortran/C SUN Impact KAI/KAP OpenMP OpenMP 5%95%(?) 5% small-scale(16medium-scale (64 pthreados-oriented, general-purpose OpenMPAPI OpenMP directives/pragma Fortran77, f90, C, C++ Fortran!$OMP C: #pragma omp pragma incremental Fork-join parallel region A... #pragma omp parallel foo(); /*..B... */ C. #pragma omp parallel D E... Call foo() fork A Call foo() Call foo() B join C D E Call foo() OpenMP OpenMP OpenMPAPI Fortran $OMP,C$OMP,*$OMPsentinel!$OMP directive_name [clause, clause, ] directive_name: clause:, C/C++ #pragma omp pragma #pragma omp directive_name [clause, clause, ] #pragma omp parallel 3

4 Parallel Region (team) Parallel Parallel regionteam regionteam Fortran: C:!$OMP PARALLEL #pragma omp parallel parallel region Parallel region...!$omp END PARALLEL... Parallel region (contd.) ID omp_get_thread_num() IDTeam ID=0 ID omp_set_num_threads(nthreads) OMP_NUM_THREADS parallel regionjoin critical, atomic, barrier for(i=0;i<1000; i++) S += A[i] S S OpenMP #pragma omp parallel int c,b,e,i,ss; c=1000/omp_get_num_threads(); b=c*omp_get_thread_num();e=s+c;ss=0; for(i=b; i<e; i++) ss += a[i]; #pragma omp atomic s += ss; OpenMP #pragma omp parallel for reduction(+:s) for(i=0; i<1000;i++) s+= a[i]; OpenMP : (data-parallel) (task-parallel) tuning : SPMD omp_get_thread_num()id SPLASH 2PARMACS Macro backend: OpenMP e.g. Polaris Compiler OpenMP Pthread, Solaris thread for(t=1;t<n_thd;t++) r=pthread_create(thd_main,t) thd_main(0); for(t=1; t<n_thd;t++) pthread_join(); PARAMCS For(t=1; t<n_thd;t++) CREATE(thd_main); thd_main(0) WAIT_FOR_END(n_thd-1); OpenMP omp_set_num_threads(n_thd); #pragma omp parallel thd_main(omp_get_thread_num()); 4

5 Work sharing Team parallel region for sections single parallel parallel for parallel sections For ForDO forcanonical shape #pragma omp for [clause] for(var=lb; var logical-op ub; incr-expr) body varprivate incr-expr ++var,var++,--var,var--,var+=incr,var-=incr logical-op break clause For schedule(kind[,chunk_size]) schedule(static,chunk_size) chunk_sizeround-roubin chunk_size=1:cyclic schedule(dynamic,chunk_size) chunk_size chunk_size=1 schedule(guided,chunk_size) chunk_size schedule(runtime) OMP_SCHEDULE implementation n schedule(static,n) Schedule(static) Schedule(dynamic,n) Schedule(guided,n) Iteration space Sections single Matvec(double a[],int row_start,int col_idx[], double x[],double y[],int n) int i,j,start,end; double t; #pragma omp parallel for private(j,t,start,end) for(i=0; i<n;i++) start=row_start[i]; end=row_start[i+1]; t = 0.0; for(j=start;j<end;j++) t += a[j]*x[col_idx[j]]; y[i]=t; Section #pragma omp sections #pragma omp section section1 #pragma omp section section2 #pragma omp single statements 5

6 Work sharingnowait barrier Critical section critical Atomic atomic Barrier flush work sharingnowait #pragma omp barrier Atomic Atomic #pragma omp atomic statement x binop= expr x++,++x, x--, --xx xexpr Atomic Critical Critical section #pragma omp critical[(name)] statements critical section critical section conditional wait master Master ordered #pragma omp master block statements ordered #pragma omp ordered block statements fordynamic extent forordered Data scope parallelwork sharing shared(var_list) private(var_list) private firstprivate(var_list) private lastprivate(var_list) private reduction(op:var_list) reduction private 6

7 Threadprivate file-scope #pragma omp threadprivate(var_list) parallel region persistent parallelcopyin(var_list) Data scope work sharing Parallel private,firstprivate,shared,reduction,copyin default(shared none) defaultnone for private,firstprivate,lastprivate,reduction sections private,firstprivate,lastprivate,reduction single private,firstprivate Orphan directiveextent extent (orphan directive) Static extent lexical dynamic extent orphan directive Static extentdynamic extent dynamic extent dynamic extentdata scope autoprivate shared main() for(it=0;it<niter;i++) resid=cgsol() printf(,resid); cgsol() #pragma omp parallel for for(i=0;i<cols;i+) p[i]=r[i]=x[i]; for(it=0;it<nitcg;i++) matvec(); #pragma omp parallel for for(i=0;i<cols;i++) z[i]+=alpha*p[i]; main() #pragma omp parallel for(it=0;it<niter;i++) resid=cgsol() #pragma omp master printf(,resid); cgsol() #pragma omp for for(i=0;i<cols;i+) p[i]=r[i]=x[i]; for(it=0;it<nitcg;i++) matvec(); #pragma omp for for(i=0;i<cols;i++) z[i]+=alpha*p[i]; Directive binding for, sections, single,master, barrier directivedynamic extentbind dynamic extent work sharingnest master, critical nested parallelism parallel directivenest Nested parallelismenableparallel Disablethread Nested parallelism Nested parallelism in FAQ ``What about nested parallelism? Nested parallelism is permitted by the OpenMP specification. Supporting nested parallelism effectively can be difficult, and we expect most vendors will start out by executing nested parallel constructs on a single thread. In ``OpenMP Fortran Interpretations Version 1.0 In Note that an OpenMP-compliant implementation is permitted to serialize a nested parallel region. Nested parallelismserialize sectionserialize serialize 7

8 OpenMPmemory consistency OpenMPweak consistency Parallel region volatile nowaitwork sharing flush flush #pragma omp flush[(var_list)] consistency omp_get_num_threads, omp_set_num_threads team omp_get_thread_num id omp_get_max_threads omp_get_num_procs omp_set_dynamic, omp_get_dynamic omp_set_nested, omp_get_nested parallel regionnest lock omp_lock_t omp_nest_lock_t OMP_NUM_THREADS Parallel region OMP_SCHEDULE schedule(runtime) OMP_DYNAMIC SGI origin OMP_NESTED nested parallelism nestparallel region OpenMP incremental Work sharing orphan directive data mapping Iteration mapping locality reduction pragma OpenMP --- (Fortran,C/C++) fork-join incremental Fortraninterpretation publish SC 99Fortransecond version locality MPI,HPF Commercial products KAI Guide compiler(fortran,c,c++) Digital UNIX/NT alpha, HPUX,IBM AIX,Intel Solaris/NT,SGI,SUN Solaris PGI SGI MIPSpro (Fortran,C) Gray UNICOS SUN COMPaQ/Digital Fortran IBM SR8000(?) NEC SX-4(?) 8

9 Performance tuning tools performance tuning KAI Assure/Guide view TAU (OGI) Polaris Omni tlogview Omni OpenMPdirective (tlog file) barrier Omni tlogview KAI Guide Tools Performance Viewer KAI Assure Tools Program verifier check racing condition,..etc.. OpenMP-NOW(Rice) WS TradeMarkOpenMP OpenMP+MPI(ASCI) OpenMP+HPF(Vienna Univ.) 2 SPEC HPGOpenMP(MPI OpenMP(RWCP) 9

10 Omni OpenMP SMP (Solaris Thread or POSIX Threads) Solaris 5.6 (SPARC,x86), linux (x86 SMP) Fortran77 C-front: Cparser exc-tools-java: Java download RWC Omni OpenMP Compiler A translator from an OpenMP program to the multithreaded C program with the runtime library calls. Omni Exc toolkit Toolkit for compiler research C-front : OpenMP C parser to generate Xobject code Xobject code: AST (Abstract Syntax Tree) and data type informations Exc java toolkit : Java class libraries to analyze and transform Xobject code. OpenMP transformation and optimization are written in Java using Exc java toolkit. Omni OpenMP compiler for SMP Solaris Thread or POSIX Threads. (Stack/Threads at U. of Tokyo) Solaris 5.6 (SPARC,x86), linux (x86 SMP), (O2K pthread) C and Fortran77. F90 is under development. Overview of Omni OpenMP Compiler OpenMP (1) C++ OpenMP COpenMP F77+OpenMP C++frontend C-Front Xobject code Exc Java toolkit F77 frontend Omni Exc Toolkit Multithreaded C code +Runtime library calls OpenMP Compiled by native cc runtime library executable link S1 #pragma omp parallel for for(i=0; i<n; i++) x[i]=; S2 OpenMP (2) void ompc_func_1(void ** ompc_args) auto int *_pp_n; _pp_n=(int *)*( ompc_args+0); auto int _p_i, _p_i_0, _p_i_1, _p_i_2; _p_i_0=; _ompc_static_sched(&_p_i_0,...); for(_p_i=)... Parallel S1 auto void * ompc_argv[1]; *( ompc_argv+0)=(void *)&n; ompc_do_parallel( ompc_func_1, ompc_argv); S2 RWCP Omni OpenMP Compiler/C NPB1 CG,BT,SP (Class A) in C orphan directive overhead SUN S1000(8CPU) Speedup 10

11 (RWC OpenMP NPB1 CG(Class A) in OpenMP and Multithread(solaris thread) NPB1 CG (Class A) in OpenMP and Multithread (solaris thread) SMP SMP PC MPP SMP SMP SMP COMPaS: a PC-based SMP Cluster SMP CPUs MEM SMP node network Middle scale Server ASCI Blue Mountain, O2K ASCI Blue Pacific, SP2 vector supercomputer Hitachi SR8000 SX-5 PC-based SMP COMPaS Clumps SMP SMP) SMP SMP hardware shared memory + DSM DSM MPI/shmemMPI? Hybrid MPI+OpenMP MPIOpenMP OpenMP MPI Cyclic Shift for(iter=0;iter<n_pe; iter++) #pragma for(i=0; ompi<blkn;i++) parallel for private(j,k,t) firstprivate(blkn) for(i=0; t=0; i<blkn;i++) t=0; for(j=0;j<blkn; j++) t+=a[k][i]*b[j][k]; for(j=0;j<blkn; C[j][i]=t; j++) t+=a[k][i]*b[j][k]; C[j][i]=t; r=mpi_sendrecv(,b,bb,); r=mpi_sendrecv(,b,bb,); update matrix, B <- BB, update matrix, B <- BB, 11

12 MPIOpenMP OpenMP MPI+OpenMP MPI SMP Cyclic Shift OpenMP+MPI OpenMP singlemastercritical thread-safempi MPI OpenMPthreadprivate SMP OpenMP --- (Fortran,C/C++) fork-join incremental SMP MPIOpenMP SPEC HPGOpenMP(MPI OpenMP(RWCP) OpenMPSMP OpenMP for SMP Cluster OpenMP on SDSM OpenMP(SIF) on TreadMarks (at Rice Univ.) Omni OpenMP Compiler for SCASH (RWCP and TITECH) This approach cannot exploit application-specific data access pattern. Compiler-directed SDSM The compiler generates memory coherence check codes to keep memory consistency (e.g. Shasta SDSM). The compiler analyzes the memory access pattern to optimize communication between nodes. OpenMP structured parallelism description enables more high-level optimization The data-parallel computation in work sharing directives can be compiled into efficient and explicit communication by compiler analysis. OpenMP SMP OpenMP+MPI SMPOpenMP HPF 12

untitled

untitled OpenMP (Message Passing) (shared memory) DSMon MPI,PVM pthread, solaris thread, NT thread OpenMP annotation thread HPF annotation, distribution hint Fancy parallel programming languages for(i=0;i

More information

untitled

untitled OpenMP MPI OpenMPI 1 2 http://www.es.jamstec.go.jp/ 3 4 http://www.top500.org/ CPU 3GHz, 10GHz 90nm 65nm, 45nm VLIW L3 Intel Hyperthreading CPU Pentium 5 6 7 8 Cell 23400 90nm 221mm2 SPU 1.52Moore s Law

More information

untitled

untitled CPU CPU PC 1 3GHz, 10GHz 0.13m VLIW L3 Intel Hyperthreading Intel IA32: Xeon, P4 PC Intel IA64: Itanium2 64 Itanium2 (Deerfield) AMD x86-64: Opteron x8664 x86 Sun SPARC,IBM Power, Alpha, MIPS, PCPDA P

More information

02_C-C++_osx.indd

02_C-C++_osx.indd C/C++ OpenMP* / 2 C/C++ OpenMP* OpenMP* 9.0 1... 2 2... 3 3OpenMP*... 5 3.1... 5 3.2 OpenMP*... 6 3.3 OpenMP*... 8 4OpenMP*... 9 4.1... 9 4.2 OpenMP*... 9 4.3 OpenMP*... 10 4.4... 10 5OpenMP*... 11 5.1

More information

untitled

untitled OpenMP 1 OpenMP MPI Open Advanced Topics SMP Hybrid Programming OpenMP 3.0 (task) 2 CPU 3 3GHz, 10GHz 65nm 45nm, 32nm(20?) VLIW L3 Intel Hyperthreading CPU 4 Pentium CPU 5 (Message Passing) (shared memory)

More information

2. OpenMP OpenMP OpenMP OpenMP #pragma#pragma omp #pragma omp parallel #pragma omp single #pragma omp master #pragma omp for #pragma omp critica

2. OpenMP OpenMP OpenMP OpenMP #pragma#pragma omp #pragma omp parallel #pragma omp single #pragma omp master #pragma omp for #pragma omp critica C OpenMP 1. OpenMP OpenMP Architecture Review BoardARB OpenMP OpenMP OpenMP OpenMP OpenMP Version 2.0 Version 2.0 OpenMP Fortran C/C++ C C++ 1997 10 OpenMP Fortran API 1.0 1998 10 OpenMP C/C++ API 1.0

More information

untitled

untitled OpenMP 1 OpenMP MPI Open Advanced Topics SMP Hybrid Programming OpenMP 3.0 2 CPU 3GHz, 10GHz 65nm 45nm, 32nm VLIW L3 Intel Hyperthreading CPU 3 4 Pentium CPU CPU CPU CPU CPU CPU CPU CPU BUS CPU MEM CPU

More information

(Microsoft PowerPoint \215u\213`4\201i\221\272\210\344\201j.pptx)

(Microsoft PowerPoint \215u\213`4\201i\221\272\210\344\201j.pptx) AICS 村井均 RIKEN AICS HPC Summer School 2012 8/7/2012 1 背景 OpenMP とは OpenMP の基本 OpenMP プログラミングにおける注意点 やや高度な話題 2 共有メモリマルチプロセッサシステムの普及 共有メモリマルチプロセッサシステムのための並列化指示文を共通化する必要性 各社で仕様が異なり 移植性がない そして いまやマルチコア プロセッサが主流となり

More information

AICS 村井均 RIKEN AICS HPC Summer School /6/2013 1

AICS 村井均 RIKEN AICS HPC Summer School /6/2013 1 AICS 村井均 RIKEN AICS HPC Summer School 2013 8/6/2013 1 背景 OpenMP とは OpenMP の基本 OpenMP プログラミングにおける注意点 やや高度な話題 2 共有メモリマルチプロセッサシステムの普及 共有メモリマルチプロセッサシステムのための並列化指示文を共通化する必要性 各社で仕様が異なり 移植性がない そして いまやマルチコア プロセッサが主流となり

More information

Microsoft PowerPoint - HPCseminar2013-msato.pptx

Microsoft PowerPoint - HPCseminar2013-msato.pptx OpenMP 並列プログラミング入門 筑波大学計算科学研究センター担当佐藤 1 もくじ 背景 並列プログラミング超入門 OpenMP Openプログラミングの概要 Advanced Topics SMPクラスタ Hybrid Programming OpenMP 3.0 (task) OpenMP 4.0 まとめ 2 計算の高速化とは コンピュータの高速化 デバイス 計算機アーキテクチャ パイプライン

More information

卒業論文

卒業論文 PC OpenMP SCore PC OpenMP PC PC PC Myrinet PC PC 1 OpenMP 2 1 3 3 PC 8 OpenMP 11 15 15 16 16 18 19 19 19 20 20 21 21 23 26 29 30 31 32 33 4 5 6 7 SCore 9 PC 10 OpenMP 14 16 17 10 17 11 19 12 19 13 20 1421

More information

01_OpenMP_osx.indd

01_OpenMP_osx.indd OpenMP* / 1 1... 2 2... 3 3... 5 4... 7 5... 9 5.1... 9 5.2 OpenMP* API... 13 6... 17 7... 19 / 4 1 2 C/C++ OpenMP* 3 Fortran OpenMP* 4 PC 1 1 9.0 Linux* Windows* Xeon Itanium OS 1 2 2 WEB OS OS OS 1 OS

More information

OpenMP (1) 1, 12 1 UNIX (FUJITSU GP7000F model 900), 13 1 (COMPAQ GS320) FUJITSU VPP5000/64 1 (a) (b) 1: ( 1(a))

OpenMP (1) 1, 12 1 UNIX (FUJITSU GP7000F model 900), 13 1 (COMPAQ GS320) FUJITSU VPP5000/64 1 (a) (b) 1: ( 1(a)) OpenMP (1) 1, 12 1 UNIX (FUJITSU GP7000F model 900), 13 1 (COMPAQ GS320) FUJITSU VPP5000/64 1 (a) (b) 1: ( 1(a)) E-mail: {nanri,amano}@cc.kyushu-u.ac.jp 1 ( ) 1. VPP Fortran[6] HPF[3] VPP Fortran 2. MPI[5]

More information

GNU開発ツール

GNU開発ツール 並列プログラミング環境 プログラミング環境特論 2008 年 1 月 24 日 建部修見 分散メモリ型計算機 CPU CPU CPU とメモリという一つの計算機システムが ネットワークで結合されているシステム MEM CPU Network MEM CPU それぞれの計算機で実行されているプログラムはネットワークを通じて データ ( メッセージ ) を交換し 動作する MEM MEM 超並列 (MPP:Massively

More information

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë 2012 5 24 scalar Open MP Hello World Do (omp do) (omp workshare) (shared, private) π (reduction) PU PU PU 2 16 OpenMP FORTRAN/C/C++ MPI OpenMP 1997 FORTRAN Ver. 1.0 API 1998 C/C++ Ver. 1.0 API 2000 FORTRAN

More information

develop

develop SCore SCore 02/03/20 2 1 HA (High Availability) HPC (High Performance Computing) 02/03/20 3 HA (High Availability) Mail/Web/News/File Server HPC (High Performance Computing) Job Dispatching( ) Parallel

More information

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë 2011 5 26 scalar Open MP Hello World Do (omp do) (omp workshare) (shared, private) π (reduction) scalar magny-cours, 48 scalar scalar 1 % scp. ssh / authorized keys 133. 30. 112. 246 2 48 % ssh 133.30.112.246

More information

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£²¡Ë

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£²¡Ë 2013 5 30 (schedule) (omp sections) (omp single, omp master) (barrier, critical, atomic) program pi i m p l i c i t none integer, parameter : : SP = kind ( 1. 0 ) integer, parameter : : DP = selected real

More information

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N GPU 1 1 2 1, 3 2, 3 (Graphics Unit: GPU) GPU GPU GPU Evaluation of GPU Computing Based on An Automatic Program Generation Technology Makoto Sugawara, 1 Katsuto Sato, 1 Kazuhiko Komatsu, 2 Hiroyuki Takizawa

More information

Microsoft Word - openmp-txt.doc

Microsoft Word - openmp-txt.doc ( 付録 A) OpenMP チュートリアル OepnMP は 共有メモリマルチプロセッサ上のマルチスレッドプログラミングのための API です 本稿では OpenMP の簡単な解説とともにプログラム例をつかって説明します 詳しくは OpenMP の規約を決めている OpenMP ARB の http://www.openmp.org/ にある仕様書を参照してください 日本語訳は http://www.hpcc.jp/omni/spec.ja/

More information

untitled

untitled OS 2007/4/27 1 Uni-processor system revisited Memory disk controller frame buffer network interface various devices bus 2 1 Uni-processor system today Intel i850 chipset block diagram Source: intel web

More information

OpenMP 3.0 C/C++ 構文の概要

OpenMP 3.0 C/C++ 構文の概要 OpenMP 3.0 C/C++ 構文の概要 OpenMP API 仕様については www.openmp.org でダウンロードしてください OpenMP 実行宣言子は 後続の構造化ブロックや OpenMP 構文に適用されます 構造化ブロック () とは 単文または先頭に入口が 1 つ 末尾に出口が 1 つの複合文です parallel 構文はスレッドのチームを形成し 並列実行を開始します #pragma

More information

XcalableMP入門

XcalableMP入門 XcalableMP 1 HPC-Phys@, 2018 8 22 XcalableMP XMP XMP Lattice QCD!2 XMP MPI MPI!3 XMP 1/2 PCXMP MPI Fortran CCoarray C++ MPIMPI XMP OpenMP http://xcalablemp.org!4 XMP 2/2 SPMD (Single Program Multiple Data)

More information

1重谷.PDF

1重谷.PDF RSCC RSCC RSCC BMT 1 6 3 3000 3000 200310 1994 19942 VPP500/32PE 19992 VPP700E/128PE 160PE 20043 2 2 PC Linux 2048 CPU Intel Xeon 3.06GHzDual) 12.5 TFLOPS SX-7 32CPU/256GB 282.5 GFLOPS Linux 3 PC 1999

More information

スパコンに通じる並列プログラミングの基礎

スパコンに通じる並列プログラミングの基礎 2018.06.04 2018.06.04 1 / 62 2018.06.04 2 / 62 Windows, Mac Unix 0444-J 2018.06.04 3 / 62 Part I Unix GUI CUI: Unix, Windows, Mac OS Part II 2018.06.04 4 / 62 0444-J ( : ) 6 4 ( ) 6 5 * 6 19 SX-ACE * 6

More information

untitled

untitled taisuke@cs.tsukuba.ac.jp http://www.hpcs.is.tsukuba.ac.jp/~taisuke/ CP-PACS HPC PC post CP-PACS CP-PACS II 1990 HPC RWCP, HPC かつての世界最高速計算機も 1996年11月のTOP500 第一位 ピーク性能 614 GFLOPS Linpack性能 368 GFLOPS (地球シミュレータの前

More information

スパコンに通じる並列プログラミングの基礎

スパコンに通じる並列プログラミングの基礎 2018.09.10 furihata@cmc.osaka-u.ac.jp ( ) 2018.09.10 1 / 59 furihata@cmc.osaka-u.ac.jp ( ) 2018.09.10 2 / 59 Windows, Mac Unix 0444-J furihata@cmc.osaka-u.ac.jp ( ) 2018.09.10 3 / 59 Part I Unix GUI CUI:

More information

スパコンに通じる並列プログラミングの基礎

スパコンに通じる並列プログラミングの基礎 2016.06.06 2016.06.06 1 / 60 2016.06.06 2 / 60 Windows, Mac Unix 0444-J 2016.06.06 3 / 60 Part I Unix GUI CUI: Unix, Windows, Mac OS Part II 0444-J 2016.06.06 4 / 60 ( : ) 6 6 ( ) 6 10 6 16 SX-ACE 6 17

More information

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation 1 1 1 1 SPEC CPU 2000 EQUAKE 1.6 50 500 A Parallelizing Compiler Cooperative Multicore Architecture Simulator with Changeover Mechanism of Simulation Modes GAKUHO TAGUCHI 1 YOUICHI ABE 1 KEIJI KIMURA 1

More information

Second-semi.PDF

Second-semi.PDF PC 2000 2 18 2 HPC Agenda PC Linux OS UNIX OS Linux Linux OS HPC 1 1CPU CPU Beowulf PC (PC) PC CPU(Pentium ) Beowulf: NASA Tomas Sterling Donald Becker 2 (PC ) Beowulf PC!! Linux Cluster (1) Level 1:

More information

I I / 47

I I / 47 1 2013.07.18 1 I 2013 3 I 2013.07.18 1 / 47 A Flat MPI B 1 2 C: 2 I 2013.07.18 2 / 47 I 2013.07.18 3 / 47 #PJM -L "rscgrp=small" π-computer small: 12 large: 84 school: 24 84 16 = 1344 small school small

More information

The 3 key challenges in programming for MC

The 3 key challenges in programming for MC コンパイラーによる並列化機能 ソフトウェア & ソリューションズ統括部 ソフトウェア製品部 Rev 12/26/2006 コースの内容 並列計算 なぜ使用するのか? OpenMP* 入門 宣言子と使用方法 演習 : Hello world と円周率の計算 並列プログラミング : ヒントとテクニック コード開発で避けるべきこと 2 並列計算なぜ並列処理を使用するのか? 計算をより短い時間で処理 一定の所要時間でより大きな計算を処理

More information

Microsoft PowerPoint - OpenMP入門.pptx

Microsoft PowerPoint - OpenMP入門.pptx OpenMP 入門 須田礼仁 2009/10/30 初版 OpenMP 共有メモリ並列処理の標準化 API http://openmp.org/ 最新版は 30 3.0 バージョンによる違いはあまり大きくない サポートしているバージョンはともかく csp で動きます gcc も対応しています やっぱり SPMD Single Program Multiple Data プログラム #pragma omp

More information

OpenMPプログラミング

OpenMPプログラミング OpenMP 基礎 岩下武史 ( 学術情報メディアセンター ) 1 2013/9/13 並列処理とは 逐次処理 CPU1 並列処理 CPU1 CPU2 CPU3 CPU4 処理 1 処理 1 処理 2 処理 3 処理 4 処理 2 処理 3 処理 4 時間 2 2 種類の並列処理方法 プロセス並列 スレッド並列 並列プログラム 並列プログラム プロセス プロセス 0 プロセス 1 プロセス間通信 スレッド

More information

橡3_2石川.PDF

橡3_2石川.PDF PC RWC 01/10/31 2 1 SCore 1,024 PC SCore III PC 01/10/31 3 SCore SCore Aug. 1995 Feb. 1996 Oct. 1996 1997-1998 Oct. 1999 Oct. 2000 April. 2001 01/10/31 4 2 SCore University of Bonn, Germany University

More information

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2 FFT 1 Fourier fast Fourier transform FFT FFT FFT 1 FFT FFT 2 Fourier 2.1 Fourier FFT Fourier discrete Fourier transform DFT DFT n 1 y k = j=0 x j ω jk n, 0 k n 1 (1) x j y k ω n = e 2πi/n i = 1 (1) n DFT

More information

Copyright 2004 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. U.S. Government Rights - Commer

Copyright 2004 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. U.S. Government Rights - Commer OpenMP API ユーザーズガイド Sun TM Studio 8 Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. 650-960-1300 Part No. 817-5813-10 2004 年 3 月, Revision A Copyright 2004 Sun Microsystems, Inc.,

More information

040312研究会HPC2500.ppt

040312研究会HPC2500.ppt 2004312 e-mail : m-aoki@jp.fujitsu.com 1 2 PRIMEPOWER VX/VPP300 VPP700 GP7000 AP3000 VPP5000 PRIMEPOWER 2000 PRIMEPOWER HPC2500 1998 1999 2000 2001 2002 2003 3 VPP5000 PRIMEPOWER ( 1 VU 9.6 GF 16GB 1 VU

More information

fiš„v8.dvi

fiš„v8.dvi (2001) 49 2 333 343 Java Jasp 1 2 3 4 2001 4 13 2001 9 17 Java Jasp (JAva based Statistical Processor) Jasp Jasp. Java. 1. Jasp CPU 1 106 8569 4 6 7; fuji@ism.ac.jp 2 106 8569 4 6 7; nakanoj@ism.ac.jp

More information

XMPによる並列化実装2

XMPによる並列化実装2 2 3 C Fortran Exercise 1 Exercise 2 Serial init.c init.f90 XMP xmp_init.c xmp_init.f90 Serial laplace.c laplace.f90 XMP xmp_laplace.c xmp_laplace.f90 #include int a[10]; program init integer

More information

A B 1: Ex. MPICH-G2 C.f. NXProxy [Tanaka] 2:

A B 1: Ex. MPICH-G2 C.f. NXProxy [Tanaka] 2: Java Jojo ( ) ( ) A B 1: Ex. MPICH-G2 C.f. NXProxy [Tanaka] 2: Java Jojo Jojo (1) :Globus GRAM ssh rsh GRAM ssh GRAM A rsh B Jojo (2) ( ) Jojo Java VM JavaRMI (Sun) Horb(ETL) ( ) JPVM,mpiJava etc. Send,

More information

1 2 4 5 9 10 12 3 6 11 13 14 0 8 7 15 Iteration 0 Iteration 1 1 Iteration 2 Iteration 3 N N N! N 1 MOPT(Merge Optimization) 3) MOPT 8192 2 16384 5 MOP

1 2 4 5 9 10 12 3 6 11 13 14 0 8 7 15 Iteration 0 Iteration 1 1 Iteration 2 Iteration 3 N N N! N 1 MOPT(Merge Optimization) 3) MOPT 8192 2 16384 5 MOP 10000 SFMOPT / / MOPT(Merge OPTimization) MOPT FMOPT(Fast MOPT) FMOPT SFMOPT(Subgrouping FMOPT) SFMOPT 2 8192 31 The Proposal and Evaluation of SFMOPT, a Task Mapping Method for 10000 Tasks Haruka Asano

More information

03_Fortran_osx.indd

03_Fortran_osx.indd Fortran OpenMP* Fortran OpenMP* OpenMP* 9.0 1...2 2... 3 3 OpenMP*... 4 3.1... 4 3.2 OpenMP*... 5 3.3 OpenMP*... 8 4 OpenMP*... 9 4.1... 9 4.2... 10 4.3 OpenMP*... 10 4.4 OpenMP*... 11 4.5... 12 5 OpenMP*...

More information

smpp_resume.dvi

smpp_resume.dvi 6 mmiki@mail.doshisha.ac.jp Parallel Processing Parallel Pseudo-parallel Concurrent 1) 1/60 1) 1997 5 11 IBM Deep Blue Deep Blue 2) PC 2000 167 Rank Manufacturer Computer Rmax Installation Site Country

More information

NUMAの構成

NUMAの構成 共有メモリを使ったデータ交換と同期 慶應義塾大学理工学部 天野英晴 hunga@am.ics.keio.ac.jp 同期の必要性 あるプロセッサが共有メモリに書いても 別のプロセッサにはそのことが分からない 同時に同じ共有変数に書き込みすると 結果がどうなるか分からない そもそも共有メモリって結構危険な代物 多くのプロセッサが並列に動くには何かの制御機構が要る 不可分命令 同期用メモリ バリア同期機構

More information

Microsoft PowerPoint - sales2.ppt

Microsoft PowerPoint - sales2.ppt 並列化の基礎 ( 言葉の意味 ) 並列実行には 複数のタスク実行主体が必要 共有メモリ型システム (SMP) での並列 プロセスを使用した並列化 スレッドとは? スレッドを使用した並列化 分散メモリ型システムでの並列 メッセージパッシングによる並列化 並列アーキテクチャ関連の言葉を押さえよう 21 プロセスを使用した並列処理 並列処理を行うためには複数のプロセスの生成必要プロセスとは プログラム実行のための能動実態メモリ空間親プロセス子プロセス

More information

untitled

untitled IBM i IBM GUI 2 JAVA JAVA JAVA JAVA-COBOL JAVA JDBC CUI CUI COBOL DB2 3 1 3270 5250 HTML IBM HATS WebFacing 4 2 IBM CS Bridge XML Bridge 5 Eclipse RSE RPG 6 7 WEB/JAVA RPG WEB 8 EBCDIC EBCDIC PC ASCII

More information

はじめに

はじめに IT 1 NPO (IPEC) 55.7 29.5 Web TOEIC Nice to meet you. How are you doing? 1 type (2002 5 )66 15 1 IT Java (IZUMA, Tsuyuki) James Robinson James James James Oh, YOU are Tsuyuki! Finally, huh? What's going

More information

C

C C 1 2 1.1........................... 2 1.2........................ 2 1.3 make................................................ 3 1.4....................................... 5 1.4.1 strip................................................

More information

Cell/B.E. BlockLib

Cell/B.E. BlockLib Cell/B.E. BlockLib 17 17115080 21 2 10 i Cell/B.E. BlockLib SIMD CELL SIMD Cell Cell BlockLib BlockLib NestStep libspe1 Cell SDK 3.1 libspe2 BlockLib Cell SDK 3.1 NestStep libspe2 BlockLib BlockLib libspe1

More information

,,,,., C Java,,.,,.,., ,,.,, i

,,,,., C Java,,.,,.,., ,,.,, i 24 Development of the programming s learning tool for children be derived from maze 1130353 2013 3 1 ,,,,., C Java,,.,,.,., 1 6 1 2.,,.,, i Abstract Development of the programming s learning tool for children

More information

Introduction Purpose This training course demonstrates the use of the High-performance Embedded Workshop (HEW), a key tool for developing software for

Introduction Purpose This training course demonstrates the use of the High-performance Embedded Workshop (HEW), a key tool for developing software for Introduction Purpose This training course demonstrates the use of the High-performance Embedded Workshop (HEW), a key tool for developing software for embedded systems that use microcontrollers (MCUs)

More information

XACCの概要

XACCの概要 2 global void kernel(int a[max], int llimit, int ulimit) {... } : int main(int argc, char *argv[]){ MPI_Int(&argc, &argc); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); dx

More information

~~~~~~~~~~~~~~~~~~ wait Call CPU time 1, latch: library cache 7, latch: library cache lock 4, job scheduler co

~~~~~~~~~~~~~~~~~~ wait Call CPU time 1, latch: library cache 7, latch: library cache lock 4, job scheduler co 072 DB Magazine 2007 September ~~~~~~~~~~~~~~~~~~ wait Call CPU time 1,055 34.7 latch: library cache 7,278 750 103 24.7 latch: library cache lock 4,194 465 111 15.3 job scheduler coordinator slave wait

More information

JAMSTECR, October MPI Message Passing Interface JAMSTEC NEC SX- IBM RS /SP PC MPI MPI_SENDRECV SX- SP PCC MPI MPI, Performance of MPI on parallel comp

JAMSTECR, October MPI Message Passing Interface JAMSTEC NEC SX- IBM RS /SP PC MPI MPI_SENDRECV SX- SP PCC MPI MPI, Performance of MPI on parallel comp JAMSTECR, October MPI Message Passing Interface JAMSTECNEC SX- IBM RS/SPPC MPI MPI_SENDRECVSX- SP PCC MPI MPI, Performance of MPI on parallel computers in JAMSTEC Hideaki SAITO Kazushi FURUTA Jun NAOI

More information

Cisco 1711/1712セキュリティ アクセス ルータの概要

Cisco 1711/1712セキュリティ アクセス ルータの概要 CHAPTER 1 Cisco 1711/1712 Cisco 1711/1712 Cisco 1711/1712 1-1 1 Cisco 1711/1712 Cisco 1711/1712 LAN Cisco 1711 1 WIC-1-AM WAN Interface Card WIC;WAN 1 Cisco 1712 1 ISDN-BRI S/T WIC-1B-S/T 1 Cisco 1711/1712

More information

untitled

untitled IBM i IBM AS/400 Power Systems 63.8% CPU 19,516 43,690 25,072 2002 POWER4 2000 SOI 2005 2004 POWER5 2007 POWER6 2008 IBM i 2004 eserver i5 2000 eserver iseries e 2006 System i5 Systems Agenda 2008 Power

More information

JIIAセミナー

JIIAセミナー Digital Interface IIDC URL teli.co.jp/ E-Mail http://www.toshiba-teli.co.jp teli.co.jp/ s-itokawa@toshiba-teli.co.jpteli.co.jp EIA,NTSC EIA,NTSC 4-5 JIIA JIIA - / Digital Interface Digital Interface IEEE1394

More information

PowerPoint プレゼンテーション

PowerPoint プレゼンテーション OpenMP 並列解説 1 人が共同作業を行うわけ 田植えの例 重いものを持ち上げる 田おこし 代かき 苗の準備 植付 共同作業する理由 1. 短時間で作業を行うため 2. 一人ではできない作業を行うため 3. 得意分野が異なる人が協力し合うため ポイント 1. 全員が最大限働く 2. タイミングよく 3. 作業順序に注意 4. オーバーヘッドをなくす 2 倍率 効率 並列化率と並列加速率 並列化効率の関係

More information

,,.,,., II,,,.,,.,.,,,.,,,.,, II i

,,.,,., II,,,.,,.,.,,,.,,,.,, II i 12 Load Dispersion Methods in Thin Client Systems 1010405 2001 2 5 ,,.,,., II,,,.,,.,.,,,.,,,.,, II i Abstract Load Dispersion Methods in Thin Client Systems Noritaka TAKEUCHI Server Based Computing by

More information

GPGPU

GPGPU GPGPU 2013 1008 2015 1 23 Abstract In recent years, with the advance of microscope technology, the alive cells have been able to observe. On the other hand, from the standpoint of image processing, the

More information

MPI usage

MPI usage MPI (Version 0.99 2006 11 8 ) 1 1 MPI ( Message Passing Interface ) 1 1.1 MPI................................. 1 1.2............................... 2 1.2.1 MPI GATHER.......................... 2 1.2.2

More information

コードのチューニング

コードのチューニング OpenMP による並列化実装 八木学 ( 理化学研究所計算科学研究センター ) KOBE HPC Spring School 2019 2019 年 3 月 14 日 スレッド並列とプロセス並列 スレッド並列 OpenMP 自動並列化 プロセス並列 MPI プロセス プロセス プロセス スレッドスレッドスレッドスレッド メモリ メモリ プロセス間通信 Private Private Private

More information

AJACS18_ ppt

AJACS18_ ppt 1, 1, 1, 1, 1, 1,2, 1,2, 1 1 DDBJ 2 AJACS3 2010 6 414:20-15:20 2231 DDBJ DDBJ DDBJ DDBJ NCBI (GenBank) DDBJ EBI (EMBL-Bank) GEO DDBJ Omics ARchive(DOR) ArrayExpress DTA (DDBJ Trace Archive) DRA (DDBJ

More information

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h 23 FPGA CUDA Performance Comparison of FPGA Array with CUDA on Poisson Equation (lijiang@sekine-lab.ei.tuat.ac.jp), (kazuki@sekine-lab.ei.tuat.ac.jp), (takahashi@sekine-lab.ei.tuat.ac.jp), (tamukoh@cc.tuat.ac.jp),

More information

Microsoft PowerPoint - 03_What is OpenMP 4.0 other_Jan18

Microsoft PowerPoint - 03_What is OpenMP 4.0 other_Jan18 OpenMP* 4.x における拡張 OpenMP 4.0 と 4.5 の機能拡張 内容 OpenMP* 3.1 から 4.0 への拡張 OpenMP* 4.0 から 4.5 への拡張 2 追加された機能 (3.1 -> 4.0) C/C++ 配列シンタックスの拡張 SIMD と SIMD 対応関数 デバイスオフロード task 構 の依存性 taskgroup 構 cancel 句と cancellation

More information

workshop Eclipse TAU AICS.key

workshop Eclipse TAU AICS.key 11 AICS 2016/02/10 1 Bryzgalov Peter @ HPC Usability Research Team RIKEN AICS Copyright 2016 RIKEN AICS 2 3 OS X, Linux www.eclipse.org/downloads/packages/eclipse-parallel-application-developers/lunasr2

More information

Introduction Purpose This course explains how to use Mapview, a utility program for the Highperformance Embedded Workshop (HEW) development environmen

Introduction Purpose This course explains how to use Mapview, a utility program for the Highperformance Embedded Workshop (HEW) development environmen Introduction Purpose This course explains how to use Mapview, a utility program for the Highperformance Embedded Workshop (HEW) development environment for microcontrollers (MCUs) from Renesas Technology

More information

FAX-760CLT

FAX-760CLT FAX-760CLT ;; yy 1 f a n l p w s m t v y k u c j 09,. i 09 V X Q ( < N > O P Z R Q: W Y M S T U V 1 2 3 4 2 1 1 2 1 2 j 11 dd e i j i 1 ; 3 oo c o 1 2 3 4 5 6 j12 00 9 i 0 9 i 0 9 i 0 9 i oo

More information

2. OpenMP におけるキーワード一覧 OpenMP の全体像を理解するために 指示文 指示節 実行時ライブラリ関数 環境変数にそれぞれどうようなものがあるのかを最初に示します 各詳細については第 4 章以降で説明します 2.1 OpenMP の指示文 OpenMPの指示文は プログラム内で並列

2. OpenMP におけるキーワード一覧 OpenMP の全体像を理解するために 指示文 指示節 実行時ライブラリ関数 環境変数にそれぞれどうようなものがあるのかを最初に示します 各詳細については第 4 章以降で説明します 2.1 OpenMP の指示文 OpenMPの指示文は プログラム内で並列 C 言語による OpenMP 入門 東京大学情報基盤センタープログラミング講習会資料 担当黒田久泰 1. はじめに OpenMP は非営利団体 OpenMP Architecture Review Board(ARB) によって規定されている業界標準規格です 共有メモリ型並列計算機用のプログラムの並列化を記述するための指示文 ライブラリ関数 環境変数などが規定されています OpenMP を利用するには

More information

120802_MPI.ppt

120802_MPI.ppt CPU CPU CPU CPU CPU SMP Symmetric MultiProcessing CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CP OpenMP MPI MPI CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU MPI MPI+OpenMP CPU CPU CPU CPU CPU CPU CPU CP

More information

( CUDA CUDA CUDA CUDA ( NVIDIA CUDA I

(    CUDA CUDA CUDA CUDA (  NVIDIA CUDA I GPGPU (II) GPGPU CUDA 1 GPGPU CUDA(CUDA Unified Device Architecture) CUDA NVIDIA GPU *1 C/C++ (nvcc) CUDA NVIDIA GPU GPU CUDA CUDA 1 CUDA CUDA 2 CUDA NVIDIA GPU PC Windows Linux MaxOSX CUDA GPU CUDA NVIDIA

More information

1 M32R Single-Chip Multiprocessor [2] [3] [4] [5] Linux/M32R UP(Uni-processor) SMP(Symmetric Multi-processor) MMU CPU nommu Linux/M32R Linux/M32R 2. M

1 M32R Single-Chip Multiprocessor [2] [3] [4] [5] Linux/M32R UP(Uni-processor) SMP(Symmetric Multi-processor) MMU CPU nommu Linux/M32R Linux/M32R 2. M M32R Linux SMP a) Implementation of Linux SMP kernel for M32R multiprocessor Hayato FUJIWARA a), Hitoshi YAMAMOTO, Hirokazu TAKATA, Kei SAKAMOTO, Mamoru SAKUGAWA, and Hiroyuki KONDO CPU OS 32 RISC M32R

More information

研究背景 大規模な演算を行うためには 分散メモリ型システムの利用が必須 Message Passing Interface MPI 並列プログラムの大半はMPIを利用 様々な実装 OpenMPI, MPICH, MVAPICH, MPI.NET プログラミングコストが高いため 生産性が悪い 新しい並

研究背景 大規模な演算を行うためには 分散メモリ型システムの利用が必須 Message Passing Interface MPI 並列プログラムの大半はMPIを利用 様々な実装 OpenMPI, MPICH, MVAPICH, MPI.NET プログラミングコストが高いため 生産性が悪い 新しい並 XcalableMPによる NAS Parallel Benchmarksの実装と評価 中尾 昌広 李 珍泌 朴 泰祐 佐藤 三久 筑波大学 計算科学研究センター 筑波大学大学院 システム情報工学研究科 研究背景 大規模な演算を行うためには 分散メモリ型システムの利用が必須 Message Passing Interface MPI 並列プログラムの大半はMPIを利用 様々な実装 OpenMPI,

More information

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2011-OS-118 No /7/28 LLVM LLVM Scattaring Object files by LLVM Natsuki Kawai 1 and Koichi Sa

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2011-OS-118 No /7/28 LLVM LLVM Scattaring Object files by LLVM Natsuki Kawai 1 and Koichi Sa LLVM 1 1 1 1 1 LLVM Scattaring Object files by LLVM Natsuki Kawai 1 and Koichi Sasada 1 This paper describes the system scatters executable files or shared libraries, generated by compile and link processes,

More information

連載講座 : 高生産並列言語を使いこなす (4) ゲーム木探索の並列化 田浦健次朗 東京大学大学院情報理工学系研究科, 情報基盤センター 目次 1 準備 問題の定義 αβ 法 16 2 αβ 法の並列化 概要 Young Brothers Wa

連載講座 : 高生産並列言語を使いこなす (4) ゲーム木探索の並列化 田浦健次朗 東京大学大学院情報理工学系研究科, 情報基盤センター 目次 1 準備 問題の定義 αβ 法 16 2 αβ 法の並列化 概要 Young Brothers Wa 連載講座 : 高生産並列言語を使いこなす (4) ゲーム木探索の並列化 田浦健次朗 東京大学大学院情報理工学系研究科, 情報基盤センター 目次 1 準備 16 1.1 問題の定義 16 1.2 αβ 法 16 2 αβ 法の並列化 17 2.1 概要 17 2.2 Young Brothers Wait Concept 17 2.3 段数による逐次化 18 2.4 適応的な待機 18 2. 強制終了

More information

¥Ñ¥Ã¥±¡¼¥¸ Rhpc ¤Î¾õ¶·

¥Ñ¥Ã¥±¡¼¥¸ Rhpc ¤Î¾õ¶· Rhpc COM-ONE 2015 R 27 12 5 1 / 29 1 2 Rhpc 3 forign MPI 4 Windows 5 2 / 29 1 2 Rhpc 3 forign MPI 4 Windows 5 3 / 29 Rhpc, R HPC Rhpc, ( ), snow..., Rhpc worker call Rhpc lapply 4 / 29 1 2 Rhpc 3 forign

More information

P2P P2P peer peer P2P peer P2P peer P2P i

P2P P2P peer peer P2P peer P2P peer P2P i 26 P2P Proposed a system for the purpose of idle resource utilization of the computer using the P2P 1150373 2015 2 27 P2P P2P peer peer P2P peer P2P peer P2P i Abstract Proposed a system for the purpose

More information

FabHetero FabHetero FabHetero FabCache FabCache SPEC2000INT IPC FabCache 0.076%

FabHetero FabHetero FabHetero FabCache FabCache SPEC2000INT IPC FabCache 0.076% 2013 (409812) FabHetero FabHetero FabHetero FabCache FabCache SPEC2000INT 6 1000 IPC FabCache 0.076% Abstract Single-ISA heterogeneous multi-core processors are increasing importance in the processor architecture.

More information

Vol.214-HPC-145 No /7/3 C #pragma acc directive-name [clause [[,] clause] ] new-line structured block Fortran!$acc directive-name [clause [[,] c

Vol.214-HPC-145 No /7/3 C #pragma acc directive-name [clause [[,] clause] ] new-line structured block Fortran!$acc directive-name [clause [[,] c Vol.214-HPC-145 No.45 214/7/3 OpenACC 1 3,1,2 1,2 GPU CUDA OpenCL OpenACC OpenACC High-level OpenACC CPU Intex Xeon Phi K2X GPU Intel Xeon Phi 27% K2X GPU 24% 1. TSUBAME2.5 CPU GPU CUDA OpenCL CPU OpenMP

More information

main.dvi

main.dvi PC 1 1 [1][2] [3][4] ( ) GPU(Graphics Processing Unit) GPU PC GPU PC ( 2 GPU ) GPU Harris Corner Detector[5] CPU ( ) ( ) CPU GPU 2 3 GPU 4 5 6 7 1 toyohiro@isc.kyutech.ac.jp 45 2 ( ) CPU ( ) ( ) () 2.1

More information

Microsoft PowerPoint ppt [互換モード]

Microsoft PowerPoint ppt [互換モード] 計算機アーキテクチャ特論 2013 年 10 28 枝廣 前半 ( 並列アーキテクチャの基本 枝廣 ) 10/7, 10/21, 10/28, 11/11, 11/18, (12/2)( 程は予定 ) 内容 ( 変更の可能性あり ) 序論 ( マルチコア= 並列アーキテクチャ概論 ) キャッシュ コヒーレンシ メモリ コンシステンシ 並列アーキテクチャモデル OSモデル 並列プログラミングモデル 語

More information

<95DB8C9288E397C389C88A E696E6462>

<95DB8C9288E397C389C88A E696E6462> 2011 Vol.60 No.2 p.138 147 Performance of the Japanese long-term care benefit: An International comparison based on OECD health data Mie MORIKAWA[1] Takako TSUTSUI[2] [1]National Institute of Public Health,

More information

Introduction Purpose The course describes library configuration and usage in the High Performance Embedded Workshop (HEW), which speeds development of

Introduction Purpose The course describes library configuration and usage in the High Performance Embedded Workshop (HEW), which speeds development of Introduction Purpose The course describes library configuration and usage in the High Performance Embedded Workshop (HEW), which speeds development of software for embedded systems. Objectives Learn the

More information

2012年度HPCサマーセミナー_多田野.pptx

2012年度HPCサマーセミナー_多田野.pptx ! CCS HPC! I " tadano@cs.tsukuba.ac.jp" " 1 " " " " " " " 2 3 " " Ax = b" " " 4 Ax = b" A = a 11 a 12... a 1n a 21 a 22... a 2n...... a n1 a n2... a nn, x = x 1 x 2. x n, b = b 1 b 2. b n " " 5 Gauss LU

More information

HPC146

HPC146 2 3 4 5 6 int array[16]; #pragma xmp nodes p(4) #pragma xmp template t(0:15) #pragma xmp distribute t(block) on p #pragma xmp align array[i] with t(i) array[16] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Node

More information

Tab 5, 11 Tab 4, 10, Tab 3, 9, 15Tab 2, 8, 14 Tab 1, 7, 13 2

Tab 5, 11 Tab 4, 10, Tab 3, 9, 15Tab 2, 8, 14 Tab 1, 7, 13 2 COMPANION 20 MULTIMEDIA SPEAKER SYSTEM Owner s Guide Tab 5, 11 Tab 4, 10, Tab 3, 9, 15Tab 2, 8, 14 Tab 1, 7, 13 2 Tab1, 7, 13 Tab 2, 8, 14 Tab 3, 9, 15 Tab 4, 10, Tab 5, 11 This product conforms to all

More information

FFTSS Library Version 3.0 User's Guide

FFTSS Library Version 3.0 User's Guide : 19 10 31 FFTSS 3.0 Copyright (C) 2002-2007 The Scalable Software Infrastructure Project, (CREST),,. http://www.ssisc.org/ Contents 1 4 2 (DFT) 4 3 4 3.1 UNIX............................................

More information

IPSJ SIG Technical Report Vol.2015-HPC-150 No /8/6 I/O Jianwei Liao 1 Gerofi Balazs 1 1 Guo-Yuan Lien Prototyping F

IPSJ SIG Technical Report Vol.2015-HPC-150 No /8/6 I/O Jianwei Liao 1 Gerofi Balazs 1 1 Guo-Yuan Lien Prototyping F I/O Jianwei Liao 1 Gerofi Balazs 1 1 Guo-Yuan Lien 1 1 1 1 1 30 30 100 30 30 2 Prototyping File I/O Arbitrator Middleware for Real-Time Severe Weather Prediction System Jianwei Liao 1 Gerofi Balazs 1 Yutaka

More information

56 Study on the Methodology for Predicting and Preventing Errors to Improve Reliability of Maintenance Task in Nuclear Power Plant Abstract Identification of maintenance tasks Specification of important

More information

Nios II ハードウェア・チュートリアル

Nios II ハードウェア・チュートリアル Nios II ver. 7.1 2007 8 1. Nios II FPGA Nios II Quaruts II 7.1 Nios II 7.1 Nios II Cyclone II count_binary 2. 2-1. http://www.altera.com/literature/lit-nio2.jsp 2-2. Nios II Quartus II FEATURE Nios II

More information

スライド 1

スライド 1 High Performance and Productivity 並列プログラミング課題と挑戦 HPC システムの利用の拡大の背景 シュミレーションへの要求 より複雑な問題をより精度良くシュミレーションすることが求められている HPC システムでの並列処理の要求の拡大 1. モデル アルゴリズム 解析対象は何れもより複雑で 規模の大きなものになっている 2. マイクロプロセッサのマルチコア化 3.

More information

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325 社団法人人工知能学会 Japanese Society for Artificial Intelligence 人工知能学会研究会資料 JSAI Technical Report SIG-Challenge-B3 (5/5) RoboCup SSL Humanoid A Proposal and its Application of Color Voxel Server for RoboCup SSL

More information

PC Development of Distributed PC Grid System,,,, Junji Umemoto, Hiroyuki Ebara, Katsumi Onishi, Hiroaki Morikawa, and Bunryu U PC WAN PC PC WAN PC 1 P

PC Development of Distributed PC Grid System,,,, Junji Umemoto, Hiroyuki Ebara, Katsumi Onishi, Hiroaki Morikawa, and Bunryu U PC WAN PC PC WAN PC 1 P PC Development of Distributed PC Grid System,,,, Junji Umemoto, Hiroyuki Ebara, Katsumi Onishi, Hiroaki Morikawa, and Bunryu U PC WAN PC PC WAN PC 1 PC PC PC PC PC Key Words:Grid, PC Cluster, Distributed

More information

第3回戦略シンポジウム緑川公開用

第3回戦略シンポジウム緑川公開用 2010 5 15 - - (SDSM) SMS MpC DLM Top500 Top 500 list of Supercomputers (http://www.top500.org) Top 500 list of Supercomputers (http://www.top500.org) 1998 11 SMP Symmetric Multiprocessor CPU CPU CPU CPU

More information

スライド 1

スライド 1 WEEE RoHS Supply Chain Management 10.29 1 2 3 4 WEEE WEEE RoHS RoHS TAC TAC Official Journal ( Official Journal ( 2003 2003 2 13 13 WEEE WEEE 9 2003 2003 12 12 31 31 RoHS RoHS 2004 2004 7 5 TAC TAC Work

More information

A B C 21 ( ) 8:30 09:00 10:30 (3) EVA-1 HPC-1 10:40 12:10 (3) PRO-1 (2) EVA-2 HPC-2 13:40 15:10 (3) PRO-2 (2) OS-1 HPC-3 15:20 16:50 (3) PRO-3 17:35 O

A B C 21 ( ) 8:30 09:00 10:30 (3) EVA-1 HPC-1 10:40 12:10 (3) PRO-1 (2) EVA-2 HPC-2 13:40 15:10 (3) PRO-2 (2) OS-1 HPC-3 15:20 16:50 (3) PRO-3 17:35 O FINAL PROGRAM 15th Annual Workshop SWoPP 2002 2002 / / 2002 Yufuin Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2002 8 21 ( ) 23 ( ) 879-5191 http://www.kinrou.or.jp/yuhuin-heights/

More information

CSV ToDo ToDo

CSV ToDo ToDo intra-mart ver4.0 2003/05/02 1. ( 10 imode ConceptBase imode CSV ToDo ToDo 2. intra-mart ver4.0 Java Sun JDK1.3.1 WebServerConnector Java DDL intra-mart intra-mart Java OS (1 Web Web intra-mart 2 Sun ONE

More information

Copyright Oracle Parkway, Redwood City, CA U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated softw

Copyright Oracle Parkway, Redwood City, CA U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated softw Oracle Solaris Studio 12.3 Part No: E26466 2011 12 Copyright 2011 500 Oracle Parkway, Redwood City, CA 94065 U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software,

More information

ISSN NII Technical Report Patent application and industry-university cooperation: Analysis of joint applications for patent in the Universit

ISSN NII Technical Report Patent application and industry-university cooperation: Analysis of joint applications for patent in the Universit ISSN 1346-5597 NII Technical Report Patent application and industry-university cooperation: Analysis of joint applications for patent in the University of Tokyo Morio SHIBAYAMA, Masaharu YANO, Kiminori

More information