GRAPE GRAPE-DR V-GRAPE

Size: px
Start display at page:

Download "GRAPE GRAPE-DR V-GRAPE"

Transcription

1 V-GRAPE / CCSR 2007/1/24

2 GRAPE GRAPE-DR V-GRAPE

3

4

5

6

7 ( ) SDSS

8 GRAPE : (Barnes-Hut tree, FMM, Particle- Mesh Ewald(PPPM)...): ( )

9 1988

10 GRAPE-1(1989) Mflops

11 GRAPE-2(1990) 8 ( ) 40Mflops

12 GRAPE-3(1991) MHz 7.2Gflops

13 GRAPE-3 1µm MHz 600 Mflops

14 GRAPE-4(1995) Tflops

15 GRAPE-4 Xi Xi sqrt Pcut Fcut Xi Xi m/r FiFiPi m j Xi Xi r 2 Xi Xi Func. eval. Xi Xi Xi Xi Xi Xi Xi Xi m/r 3 Xi Xi Xi FiFiFi Xj Xi Xi Vi Xi Xi r. v m/r 5 Xi Xi Vj Xi Xi Xi Xi FiFiJi Xi Xi Xi Xi 1µm 10 (40 ) 640Mflops

16 GRAPE-6(2002) Tflops

17 パイプライン LSI 0.25 µm ルール (東芝 TC-240, 1.8M ゲート) 90 MHz 動作 6 パイプラインを集積 チップあたり 31 Gflops

18 2006 GRAPE-6 Core 2 Extreme 250nm 65nm 90MHz 2.93GHz 32.4Gflops 23.44Gflops 10W 75W 1W 3.24Gflops Gflops

19 GRAPE-4

20 GRAPE-6 MDGRAPE-3 : MDGRAPE-4, 20Pflops@2010 MDGRAPE-3 GRAPE-DR

21 GRAPE-DR GRAPE : 2 Petaflops Tflops GRAPE : GRAPE

22 GRAPE ( ( N )) µm µm nm nm 10

23 1.

24 1. 2.

25

26 GRAPE-DR (3)

27 1

28 : ( ) 1. GRAPE SIMD

29 SIMD SIMD (Single Instruction Multiple Data): GRAPE

30 SIMD SIMD SSE MMX SIMD GRAPE-DR SIMD

31 SIMD Illiac IV, Goodyear MPP, ICL DAP, TMC CM-2, MASPAR MP-1 ALU REG MEM ALU REG MEM ALU REG MEM ALU REG MEM ALU REG MEM : : SIMD

32 SIMD Pentium III, R0 R1 R2 R3 R4 R5 R6 R7 W0 W1 W0 W1 W0 W1 W0 W1 W0 W1 W0 W1 W0 W1 W0 W1 W2 W3 W2 W3 W2 W3 W2 W3 W2 W3 W2 W3 W2 W3 W2 W3 ALU0 ALU1 ALU2 ALU3 1 : 4

33 nyo d4prqts B8C*DFEHGFI 7KJ GRAPE-DR SIMD!"$# %'& (*)+,-. /0!"$#%ˆ $Š 'ŒŽ (* & ) \Y]_^[`baTced 1$243$5687*9 (FPGA :';$< ) RTSVUTWYX[Z yz{z z} ~ $ƒ Q 0 w4xzyz{ L$M4N'OQP SING u Xtv (PE) 1 PE = + ( ) (PE ) PE (BB)

34 *,+ (M) PE PEID BBID A x + "! B T 32W 256W ALU 256 # $ % & (' #)$ & (' (K M )

35

36 32PE( ) 16 18mm

37 GRAPE-DR 500MHz 100 Gflops ( )

38 GRAPE-DR 別ボード こっちが プロジェ クト公式 中身は殆ど同じ 何故か大きい LINPACK が動作 したらしい

39 PCI-Express (8 2GB/s) 4 GRAPE-DR ( ) PCI-Express 1

40 : 1 Pflops = PC 512

41 GRAPE

42 : g i = j f(x i, x j ) i j j i j, i j ( )

43 ( 2006) /VARI xi, yi, zi, e2; /VARJ xj, yj, zj, mj; /VARF fx, fy, fz; dx = xi - xj; dy = yi - yj; dz = zi - zj; r2 = dx*dx + dy*dy + dz*dz + e2; r3i= powm32(r2); ff = mj*r3i; fx += ff*dx; fy += ff*dy; fz += ff*dz; GRAPE PGR (FPGA PROGRAPE D 2006)

44 / int SING_send_j_particle(struct grape_j_particle_struct *jp, int index_in_em); int SING_send_i_particle(struct grape_i_particle_struct *ip, int n); int SING_get_result(struct grape_result_struct *rp); void SING_grape_init(); int SING_grape_run(int n);

45 2 ( )

46 V-GRAPE GRAPE-DR = V-GRAPE

47 GRAPE-DR 256Gflops MDGRAPE-3 FPGA FFT CG 2

48 FFT CG :

49 FFT FFT FFT : 10 log n 4GB/s 10 Gflops CPU

50 CG : O(10)

51 GRAPE-DR: 1MB Intel Itanium : 24MB? DRAM 1T-SRAM : 32 MB

52 V-GRAPE PE PE PE PE PE PE PE PE GRAPE-DR V-GRAPE PE

53 V-GRAPE / : ( ) :

54

55 (GRAPE ) GRAPE : ( )

56 ( ) SC2002 NICAM

57 SC2002 Shingu et al, A Tflops Global Atmospheric Simulation with the Spectral Transform Method on the Earth Simulator % GRAPE-DR n n 2

58 NICAM wtk/ /iga/pub/ GRAPE- DR

59 2010 : Glevel=14 650TB 1 2P ( ) 1 3 2GB/s GRAPE-DR 6 Gflops ( 1% ) V-GRAPE

60 : 1 V-GRAPE

61 LINPACK V-GRAPE

62 1960 : CDC 6(7)600 (Cray ) 1970 : Cray-1, CDC-Star 1980

63 1990 PC

64 PC

65 1975: Cray-1 100Mflops 10M$ PDP-11/70 10kflops? 50K$? 1985: Cray XMP 1Gflops 10M$ PC-AT 30kflops? 5K$ 1995: VPP Gflops 30M$? Dec Alpha 300Mflops 30K$ 2005: SX-8 10TF 50M$? Intel PD 12 Gflops 1K$ Cray-1 50 XMP 20 VPP 3 SX

66

67 1970 : IC 1980 : Cray : (Cray ) GHz GHz : 1 : Tflops ( ) : 100GB/s

68 Cray-1 : 100GB/ Gflops : 10GB/s 50Gflops

69 GRAPE-DR GPGPU

70 GPGPU nvidia 8800: C 768MB 90GB/s(SX-7 3 ) GPU C 400Gflops (8 )

71 : =

72 GRAPE LSI GRAPE-DR SIMD GRAPE V-GRAPE GRAPE-DR V-GRAPE

73

74 Memory Wall : : : :

75 1990 I/O

76

77 : 30

78 V-GRAPE BLAS, LAPACK PE PGDL ( FPGA ) SPH ( 150)

79 :

80 (M. Flynn) SISD/SIMD/MISD/MIMD (SI) (MI) (SD) (MD) SIMD SIMD ( ) MIMD

81 SIMD GRAPE ( ) : : ( ) : 1000 ( / )

82 (PE) (j- ) j- j- j- i- PE PE PE PE PE i- PE PE PE PE PE i- PE PE PE PE PE i- PE PE PE PE PE i- PE PE PE PE PE i- PE PE PE PE PE j- j- (GRAPE-6 ) 2 : 2

83 PE PE PE PE PE PE broadcast memory PE PE PE PE broadcast memory PE PE PE PE broadcast memory PE PE PE PE broadcast memory ( ) Memory controller/host

84 SING: Sing Is Not GRAPE DRAM DRAM DRAM DRAM FPGA CP SING FPGA CP SING FPGA CP SING FPGA CP SING FPGA Host interface PCI-X/PCIE PCI

85 GRAPE : SIMD GDR : (FPGA ) =

86 PE PE ( )

87 var vector long xi hlt flt64to72 var vector long yi hlt flt64to72 var vector long zi hlt flt64to72 var vector short idxi hlt fix32to36ru bvar long xj elt flt64to72 bvar long yj elt flt64to72 bvar long zj elt flt64to72 bvar long vxj xj bvar short mj elt flt64to36 bvar short eps2 elt flt64to36 bvar short idxj elt fix32to36ru var short lmj var short leps2 var short lidxj var vector long accx rrn flt72to64 fadd var vector long accy rrn flt72to64 fadd var vector long accz rrn flt72to64 fadd var vector long pot rrn flt72to64 fadd hlt, elt, rrn

88 loop initialization vlen 4 uxor $t $t $t upassa $ti $ti $lr40v upassa $t $t $lr48v upassa $t $t $lr56v upassa $t $t pot loop body vlen 3 bm vxj $lr0v vlen 1 bm mj lmj bm eps2 leps2 bm idxj lidxj ( ) ( ) ( )

89 vlen 4 nop upassa idxi idxi $t uxor $ti lidxj $t moi 2 ( ) ulnot $ti $ti $t # mreg 1 indicates i!= j moi 0 nop fsub $lr0 xi $r6v $t fsub $lr2 yi $r10v ; fmul $ti $ti $t fsub $lr4 zi $r14v fmul $r10v $r10v $r18v ; fadd $t leps2 $t fmul $r14v $r14v ; fadd $fb $ti $t fadd $fb $ti $r18v $t # rsq is now in r18 t, dx, dy,dz are in 2

90 ( ) ulsr $ti il"60" $t $lr22v ulsr $ti il"1" $t uadd $ti $lr22v $t usub hl"9fd" $ti $t # $lr8v 1.5 ulsl $ti il"60" $lr30v moi 1 uand il"1" $lr22v moi 0 uand $r18v h"000ffffff" $t uor $ti h"3ff000000" $t fmul $ti f"0.57" $t fsub f"1.57" $ti $t mi 1 fmul f"1.414" $ti $t mi 0 nop fmul $t $lr30v $t $r22v # Here the result is the initial guess r 3 1

91 ( ) fmul $r18v $r18v $r26v $t fmul $r18v $ti $r26v $t fmul $ti f"0.5" $r26v # r26v is a**3/2 fmul $r22v $r22v $t fmul $ti $r26v $t fsub f"1.5" $ti $t fmul $r22v $ti $t $r22v fmul $ti $ti $t fmul $ti $r26v $t ( ) fsub f"1.5" $ti $t fmul $r22v $ti $t $r22v fmul $ti $ti $t fmul $ti $r26v $t fsub f"0.5" $ti $t fmul $r22v $ti $t fadd $r22v $ti $t fmul lmj $ti $t $r22v

92 ( ) mi 2 fmul $r6v $ti ; upassa pot pot $lr0v fmul $r10v $t ; fadd $fb $lr40v $lr40v accx fmul $r14v $t ; fadd $fb $lr48v $lr48v accy fmul $r18v $t ; fadd $fb $lr56v $lr56v accz fadd $fb $lr0v pot

93 int SING_send_j_particle(struct grape_j_particle_struct *jp, int index_in_em); int SING_send_i_particle(struct grape_i_particle_struct *ip, int n); int SING_get_result(struct grape_result_struct *rp); void SING_grape_init(); int SING_grape_run(int n); GRAPE-3/5

94 struct grape_j_particle_struct{ double xj; double yj; double zj; double mj; double eps2; UINT32 idxj; }; struct grape_i_particle_struct{ double xi; double yi; double zi; UINT32 idxi; }; struct grape_result_struct{ double accx; double accy; double accz; double pot; };

95 17mm

96

97 PE

GRAPE GRAPE-DR V-GRAPE

GRAPE GRAPE-DR V-GRAPE GRAPE-DR / 2006/11/20-22 GRAPE GRAPE-DR V-GRAPE http://antwrp.gsfc.nasa.gov/apod/ap950917.html ( ) SDSS Genzel et al 2003 Adaptive Optics SgrA ( ) 12 1 : GRAPE : (Barnes-Hut tree, FMM, Particle- Mesh

More information

HPC / (CfCA) HPC 2007/11/23-25

HPC / (CfCA) HPC 2007/11/23-25 HPC / (CfCA) HPC 2007/11/23-25 CfCA GRAPE GRAPE GRAPE-DR HPC : : 1 1 (II ) Ia 100 1 ( ) 0.1 pc 1 AU 3 : 1 100 Top-down Katz and Gunn 1992 Dark Matter + + DM, : :SPH 10 4 Cray YMP 500-1000 : 10 7 Saitoh

More information

GRAPE-DR /

GRAPE-DR / GRAPE-DR / GRAPE GRAPE-DR GRAPE ( ): (Barnes-Hut tree, FMM, Particle- Mesh Ewald(PPPM)...): ( ) 1988 32 IC 200 0.1m 3 400 GRAPE-1(1989) 16 8 32 48 240Mflops GRAPE-2(1990) 8 ( ) 40Mflops GRAPE-3(1991) 24

More information

: 50 10 10 1. : : 3 : 4 : 2 2. : 1946 1975 1 : load: store: : : ( ) ( ) : 101 x 101 ------------- 101 101 ------------ 11001 2 ( ): 32 32 1 32 : 32 ( ) 32 ( ) : log 2 32 : : ( F) ( D) E W 1 4 : F D E

More information

Agenda GRAPE-MPの紹介と性能評価 GRAPE-MPの概要 OpenCLによる四倍精度演算 (preliminary) 4倍精度演算用SIM 加速ボード 6 processor elem with 128 bit logic Peak: 1.2Gflops

Agenda GRAPE-MPの紹介と性能評価 GRAPE-MPの概要 OpenCLによる四倍精度演算 (preliminary) 4倍精度演算用SIM 加速ボード 6 processor elem with 128 bit logic Peak: 1.2Gflops Agenda GRAPE-MPの紹介と性能評価 GRAPE-MPの概要 OpenCLによる四倍精度演算 (preliminary) 4倍精度演算用SIM 加速ボード 6 processor elem with 128 bit logic Peak: 1.2Gflops ボードの概要 Control processor (FPGA by Altera) GRAPE-MP chip[nextreme

More information

untitled

untitled taisuke@cs.tsukuba.ac.jp http://www.hpcs.is.tsukuba.ac.jp/~taisuke/ CP-PACS HPC PC post CP-PACS CP-PACS II 1990 HPC RWCP, HPC かつての世界最高速計算機も 1996年11月のTOP500 第一位 ピーク性能 614 GFLOPS Linpack性能 368 GFLOPS (地球シミュレータの前

More information

supercomputer2010.ppt

supercomputer2010.ppt nanri@cc.kyushu-u.ac.jp 1 !! : 11 12! : nanri@cc.kyushu-u.ac.jp! : Word 2 ! PC GPU) 1997 7 http://wiredvision.jp/news/200806/2008062322.html 3 !! (Cell, GPU )! 4 ! etc...! 5 !! etc. 6 !! 20km 40 km ) 340km

More information

ストリーミング SIMD 拡張命令2 (SSE2) を使用した SAXPY/DAXPY

ストリーミング SIMD 拡張命令2 (SSE2) を使用した SAXPY/DAXPY SIMD 2(SSE2) SAXPY/DAXPY 2.0 2000 7 : 248600J-001 01/12/06 1 305-8603 115 Fax: 0120-47-8832 * Copyright Intel Corporation 1999, 2000 01/12/06 2 1...5 2 SAXPY DAXPY...5 2.1 SAXPY DAXPY...6 2.1.1 SIMD C++...6

More information

スパコンに通じる並列プログラミングの基礎

スパコンに通じる並列プログラミングの基礎 2016.06.06 2016.06.06 1 / 60 2016.06.06 2 / 60 Windows, Mac Unix 0444-J 2016.06.06 3 / 60 Part I Unix GUI CUI: Unix, Windows, Mac OS Part II 0444-J 2016.06.06 4 / 60 ( : ) 6 6 ( ) 6 10 6 16 SX-ACE 6 17

More information

スパコンに通じる並列プログラミングの基礎

スパコンに通じる並列プログラミングの基礎 2018.06.04 2018.06.04 1 / 62 2018.06.04 2 / 62 Windows, Mac Unix 0444-J 2018.06.04 3 / 62 Part I Unix GUI CUI: Unix, Windows, Mac OS Part II 2018.06.04 4 / 62 0444-J ( : ) 6 4 ( ) 6 5 * 6 19 SX-ACE * 6

More information

( )

( ) 1. 2. 3. 4. 5. ( ) () http://www-astro.physics.ox.ac.uk/~wjs/apm_grey.gif http://antwrp.gsfc.nasa.gov/apod/ap950917.html ( ) SDSS : d 2 r i dt 2 = Gm jr ij j i rij 3 = Newton 3 0.1% 19 20 20 2 ( ) 3 3

More information

並列計算の数理とアルゴリズム サンプルページ この本の定価 判型などは, 以下の URL からご覧いただけます. このサンプルページの内容は, 初版 1 刷発行時のものです.

並列計算の数理とアルゴリズム サンプルページ この本の定価 判型などは, 以下の URL からご覧いただけます.  このサンプルページの内容は, 初版 1 刷発行時のものです. 並列計算の数理とアルゴリズム サンプルページ この本の定価 判型などは, 以下の URL からご覧いただけます. http://www.morikita.co.jp/books/mid/080711 このサンプルページの内容は, 初版 1 刷発行時のものです. Calcul scientifique parallèle by Frédéric Magoulès and François-Xavier

More information

スパコンに通じる並列プログラミングの基礎

スパコンに通じる並列プログラミングの基礎 2018.09.10 furihata@cmc.osaka-u.ac.jp ( ) 2018.09.10 1 / 59 furihata@cmc.osaka-u.ac.jp ( ) 2018.09.10 2 / 59 Windows, Mac Unix 0444-J furihata@cmc.osaka-u.ac.jp ( ) 2018.09.10 3 / 59 Part I Unix GUI CUI:

More information

A 99% MS-Free Presentation

A 99% MS-Free Presentation A 99% MS-Free Presentation 2 Galactic Dynamics (Binney & Tremaine 1987, 2008) Dynamics of Galaxies (Bertin 2000) Dynamical Evolution of Globular Clusters (Spitzer 1987) The Gravitational Million-Body Problem

More information

次世代スーパーコンピュータのシステム構成案について

次世代スーパーコンピュータのシステム構成案について 6 19 4 27 1. 2. 3. 3.1 3.2 A 3.3 B 4. 5. 2007/4/27 4 1 1. 2007/4/27 4 2 NEC NHF2 18 9 19 19 2 28 10PFLOPS2.5PB 30MW 3,200 18 12 12 SimFold, GAMESS, Modylas, RSDFT, NICAM, LatticeQCD, LANS HPL, NPB-FT 19

More information

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2 FFT 1 Fourier fast Fourier transform FFT FFT FFT 1 FFT FFT 2 Fourier 2.1 Fourier FFT Fourier discrete Fourier transform DFT DFT n 1 y k = j=0 x j ω jk n, 0 k n 1 (1) x j y k ω n = e 2πi/n i = 1 (1) n DFT

More information

iphone GPGPU GPU OpenCL Mac OS X Snow LeopardOpenCL iphone OpenCL OpenCL NVIDIA GPU CUDA GPU GPU GPU 15 GPU GPU CPU GPU iii OpenMP MPI CPU OpenCL CUDA OpenCL CPU OpenCL GPU NVIDIA Fermi GPU Fermi GPU GPU

More information

ohpr.dvi

ohpr.dvi 2003-08-04 1984 VP-1001 CPU, 250 MFLOPS, 128 MB 2004ASCI Purple (LLNL)64 CPU 197, 100 TFLOPS, 50 TB, 4.5 MW PC 2 CPU 16, 4 GFLOPS, 32 GB, 3.2 kw 20028 CPU 640, 40 TFLOPS, 10 TB, 10 MW (ASCI: Accelerated

More information

アクセラレータのデモと プログラミング手法

アクセラレータのデモと プログラミング手法 アクセラレータのデモと プログラミング手法 会津大学中里直人 アクセラレータボードを使った高速化スクール 2009/12/07 アクセラレータとは (1) ホスト計算機を補佐して特定の計算を高速化する計算機デバイス ホスト (CPU) で動作するプログラムを補佐 アクセラレータの例 Cell/PowerXCell8iブレード ボード : 計算 GPU ボード (NVIDIA, AMD, S3) :

More information

untitled

untitled A = QΛQ T A n n Λ Q A = XΛX 1 A n n Λ X GPGPU A 3 T Q T AQ = T (Q: ) T u i = λ i u i T {λ i } {u i } QR MR 3 v i = Q u i A {v i } A n = 9000 Quad Core Xeon 2 LAPACK (4/3) n 3 O(n 2 ) O(n 3 ) A {v i }

More information

untitled

untitled PC murakami@cc.kyushu-u.ac.jp muscle server blade server PC PC + EHPC/Eric (Embedded HPC with Eric) 1216 Compact PCI Compact PCIPC Compact PCISH-4 Compact PCISH-4 Eric Eric EHPC/Eric EHPC/Eric Gigabit

More information

マルチコアPCクラスタ環境におけるBDD法のハイブリッド並列実装

マルチコアPCクラスタ環境におけるBDD法のハイブリッド並列実装 2010 GPGPU 2010 9 29 MPI/Pthread (DDM) DDM CPU CPU CPU CPU FEM GPU FEM CPU Mult - NUMA Multprocessng Cell GPU Accelerator, GPU CPU Heterogeneous computng L3 cache L3 cache CPU CPU + GPU GPU L3 cache 4

More information

1重谷.PDF

1重谷.PDF RSCC RSCC RSCC BMT 1 6 3 3000 3000 200310 1994 19942 VPP500/32PE 19992 VPP700E/128PE 160PE 20043 2 2 PC Linux 2048 CPU Intel Xeon 3.06GHzDual) 12.5 TFLOPS SX-7 32CPU/256GB 282.5 GFLOPS Linux 3 PC 1999

More information

untitled

untitled A = QΛQ T A n n Λ Q A = XΛX 1 A n n Λ X GPGPU A 3 T Q T AQ = T (Q: ) T u i = λ i u i T {λ i } {u i } QR MR 3 v i = Q u i A {v i } A n = 9000 Quad Core Xeon 2 LAPACK (4/3) n 3 O(n 2 ) O(n 3 ) A {v i }

More information

EGunGPU

EGunGPU Super Computing in Accelerator simulations - Electron Gun simulation using GPGPU - K. Ohmi, KEK-Accel Accelerator Physics seminar 2009.11.19 Super computers in KEK HITACHI SR11000 POWER5 16 24GB 16 134GFlops,

More information

2005 1

2005 1 25 SPARCstation 2 CPU central processor unit 25 2 25 3 25 4 DRAM 25 5 25 6 : DRAM 25 7 2 25 8 2 25 9 2 bit: binary digit V 2V 25 2 2 2 2 4 5 2 6 3 7 25 A B C A B C A B C A B C A C A B 3 25 2 25 3 Co Cin

More information

スライド 1

スライド 1 計算科学が拓く世界スーパーコンピュータは何故スーパーか 学術情報メディアセンター中島浩 http://www.para.media.kyoto-u.ac.jp/jp/ username=super password=computer 講義の概要 目的 計算科学に不可欠の道具スーパーコンピュータが どういうものか なぜスーパーなのか どう使うとスーパーなのかについて雰囲気をつかむ 内容 スーパーコンピュータの歴史を概観しつつ

More information

The 3 key challenges in programming for MC

The 3 key challenges in programming for MC Aug 3 06 Software &Solutions group Intel Intel Centrino Intel NetBurst Intel XScale Itanium Pentium Xeon Intel Core VTune Intel Corporation Intel NetBurst Pentium Xeon Pentium M Core 64 2 Intel Software

More information

07-二村幸孝・出口大輔.indd

07-二村幸孝・出口大輔.indd GPU Graphics Processing Units HPC High Performance Computing GPU GPGPU General-Purpose computation on GPU CPU GPU GPU *1 Intel Quad-Core Xeon E5472 3.0 GHz 2 6 MB L2 cache 1600 MHz FSB 80 GFlops 1 nvidia

More information

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU GPGPU (I) GPU GPGPU 1 GPU(Graphics Processing Unit) GPU GPGPU(General-Purpose computing on GPUs) GPU GPGPU GPU ( PC ) PC PC GPU PC PC GPU GPU 2008 TSUBAME NVIDIA GPU(Tesla S1070) TOP500 29 [1] 2009 AMD

More information

AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted

AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted DEGIMA LINPACK Energy Performance for LINPACK Benchmark on DEGIMA 1 AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK 1.4698 GFlops/Watt 1.9658 GFlops/Watt Abstract GPU Computing has

More information

XACCの概要

XACCの概要 2 global void kernel(int a[max], int llimit, int ulimit) {... } : int main(int argc, char *argv[]){ MPI_Int(&argc, &argc); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); dx

More information

untitled

untitled Power Wall HPL1 10 B/F EXTREMETECH Supercomputing director bets $2,000 that we won t have exascale computing by 2020 One of the biggest problems standing in our way is power. [] http://www.extremetech.com/computing/155941

More information

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h 23 FPGA CUDA Performance Comparison of FPGA Array with CUDA on Poisson Equation (lijiang@sekine-lab.ei.tuat.ac.jp), (kazuki@sekine-lab.ei.tuat.ac.jp), (takahashi@sekine-lab.ei.tuat.ac.jp), (tamukoh@cc.tuat.ac.jp),

More information

FIT2013( 第 12 回情報科学技術フォーラム ) C-017 SIMD Implementation and evaluation of a morphological pattern spectrum using an highly-parallel SIMD matrix process

FIT2013( 第 12 回情報科学技術フォーラム ) C-017 SIMD Implementation and evaluation of a morphological pattern spectrum using an highly-parallel SIMD matrix process C-017 SIMD Implementation an evaluation of a morphological pattern pectrum uing an highly-parallel SIMD matrix proceor Yauhi Tukaa Tomohiro Takea Tohiya Hona Takehi Kumaki Takehi Ogura Takehi Fujino 1.

More information

smpp_resume.dvi

smpp_resume.dvi 6 mmiki@mail.doshisha.ac.jp Parallel Processing Parallel Pseudo-parallel Concurrent 1) 1/60 1) 1997 5 11 IBM Deep Blue Deep Blue 2) PC 2000 167 Rank Manufacturer Computer Rmax Installation Site Country

More information

26102 (1/2) LSISoC: (1) (*) (*) GPU SIMD MIMD FPGA DES, AES (2/2) (2) FPGA(8bit) (ISS: Instruction Set Simulator) (3) (4) LSI ECU110100ECU1 ECU ECU ECU ECU FPGA ECU main() { int i, j, k for { } 1 GP-GPU

More information

openmp1_Yaguchi_version_170530

openmp1_Yaguchi_version_170530 並列計算とは /OpenMP の初歩 (1) 今 の内容 なぜ並列計算が必要か? スーパーコンピュータの性能動向 1ExaFLOPS 次世代スハ コン 京 1PFLOPS 性能 1TFLOPS 1GFLOPS スカラー機ベクトル機ベクトル並列機並列機 X-MP ncube2 CRAY-1 S-810 SR8000 VPP500 CM-5 ASCI-5 ASCI-4 S3800 T3E-900 SR2201

More information

Itanium2ベンチマーク

Itanium2ベンチマーク HPC CPU mhori@ile.osaka-u.ac.jp Special thanks Timur Esirkepov HPC 2004 2 25 1 1. CPU 2. 3. Itanium 2 HPC 2 1 Itanium2 CPU CPU 3 ( ) Intel Itanium2 NEC SX-6 HP Alpha Server ES40 PRIMEPOWER SR8000 Intel

More information

Slides: TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments

Slides: TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments 計算機アーキテクチャ第 11 回 マルチプロセッサ 本資料は授業用です 無断で転載することを禁じます 名古屋大学 大学院情報科学研究科 准教授加藤真平 デスクトップ ジョブレベル並列性 スーパーコンピュータ 並列処理プログラム プログラムの並列化 for (i = 0; i < N; i++) { x[i] = a[i] + b[i]; } プログラムの並列化 x[0] = a[0] + b[0];

More information

main.dvi

main.dvi PC 1 1 [1][2] [3][4] ( ) GPU(Graphics Processing Unit) GPU PC GPU PC ( 2 GPU ) GPU Harris Corner Detector[5] CPU ( ) ( ) CPU GPU 2 3 GPU 4 5 6 7 1 toyohiro@isc.kyutech.ac.jp 45 2 ( ) CPU ( ) ( ) () 2.1

More information

e Ž ¹ vµ q ¹¹¹ ¹¹¹¹¹ vµ j ¹¹¹ ¹¹¹¹ r µ ¹¹¹¹ ¹¹¹¹¹ µ ¹¹¹¹¹ ¹¹¹¹ µ ¹¹¹¹ ¹¹¹ vµ ¹¹¹¹ ¹¹¹¹ vµ Ž ¹¹¹ ¹¹¹¹ vµˆ ¹¹¹ ¹¹¹¹¹ µ ¹¹¹¹ ¹¹¹¹¹¹¹¹ µ ¹¹¹¹¹ ¹¹¹

e Ž ¹ vµ q ¹¹¹ ¹¹¹¹¹ vµ j ¹¹¹ ¹¹¹¹ r µ ¹¹¹¹ ¹¹¹¹¹ µ ¹¹¹¹¹ ¹¹¹¹ µ ¹¹¹¹ ¹¹¹ vµ ¹¹¹¹ ¹¹¹¹ vµ Ž ¹¹¹ ¹¹¹¹ vµˆ ¹¹¹ ¹¹¹¹¹ µ ¹¹¹¹ ¹¹¹¹¹¹¹¹ µ ¹¹¹¹¹ ¹¹¹ e Ž µ ¹¹¹ ¹¹¹ v µ ¹¹¹¹¹ ¹¹¹¹¹¹ rµ ¹¹¹¹ ¹¹¹ j µ r µž ¹¹¹¹¹ ¹¹¹¹ µ ¹¹¹ ¹¹¹¹ µ ¹¹¹¹ ¹¹¹¹ µ ¹¹¹¹¹ µ ¹¹¹¹¹¹ ¹¹¹¹¹ l vµ u ¹¹¹ ¹¹¹¹¹¹ µ ¹¹¹¹ ¹¹¹¹¹ µ µ ¹¹¹ ¹¹¹ µg ¹¹¹¹ ¹¹¹¹¹ r µ Ž ¹¹¹ ¹¹¹ vµ ¹¹¹¹ ¹¹¹¹ µ ¹¹¹¹¹

More information

( : December 27, 2015) CONTENTS I. 1 II. 2 III. 2 IV. 3 V. 5 VI. 6 VII. 7 VIII. 9 I. 1 f(x) f (x) y = f(x) x ϕ(r) (gradient) ϕ(r) (gradϕ(r) ) ( ) ϕ(r)

( : December 27, 2015) CONTENTS I. 1 II. 2 III. 2 IV. 3 V. 5 VI. 6 VII. 7 VIII. 9 I. 1 f(x) f (x) y = f(x) x ϕ(r) (gradient) ϕ(r) (gradϕ(r) ) ( ) ϕ(r) ( : December 27, 215 CONTENTS I. 1 II. 2 III. 2 IV. 3 V. 5 VI. 6 VII. 7 VIII. 9 I. 1 f(x f (x y f(x x ϕ(r (gradient ϕ(r (gradϕ(r ( ϕ(r r ϕ r xi + yj + zk ϕ(r ϕ(r x i + ϕ(r y j + ϕ(r z k (1.1 ϕ(r ϕ(r i

More information

GPUを用いたN体計算

GPUを用いたN体計算 単精度 190Tflops GPU クラスタ ( 長崎大 ) の紹介 長崎大学工学部超高速メニーコアコンピューティングセンターテニュアトラック助教濱田剛 1 概要 GPU (Graphics Processing Unit) について簡単に説明します. GPU クラスタが得意とする応用問題を議論し 長崎大学での GPU クラスタによる 取組方針 N 体計算の高速化に関する研究内容 を紹介します. まとめ

More information

II 2 II

II 2 II II 2 II 2005 yugami@cc.utsunomiya-u.ac.jp 2005 4 1 1 2 5 2.1.................................... 5 2.2................................. 6 2.3............................. 6 2.4.................................

More information

262014 3 1 1 6 3 2 198810 2/ 198810 2 1 3 4 http://www.pref.hiroshima.lg.jp/site/monjokan/ 1... 1... 1... 2... 2... 4... 5... 9... 9... 10... 10... 10... 10... 13 2... 13 3... 15... 15... 15... 16 4...

More information

untitled

untitled 2005 2 1 105-0004 5-34-3 Tel: 03-3431-4002 Fax: 03-3431-4044 1 SRL/ISTEC 1 1 SFQ SFQ SFQ 2004 9 4 SFQ SFQ / LSI 269 230 230 230 269 230 SFQ SFQ 2005 2 ISTEC 2005 All rights reserved. - 1 - 2005 2 1 105-0004

More information

: , 2.0, 3.0, 2.0, (%) ( 2.

: , 2.0, 3.0, 2.0, (%) ( 2. 2017 1 2 1.1...................................... 2 1.2......................................... 4 1.3........................................... 10 1.4................................. 14 1.5..........................................

More information

(Basic Theory of Information Processing) 1

(Basic Theory of Information Processing) 1 (Basic Theory of Information Processing) 1 10 (p.178) Java a[0] = 1; 1 a[4] = 7; i = 2; j = 8; a[i] = j; b[0][0] = 1; 2 b[2][3] = 10; b[i][j] = a[2] * 3; x = a[2]; a[2] = b[i][3] * x; 2 public class Array0

More information

1 4 1.1........................................... 4 1.2.................................. 4 1.3................................... 4 2 5 2.1 GPU.....

1 4 1.1........................................... 4 1.2.................................. 4 1.3................................... 4 2 5 2.1 GPU..... CPU GPU N Q07-065 2011 2 17 1 1 4 1.1........................................... 4 1.2.................................. 4 1.3................................... 4 2 5 2.1 GPU...........................................

More information

スライド 1

スライド 1 swk(at)ic.is.tohoku.ac.jp 2 Outline 3 ? 4 S/N CCD 5 Q Q V 6 CMOS 1 7 1 2 N 1 2 N 8 CCD: CMOS: 9 : / 10 A-D A D C A D C A D C A D C A D C A D C ADC 11 A-D ADC ADC ADC ADC ADC ADC ADC ADC ADC A-D 12 ADC

More information

HPCマシンの変遷と 今後の情報基盤センターの役割

HPCマシンの変遷と 今後の情報基盤センターの役割 筑波大学計算科学センターシンポジウム 計算機アーキテクトが考える 次世代スパコン 2006 年 4 月 5 日 村上和彰 九州大学 murakami@cc.kyushu-u.ac.jp 次世代スパコン ~ 達成目標と制約条件の整理 ~ 達成目標 性能目標 (2011 年 ) LINPACK (HPL):10PFlop/s 実アプリケーション :1PFlop/s 成果目標 ( 私見 ) 科学技術計算能力の国際競争力の向上ならびに維持による我が国の科学技術力

More information

1 All Rights Reserved, Copyright 2004, NEC Corporation 2 All Rights Reserved, Copyright 2004, NEC Corporation

1 All Rights Reserved, Copyright 2004, NEC Corporation 2 All Rights Reserved, Copyright 2004, NEC Corporation 1 2 Linpack TO500 3 SIM BlueGene/L DD2 olumbia BlueGene/L DD3 TIGER 4 ASI Q BlueGene/L DD1 LINAK Blue Gene/L H apacity omputing 4 apability omputing Goals Goals TAT - Not challenging - hallenging - SM

More information

HPC146

HPC146 2 3 4 5 6 int array[16]; #pragma xmp nodes p(4) #pragma xmp template t(0:15) #pragma xmp distribute t(block) on p #pragma xmp align array[i] with t(i) array[16] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Node

More information

Part () () Γ Part ,

Part () () Γ Part , Contents a 6 6 6 6 6 6 6 7 7. 8.. 8.. 8.3. 8 Part. 9. 9.. 9.. 3. 3.. 3.. 3 4. 5 4.. 5 4.. 9 4.3. 3 Part. 6 5. () 6 5.. () 7 5.. 9 5.3. Γ 3 6. 3 6.. 3 6.. 3 6.3. 33 Part 3. 34 7. 34 7.. 34 7.. 34 8. 35

More information

rank ”«‘‚“™z‡Ì GPU ‡É‡æ‡éŁÀŠñ›»

rank ”«‘‚“™z‡Ì GPU ‡É‡æ‡éŁÀŠñ›» rank GPU ERATO 2011 11 1 1 / 26 GPU rank/select wavelet tree balanced parenthesis GPU rank 2 / 26 GPU rank/select wavelet tree balanced parenthesis GPU rank 2 / 26 GPU rank/select wavelet tree balanced

More information

GPU n Graphics Processing Unit CG CAD

GPU n Graphics Processing Unit CG CAD GPU 2016/06/27 第 20 回 GPU コンピューティング講習会 ( 東京工業大学 ) 1 GPU n Graphics Processing Unit CG CAD www.nvidia.co.jp www.autodesk.co.jp www.pixar.com GPU n GPU ü n NVIDIA CUDA ü NVIDIA GPU ü OS Linux, Windows, Mac

More information

develop

develop SCore SCore 02/03/20 2 1 HA (High Availability) HPC (High Performance Computing) 02/03/20 3 HA (High Availability) Mail/Web/News/File Server HPC (High Performance Computing) Job Dispatching( ) Parallel

More information

VLSI工学

VLSI工学 2008//5/ () 2008//5/ () 2 () http://ssc.pe.titech.ac.jp 2008//5/ () 3!! A (WCDMA/GSM) DD DoCoMo 905iP905i 2008//5/ () 4 minisd P900i SemiConsult SDRAM, MPEG4 UIMIrDA LCD/ AF ADC/DAC IC CCD C-CPUA-CPU DSPSRAM

More information

36 th IChO : - 3 ( ) , G O O D L U C K final 1

36 th IChO : - 3 ( ) , G O O D L U C K final 1 36 th ICh - - 5 - - : - 3 ( ) - 169 - -, - - - - - - - G D L U C K final 1 1 1.01 2 e 4.00 3 Li 6.94 4 Be 9.01 5 B 10.81 6 C 12.01 7 N 14.01 8 16.00 9 F 19.00 10 Ne 20.18 11 Na 22.99 12 Mg 24.31 Periodic

More information

NEC All rights reserved 1

NEC All rights reserved 1 NEC All rights reserved 1 NEC All rights reserved 2 NEC All rights reserved 3 (Founder) (Langchao Langchao) NEC All rights reserved 4 2.1 GB/s 64 bits wide 266 MHz 4 MB L3 on board, 96k L2, 32k L1 on -die

More information

chapter4.PDF

chapter4.PDF 4. 4.1. 4.2. 63 4 1 4.3. 4.3.1. 4 a) 1 5 b) 1 c) d) 1 4.3.2. a) b) c) a) 10 18 b) 2 17 2 1 54 2 1 c) 11 4 1 1 (TB) (FB) TB FB 4.3.3. 4.3.4. 1 18 16 4.3.5. a) b) 18 16 a) b) c) 1 18 16 2 1 18 16 3 18 16

More information

..3. Ω, Ω F, P Ω, F, P ). ) F a) A, A,..., A i,... F A i F. b) A F A c F c) Ω F. ) A F A P A),. a) 0 P A) b) P Ω) c) [ ] A, A,..., A i,... F i j A i A

..3. Ω, Ω F, P Ω, F, P ). ) F a) A, A,..., A i,... F A i F. b) A F A c F c) Ω F. ) A F A P A),. a) 0 P A) b) P Ω) c) [ ] A, A,..., A i,... F i j A i A .. Laplace ). A... i),. ω i i ). {ω,..., ω } Ω,. ii) Ω. Ω. A ) r, A P A) P A) r... ).. Ω {,, 3, 4, 5, 6}. i i 6). A {, 4, 6} P A) P A) 3 6. ).. i, j i, j) ) Ω {i, j) i 6, j 6}., 36. A. A {i, j) i j }.

More information

( ) X x, y x y x y X x X x [x] ( ) x X y x y [x] = [y] ( ) x X y y x ( ˆX) X ˆX X x x z x X x ˆX [z x ] X ˆX X ˆX ( ˆX ) (0) X x, y d(x(1), y(1)), d(x

( ) X x, y x y x y X x X x [x] ( ) x X y x y [x] = [y] ( ) x X y y x ( ˆX) X ˆX X x x z x X x ˆX [z x ] X ˆX X ˆX ( ˆX ) (0) X x, y d(x(1), y(1)), d(x Z Z Ẑ 1 1.1 (X, d) X x 1, x 2,, x n, x x n x(n) ( ) X x x ε N N i, j i, j d(x(i), x(j)) < ε ( ) X x x n N N i i d(x(n), x(i)) < 1 n ( ) X x lim n x(n) X x X () X x, y lim n d(x(n), y(n)) = 0 x y x y 1

More information

単位、情報量、デジタルデータ、CPUと高速化 ~ICT用語集~

単位、情報量、デジタルデータ、CPUと高速化  ~ICT用語集~ CPU ICT mizutani@ic.daito.ac.jp 2014 SI: Systèm International d Unités SI SI 10 1 da 10 1 d 10 2 h 10 2 c 10 3 k 10 3 m 10 6 M 10 6 µ 10 9 G 10 9 n 10 12 T 10 12 p 10 15 P 10 15 f 10 18 E 10 18 a 10 21

More information

ÊÂÎó·×»»¤È¤Ï/OpenMP¤Î½éÊâ¡Ê£±¡Ë

ÊÂÎó·×»»¤È¤Ï/OpenMP¤Î½éÊâ¡Ê£±¡Ë 2015 5 21 OpenMP Hello World Do (omp do) Fortran (omp workshare) CPU Richardson s Forecast Factory 64,000 L.F. Richardson, Weather Prediction by Numerical Process, Cambridge, University Press (1922) Drawing

More information

九州大学学術情報リポジトリ Kyushu University Institutional Repository 将来 (2010 年前後を想定 ) のペタフロップス超級スパコンセンターとの連携について 村上, 和彰九州大学大学院システム情報科学研究院 九州大学情報基盤センター

九州大学学術情報リポジトリ Kyushu University Institutional Repository 将来 (2010 年前後を想定 ) のペタフロップス超級スパコンセンターとの連携について 村上, 和彰九州大学大学院システム情報科学研究院 九州大学情報基盤センター 九州大学学術情報リポジトリ Kyushu University Institutional Repository 将来 (2010 年前後を想定 ) のペタフロップス超級スパコンセンターとの連携について 村上, 和彰九州大学大学院システム情報科学研究院 九州大学情報基盤センター http://hdl.handle.net/2324/9112 出版情報 :SLRC プレゼンテーション, 2005-03-08

More information

2/66

2/66 1/66 9 Outline 1. 2. 3. 4. CPU 5. Jun. 13, 2013@A 2/66 3/66 4/66 Network Memory Memory Memory CPU SIMD if Cache CPU Cache CPU Cache CPU 5/66 FPU FPU Floating Processing Unit Register Register Register

More information

4

4 4 r r 43 44 a b c f d e a r b c d e f 45 r r r 46 47 a b g a b r c d e f r g c d e f e 48 mm r r 1 49 a r b c a b 1 1 a 3 a 50 1 a 3 1 mb a 1 mm 3 a a a 51 1 mm 1 mm 1 5 mb 3 4 1 3 4 1 53 1 1 mj r 1 a

More information

倍々精度RgemmのnVidia C2050上への実装と応用

倍々精度RgemmのnVidia C2050上への実装と応用 .. maho@riken.jp http://accc.riken.jp/maho/,,, 2011/2/16 1 - : GPU : SDPA-DD 10 1 - Rgemm : 4 (32 ) nvidia C2050, GPU CPU 150, 24GFlops 25 20 GFLOPS 15 10 QuadAdd Cray, QuadMul Sloppy Kernel QuadAdd Cray,

More information

卒業論文

卒業論文 PC OpenMP SCore PC OpenMP PC PC PC Myrinet PC PC 1 OpenMP 2 1 3 3 PC 8 OpenMP 11 15 15 16 16 18 19 19 19 20 20 21 21 23 26 29 30 31 32 33 4 5 6 7 SCore 9 PC 10 OpenMP 14 16 17 10 17 11 19 12 19 13 20 1421

More information

( ) ( ) HPC SPH FPGA Web http://galaxy.u-aizu.ac.jp/trac/note/ : 1 4 : 2 6 : 3 6 GPU : ~ 100 1000 : ~ 1000-100000 Google : ~ 10000 : ~ 100000000 GPU, Cell, FPGA GRAPE-DR/GRAPE-MP ( ) GPU GPU : Matsumoto,

More information

HPC可視化_小野2.pptx

HPC可視化_小野2.pptx 大 小 二 生 高 方 目 大 方 方 方 Rank Site Processors RMax Processor System Model 1 DOE/NNSA/LANL 122400 1026000 PowerXCell 8i BladeCenter QS22 Cluster 2 DOE/NNSA/LLNL 212992 478200 PowerPC 440 BlueGene/L 3 Argonne

More information

ストリーミング SIMD 拡張命令2 (SSE2) を使用した、倍精度浮動小数点ベクトルの最大/最小要素とそのインデックスの検出

ストリーミング SIMD 拡張命令2 (SSE2) を使用した、倍精度浮動小数点ベクトルの最大/最小要素とそのインデックスの検出 SIMD 2(SSE2) / 2.0 2000 7 : 248602J-001 01/10/30 1 305-8603 115 Fax: 0120-47-8832 * Copyright Intel Corporation 1999-2001 01/10/30 2 1...5 2...5 2.1...5 2.1.1...5 2.1.2...8 3...9 3.1...9 3.2...9 4...9

More information

1. A0 A B A0 A : A1,...,A5 B : B1,...,B

1. A0 A B A0 A : A1,...,A5 B : B1,...,B 1. A0 A B A0 A : A1,...,A5 B : B1,...,B12 2. 3. 4. 5. A0 A, B Z Z m, n Z m n m, n A m, n B m=n (1) A, B (2) A B = A B = Z/ π : Z Z/ (3) A B Z/ (4) Z/ A, B (5) f : Z Z f(n) = n f = g π g : Z/ Z A, B (6)

More information

°ÌÁê¿ô³ØII

°ÌÁê¿ô³ØII July 14, 2007 Brouwer f f(x) = x x f(z) = 0 2 f : S 2 R 2 f(x) = f( x) x S 2 3 3 2 - - - 1. X x X U(x) U(x) x U = {U(x) x X} X 1. U(x) A U(x) x 2. A U(x), A B B U(x) 3. A, B U(x) A B U(x) 4. A U(x),

More information

CP-PACS CP-PACS CP-PACS : 2048PU+128IOU 614GFLOPS peak 128GByte memory 1058GByte disk 1992 1996 SR2201 : 1996 8 9 CP-PACS Top 500 List ranking No. 1 November 1996 Linpack 368.2Gflops No. 24 Novermber 1999

More information

n ξ n,i, i = 1,, n S n ξ n,i n 0 R 1,.. σ 1 σ i .10.14.15 0 1 0 1 1 3.14 3.18 3.19 3.14 3.14,. ii 1 1 1.1..................................... 1 1............................... 3 1.3.........................

More information

2 Chapter 4 (f4a). 2. (f4cone) ( θ) () g M. 2. (f4b) T M L P a θ (f4eki) ρ H A a g. v ( ) 2. H(t) ( )

2 Chapter 4 (f4a). 2. (f4cone) ( θ) () g M. 2. (f4b) T M L P a θ (f4eki) ρ H A a g. v ( ) 2. H(t) ( ) http://astr-www.kj.yamagata-u.ac.jp/~shibata f4a f4b 2 f4cone f4eki f4end 4 f5meanfp f6coin () f6a f7a f7b f7d f8a f8b f9a f9b f9c f9kep f0a f0bt version feqmo fvec4 fvec fvec6 fvec2 fvec3 f3a (-D) f3b

More information

FINAL PROGRAM 22th Annual Workshop SWoPP / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing

FINAL PROGRAM 22th Annual Workshop SWoPP / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing FINAL PROGRAM 22th Annual Workshop SWoPP 2009 2009 / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2009 8 4 ( ) 8 6 ( ) 981-0933 1-2-45 http://www.forestsendai.jp

More information

HP ProLiant 500シリーズ

HP ProLiant 500シリーズ HPProLiant5 DL58/585 HPProLiant5 4 HPProLiant5 HPProLiant5 64 HPProLiant5 TPC-H@1GB 4, 34,99 SAP SD Benchmark Users QphH@1GB 3, 2, 1, 4, 3, 2, 1, DL58 G5, Xeon X735 DL585 G5, AMD Opteron 836SE 17,12 DL58

More information

Express5800/120Lf 1. Express5800/120Lf N N N Express5800/120Lf Express5800/120Lf Express5800/120Lf ( /1BG(256)) ( /1BG(256)) (

Express5800/120Lf 1. Express5800/120Lf N N N Express5800/120Lf Express5800/120Lf Express5800/120Lf ( /1BG(256)) ( /1BG(256)) ( (2001/11/13) Express5800/120Lf 1. Express5800/120Lf N8100-748 N8100-751 N8100-754 Express5800/120Lf Express5800/120Lf Express5800/120Lf ( /1BG(256)) ( /1BG(256)) ( /1.26G(512)) CPU Hot-Plug Pentium (1.0BGHz)

More information

imai@eng.kagawa-u.ac.jp No1 No2 OS Wintel Intel x86 CPU No3 No4 8bit=2 8 =256(Byte) 16bit=2 16 =65,536(Byte)=64KB= 6 5 32bit=2 32 =4,294,967,296(Byte)=4GB= 43 64bit=2 64 =18,446,744,073,709,551,615(Byte)=16EB

More information

ごあいさつ

ごあいさつ 2004 11 7 10 00 2004 13:0014:00 16 00 2004 3 5N S24 29 34 39 44 49 54 59H1 6 11. URL 1 7 2005 2 1 1210 121 149 187 149 606 137 134 177 156 604 162 11 1 2004 2 1241 135 126 120 233 614 145 131 131 220 627

More information

1. x { e 1,..., e n } x = x1 e1 + + x n en = (x 1,..., x n ) X, Y [X, Y ] Intrinsic ( ) Intrinsic M m P M C P P M P M v 3 v : C P R 1

1. x { e 1,..., e n } x = x1 e1 + + x n en = (x 1,..., x n ) X, Y [X, Y ] Intrinsic ( ) Intrinsic M m P M C P P M P M v 3 v : C P R 1 1. x { e 1,..., e n } x = x1 e1 + + x n en = (x 1,..., x n ) X, Y [X, Y ] Intrinsic ( ) Intrinsic M m P M C P P M P M v 3 v : C P R 1 f, g C P, λ R (1) v(f + g) = v(f) + v(g) (2) v(λf) = λv(f) (3) v(fg)

More information

統計学のポイント整理

統計学のポイント整理 .. September 17, 2012 1 / 55 n! = n (n 1) (n 2) 1 0! = 1 10! = 10 9 8 1 = 3628800 n k np k np k = n! (n k)! (1) 5 3 5 P 3 = 5! = 5 4 3 = 60 (5 3)! n k n C k nc k = npk k! = n! k!(n k)! (2) 5 3 5C 3 = 5!

More information

GPUコンピューティング講習会パート1

GPUコンピューティング講習会パート1 GPU コンピューティング (CUDA) 講習会 GPU と GPU を用いた計算の概要 丸山直也 スケジュール 13:20-13:50 GPU を用いた計算の概要 担当丸山 13:50-14:30 GPU コンピューティングによる HPC アプリケーションの高速化の事例紹介 担当青木 14:30-14:40 休憩 14:40-17:00 CUDA プログラミングの基礎 担当丸山 TSUBAME の

More information

2

2 GPU 2008/11/30 GPU GPU UniformGrid GPU CPU GeForce6 9 kd-tree GPU GPU UG kd-tree GPU CPU GPU GPU GPU I/O PCI-Express DMA DirectX9 DirectX 3D OpenGL CUDA Larrabee Mac 2008/11/28 Mac(Carbon) Carbon.framework/QuickTime.framework

More information

関数のグラフを描こう

関数のグラフを描こう L05(2010-05-07) 1 2 hig3.net ( ) L05(2010-05-07) 1 / 16 #i n c l u d e double f ( double x ) ; i n t main ( void ){ i n t n ; i n t nmax=10; double x ; double s =0.0; } x = 1.0; s=s+x ;

More information

i

i 14 i ii iii iv v vi 14 13 86 13 12 28 14 16 14 15 31 (1) 13 12 28 20 (2) (3) 2 (4) (5) 14 14 50 48 3 11 11 22 14 15 10 14 20 21 20 (1) 14 (2) 14 4 (3) (4) (5) 12 12 (6) 14 15 5 6 7 8 9 10 7

More information

Microsoft PowerPoint - 03_murakami(参照)_ pptx[読み取り専用]

Microsoft PowerPoint - 03_murakami(参照)_ pptx[読み取り専用] SS 研科学技術計算分科会 アクセラレータ技術の現状と今後 ~HPC とアクセラレータ ~ 2008 年 10 月 22 日村上和彰 murakami@i.kyushu u.ac.jp 国立大学法人九州大学教授 SS 研会長 1 概要 高性能科学技術計算 (HPC) とアクセラレータとの関係は歴史が長い ベクトル処理もアクセラレータの一種であり かつ その元祖的存在である ベクトル処理が時間軸方向のデータレベル並列処理だったものを空間軸方向に置き換えたものが現在主流となっている

More information

128 3 II S 1, S 2 Φ 1, Φ 2 Φ 1 = { B( r) n( r)}ds S 1 Φ 2 = { B( r) n( r)}ds (3.3) S 2 S S 1 +S 2 { B( r) n( r)}ds = 0 (3.4) S 1, S 2 { B( r) n( r)}ds

128 3 II S 1, S 2 Φ 1, Φ 2 Φ 1 = { B( r) n( r)}ds S 1 Φ 2 = { B( r) n( r)}ds (3.3) S 2 S S 1 +S 2 { B( r) n( r)}ds = 0 (3.4) S 1, S 2 { B( r) n( r)}ds 127 3 II 3.1 3.1.1 Φ(t) ϕ em = dφ dt (3.1) B( r) Φ = { B( r) n( r)}ds (3.2) S S n( r) Φ 128 3 II S 1, S 2 Φ 1, Φ 2 Φ 1 = { B( r) n( r)}ds S 1 Φ 2 = { B( r) n( r)}ds (3.3) S 2 S S 1 +S 2 { B( r) n( r)}ds

More information

HP Compaq Business Desktop dx7300シリーズ

HP Compaq Business Desktop dx7300シリーズ 本カタログは 旧製品もしくはすでに販売終了した製品のカタログです 最新版のカタログ 現在販売している製品のカタログは下記サイトにございます www.hp.com/jp/catalog その他ご不明な点は下記お問い合わせ窓口までご連絡ください HP Directplus 9 00 19 00 5/1 10 00 17 00 03-6416-6222 HP 9 00 19 00 10 00 17 00

More information

HPEハイパフォーマンスコンピューティング ソリューション

HPEハイパフォーマンスコンピューティング ソリューション HPE HPC / AI Page 2 No.1 * 24.8% No.1 * HPE HPC / AI HPC AI SGIHPE HPC / AI GPU TOP500 50th edition Nov. 2017 HPE No.1 124 www.top500.org HPE HPC / AI TSUBAME 3.0 2017 7 AI TSUBAME 3.0 HPE SGI 8600 System

More information

2 G(k) e ikx = (ik) n x n n! n=0 (k ) ( ) X n = ( i) n n k n G(k) k=0 F (k) ln G(k) = ln e ikx n κ n F (k) = F (k) (ik) n n= n! κ n κ n = ( i) n n k n

2 G(k) e ikx = (ik) n x n n! n=0 (k ) ( ) X n = ( i) n n k n G(k) k=0 F (k) ln G(k) = ln e ikx n κ n F (k) = F (k) (ik) n n= n! κ n κ n = ( i) n n k n . X {x, x 2, x 3,... x n } X X {, 2, 3, 4, 5, 6} X x i P i. 0 P i 2. n P i = 3. P (i ω) = i ω P i P 3 {x, x 2, x 3,... x n } ω P i = 6 X f(x) f(x) X n n f(x i )P i n x n i P i X n 2 G(k) e ikx = (ik) n

More information

1009.\1.\4.ai

1009.\1.\4.ai - 1 - E O O O O O O - 2 - E O O O - 3 - O N N N N N N N N N N N N N N N N N N N N N N N E e N N N N N N N N N N N N N N N N N N N N N N N D O O O - 4 - O O O O O O O O N N N N N N N N N N N N N N N N N

More information