Agenda GRAPE-MPの紹介と性能評価 GRAPE-MPの概要 OpenCLによる四倍精度演算 (preliminary) 4倍精度演算用SIM 加速ボード 6 processor elem with 128 bit logic Peak: 1.2Gflops
|
|
|
- はすな つつの
- 7 years ago
- Views:
Transcription
1
2 Agenda GRAPE-MPの紹介と性能評価 GRAPE-MPの概要 OpenCLによる四倍精度演算 (preliminary) 4倍精度演算用SIM 加速ボード 6 processor elem with 128 bit logic Peak: 1.2Gflops
3
4
5 ボードの概要 Control processor (FPGA by Altera) GRAPE-MP chip[nextreme NX2500] (structured ASIC by easic)
6 GRAPE-MP チップのブロック図 転送 :800MB/s 128bit, 128 words 2 演算 x 100MHz x 6 PE = 1.2 Gflops 4 論理 pipelines x 6 PE = 24 pipelines /chip
7 inst BM out BM in GRF1 128w GRF2 128w add treg 4w mul tt ss rsq
8 ( 0 ) : nop if 0 ( 1 ) : sub? ( 2-8) : grf_a adr 7 bit ( 9-15) : grf_b adr 7 bit (16-22) : grf_c adr 7 bit (23-29) : grf_d adr 7 bit (30-31) : TREG adr 2 bit (32-34) : ADD 1st arg : a,b,bm,t,ti (35-37) : ADD 2nd arg : a,b,bm,t,ti (38-40) : MUL 1st arg : a,b,bm,t,ti (41-43) : MUL 2nd arg : a,b,bm,t,ti (44 ) : RSQ 1st arg : t,ti (45-46) : grf_c write : add, mul, rsq (47-48) : grf_d write : add, mul, rsq (49-50) : treg write : add, mul, rsq (51 ) : bm out (52-55) : bm mask : 1000 => 0, 1001 => 1, 1010 => 2 etc. (56-62) : bm adr 7 bit (128 words)
9 bit exponents 116bit mantissa 1bit for sign
10
11
12
13
14 GRAPE-MP ボードのブロック図 64bit 16k ワード IO control processor をGRAPE-MP チップから分離 MP チップのPE 数を最大にするため 開発を簡単にするため
15
16 sub bm16v ra0v rb40v sub bm20v ra4v rb44v sub bm24v ra8v rb48v mul rb40v rb40v ra36v mul rb44v rb44v tt add ra36v ts ra32v mul rb48v rb48v tt add ra32v ts tt b f b f a b b f a a a a e e a e c e240c0005e d e e e e240c e e
17 VARI xi, yi, zi, e2; VARJ xj, yj, zj, mj; VARF ax, ay, az, pt; dx = xj - xi; dy = yj - yi; dz = zj - zi; r1i = rsqrt(dx**2 + dy**2 + dz**2 + e2); pf = mj*r1i; pt += pf; af = pf*r1i**2; ax += af*dx; bm_in bm12v ra12v pe0 bm_in bm8v ra8v pe0 bm_in bm4v ra4v pe0 bm_in bm0v ra0v pe0 mov zz ra16v mov zz ra28v mov zz ra24v mov zz ra20v sub bm16v ra0v rb40v sub bm20v ra4v rb44v sub bm24v ra8v rb48v mul rb40v rb40v ra36v mul rb44v rb44v tt add ra36v ts ra32v mul rb48v rb48v tt add ra32v ts tt
18 GRAPE-MPの性能評価 テスト環境 CPU:Intel Core i7 920 (OC 3GHz) MEM: DDR GB (1208MHz動作) MB: Asus P6T6 WS Revolution (6PCIe スロット) 6ボードを搭載して性能評価
19 ファインマンループ積分 1 I = 0 1 x dx 0 1 x y dy 0 dz 1 D 2 D= xys tz 1 x y z x y 2 1 x y z 1 x y m e 2 z 1 x y m f 2 x,yを与える 一番内側のzの和を計算 同時に (x,y) の24 組を計算 積分のポイント数 Nを変えて計算 41 N 3 演算
20
21 i 並列 146 pipelines(6 台 ) 96 pipelines(4 台 ) 48 pipelines(2 台 ) 性能 (N=3900) Gflops (5.30 倍 ) Gflops (3.75 倍 ) Gflops (1.95) Number of particles/points
22 ( i Number of particles 42 %
23 M A RAM-A (RA[1]) M B RAM-B (RB[1]) Multiplier[1] (64 64)... M A RAM-A (RA[p]) M B RAM-B (RB[p]) Multiplier[p] (64 64) Op 1024 bits 2048 bits MPFR Our Speedup MPFR Our Speedup x ± y x y x/y x Sin(x) Cos(x) Exp(x) Ln(x) Accumulator[1] 70bits(high) E A + E B MUX + Sum 64bits(low) RAM-C (RC) Normalization Result Accumulator[p] S A * S B (B) Structure of VP_Mult unit
24 POWER7 FPGA 400 Mop/s e e+08 1e+07 vector length 7 8 FPGA
25
26 100 Performance of C AB + C on CPU-GPU Systems 2000 Performa in Differe 600 Performance [GFlop/s] Performance [GFlop/s] Matrix size [n=m=k] SGEMM on System A (HD 5870 GPU + Core i7 970 CPU) SGEMM on System B (HD 6970 GPU + Core i7 2600k CPU) DGEMM on System C (2 HD 5870 GPUs + Core i7 960 CPU) DGEMM on System A (HD 5870 GPU + Core i7 970 CPU) DGEMM on System B (HD 6970 GPU + Core i7 2600k CPU) 0
27 Blocking factor [b] (n=m=k=10b) Maximum Performance DGEMM SGEMM Variant System A System B Perf. [GFlop/s] Perf. [GFlop/s] C A T B + C C AB + C C A T B T + C C AB T + C C A T B + C C AB + C C A T B T + C C AB T + C
( ) ( ) HPC SPH FPGA Web http://galaxy.u-aizu.ac.jp/trac/note/ : 1 4 : 2 6 : 3 6 GPU : ~ 100 1000 : ~ 1000-100000 Google : ~ 10000 : ~ 100000000 GPU, Cell, FPGA GRAPE-DR/GRAPE-MP ( ) GPU GPU : Matsumoto,
23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h
23 FPGA CUDA Performance Comparison of FPGA Array with CUDA on Poisson Equation ([email protected]), ([email protected]), ([email protected]), ([email protected]),
26102 (1/2) LSISoC: (1) (*) (*) GPU SIMD MIMD FPGA DES, AES (2/2) (2) FPGA(8bit) (ISS: Instruction Set Simulator) (3) (4) LSI ECU110100ECU1 ECU ECU ECU ECU FPGA ECU main() { int i, j, k for { } 1 GP-GPU
EGunGPU
Super Computing in Accelerator simulations - Electron Gun simulation using GPGPU - K. Ohmi, KEK-Accel Accelerator Physics seminar 2009.11.19 Super computers in KEK HITACHI SR11000 POWER5 16 24GB 16 134GFlops,
II 2 II
II 2 II 2005 [email protected] 2005 4 1 1 2 5 2.1.................................... 5 2.2................................. 6 2.3............................. 6 2.4.................................
( )
18 10 01 ( ) 1 2018 4 1.1 2018............................... 4 1.2 2018......................... 5 2 2017 7 2.1 2017............................... 7 2.2 2017......................... 8 3 2016 9 3.1 2016...............................
untitled
PC [email protected] muscle server blade server PC PC + EHPC/Eric (Embedded HPC with Eric) 1216 Compact PCI Compact PCIPC Compact PCISH-4 Compact PCISH-4 Eric Eric EHPC/Eric EHPC/Eric Gigabit
4 倍精度基本線形代数ルーチン群 QPBLAS の紹介 [index] 1. Introduction 2. Double-double algorithm 3. QPBLAS 4. QPBLAS-GPU 5. Summary 佐々成正 1, 山田進 1, 町田昌彦 1, 今村俊幸 2, 奥田洋司
4 倍精度基本線形代数ルーチン群 QPBLAS の紹介 [index] 1. Introduction 2. Double-double algorithm 3. QPBLAS 4. QPBLAS-GPU 5. Summary 佐々成正 1, 山田進 1, 町田昌彦 1, 今村俊幸 2, 奥田洋司 3 1 1 日本原子力研究開発機構システム計算科学センター 2 理科学研究所計算科学研究機構 3 東京大学新領域創成科学研究科
120 9 I I 1 I 2 I 1 I 2 ( a) ( b) ( c ) I I 2 I 1 I ( d) ( e) ( f ) 9.1: Ampère (c) (d) (e) S I 1 I 2 B ds = µ 0 ( I 1 I 2 ) I 1 I 2 B ds =0. I 1 I 2
9 E B 9.1 9.1.1 Ampère Ampère Ampère s law B S µ 0 B ds = µ 0 j ds (9.1) S rot B = µ 0 j (9.2) S Ampère Biot-Savart oulomb Gauss Ampère rot B 0 Ampère µ 0 9.1 (a) (b) I B ds = µ 0 I. I 1 I 2 B ds = µ 0
I y = f(x) a I a x I x = a + x 1 f(x) f(a) x a = f(a + x) f(a) x (11.1) x a x 0 f(x) f(a) f(a + x) f(a) lim = lim x a x a x 0 x (11.2) f(x) x
11 11.1 I y = a I a x I x = a + 1 f(a) x a = f(a +) f(a) (11.1) x a 0 f(a) f(a +) f(a) = x a x a 0 (11.) x = a a f (a) d df f(a) (a) I dx dx I I I f (x) d df dx dx (x) [a, b] x a ( 0) x a (a, b) () [a,
strtok-count.eps
IoT FPGA 2016/12/1 IoT FPGA 200MHz 32 ASCII PCI Express FPGA OpenCL (Volvox) Volvox CPU 10 1 IoT (Internet of Things) 2020 208 [1] IoT IoT HTTP JSON ( Python Ruby) IoT IoT IoT (Hadoop [2] ) AI (Artificial
3 SIMPLE ver 3.2: SIMPLE (SIxteen-bit MicroProcessor for Laboratory Experiment) 1 16 SIMPLE SIMPLE 2 SIMPLE 2.1 SIMPLE (main memo
3 SIMPLE ver 3.2: 20190404 1 3 SIMPLE (SIxteen-bit MicroProcessor for Laboratory Experiment) 1 16 SIMPLE SIMPLE 2 SIMPLE 2.1 SIMPLE 1 16 16 (main memory) 16 64KW a (C )*(a) (register) 8 r[0], r[1],...,
y π π O π x 9 s94.5 y dy dx. y = x + 3 y = x logx + 9 s9.6 z z x, z y. z = xy + y 3 z = sinx y 9 s x dx π x cos xdx 9 s93.8 a, fx = e x ax,. a =
[ ] 9 IC. dx = 3x 4y dt dy dt = x y u xt = expλt u yt λ u u t = u u u + u = xt yt 6 3. u = x, y, z = x + y + z u u 9 s9 grad u ux, y, z = c c : grad u = u x i + u y j + u k i, j, k z x, y, z grad u v =
92% TEL ディー クルー テクノロジーズ株式会社
92% TEL.050006409 0006409 http://www.logitec.co.jp/data_recovery/ ディー クルー テクノロジーズ株式会社 http://www.hagisol.co.jp BXPCCARAMX6S BXPCCBYTMN20 40 0 30 65 2022 年まで 産予定 は変更する可能性があります 2020 年まで 産予定 は変更する可能性があります
zz + 3i(z z) + 5 = 0 + i z + i = z 2i z z z y zz + 3i (z z) + 5 = 0 (z 3i) (z + 3i) = 9 5 = 4 z 3i = 2 (3i) zz i (z z) + 1 = a 2 {
04 zz + iz z) + 5 = 0 + i z + i = z i z z z 970 0 y zz + i z z) + 5 = 0 z i) z + i) = 9 5 = 4 z i = i) zz i z z) + = a {zz + i z z) + 4} a ) zz + a + ) z z) + 4a = 0 4a a = 5 a = x i) i) : c Darumafactory
GPU GPU CPU CPU CPU GPU GPU N N CPU ( ) 1 GPU CPU GPU 2D 3D CPU GPU GPU GPGPU GPGPU 2 nvidia GPU CUDA 3 GPU 3.1 GPU Core 1
GPU 4 2010 8 28 1 GPU CPU CPU CPU GPU GPU N N CPU ( ) 1 GPU CPU GPU 2D 3D CPU GPU GPU GPGPU GPGPU 2 nvidia GPU CUDA 3 GPU 3.1 GPU Core 1 Register & Shared Memory ( ) CPU CPU(Intel Core i7 965) GPU(Tesla
( )/2 hara/lectures/lectures-j.html 2, {H} {T } S = {H, T } {(H, H), (H, T )} {(H, T ), (T, T )} {(H, H), (T, T )} {1
( )/2 http://www2.math.kyushu-u.ac.jp/ hara/lectures/lectures-j.html 1 2011 ( )/2 2 2011 4 1 2 1.1 1 2 1 2 3 4 5 1.1.1 sample space S S = {H, T } H T T H S = {(H, H), (H, T ), (T, H), (T, T )} (T, H) S
倍々精度RgemmのnVidia C2050上への実装と応用
.. [email protected] http://accc.riken.jp/maho/,,, 2011/2/16 1 - : GPU : SDPA-DD 10 1 - Rgemm : 4 (32 ) nvidia C2050, GPU CPU 150, 24GFlops 25 20 GFLOPS 15 10 QuadAdd Cray, QuadMul Sloppy Kernel QuadAdd Cray,
(ii) (iii) z a = z a =2 z a =6 sin z z a dz. cosh z z a dz. e z dz. (, a b > 6.) (z a)(z b) 52.. (a) dz, ( a = /6.), (b) z =6 az (c) z a =2 53. f n (z
B 4 24 7 9 ( ) :,..,,.,. 4 4. f(z): D C: D a C, 2πi C f(z) dz = f(a). z a a C, ( ). (ii), a D, a U a,r D f. f(z) = A n (z a) n, z U a,r, n= A n := 2πi C f(ζ) dζ, n =,,..., (ζ a) n+, C a D. (iii) U a,r
29
9 .,,, 3 () C k k C k C + C + C + + C 8 + C 9 + C k C + C + C + C 3 + C 4 + C 5 + + 45 + + + 5 + + 9 + 4 + 4 + 5 4 C k k k ( + ) 4 C k k ( k) 3 n( ) n n n ( ) n ( ) n 3 ( ) 3 3 3 n 4 ( ) 4 4 4 ( ) n n
数学の基礎訓練I
I 9 6 13 1 1 1.1............... 1 1................ 1 1.3.................... 1.4............... 1.4.1.............. 1.4................. 3 1.4.3........... 3 1.4.4.. 3 1.5.......... 3 1.5.1..............
1 1 1 1 1 1 2 f z 2 C 1, C 2 f 2 C 1, C 2 f(c 2 ) C 2 f(c 1 ) z C 1 f f(z) xy uv ( u v ) = ( a b c d ) ( x y ) + ( p q ) (p + b, q + d) 1 (p + a, q + c) 1 (p, q) 1 1 (b, d) (a, c) 2 3 2 3 a = d, c = b
AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted
DEGIMA LINPACK Energy Performance for LINPACK Benchmark on DEGIMA 1 AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK 1.4698 GFlops/Watt 1.9658 GFlops/Watt Abstract GPU Computing has
68 A mm 1/10 A. (a) (b) A.: (a) A.3 A.4 1 1
67 A Section A.1 0 1 0 1 Balmer 7 9 1 0.1 0.01 1 9 3 10:09 6 A.1: A.1 1 10 9 68 A 10 9 10 9 1 10 9 10 1 mm 1/10 A. (a) (b) A.: (a) A.3 A.4 1 1 A.1. 69 5 1 10 15 3 40 0 0 ¾ ¾ É f Á ½ j 30 A.3: A.4: 1/10
A Responsive Processor for Parallel/Distributed Real-time Processing
E-mail: yamasaki@{ics.keio.ac.jp, etl.go.jp} http://www.ny.ics.keio.ac.jp etc. CPU) I/O I/O or Home Automation, Factory Automation, (SPARC) (SDRAM I/F, DMAC, PCI, USB, Timers/Counters, SIO, PIO, )
HP Blade Workstation HP RCS Remote Client Solution HP Blade Workstation CO2 2
HP Blade Workstation HP RCS Remote Client Solution HP Blade Workstation CO2 2 3D CAD HP Remote Graphics 3 HP Blade Workstation OS IT HP Blade Workstation IT TCO 4 IT HP Blade Workstation HP Blade Workstation
システムソリューションのご紹介
HP 2 C 製品 :VXPRO/VXSMP サーバ 製品アップデート 製品アップデート VXPRO と VXSMP での製品オプションの追加 8 ポート InfiniBand スイッチ Netlist HyperCloud メモリ VXPRO R2284 GPU サーバ 製品アップデート 8 ポート InfiniBand スイッチ IS5022 8 ポート 40G InfiniBand スイッチ
IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1
SMYLE OpenCL 128 1 1 1 1 1 2 2 3 3 3 (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 128 SMYLEref SMYLE OpenCL SMYLE OpenCL Implementation and Evaluations on 128 Cores Takuji Hieda 1 Noriko Etani
13,825,228 3,707,995 26.8 4.9 25 3 8 9 1 50,000 0.29 1.59 70,000 0.29 1.74 12,500 0.39 1.69 12,500 0.55 10,000 20,000 0.13 1.58 30,000 0.00 1.26 5,000 0.13 1.58 25,000 40,000 0.13 1.58 50,000 0.00 1.26
1 8, : 8.1 1, 2 z = ax + by + c ax by + z c = a b +1 x y z c = 0, (0, 0, c), n = ( a, b, 1). f = n i=1 a ii x 2 i + i<j 2a ij x i x j = ( x, A x), f =
1 8, : 8.1 1, z = ax + by + c ax by + z c = a b +1 x y z c = 0, (0, 0, c), n = ( a, b, 1). f = a ii x i + i
( )
1. 2. 3. 4. 5. ( ) () http://www-astro.physics.ox.ac.uk/~wjs/apm_grey.gif http://antwrp.gsfc.nasa.gov/apod/ap950917.html ( ) SDSS : d 2 r i dt 2 = Gm jr ij j i rij 3 = Newton 3 0.1% 19 20 20 2 ( ) 3 3
FIT2013( 第 12 回情報科学技術フォーラム ) I-032 Acceleration of Adaptive Bilateral Filter base on Spatial Decomposition and Symmetry of Weights 1. Taiki Makishi Ch
I-032 Acceleration of Adaptive Bilateral Filter base on Spatial Decomposition and Symmetry of Weights 1. Taiki Makishi Chikatoshi Yamada Shuichi Ichikawa Gaussian Filter GF GF Bilateral Filter BF CG [1]
PROSTAGE[プロステージ]
PROSTAGE & L 2 3200 650 2078 Storage system Panel system 3 esk system 2 250 22 01 125 1 2013-2014 esk System 2 L4OA V 01 2 L V L V OA 4 3240 32 2 7 4 OA P202 MG55 MG57 MG56 MJ58 MG45 MG55 MB95 Z712 MG57
IoTを加速するエッジコンピューティング HPE Edgeline Converged IoT Systems
IoT を加速するエッジコンピューティング HPE Edgeline Converged IoT Systems エッジコンピューティングが IoT にまったく新しい価値を創造する IoT Internet of Things IoT 域コスト情報漏えい設備重複データ破損コンプライアンス HPEIoT 64CPUHPE Edgeline Converged IoT Systems IoT データ転送に伴うネットワークコストの増加
.5 z = a + b + c n.6 = a sin t y = b cos t dy d a e e b e + e c e e e + e 3 s36 3 a + y = a, b > b 3 s363.7 y = + 3 y = + 3 s364.8 cos a 3 s365.9 y =,
[ ] IC. r, θ r, θ π, y y = 3 3 = r cos θ r sin θ D D = {, y ; y }, y D r, θ ep y yddy D D 9 s96. d y dt + 3dy + y = cos t dt t = y = e π + e π +. t = π y =.9 s6.3 d y d + dy d + y = y =, dy d = 3 a, b
7. y fx, z gy z gfx dz dx dz dy dy dx. g f a g bf a b fa 7., chain ule Ω, D R n, R m a Ω, f : Ω R m, g : D R l, fω D, b fa, f a g b g f a g f a g bf a
9 203 6 7 WWW http://www.math.meiji.ac.jp/~mk/lectue/tahensuu-203/ 2 8 8 7. 7 7. y fx, z gy z gfx dz dx dz dy dy dx. g f a g bf a b fa 7., chain ule Ω, D R n, R m a Ω, f : Ω R m, g : D R l, fω D, b fa,
BIT -2-
2004.3.31 10 11 12-1- BIT -2- -3-256 258 932 524 585 -4- -5- A B A B AB A B A B C AB A B AB AB AB AB -6- -7- A B -8- -9- -10- mm -11- fax -12- -13- -14- -15- s58.10.1 1255 4.2 30.10-16- -17- -18- -19-6.12.10
1 (bit ) ( ) PC WS CPU IEEE754 standard ( 24bit) ( 53bit)
GNU MP BNCpack [email protected] 2002 9 20 ( ) Linux Conference 2002 1 1 (bit ) ( ) PC WS CPU IEEE754 standard ( 24bit) ( 53bit) 10 2 2 3 4 5768:9:; = %? @BADCEGFH-I:JLKNMNOQP R )TSVU!" # %$ & " #
untitled
A = QΛQ T A n n Λ Q A = XΛX 1 A n n Λ X GPGPU A 3 T Q T AQ = T (Q: ) T u i = λ i u i T {λ i } {u i } QR MR 3 v i = Q u i A {v i } A n = 9000 Quad Core Xeon 2 LAPACK (4/3) n 3 O(n 2 ) O(n 3 ) A {v i }
スライド 1
Zabbix のデータベース ベンチマークレポート PostgreSQL vs MySQL Yoshiharu Mori SRA OSS Inc. Japan Agenda はじめに Simple test 大量のアイテムを設定 Partitioning test パーティションイングを利用して計測 Copyright 2013 SRA OSS, Inc. Japan All rights reserved.
ATLAS 2011/3/25-26
ATLAS 2011/3/25-26 2 LHC (Large Hadron Collider)/ATLAS LHC - CERN - s=7 TeV ATLAS - LHC 1 Higgs 44 m 44m 22m 7000t 22 m 3 SCT( ) SCT(SemiConductor Tracker) - - 100 fb -1 SCT 3 SCT( ) R eta=1.0 eta=1.5
02_Matrox Frame Grabbers_1612
Matrox - - Frame Grabbers MatroxRadient ev-cxp Equalizer Equalizer Equalizer Equalizer 6.25 Gbps 20 Mbps Stream channel Control channel Stream channel Control channel Stream channel Control channel Stream
211 [email protected] 1 R *1 n n R n *2 R n = {(x 1,..., x n ) x 1,..., x n R}. R R 2 R 3 R n R n R n D D R n *3 ) (x 1,..., x n ) f(x 1,..., x n ) f D *4 n 2 n = 1 ( ) 1 f D R n f : D R 1.1. (x,
26 FPGA 11 05340 1 FPGA (Field Programmable Gate Array) ASIC (Application Specific Integrated Circuit) FPGA FPGA FPGA FPGA Linux FreeDOS skewed way L1
FPGA 272 11 05340 26 FPGA 11 05340 1 FPGA (Field Programmable Gate Array) ASIC (Application Specific Integrated Circuit) FPGA FPGA FPGA FPGA Linux FreeDOS skewed way L1 FPGA skewed L2 FPGA skewed Linux
ProLiant BL25p Generation 2システム構成図
HP ProLiant BL p-class Server BL25p Generation 2 2007 11 15 1 OVERVIEW ProLiant BL25p Generation 2 HP BladeSystem p-class Hardware Component BladeSystem p-class BladeSystem p-class BladeSystem p-class
<4D F736F F F696E74202D2091E63489F15F436F6D C982E682E992B48D8291AC92B489B F090CD2888F38DFC E B8CDD8
Web キャンパス資料 超音波シミュレーションの基礎 ~ 第 4 回 ComWAVEによる超高速超音波解析 ~ 科学システム開発部 Copyright (c)2006 ITOCHU Techno-Solutions Corporation 本日の説明内容 ComWAVEの概要および特徴 GPGPUとは GPGPUによる解析事例 CAE POWER 超音波研究会開催 (10 月 3 日 ) のご紹介
Ver. 3.8 Ver NOTE E v3 2.4GHz, 20M cache, 8.00GT/s QPI,, HT, 8C/16T 85W E v3 1.6GHz, 15M cache, 6.40GT/s QPI,
PowerEdge T630 Contents RAID /RAID & PCIe GPU OS v3.8 Apr. 2017 P3-5 P6 P7 P8-9 P10-11 P12-16 P17-79 P80-85 P86-87 P88-90 P90 P91-92 P93-96 P97-100 P101-107 P107-108 P109-110 2017 4 28 2016 4 22 Ver. 3.8
DRAM SRAM SDRAM (Synchronous DRAM) DDR SDRAM (Double Data Rate SDRAM) DRAM 4 C Wikipedia 1.8 SRAM DRAM DRAM SRAM DRAM SRAM (256M 1G bit) (32 64M bit)
2016.4.1 II ( ) 1 1.1 DRAM RAM DRAM DRAM SRAM RAM SRAM SRAM SRAM SRAM DRAM SRAM SRAM DRAM SRAM 1.2 (DRAM, Dynamic RAM) (SRAM, Static RAM) (RAM Random Access Memory ) DRAM 1 1 1 1 SRAM 4 1 2 DRAM 4 DRAM
WJ-HD SHIFT /0 PULL Digital Disk Recorder WJ-HD 316
WJ-HD36 SHIFT 3 4 5 6 7 8 9 0/0 PULL 3 4 5 6 Digital Disk Recorder WJ-HD 36 q w e r t y 3 4 5 6 7 8 9 0 3 4 5 q w 6 q w e r t y 7 SHIFT 3 4 5 6 7 8 9 0/0 HDD HDD 3 4 5 6 8 9 PULL Digital Disk Recorder
1 28 6 12 7 1 7.1...................................... 2 7.1.1............................... 2 7.1.2........................... 2 7.2...................................... 3 7.3...................................
スライド 1
GPU クラスタによる格子 QCD 計算 広大理尾崎裕介 石川健一 1.1 Introduction Graphic Processing Units 1 チップに数百個の演算器 多数の演算器による並列計算 ~TFLOPS ( 単精度 ) CPU 数十 GFLOPS バンド幅 ~100GB/s コストパフォーマンス ~$400 GPU の開発環境 NVIDIA CUDA http://www.nvidia.co.jp/object/cuda_home_new_jp.html
「FPGAを用いたプロセッサ検証システムの製作」
FPGA 2210010149-5 2005 2 21 RISC Verilog-HDL FPGA (celoxica RC100 ) LSI LSI HDL CAD HDL 3 HDL FPGA MPU i 1. 1 2. 3 2.1 HDL FPGA 3 2.2 5 2.3 6 2.3.1 FPGA 6 2.3.2 Flash Memory 6 2.3.3 Flash Memory 7 2.3.4
fx-3650P_fx-3950P_J
SA1109-E J fx-3650p fx-3950p http://edu.casio.jp RCA500002-001V04 AB2 Mode
HP Workstation 総合カタログ
HP Workstation E5 v2 Z Z SFF E5 v2 2 HP Windows Z 3 Performance Innovation Reliability 3 HPZ HP HP Z820 Workstation P.11 HP Z620 Workstation & CPU P.12 HP Z420 Workstation P.13 17.3in WIDE HP ZBook 17
No2 4 y =sinx (5) y = p sin(2x +3) (6) y = 1 tan(3x 2) (7) y =cos 2 (4x +5) (8) y = cos x 1+sinx 5 (1) y =sinx cos x 6 f(x) = sin(sin x) f 0 (π) (2) y
No1 1 (1) 2 f(x) =1+x + x 2 + + x n, g(x) = 1 (n +1)xn + nx n+1 (1 x) 2 x 6= 1 f 0 (x) =g(x) y = f(x)g(x) y 0 = f 0 (x)g(x)+f(x)g 0 (x) 3 (1) y = x2 x +1 x (2) y = 1 g(x) y0 = g0 (x) {g(x)} 2 (2) y = µ
< 1 > (1) f 0 (a) =6a ; g 0 (a) =6a 2 (2) y = f(x) x = 1 f( 1) = 3 ( 1) 2 =3 ; f 0 ( 1) = 6 ( 1) = 6 ; ( 1; 3) 6 x =1 f(1) = 3 ; f 0 (1) = 6 ; (1; 3)
< 1 > (1) f 0 (a) =6a ; g 0 (a) =6a 2 (2) y = f(x) x = 1 f( 1) = 3 ( 1) 2 =3 ; f 0 ( 1) = 6 ( 1) = 6 ; ( 1; 3) 6 x =1 f(1) = 3 ; f 0 (1) = 6 ; (1; 3) 6 y = g(x) x = 1 g( 1) = 2 ( 1) 3 = 2 ; g 0 ( 1) =
RW1097-0A-001_V0.1_170106
INTRODUCTION RW1097 is a dot matrix LCD driver & controller LSI which is fabricated by low power CMOS technology. It can display 1line/2line/3line/4line/5line/6lines x 12 (16 x 16 dot format) with the
(1.2) T D = 0 T = D = 30 kn 1.2 (1.4) 2F W = 0 F = W/2 = 300 kn/2 = 150 kn 1.3 (1.9) R = W 1 + W 2 = = 1100 N. (1.9) W 2 b W 1 a = 0
1 1 1.1 1.) T D = T = D = kn 1. 1.4) F W = F = W/ = kn/ = 15 kn 1. 1.9) R = W 1 + W = 6 + 5 = 11 N. 1.9) W b W 1 a = a = W /W 1 )b = 5/6) = 5 cm 1.4 AB AC P 1, P x, y x, y y x 1.4.) P sin 6 + P 1 sin 45
lll
lll HA8000/30W アーキテクチャー HA8000/30W A8,B8,C8 Intel Intel845 Pentium 4(2.60GHz/2.40GHz) celeron (2.0GHz) Intel Intel845 1way 2GB Pentium 4 Celeron CPU Host Bus 64bit Bus:400MHz:MAX 3.2GB/s PCI AGP (Intel845)
P33W・P28X カタログ
P33WP28X Windows 10 24 FC-PM IoT 24 Windows 10Windows 7 2 FC98-NXP33WP28X PC FC-PM P33WP28X PC ACC 1 1HDD1 1 2HDD2 1 AC 1 2 USB 3 USB3.0 USB 4 USB3.0 USB 5 USB3.0 USB 6 USB3.0 USB 7 USB3.0 USB 8 USB3.0
main.dvi
2 f(z) 0 f(z 0 ) lim z!z 0 z 0 z 0 z z 1z = z 0 z 0 1z z z 0 1z 21 22 2 2 (1642-1727) (1646-1716) (1777-1855) (1789-1857) (1826-1866) 18 19 2.1 2.1.1 12 z w w = f (z) (2.1) f(z) z w = f(z) z z 0 w w 0
sec13.dvi
13 13.1 O r F R = m d 2 r dt 2 m r m = F = m r M M d2 R dt 2 = m d 2 r dt 2 = F = F (13.1) F O L = r p = m r ṙ dl dt = m ṙ ṙ + m r r = r (m r ) = r F N. (13.2) N N = R F 13.2 O ˆn ω L O r u u = ω r 1 1:
II (10 4 ) 1. p (x, y) (a, b) ε(x, y; a, b) 0 f (x, y) f (a, b) A, B (6.5) y = b f (x, b) f (a, b) x a = A + ε(x, b; a, b) x a 2 x a 0 A = f x (
II (1 4 ) 1. p.13 1 (x, y) (a, b) ε(x, y; a, b) f (x, y) f (a, b) A, B (6.5) y = b f (x, b) f (a, b) x a = A + ε(x, b; a, b) x a x a A = f x (a, b) y x 3 3y 3 (x, y) (, ) f (x, y) = x + y (x, y) = (, )
