¹âÀºÅÙÀþ·ÁÂå¿ô±é»»¥é¥¤¥Ö¥é¥êMPACK¤Î³«È¯

Size: px
Start display at page:

Download "¹âÀºÅÙÀþ·ÁÂå¿ô±é»»¥é¥¤¥Ö¥é¥êMPACK¤Î³«È¯"

Transcription

1 /11/27@

2 . MPACK 0.6.7: Multiple precision version of BLAS and LAPACK MPACK (MBLAS and MLAPACK) BLAS/LAPACK (API) (2010/8/20); MLAPACK, MBLAS : Linux/Windows/MacOSX/FreeBSD : MPFR/MPC/GMP/DD/QD/double C++; 2- BSD ;

3 . Top500 Linpack 1... BLAS/LAPACK

4 . BLAS LAPACK? BLAS: -, -, - API : GotoBLAS, Intel MKL, ATLAS LAPACK: BLAS ( ) Netlib 8400 (2010/11/27) BLAS/LAPACK

5 . BLAS LAPACK? BLAS: -, -, - API : GotoBLAS, Intel MKL, ATLAS LAPACK: BLAS ( ) Netlib 8400 (2010/11/27) BLAS/LAPACK

6 . MPACK 0.6.7: Multiple precision version of BLAS and LAPACK MPACK (MBLAS and MLAPACK) BLAS/LAPACK (API) (2010/8/20); MLAPACK, MBLAS : Linux/Windows/MacOSX/FreeBSD : MPFR/MPC/GMP/DD/QD/double C++; 2- BSD ;

7 . MPACK 4,8 : : QD, DD : : GMP : ( ): MPFR/MPC : double

8 . ; ; Prefix float, double R eal, complex, double complex C omplex. daxpy, zaxpy Raxpy, Caxpy dgemm, zgemm Rgemm, Cgemm dsterf, dsyev Rsterf, Rsyev dzabs1, dzasum RCabs1, RCasum

9 . MBLAS LEVEL1 MBLAS Crotg Cscal Rrotg Rrot Rrotm CRrot Cswap Rswap CRscal Rscal Ccopy Rcopy Caxpy Raxpy Rdot Cdotc Cdotu RCnrm2 Rnrm2 Rasum icasum iramax RCabs1 Mlsame Mxerbla LEVEL2 MBLAS Cgemv Rgemv Cgbmv Rgbmv Chemv Chbmv Chpmv Rsymv Rsbmv Ctrmv Cgemv Rgemv Cgbmv Rgemv Chemv Chbmv Chpmv Rsymv Rsbmv Rspmv Ctrmv Rtrmv Ctbmv Ctpmv Rtpmv Ctrsv Rtrsv Ctbsv Rtbsv Ctpsv Rger Cgeru Cgerc Cher Chpr Cher2 Chpr2 Rsyr Rspr Rsyr2 Rspr2 LEVEL3 MBLAS Cgemm Rgemm Csymm Rsymm Chemm Csyrk Rsyrk Cherk Csyr2k Rsyr2k Cher2k Ctrmm Rtrmm Ctrsm Rtrsm

10 . MLAPACK Mutils Rlamch Rlae2 Rlaev2 Claev2 Rlassq Classq Rlanst Clanht Rlansy Clansy Clanhe Rlapy2 Rlarfg Rlapy3 Rladiv Cladiv Clarfg Rlartg Clartg Rlaset Claset Rlasr Clasr Rpotf2 Clacgv Cpotf2 Rlascl Clascl Rlasrt Rsytd2 Chetd2 Rsteqr Csteqr Rsterf Rlarf Clarf Rorg2l Cung2l Rorg2r Cung2r Rlarft Clarft Rlarfb Clarfb Rorgqr Cungqr Rorgql Cungql Rlatrd Clatrd Rsytrd Chetrd Rorgtr Cungtr Rsyev Cheev Rpotrf Cpotrf Clacrm Rtrti2 Ctrti2 Rtrtri Ctrtri Rgetf2 Cgetf2 Rlaswp Claswp Rgetrf Cgetrf Rgetri Cgetri Rgetrs Cgetrs Rgesv Cgesv Rtrtrs Ctrtrs Rlasyf Clasyf Clahef Clacrt Claesy Crot Cspmv Cspr Csymv Csyr icmax1 RCsum1 Rpotrs Rposv Rgeequ Rlatrs Rlange Rgecon Rlauu2 Rlauum Rpotri Rpocon

11 . ; ; call by value or call by reference MBLAS/MLAPACK: Rgemm("n", "n", n, n, n, alpha, A, n, B, n, beta, C, n); Rgetrf(n, n, A, n, ipiv, &info); Rgetri(n, A, n, ipiv, work, lwork, &info); Rsyev("V", "U", n, A, n, w, work, &lwork, &info); BLAS/LAPACK: dgemm_f77("n", "N", &n, &n, &n, &One, A, &n, A, &n, &Zero, C, & dgetri_f77(&n, A, &n, ipiv, work, &lwork, &info); (C++/C BLAS/LAPACK lapack.h, blas.h!)

12 . MBLAS : Caxpy; axpy void Caxpy(INTEGER n, COMPLEX ca, COMPLEX * cx, INTEGER incx, COMPLEX { REAL Zero = 0.0; if (n <= 0) return; if (RCabs1(ca) == Zero) return; INTEGER ix = 0; INTEGER iy = 0; if (incx < 0) ix = (-n + 1) * incx; if (incy < 0) iy = (-n + 1) * incy; for (INTEGER i = 0; i < n; i++) { cy[iy] = cy[iy] + ca * cx[ix]; ix = ix + incx;

13 . MLAPACK Rsyev; Rlascl(uplo, 0, 0, One, sigma, n, n, A, lda, info); } //Call DSYTRD to reduce symmetric matrix to tridiagonal form. inde = 1; indtau = inde + n; indwrk = indtau + n; llwork = *lwork - indwrk + 1; Rsytrd(uplo, n, &A[0], lda, &w[0], &work[inde - 1], &work[indtau - &work[indwrk - 1], llwork, &iinfo); //For eigenvalues only, call DSTERF. For eigenvectors, first call //DORGTR to generate the orthogonal matrix, then call DSTEQR. if (!wantz) { Rsterf(n, &w[0], &work[inde - 1], info); } else { Rorgtr(uplo, n, A, lda, &work[indtau - 1], &work[indwrk - 1], &iinfo); Rsteqr(jobz, n, w, &work[inde - 1], A, lda, &work[indtau - 1], } //If matrix was scaled, then rescale eigenvalues appropriately. if (iscale == 1) { if (*info == 0) {

14 . MPACK (MBLAS/MLAPACK) : :SDPA-GMP, -QD, -DD ( ; ) Google Multiple precision BLAS MPACK Project web hits , download (2010/11/24)

15 . MPACK (MBLAS/MLAPACK) Core i7 920/Ubuntu 10.04/9.04 amd64 DD(double-double) Raxpy 600MFlops (mpack 0.6.6; gotoblas 1.5GFlops) Rgemv 140MFlops (gotoblas 3.8GFlops) Rgemm 140MFlops (gotoblas 42.5GFlops; 2010: 30GFlops on RADEON HD GFlops) Rsyev 112s (1000 ; gotoblas 1.83s)

16 . : SDPA-GMP, SDPA-QD, SDPA-DD: Powered by MPACK

17 . MPACK 0.6.7: Multiple precision version of BLAS and LAPACK MPACK (MBLAS and MLAPACK) BLAS/LAPACK (API) (2010/8/20); MLAPACK, MBLAS : Linux/Windows/MacOSX/FreeBSD : MPFR/MPC/GMP/DD/QD/double C++; 2- BSD ;

MPACK 0.6.0: 多倍長精度版のBLAS/LAPACKの作成

MPACK 0.6.0: 多倍長精度版のBLAS/LAPACKの作成 MPACK 0.6.0: BLAS/LAPACK maho@riken.jp 2009/12/11 MPACK(MBLAS/MLAPACK) ( ) (2007-2010) ( ) http://accc.riken.jp/maho/ MPACK: BLAS/LAPACK http://mplapack.sourceforge.net/ MBLAS: BLAS(Basic Linear Algebra

More information

MBLAS¤ÈMLAPACK; ¿ÇÜĹÀºÅÙÈǤÎBLAS/LAPACK¤ÎºîÀ®

MBLAS¤ÈMLAPACK; ¿ÇÜĹÀºÅÙÈǤÎBLAS/LAPACK¤ÎºîÀ® MBLAS MLAPACK; BLAS/LAPACK maho@riken.jp February 23, 2009 MPACK(MBLAS/MLAPACK) ( ) (2007 ) ( ) http://accc.riken.jp/maho/ BLAS/LAPACK http://mplapack.sourceforge.net/ BLAS (Basic Linear Algebra Subprograms)

More information

倍々精度RgemmのnVidia C2050上への実装と応用

倍々精度RgemmのnVidia C2050上への実装と応用 .. maho@riken.jp http://accc.riken.jp/maho/,,, 2011/2/16 1 - : GPU : SDPA-DD 10 1 - Rgemm : 4 (32 ) nvidia C2050, GPU CPU 150, 24GFlops 25 20 GFLOPS 15 10 QuadAdd Cray, QuadMul Sloppy Kernel QuadAdd Cray,

More information

211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS /1/18 a a 1 a 2 a 3 a a GPU Graphics Processing Unit GPU CPU GPU GPGPU G

211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS /1/18 a a 1 a 2 a 3 a a GPU Graphics Processing Unit GPU CPU GPU GPGPU G 211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS211 211/1/18 GPU 4 8 BLAS 4 8 BLAS Basic Linear Algebra Subprograms GPU Graphics Processing Unit 4 8 double 2 4 double-double DD 4 4 8 quad-double

More information

11020070-0_Vol16No2.indd

11020070-0_Vol16No2.indd 2552 チュートリアル BLAS, LAPACK 2 1 BLAS, LAPACKチュートリアル パート1 ( 簡 単 な 使 い 方 とプログラミング) 中 田 真 秀 1 読 者 の 想 定 BLAS [1], LAPACK [2] 2 線 形 代 数 の 重 要 性 について Google Page Rank 3D CPU 筆 者 紹 介 BLAS LAPACK http://accc.riken.jp/maho/

More information

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1 .. BLAS LAPACK 1, 2013/05/23 CMSI A 1 / 43 BLAS LAPACK (I) BLAS, LAPACK BLAS : - LAPACK : 2 / 43 : 3 / 43 (wikipedia) V : f : V u, v u + v u + v V α K u V αu V V x, y f (x + y) = f (x) + f (y) V x K α

More information

ストリーミング SIMD 拡張命令2 (SSE2) を使用した SAXPY/DAXPY

ストリーミング SIMD 拡張命令2 (SSE2) を使用した SAXPY/DAXPY SIMD 2(SSE2) SAXPY/DAXPY 2.0 2000 7 : 248600J-001 01/12/06 1 305-8603 115 Fax: 0120-47-8832 * Copyright Intel Corporation 1999, 2000 01/12/06 2 1...5 2 SAXPY DAXPY...5 2.1 SAXPY DAXPY...6 2.1.1 SIMD C++...6

More information

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1 1 / 50 BLAS LAPACK 1, 2015/05/21 CMSI A 2 / 50 BLAS LAPACK (I) BLAS, LAPACK BLAS : - LAPACK : 3 / 50 ( ) 1000 ( ; 1 2 ) :... 3 / 50 ( ) 1000 ( ; 1 2 ) :... 3 / 50 ( ) 1000 ( ; 1 2 ) :... 3 / 50 ( ) 1000

More information

KBLAS[7] *1., CUBLAS.,,, Byte/flop., [13] 1 2. (AT). GPU AT,, GPU SYMV., SYMV CUDABLAS., (double, float) (cu- FloatComplex, cudoublecomplex).,, DD(dou

KBLAS[7] *1., CUBLAS.,,, Byte/flop., [13] 1 2. (AT). GPU AT,, GPU SYMV., SYMV CUDABLAS., (double, float) (cu- FloatComplex, cudoublecomplex).,, DD(dou Vol.214-HPC-146 No.14 214/1/3 CUDA-xSYMV 1,3,a) 1 2,3 2,3 (SYMV)., (GEMV) 2.,, mutex., CUBLAS., 1 2,. (AT). 2, SYMV GPU., SSYMV( SYMV), GeForce GTXTitan Black 211GFLOPS( 62.8%)., ( ) (, ) DD(double-double),

More information

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1 1 / 56 BLAS LAPACK 1, 2017/05/25 CMSI A 2 / 56 BLAS LAPACK (I) BLAS, LAPACK BLAS : - LAPACK : 3 / 56 ( ) 1000 ( ; 1 2 ) :... 3 / 56 ( ) 1000 ( ; 1 2 ) :... 3 / 56 ( ) 1000 ( ; 1 2 ) :... 3 / 56 ( ) 1000

More information

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2 FFT 1 Fourier fast Fourier transform FFT FFT FFT 1 FFT FFT 2 Fourier 2.1 Fourier FFT Fourier discrete Fourier transform DFT DFT n 1 y k = j=0 x j ω jk n, 0 k n 1 (1) x j y k ω n = e 2πi/n i = 1 (1) n DFT

More information

untitled

untitled 1 4 4 6 8 10 30 13 14 16 16 17 18 19 19 96 21 23 24 3 27 27 4 27 128 24 4 1 50 by ( 30 30 200 30 30 24 4 TOP 10 2012 8 22 3 1 7 1,000 100 30 26 3 140 21 60 98 88,000 96 3 5 29 300 21 21 11 21

More information

(Basic Theory of Information Processing) Fortran Fortan Fortan Fortan 1

(Basic Theory of Information Processing) Fortran Fortan Fortan Fortan 1 (Basic Theory of Information Processing) Fortran Fortan Fortan Fortan 1 17 Fortran Formular Tranlator Lapack Fortran FORTRAN, FORTRAN66, FORTRAN77, FORTRAN90, FORTRAN95 17.1 A Z ( ) 0 9, _, =, +, -, *,

More information

Untitled

Untitled VASP 2703 2006 3 VASP 100 PC 3,4 VASP VASP VASP FFT. (LAPACK,BLAS,FFT), CPU VASP. 1 C LAPACK,BLAS VASP VASP VASP VASP bench.hg VASP CPU CPU CPU northwood LAPACK lmkl lapack64, BLAS lmkl p4 LA- PACK liblapack,

More information

PowerPoint プレゼンテーション

PowerPoint プレゼンテーション 応用数理概論 準備 端末上で cd ~/ mkdir cppwork cd cppwork wget http://271.jp/gairon/main.cpp wget http://271.jp/gairon/matrix.hpp とコマンドを記入. ls とコマンドをうち,main.cppとmatrix.hppがダウンロードされていることを確認. 1 準備 コンパイル c++ -I. -std=c++0x

More information

2 2.1 Mac OS CPU Mac OS tar zxf zpares_0.9.6.tar.gz cd zpares_0.9.6 Mac Makefile Mekefile.inc cp Makefile.inc/make.inc.gfortran.seq.macosx make

2 2.1 Mac OS CPU Mac OS tar zxf zpares_0.9.6.tar.gz cd zpares_0.9.6 Mac Makefile Mekefile.inc cp Makefile.inc/make.inc.gfortran.seq.macosx make Sakurai-Sugiura z-pares 26 9 5 1 1 2 2 2.1 Mac OS CPU......................................... 2 2.2 Linux MPI............................................ 2 3 3 4 6 4.1 MUMPS....................................

More information

Second-semi.PDF

Second-semi.PDF PC 2000 2 18 2 HPC Agenda PC Linux OS UNIX OS Linux Linux OS HPC 1 1CPU CPU Beowulf PC (PC) PC CPU(Pentium ) Beowulf: NASA Tomas Sterling Donald Becker 2 (PC ) Beowulf PC!! Linux Cluster (1) Level 1:

More information

2... Numerical Recipes [1] Matrix Computation [2].,.. 2.1, ( ) A. A,.,.. A [ ] [ ] a x T 0 A =, P = I β [0 u T ], P = I βuu T, β = 2/ u 2 x B u P ( ),

2... Numerical Recipes [1] Matrix Computation [2].,.. 2.1, ( ) A. A,.,.. A [ ] [ ] a x T 0 A =, P = I β [0 u T ], P = I βuu T, β = 2/ u 2 x B u P ( ), T2K JST/CREST 1,.,, AX = XΛ AX = BXΛ. A, B (B ), Λ, X.,,., 1,.,.,,.., T2K.,, 1. T2K (HA8000),. eingen_s,, 64 (1024 ). T2K TIPS, T2K.. 1 2... Numerical Recipes [1] Matrix Computation [2].,.. 2.1, ( ) A.

More information

4 倍精度基本線形代数ルーチン群 QPBLAS の紹介 [index] 1. Introduction 2. Double-double algorithm 3. QPBLAS 4. QPBLAS-GPU 5. Summary 佐々成正 1, 山田進 1, 町田昌彦 1, 今村俊幸 2, 奥田洋司

4 倍精度基本線形代数ルーチン群 QPBLAS の紹介 [index] 1. Introduction 2. Double-double algorithm 3. QPBLAS 4. QPBLAS-GPU 5. Summary 佐々成正 1, 山田進 1, 町田昌彦 1, 今村俊幸 2, 奥田洋司 4 倍精度基本線形代数ルーチン群 QPBLAS の紹介 [index] 1. Introduction 2. Double-double algorithm 3. QPBLAS 4. QPBLAS-GPU 5. Summary 佐々成正 1, 山田進 1, 町田昌彦 1, 今村俊幸 2, 奥田洋司 3 1 1 日本原子力研究開発機構システム計算科学センター 2 理科学研究所計算科学研究機構 3 東京大学新領域創成科学研究科

More information

11050427-0_Vol16No3.indd

11050427-0_Vol16No3.indd 2599 チュートリアル BLAS, LAPACK 2 2 GPU BLAS, LAPACKチュートリアル パート2 (GPU 編 ) 中 田 真 秀 1 はじめに GPU Graphics Processing Unit BLAS, LAPACK GPU GPU NVIDIA AMD AMD RADEON HD NVIDIA NVIDIA GPU NVIDIA C2050 BLAS, LAPACK

More information

BLAS の概要

BLAS の概要 GotoBLAS チュートリアル 後藤和茂 ( テキサス州立大学 ) 26/12/9 Kazushige Goto (TACC) 1 自己紹介 お題目 数値計算と最適化の基本事項の確認 BLAS とは? GotoBLAS の特徴 Level 1 ~Level 3 ルーチンの構造と特徴 BLAS による最適化の限界 26/12/9 Kazushige Goto (TACC) 2 自己紹介 早稲田大学電気工学修士課程卒

More information

AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted

AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted DEGIMA LINPACK Energy Performance for LINPACK Benchmark on DEGIMA 1 AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK 1.4698 GFlops/Watt 1.9658 GFlops/Watt Abstract GPU Computing has

More information

子ども・子育て支援新制度 全国総合システム(仮称)に関するインターフェース仕様書 市町村・都道府県編(初版)

子ども・子育て支援新制度 全国総合システム(仮称)に関するインターフェース仕様書 市町村・都道府県編(初版) 1...1 1.1... 1 1.1.1... 1 1.2... 3 1.2.1... 3 1.2.2... 4 1.3... 5 1.4... 6 1.4.1... 6 (1) B11:...6 (2) B11:...8 1.4.2... 11 (1) B31:... 11 1.4.3... 12 (1) B21, B41:... 12 2... 14 2.1... 14 2.1.1... 14

More information

QD library! Feature! Easy to use high precision! Easy to understand the structure of arithmetic! 2 type high precision arithmetic! Double-Double precision (pseudo quadruple precision)! Quad-Double precision

More information

untitled

untitled A = QΛQ T A n n Λ Q A = XΛX 1 A n n Λ X GPGPU A 3 T Q T AQ = T (Q: ) T u i = λ i u i T {λ i } {u i } QR MR 3 v i = Q u i A {v i } A n = 9000 Quad Core Xeon 2 LAPACK (4/3) n 3 O(n 2 ) O(n 3 ) A {v i }

More information

aisatu.pdf

aisatu.pdf 1 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71

More information

supercomputer2010.ppt

supercomputer2010.ppt nanri@cc.kyushu-u.ac.jp 1 !! : 11 12! : nanri@cc.kyushu-u.ac.jp! : Word 2 ! PC GPU) 1997 7 http://wiredvision.jp/news/200806/2008062322.html 3 !! (Cell, GPU )! 4 ! etc...! 5 !! etc. 6 !! 20km 40 km ) 340km

More information

ex01.dvi

ex01.dvi ,. 0. 0.0. C () /******************************* * $Id: ex_0_0.c,v.2 2006-04-0 3:37:00+09 naito Exp $ * * 0. 0.0 *******************************/ #include int main(int argc, char **argv) double

More information

一般演題(ポスター)

一般演題(ポスター) 6 5 13 : 00 14 : 00 A μ 13 : 00 14 : 00 A β β β 13 : 00 14 : 00 A 13 : 00 14 : 00 A 13 : 00 14 : 00 A β 13 : 00 14 : 00 A β 13 : 00 14 : 00 A 13 : 00 14 : 00 A β 13 : 00 14 : 00 A 13 : 00 14 : 00 A

More information

......1201-P1.`5

......1201-P1.`5 2009. No356 2/ 5 6 a b b 7 d d 6 ca b dd a c b b d d c c a c b - a b G A bb - c a - d b c b c c d b F F G & 7 B C C E D B C F C B E B a ca b b c c d d c c d b c c d b c c d b d d d d - d d d b b c c b

More information

AutoTuned-RB

AutoTuned-RB ABCLib Working Notes No.10 AutoTuned-RB Version 1.00 AutoTuned-RB AutoTuned-RB RB_DGEMM RB_DGEMM ( TransA, TransB, M, N, K, a, A, lda, B, ldb, b, C, ldc ) L3BLAS DGEMM (C a Trans(A) Trans(B) b C) (1) TransA:

More information

応用数学特論.dvi

応用数学特論.dvi 1 1 1.1.1 ( ). P,Q,R,.... 2+3=5 2 1.1.2 ( ). P T (true) F (false) T F P P T P. T 2 F 1.1.3 ( ). 2 P Q P Q P Q P Q P or Q P Q P Q P Q T T T T F T F T T F F F. P = 5 4 Q = 3 2 P Q = 5 4 3 2 P F Q T P Q T

More information

XcalableMP入門

XcalableMP入門 XcalableMP 1 HPC-Phys@, 2018 8 22 XcalableMP XMP XMP Lattice QCD!2 XMP MPI MPI!3 XMP 1/2 PCXMP MPI Fortran CCoarray C++ MPIMPI XMP OpenMP http://xcalablemp.org!4 XMP 2/2 SPMD (Single Program Multiple Data)

More information

Microsoft Word - .....J.^...O.|Word.i10...j.doc

Microsoft Word - .....J.^...O.|Word.i10...j.doc P 1. 2. R H C H, etc. R' n R' R C R'' R R H R R' R C C R R C R' R C R' R C C R 1-1 1-2 3. 1-3 1-4 4. 5. 1-5 5. 1-6 6. 10 1-7 7. 1-8 8. 2-1 2-2 2-3 9. 2-4 2-5 2-6 2-7 10. 2-8 10. 2-9 10. 2-10 10. 11. C

More information

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N GPU 1 1 2 1, 3 2, 3 (Graphics Unit: GPU) GPU GPU GPU Evaluation of GPU Computing Based on An Automatic Program Generation Technology Makoto Sugawara, 1 Katsuto Sato, 1 Kazuhiko Komatsu, 2 Hiroyuki Takizawa

More information

64bit SSE2 SSE2 FPU Visual C++ 64bit Inline Assembler 4 FPU SSE2 4.1 FPU Control Word FPU 16bit R R R IC RC(2) PC(2) R R PM UM OM ZM DM IM R: reserved

64bit SSE2 SSE2 FPU Visual C++ 64bit Inline Assembler 4 FPU SSE2 4.1 FPU Control Word FPU 16bit R R R IC RC(2) PC(2) R R PM UM OM ZM DM IM R: reserved (Version: 2013/5/16) Intel CPU (kashi@waseda.jp) 1 Intel CPU( AMD CPU) 64bit SIMD Inline Assemler Windows Visual C++ Linux gcc 2 FPU SSE2 Intel CPU double 8087 FPU (floating point number processing unit)

More information

P1-1 P1-2 P1-3 P1-4 P1-5 P1-6 P3-1 P3-2 P3-3 P3-4 P3-5 P3-6 P5-1 P5-2 P5-3 P5-4 P5-5 P5-6 P7-1 P7-2 P7-3 P7-4 P7-5 P7-6 P9-1 P9-2 P9-3 P9-4 P9-5 P9-6 P11-1 P11-2 P11-3 P11-4 P13-1 P13-2 P13-3 P13-4 P13-5

More information

橡Taro9-生徒の活動.PDF

橡Taro9-生徒の活動.PDF 3 1 4 1 20 30 2 2 3-1- 1 2-2- -3- 18 1200 1 4-4- -5- 15 5 25 5-6- 1 4 2 1 10 20 2 3-7- 1 2 3 150 431 338-8- 2 3 100 4 5 6 7 1-9- 1291-10 - -11 - 10 1 35 2 3 1866 68 4 1871 1873 5 6-12 - 1 2 3 4 1 4-13

More information

1 913 10301200 A B C D E F G H J K L M 1A1030 10 : 45 1A1045 11 : 00 1A1100 11 : 15 1A1115 11 : 30 1A1130 11 : 45 1A1145 12 : 00 1B1030 1B1045 1C1030

1 913 10301200 A B C D E F G H J K L M 1A1030 10 : 45 1A1045 11 : 00 1A1100 11 : 15 1A1115 11 : 30 1A1130 11 : 45 1A1145 12 : 00 1B1030 1B1045 1C1030 1 913 9001030 A B C D E F G H J K L M 9:00 1A0900 9:15 1A0915 9:30 1A0930 9:45 1A0945 10 : 00 1A1000 10 : 15 1B0900 1B0915 1B0930 1B0945 1B1000 1C0900 1C0915 1D0915 1C0930 1C0945 1C1000 1D0930 1D0945 1D1000

More information

日本糖尿病学会誌第58巻第1号

日本糖尿病学会誌第58巻第1号 α β β β β β β α α β α β α l l α l μ l β l α β β Wfs1 β β l l l l μ l l μ μ l μ l Δ l μ μ l μ l l ll l l l l l l l l μ l l l l μ μ l l l l μ l l l l l l l l l l μ l l l μ l μ l l l l l l l l l μ l l l l

More information

P-12 P-13 3 4 28 16 00 17 30 P-14 P-15 P-16 4 14 29 17 00 18 30 P-17 P-18 P-19 P-20 P-21 P-22

P-12 P-13 3 4 28 16 00 17 30 P-14 P-15 P-16 4 14 29 17 00 18 30 P-17 P-18 P-19 P-20 P-21 P-22 1 14 28 16 00 17 30 P-1 P-2 P-3 P-4 P-5 2 24 29 17 00 18 30 P-6 P-7 P-8 P-9 P-10 P-11 P-12 P-13 3 4 28 16 00 17 30 P-14 P-15 P-16 4 14 29 17 00 18 30 P-17 P-18 P-19 P-20 P-21 P-22 5 24 28 16 00 17 30 P-23

More information

main.dvi

main.dvi y () 5 C Fortran () Fortran 32bit 64bit 2 0 1 2 1 1bit bit 3 0 0 2 1 3 0 1 2 1 bit bit byte 8bit 1byte 3 0 10010011 2 1 3 0 01001011 2 1 byte Fortran A A 8byte double presicion y ( REAL*8) A 64bit 4byte

More information

Version C 1 2 3 4 5 1 2 3 4 5 6 7 8 9 0 A 1 2 1 3 4 5 1 1 2 1 1 1 2 4 5 6 7 8 3 1 2 C a b c d e f g A A B C B a b c d e f g 3 4 4 5 6 7 8 1 2 a b 1 2 a b 1 2 1 2 5 4 1 23 5 6 6 a b 1 2 e c d 3

More information

: : : TSTank 2

: : : TSTank 2 Java (8) 2008-05-20 Lesson6 Lesson5 Java 1 Lesson 6: TSTank1, TSTank2, TSTank3 java 2 car1 car2 Car car1 = new Car(); Car car2 = new Car(); car1.setcolor(red); car2.setcolor(blue); car2.changeengine(jet);

More information

(Version: 2017/4/18) Intel CPU 1 Intel CPU( AMD CPU) 64bit SIMD Inline Assemler Windows Visual C++ Linux gcc 2 FPU SSE2 Intel CPU do

(Version: 2017/4/18) Intel CPU 1 Intel CPU( AMD CPU) 64bit SIMD Inline Assemler Windows Visual C++ Linux gcc 2 FPU SSE2 Intel CPU do (Version: 2017/4/18) Intel CPU (kashi@waseda.jp) 1 Intel CPU( AMD CPU) 64bit SIMD Inline Assemler Windows Visual C++ Linux gcc 2 FPU SSE2 Intel CPU double 8087 FPU (floating point number processing unit)

More information

FAX780CL_chap-first.fm

FAX780CL_chap-first.fm FAX-780CL ABCDEFGHIα 01041115:10 :01 FAX-780CL α 1 1 2 3 1 2 f k b a FAX-780CL α n p q 09,. v m t w FAX-780CL A BC B C D E F G H I c i c s s i 0 9 V X Q ( < N > O P Z R Q: W Y M S T U V i c i k

More information

07-二村幸孝・出口大輔.indd

07-二村幸孝・出口大輔.indd GPU Graphics Processing Units HPC High Performance Computing GPU GPGPU General-Purpose computation on GPU CPU GPU GPU *1 Intel Quad-Core Xeon E5472 3.0 GHz 2 6 MB L2 cache 1600 MHz FSB 80 GFlops 1 nvidia

More information

ex01.dvi

ex01.dvi ,. 0. 0.0. C () /******************************* * $Id: ex_0_0.c,v.2 2006-04-0 3:37:00+09 naito Exp $ * * 0. 0.0 *******************************/ #include int main(int argc, char **argv) { double

More information

FAX780TA_chap-first.fm

FAX780TA_chap-first.fm FAX-780TA ABCDEFGHIα 01041115:10 :01 FAX-780CL α 1 1 2 3 1 2 f k b a FAX-780TA α n p q 09,. v m t w FAX-780TA A BC B C D E F G H I c i c s s i 0 9 i c i k o o o t c 0 9 - = C t C B t - = 1 2 3

More information