Minsky の電力消費量の調査 (1) ディープラーニングデモプログラム (mnist) の実行デモプログラム 添付 1 CPU 実行 シングル GPU 実行 2GPU 実行での計算時間と電力消費量を比較した 〇計算時間 CPU 実行 シングル GPU 実行複数 GPU 実行 今回は 2GPU 計

Size: px
Start display at page:

Download "Minsky の電力消費量の調査 (1) ディープラーニングデモプログラム (mnist) の実行デモプログラム 添付 1 CPU 実行 シングル GPU 実行 2GPU 実行での計算時間と電力消費量を比較した 〇計算時間 CPU 実行 シングル GPU 実行複数 GPU 実行 今回は 2GPU 計"

Transcription

1 IBM Minsky における 電力性能比検証報告書 (Deep Learning および HPC アプリケーション ) 2017 年 1 月 ビジュアルテクノロジー株式会社

2 Minsky の電力消費量の調査 (1) ディープラーニングデモプログラム (mnist) の実行デモプログラム 添付 1 CPU 実行 シングル GPU 実行 2GPU 実行での計算時間と電力消費量を比較した 〇計算時間 CPU 実行 シングル GPU 実行複数 GPU 実行 今回は 2GPU 計算時間 ( 秒 ) 倍以上のスピード 〇電力消費量 約 10 秒ごとに消費電力を表示 (Total Watt: 総消費電力 GPU Watt: 総電力の内 GPU の消費電力 ) Time Step Chainer CPU Chainer 1GPU Chainer 2 GPU Total Watt ( うち GPU Watt) Total Watt ( うち GPU Watt) Total Watt ( うち 2GPU 合計 Watt) 10Sec 708W (55W) 744W (70W) 816W (75W) 20Sec 636W (55W) 744W (70W) 768W (85W) 30Sec 708W (50W) 744W (65W) 792W (75W) 40Sec 636W (50W) 744W (70W) 612W (55W) 50Sec 624W (55W) 756W (70W) Sec 720W (55W) 744W (65W) Sec 600W (55W) 720W (55W) Sec 636W (50W) 720W (55W) Sec 600W (55W) Sec 600W (55W) Sec 612W (50W) Sec 612W (55W) GPU 計算では GPU 消費電力以上に総電力が増加している

3 (2) 行列ベクトル積の計算行列 (7,000 x 7,000) x ベクトル (1 列 ) の計算 CPU(20 スレッド ) 実行とシングル GPU 実行 CPU プログラム 添付 2 GPU プログラム 添付 3 Time 行列ベクトル積 CPU(20 スレッド ) 行列ベクトル積 (1GPU) Step Total Watt ( うち GPU Watt) Total Watt ( うち GPU Watt) 10Sec 984W (50W) 744W (65W) 20Sec 972W (50W) 732W (65W) 30Sec 936W (50W) 744W (65W) 40Sec 912W (50W) 744W (65W) 50Sec 888W (50W) 744W (65W) 60Sec 864W (50W) 744W (65W) 70Sec 924W (50W) 744W (65W) 80Sec 792W (50W) 744W (65W) 90Sec 948W (50W) 744W (65W) 100Sec 972W (50W) 744W (65W) 110Sec 768W (55W) 736W (65W) 120Sec 802W (50W) 744W (65W) CPU 版のコンパイラ pgi fortran バージョン :16.10 オプション :pgfortran -O3 -mp GPU コンパイラ nvcc バージョン :7.5 オプション : なし ライブラリ cublas 使用

4 〇計算性能と消費電力 CPU 実行 GPU 実行 消費電力 (12Ts の平均値 ) 約 897Watt 約 743Watt 計算性能 (GFlops) 約 220GFlops 約 1500GFlops Watt 当たりの計算性能 0.25 GFlops/Watt 2.0 GFlops/Watt ( ご参考 ) 京コンピュータ電力性能比 0.83GFlops/Watt ( ご参考 )JC-AHPC スパコン電力性能比 5.0GFlops/Watt

5 添付 1 サンプルプログラム (mnist) 55 行目で args.gpu 指定することで GPU を使うことになる 1 #!/usr/bin/env python 2 from future import print_function 3 import argparse 4 5 import chainer 6 import chainer.functions as F 7 import chainer.links as L 8 from chainer import training 9 from chainer.training import extensions # Network definition 13 class MLP(chainer.Chain): def init (self, n_units, n_out): 16 super(mlp, self). init ( 17 # the size of the inputs to each layer will be inferred 18 l1=l.linear(none, n_units), # n_in -> n_units 19 l2=l.linear(none, n_units), # n_units -> n_units 20 l3=l.linear(none, n_out), # n_units -> n_out 21 ) def call (self, x): 24 h1 = F.relu(self.l1(x)) 25 h2 = F.relu(self.l2(h1)) 26 return self.l3(h2) def main(): 30 parser = argparse.argumentparser(description='chainer example: MNIST') 31 parser.add_argument('--batchsize', '-b', type=int, default=100, 32 help='number of images in each mini-batch') 33 parser.add_argument('--epoch', '-e', type=int, default=20,

6 34 help='number of sweeps over the dataset to train') 35 parser.add_argument('--gpu', '-g', type=int, default=-1, 36 help='gpu ID (negative value indicates CPU)') 37 parser.add_argument('--out', '-o', default='result', 38 help='directory to output the result') 39 parser.add_argument('--resume', '-r', default='', 40 help='resume the training from snapshot') 41 parser.add_argument('--unit', '-u', type=int, default=1000, 42 help='number of units') 43 args = parser.parse_args() print('gpu: {}'.format(args.gpu)) 46 print('# unit: {}'.format(args.unit)) 47 print('# Minibatch-size: {}'.format(args.batchsize)) 48 print('# epoch: {}'.format(args.epoch)) 49 print('') # Set up a neural network to train 52 # Classifier reports softmax cross entropy loss and accuracy at every 53 # iteration, which will be used by the PrintReport extension below. 54 model = L.Classifier(MLP(args.unit, 10)) 55 if args.gpu >= 0: 56 chainer.cuda.get_device(args.gpu).use() # Make a specified GPU current 57 model.to_gpu() # Copy the model to the GPU # Setup an optimizer 60 optimizer = chainer.optimizers.adam() 61 optimizer.setup(model) # Load the MNIST dataset 64 train, test = chainer.datasets.get_mnist() 65

7 66 train_iter = chainer.iterators.serialiterator(train, args.batchsize) 67 test_iter = chainer.iterators.serialiterator(test, args.batchsize, 68 repeat=false, shuffle=false) # Set up a trainer 71 updater = training.standardupdater(train_iter, optimizer, device=args.gpu) 72 trainer = training.trainer(updater, (args.epoch, 'epoch'), out=args.out) # Evaluate the model with the test dataset for each epoch 75 trainer.extend(extensions.evaluator(test_iter, model, device=args.gpu)) # Dump a computational graph from 'loss' variable at the first iteration 78 # The "main" refers to the target link of the "main" optimizer. 79 trainer.extend(extensions.dump_graph('main/loss')) # Take a snapshot at each epoch 82 trainer.extend(extensions.snapshot(), trigger=(args.epoch, 'epoch')) # Write a log of evaluation statistics for each epoch 85 trainer.extend(extensions.logreport()) # Save two plot images to the result dir 88 trainer.extend( 89 extensions.plotreport(['main/loss', 'validation/main/loss'], 'epoch', 90 file_name='loss.png')) 91 trainer.extend( 92 extensions.plotreport(['main/accuracy', 'validation/main/accuracy'], 93 'epoch', file_name='accuracy.png'))

8 94 95 # Print selected entries of the log to stdout 96 # Here "main" refers to the target link of the "main" optimizer again, and 97 # "validation" refers to the default name of the Evaluator extension. 98 # Entries other than 'epoch' are reported by the Classifier link, called by 99 # either the updater or the evaluator. 100 trainer.extend(extensions.printreport( 101 ['epoch', 'main/loss', 'validation/main/loss', 102 'main/accuracy', 'validation/main/accuracy', 'elapsed_time'])) # Print a progress bar to stdout 105 trainer.extend(extensions.progressbar()) if args.resume: 108 # Resume from a snapshot 109 chainer.serializers.load_npz(args.resume, trainer) # Run the training 112 trainer.run() if name == ' main ': 115 main()

9 添付 2 行列ベクトル積 (CPU マルチスレッド版 ) 27 行目から 33 行目がカーネル部 1 use omp_lib 2 implicit double precision(a-h,o-z) 3 allocatable a(:,:),b(:),c(:) 4 dimension toms(10000),tome(10000) 5 character*32 buff 6!$OMP parallel 7 nth=omp_get_num_threads() 8!$OMP end parallel 9 call getarg(1,buff) 10 read(buff,*) n 11 allocate(a(n,n),b(n),c(n)) 12 do i=1,n 13 do j=1,n 14 a(i,j)=1.d0/dble(i+j-1) 15 enddo 16 b(i)=1. 17 enddo 18 it=0 19 t0=elaptime() continue 21!$OMP critical 22 it=it+1 23!$OMP end critical if(it.gt.10000) goto toms(it)=elaptime() 27!$OMP parallel do reduction(+:c) 28 do j=1,n 29 do i=1,n 30 c(i)=c(i)+a(i,j)*b(j) 31 enddo 32 enddo 33!$OMP end parallel do 34 tome(it)=elaptime()

10 35!$OMP parallel do reduction(+:s) 36 do i=1,n 37 s=s+c(i)*c(i) 38 enddo 39!$OMP end parallel do 40 s=dsqrt(s) 41 if(mod(it,1000).eq.0)then 42 write(6,*) it,s 43 c write(6,*) c 44 endif 45 b=c/s 46 c=0.0d0 47 goto continue 49 t1=elaptime() 50 write(6,60) n,nth,t1-t0,1.d4*dble(2*n*n+4*n)/(t1-t0)*1.d format("qaz",2i6,2f12.6) 52 write(60,61) tome-toms 53 write(6,*) "s =",s format(1pd12.6) 55 stop 56 end

11 添付 3: 行列ベクトル積 (GPU 版 ) 1 // dgemm CUDA test public domain 2 #include <stdio.h> 3 #include <stdlib.h> 4 #include <math.h> 5 #include "cublas.h" 6 //Matlab/Octave format 7 void printmat(int N, int M, double *A, int LDA) { 8 double mtmp; 9 printf("[ "); 10 for (int i = 0; i < N; i++) { 11 printf("[ "); 12 for (int j = 0; j < M; j++) { 13 mtmp = A[i + j * LDA]; 14 printf("%5.2e", mtmp); 15 if (j < M - 1) printf(", "); 16 } if (i < N - 1) printf("]; "); 17 else printf("] "); 18 } printf("]"); 19 } 20 double extern elaptime(void); 21 int main( int argc,char *argv[] ) 22 { 23 int n ; double alpha, beta; 24 int i,j,l,it; 25 int nt,nl; cublasstatus stata, statb, statc; 28 scanf("%d %d ",&n,&nl); 29 nt=atoi(argv[1]); 30 double *deva[nl], *devb[nl], *devc[nl]; 31 double **A; 32 double **B ; 33 double **C ; 34 double s1,s2,t1,t2,flop; 35 cudasetdevice(2);

12 36 cublasinit(); 37 A=(double**) malloc(sizeof(double**)*nl); 38 B=(double**) malloc(sizeof(double**)*nl); 39 C=(double**) malloc(sizeof(double**)*nl); 40 for(l=0 ; l<nl;l++){ 41 A[l] = (double*) malloc(sizeof(double)*n*n); 42 B[l] = (double*) malloc(sizeof(double)*n*n); 43 C[l] = (double*) malloc(sizeof(double)*n*n); stata = cublasalloc (n*n, n*n, (void**)&deva[l]); 46 statb = cublasalloc (2*n, n*n, (void**)&devb[l]); 47 statc = cublasalloc (2*n, n*n, (void**)&devc[l]); 48 for(i=0 ; i<n ; i++){ 49 for(j=0 ; j<n; j++){ 50 A[l][i*n+j]=1.e0/(double)(i+j+1+l); 51 } 52 B[l][i]=(double)(2*i+2); 53 B[l][i+n]=(double)(2*i+3); 54 } 55 } 56 printf("# start. n"); 57 alpha = 1.0; beta = 1.0; 58 t1=elaptime(); 59 float elapsed_time_ms=0.0f; 60 cudaevent_t start, stop; 61 cudaeventcreate( &start ); 62 cudaeventcreate( &stop ); 63 cudaeventrecord( start, 0 ); 64 for(l=0; l< nl ; l++){ 65 stata = cublassetmatrix (n, n, n*n, A[l], n, deva[l], n); 66 statb = cublassetmatrix (n, 2, 2*n, B[l], n, devb[l], n); 67 statc = cublassetmatrix (n, 2, 2*n, C[l], n, devc[l], n); 68 } 69 for(it=0; it<nt ; it++){ 70 for(l=0; l< nl ; l++){ 71 cublasdgemm('n', 'n', n, 2, n, alpha, deva[l], n, devb[l], n, beta, devc[l],

13 n); 72 s1=cublasdnrm2(n, devc[l], 1); 73 statc= cublasgetmatrix (n, 2, n*n,devc[l], n, C[l], n); 74 for(i=0;i<n;i++) { 75 B[l][i]=C[l][i]/s1; 76 B[l][i+n]=C[l][i+n]/s2; 77 C[l][i]=C[l][i+n]=0.0; 78 } 79 } 80 } 81 t2=elaptime(); 82 cudaeventrecord( stop, 0 ); 83 cudaeventsynchronize( stop ); 84 cudaeventelapsedtime( &elapsed_time_ms, start, stop ); printf("alpha = %5.3e n", alpha); 87 printf("beta = %5.3e n", beta); 88 printf(" sec= %lf n",t2-t1); 89 flop=2.*(double)(2*n*n+4*n)*(double)(nt*nl); 90 printf(" sec2= %lf n",elapsed_time_ms); 91 printf("s1 s2= %lf %lf n", s1,s2); 92 printf(" n %d nl %d nt %d n",n,nl,nt); 93 printf("#flop %lf n GFlops %lf n",flop,flop*1.e-9/(t2-t1)); 94 cublasfree (deva); 95 cublasfree (devb); 96 cublasfree (devc); 97 cublasshutdown(); 98 delete[]c; delete[]b; delete[]a; 99 }

11050427-0_Vol16No3.indd

11050427-0_Vol16No3.indd 2599 チュートリアル BLAS, LAPACK 2 2 GPU BLAS, LAPACKチュートリアル パート2 (GPU 編 ) 中 田 真 秀 1 はじめに GPU Graphics Processing Unit BLAS, LAPACK GPU GPU NVIDIA AMD AMD RADEON HD NVIDIA NVIDIA GPU NVIDIA C2050 BLAS, LAPACK

More information

XMPによる並列化実装2

XMPによる並列化実装2 2 3 C Fortran Exercise 1 Exercise 2 Serial init.c init.f90 XMP xmp_init.c xmp_init.f90 Serial laplace.c laplace.f90 XMP xmp_laplace.c xmp_laplace.f90 #include int a[10]; program init integer

More information

2017 (413812)

2017 (413812) 2017 (413812) Deep Learning ( NN) 2012 Google ASIC(Application Specific Integrated Circuit: IC) 10 ASIC Deep Learning TPU(Tensor Processing Unit) NN 12 20 30 Abstract Multi-layered neural network(nn) has

More information

ex01.dvi

ex01.dvi ,. 0. 0.0. C () /******************************* * $Id: ex_0_0.c,v.2 2006-04-0 3:37:00+09 naito Exp $ * * 0. 0.0 *******************************/ #include int main(int argc, char **argv) double

More information

ex01.dvi

ex01.dvi ,. 0. 0.0. C () /******************************* * $Id: ex_0_0.c,v.2 2006-04-0 3:37:00+09 naito Exp $ * * 0. 0.0 *******************************/ #include int main(int argc, char **argv) { double

More information

,,,,., C Java,,.,,.,., ,,.,, i

,,,,., C Java,,.,,.,., ,,.,, i 24 Development of the programming s learning tool for children be derived from maze 1130353 2013 3 1 ,,,,., C Java,,.,,.,., 1 6 1 2.,,.,, i Abstract Development of the programming s learning tool for children

More information

¥Ñ¥Ã¥±¡¼¥¸ Rhpc ¤Î¾õ¶·

¥Ñ¥Ã¥±¡¼¥¸ Rhpc ¤Î¾õ¶· Rhpc COM-ONE 2015 R 27 12 5 1 / 29 1 2 Rhpc 3 forign MPI 4 Windows 5 2 / 29 1 2 Rhpc 3 forign MPI 4 Windows 5 3 / 29 Rhpc, R HPC Rhpc, ( ), snow..., Rhpc worker call Rhpc lapply 4 / 29 1 2 Rhpc 3 forign

More information

CUDA 連携とライブラリの活用 2

CUDA 連携とライブラリの活用 2 1 09:30-10:00 受付 10:00-12:00 Reedbush-H ログイン GPU 入門 13:30-15:00 OpenACC 入門 15:15-16:45 OpenACC 最適化入門と演習 17:00-18:00 OpenACC の活用 (CUDA 連携とライブラリの活用 ) CUDA 連携とライブラリの活用 2 3 OpenACC 簡単にGPUプログラムが作成できる それなりの性能が得られる

More information

[1] #include<stdio.h> main() { printf("hello, world."); return 0; } (G1) int long int float ± ±

[1] #include<stdio.h> main() { printf(hello, world.); return 0; } (G1) int long int float ± ± [1] #include printf("hello, world."); (G1) int -32768 32767 long int -2147483648 2147483647 float ±3.4 10 38 ±3.4 10 38 double ±1.7 10 308 ±1.7 10 308 char [2] #include int a, b, c, d,

More information

Deep Learning Deep Learning GPU GPU FPGA %

Deep Learning Deep Learning GPU GPU FPGA % 2016 (412825) Deep Learning Deep Learning GPU GPU FPGA 16 1 16 69% Abstract Recognition by DeepLearning attracts attention, because of its high recognition accuracy. Lots of learning is necessary for Deep

More information

RL_tutorial

RL_tutorial )! " = $ % & ' "(& &*+ = ' " + %' "(- + %. ' "(. + γ γ=0! " = $ " γ=0.9! " = $ " + 0.9$ " + 0.81$ "+, + ! " #, % #! " #, % # + (( + #,- +. max 2 3! " #,-, % 4! " #, % # ) α ! " #, % ' ( )(#, %)!

More information

Introduction Purpose This training course demonstrates the use of the High-performance Embedded Workshop (HEW), a key tool for developing software for

Introduction Purpose This training course demonstrates the use of the High-performance Embedded Workshop (HEW), a key tool for developing software for Introduction Purpose This training course demonstrates the use of the High-performance Embedded Workshop (HEW), a key tool for developing software for embedded systems that use microcontrollers (MCUs)

More information

5 11 3 1....1 2. 5...4 (1)...5...6...7...17...22 (2)...70...71...72...77...82 (3)...85...86...87...92...97 (4)...101...102...103...112...117 (5)...121...122...123...125...128 1. 10 Web Web WG 5 4 5 ²

More information

01_OpenMP_osx.indd

01_OpenMP_osx.indd OpenMP* / 1 1... 2 2... 3 3... 5 4... 7 5... 9 5.1... 9 5.2 OpenMP* API... 13 6... 17 7... 19 / 4 1 2 C/C++ OpenMP* 3 Fortran OpenMP* 4 PC 1 1 9.0 Linux* Windows* Xeon Itanium OS 1 2 2 WEB OS OS OS 1 OS

More information

openmp1_Yaguchi_version_170530

openmp1_Yaguchi_version_170530 並列計算とは /OpenMP の初歩 (1) 今 の内容 なぜ並列計算が必要か? スーパーコンピュータの性能動向 1ExaFLOPS 次世代スハ コン 京 1PFLOPS 性能 1TFLOPS 1GFLOPS スカラー機ベクトル機ベクトル並列機並列機 X-MP ncube2 CRAY-1 S-810 SR8000 VPP500 CM-5 ASCI-5 ASCI-4 S3800 T3E-900 SR2201

More information

Microsoft Word - C.....u.K...doc

Microsoft Word - C.....u.K...doc C uwêííôöðöõ Ð C ÔÖÐÖÕ ÐÊÉÌÊ C ÔÖÐÖÕÊ C ÔÖÐÖÕÊ Ç Ê Æ ~ if eíè ~ for ÒÑÒ ÌÆÊÉÉÊ ~ switch ÉeÍÈ ~ while ÒÑÒ ÊÍÍÔÖÐÖÕÊ ~ 1 C ÔÖÐÖÕ ÐÊÉÌÊ uê~ ÏÒÏÑ Ð ÓÏÖ CUI Ô ÑÊ ÏÒÏÑ ÔÖÐÖÕÎ d ÈÍÉÇÊ ÆÒ Ö ÒÐÑÒ ÊÔÎÏÖÎ d ÉÇÍÊ

More information

£Ã¥×¥í¥°¥é¥ß¥ó¥°ÆþÌç (2018) - Â裵²ó ¨¡ À©¸æ¹½Â¤¡§¾ò·ïʬ´ô ¨¡

£Ã¥×¥í¥°¥é¥ß¥ó¥°ÆþÌç (2018) - Â裵²ó  ¨¡ À©¸æ¹½Â¤¡§¾ò·ïʬ´ô ¨¡ (2018) 2018 5 17 0 0 if switch if if ( ) if ( 0) if ( ) if ( 0) if ( ) (0) if ( 0) if ( ) (0) ( ) ; if else if ( ) 1 else 2 if else ( 0) 1 if ( ) 1 else 2 if else ( 0) 1 if ( ) 1 else 2 (0) 2 if else

More information

AutoTuned-RB

AutoTuned-RB ABCLib Working Notes No.10 AutoTuned-RB Version 1.00 AutoTuned-RB AutoTuned-RB RB_DGEMM RB_DGEMM ( TransA, TransB, M, N, K, a, A, lda, B, ldb, b, C, ldc ) L3BLAS DGEMM (C a Trans(A) Trans(B) b C) (1) TransA:

More information

C

C C 1 2 1.1........................... 2 1.2........................ 2 1.3 make................................................ 3 1.4....................................... 5 1.4.1 strip................................................

More information

RHEA key

RHEA key 2 P (k, )= k e k! 3 4 Probability 0.4 0.35 0.3 0.25 Poisson ( λ = 1) Poisson (λ = 3) Poisson ( λ = 10) Poisson (λ = 20) Poisson ( λ = 30) Gaussian (µ = 1, s = 1) Gaussian ( µ = 3, s = 3) Gaussian (µ =

More information

ohp03.dvi

ohp03.dvi 19 3 ( ) 2019.4.20 CS 1 (comand line arguments) Unix./a.out aa bbb ccc ( ) C main void int main(int argc, char *argv[]) {... 2 (2) argc argv argc ( ) argv (C char ) ( 1) argc 4 argv NULL. / a. o u t \0

More information

untitled

untitled Fortran90 ( ) 17 12 29 1 Fortran90 Fortran90 FORTRAN77 Fortran90 1 Fortran90 module 1.1 Windows Windows UNIX Cygwin (http://www.cygwin.com) C\: Install Cygwin f77 emacs latex ps2eps dvips Fortran90 Intel

More information

11042 計算機言語7回目 サポートページ:

11042 計算機言語7回目  サポートページ: 11042 7 :https://goo.gl/678wgm November 27, 2017 10/2 1(print, ) 10/16 2(2, ) 10/23 (3 ) 10/31( ),11/6 (4 ) 11/13,, 1 (5 6 ) 11/20,, 2 (5 6 ) 11/27 (7 12/4 (9 ) 12/11 1 (10 ) 12/18 2 (10 ) 12/25 3 (11

More information

DA100データアクイジションユニット通信インタフェースユーザーズマニュアル

DA100データアクイジションユニット通信インタフェースユーザーズマニュアル Instruction Manual Disk No. RE01 6th Edition: November 1999 (YK) All Rights Reserved, Copyright 1996 Yokogawa Electric Corporation 801234567 9 ABCDEF 1 2 3 4 1 2 3 4 1 2 3 4 1 2

More information

OpenMP (1) 1, 12 1 UNIX (FUJITSU GP7000F model 900), 13 1 (COMPAQ GS320) FUJITSU VPP5000/64 1 (a) (b) 1: ( 1(a))

OpenMP (1) 1, 12 1 UNIX (FUJITSU GP7000F model 900), 13 1 (COMPAQ GS320) FUJITSU VPP5000/64 1 (a) (b) 1: ( 1(a)) OpenMP (1) 1, 12 1 UNIX (FUJITSU GP7000F model 900), 13 1 (COMPAQ GS320) FUJITSU VPP5000/64 1 (a) (b) 1: ( 1(a)) E-mail: {nanri,amano}@cc.kyushu-u.ac.jp 1 ( ) 1. VPP Fortran[6] HPF[3] VPP Fortran 2. MPI[5]

More information

I I / 47

I I / 47 1 2013.07.18 1 I 2013 3 I 2013.07.18 1 / 47 A Flat MPI B 1 2 C: 2 I 2013.07.18 2 / 47 I 2013.07.18 3 / 47 #PJM -L "rscgrp=small" π-computer small: 12 large: 84 school: 24 84 16 = 1344 small school small

More information

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2 FFT 1 Fourier fast Fourier transform FFT FFT FFT 1 FFT FFT 2 Fourier 2.1 Fourier FFT Fourier discrete Fourier transform DFT DFT n 1 y k = j=0 x j ω jk n, 0 k n 1 (1) x j y k ω n = e 2πi/n i = 1 (1) n DFT

More information

/ SCHEDULE /06/07(Tue) / Basic of Programming /06/09(Thu) / Fundamental structures /06/14(Tue) / Memory Management /06/1

/ SCHEDULE /06/07(Tue) / Basic of Programming /06/09(Thu) / Fundamental structures /06/14(Tue) / Memory Management /06/1 I117 II I117 PROGRAMMING PRACTICE II 2 MEMORY MANAGEMENT 2 Research Center for Advanced Computing Infrastructure (RCACI) / Yasuhiro Ohara yasu@jaist.ac.jp / SCHEDULE 1. 2011/06/07(Tue) / Basic of Programming

More information

第 1 回ディープラーニング分散学習ハッカソン <ChainerMN 紹介 + スパコンでの実 法 > チューター福 圭祐 (PFN) 鈴 脩司 (PFN)

第 1 回ディープラーニング分散学習ハッカソン <ChainerMN 紹介 + スパコンでの実 法 > チューター福 圭祐 (PFN) 鈴 脩司 (PFN) 第 1 回ディープラーニング分散学習ハッカソン チューター福 圭祐 (PFN) 鈴 脩司 (PFN) https://chainer.org/ 2 Chainer: A Flexible Deep Learning Framework Define-and-Run Define-by-Run Define Define by Run Model

More information

r03.dvi

r03.dvi 19 ( ) 019.4.0 CS 1 (comand line arguments) Unix./a.out aa bbb ccc ( ) C main void... argc argv argc ( ) argv (C char ) ( 1) argc 4 argv NULL. / a. o u t \0 a a \0 b b b \0 c c c \0 1: // argdemo1.c ---

More information

joho09.ppt

joho09.ppt s M B e E s: (+ or -) M: B: (=2) e: E: ax 2 + bx + c = 0 y = ax 2 + bx + c x a, b y +/- [a, b] a, b y (a+b) / 2 1-2 1-3 x 1 A a, b y 1. 2. a, b 3. for Loop (b-a)/ 4. y=a*x*x + b*x + c 5. y==0.0 y (y2)

More information

1 return main() { main main C 1 戻り値の型 関数名 引数 関数ブロックをあらわす中括弧 main() 関数の定義 int main(void){ printf("hello World!!\n"); return 0; 戻り値 1: main() 2.2 C main

1 return main() { main main C 1 戻り値の型 関数名 引数 関数ブロックをあらわす中括弧 main() 関数の定義 int main(void){ printf(hello World!!\n); return 0; 戻り値 1: main() 2.2 C main C 2007 5 29 C 1 11 2 2.1 main() 1 FORTRAN C main() main main() main() 1 return 1 1 return main() { main main C 1 戻り値の型 関数名 引数 関数ブロックをあらわす中括弧 main() 関数の定義 int main(void){ printf("hello World!!\n"); return

More information

GTC Japan, 2018/09/14 得居誠也, Preferred Networks Chainer における 深層学習の高速化 Optimizing Deep Learning with Chainer

GTC Japan, 2018/09/14 得居誠也, Preferred Networks Chainer における 深層学習の高速化 Optimizing Deep Learning with Chainer GTC Japan, 2018/09/14 得居誠也, Preferred Networks Chainer における 深層学習の高速化 Optimizing Deep Learning with Chainer Chainer のミッション Deep Learning とその応用の研究開発を加速させる 環境セットアップが速い すぐ習熟 素早いコーディング 実験の高速化 結果をさっと公開 論文化

More information

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1 .. BLAS LAPACK 1, 2013/05/23 CMSI A 1 / 43 BLAS LAPACK (I) BLAS, LAPACK BLAS : - LAPACK : 2 / 43 : 3 / 43 (wikipedia) V : f : V u, v u + v u + v V α K u V αu V V x, y f (x + y) = f (x) + f (y) V x K α

More information

sim98-8.dvi

sim98-8.dvi 8 12 12.1 12.2 @u @t = @2 u (1) @x 2 u(x; 0) = (x) u(0;t)=u(1;t)=0fort 0 1x, 1t N1x =1 x j = j1x, t n = n1t u(x j ;t n ) Uj n U n+1 j 1t 0 U n j =1t=(1x) 2 = U n j+1 0 2U n j + U n j01 (1x) 2 (2) U n+1

More information

Intel® Compilers Professional Editions

Intel® Compilers Professional Editions 2007 6 10.0 * 10.0 6 5 Software &Solutions group 10.0 (SV) C++ Fortran OpenMP* OpenMP API / : 200 C/C++ Fortran : OpenMP : : : $ cat -n main.cpp 1 #include 2 int foo(const char *); 3 int main()

More information

1 1.1 C 2 1 double a[ ][ ]; 1 3x x3 ( ) malloc() 2 double *a[ ]; double 1 malloc() dou

1 1.1 C 2 1 double a[ ][ ]; 1 3x x3 ( ) malloc() 2 double *a[ ]; double 1 malloc() dou 1 1.1 C 2 1 double a[ ][ ]; 1 3x3 0 1 3x3 ( ) 0.240 0.143 0.339 0.191 0.341 0.477 0.412 0.003 0.921 1.2 malloc() 2 double *a[ ]; double 1 malloc() double 1 malloc() free() 3 #include #include

More information

file:///D|/C言語の擬似クラス.txt

file:///D|/C言語の擬似クラス.txt 愛知障害者職業能力開発校 システム設計科 修了研究発表会報告書 題名 : C 言語の擬似クラス あらまし : C 言語でクラスを作れるという噂の真偽を確かめるために思考錯誤した まえがき : VC++ や Java その他オブジェクト指向の言語にはクラスが存在して クラスはオブジェクトの設計図である 手法 : C++ のクラスを解析して C++ のクラスを作成して C 言語に翻訳する class struct

More information

18 C ( ) hello world.c 1 #include <stdio.h> 2 3 main() 4 { 5 printf("hello World\n"); 6 } [ ] [ ] #include <stdio.h> % cc hello_world.c %./a.o

18 C ( ) hello world.c 1 #include <stdio.h> 2 3 main() 4 { 5 printf(hello World\n); 6 } [ ] [ ] #include <stdio.h> % cc hello_world.c %./a.o 18 C ( ) 1 1 1.1 hello world.c 5 printf("hello World\n"); 6 } [ ] [ ] #include % cc hello_world.c %./a.out Hello World [a.out ] % cc hello_world.c -o hello_world [ ( ) ] (K&R 4.1.1) #include

More information

r07.dvi

r07.dvi 19 7 ( ) 2019.4.20 1 1.1 (data structure ( (dynamic data structure 1 malloc C free C (garbage collection GC C GC(conservative GC 2 1.2 data next p 3 5 7 9 p 3 5 7 9 p 3 5 7 9 1 1: (single linked list 1

More information

ohp07.dvi

ohp07.dvi 19 7 ( ) 2019.4.20 1 (data structure) ( ) (dynamic data structure) 1 malloc C free 1 (static data structure) 2 (2) C (garbage collection GC) C GC(conservative GC) 2 2 conservative GC 3 data next p 3 5

More information

1.ppt

1.ppt /* * Program name: hello.c */ #include int main() { printf( hello, world\n ); return 0; /* * Program name: Hello.java */ import java.io.*; class Hello { public static void main(string[] arg)

More information

1 1.1 C 2 1 double a[ ][ ]; 1 3x x3 ( ) malloc() malloc 2 #include <stdio.h> #include

1 1.1 C 2 1 double a[ ][ ]; 1 3x x3 ( ) malloc() malloc 2 #include <stdio.h> #include 1 1.1 C 2 1 double a[ ][ ]; 1 3x3 0 1 3x3 ( ) 0.240 0.143 0.339 0.191 0.341 0.477 0.412 0.003 0.921 1.2 malloc() malloc 2 #include #include #include enum LENGTH = 10 ; int

More information

Introduction Purpose This training course describes the configuration and session features of the High-performance Embedded Workshop (HEW), a key tool

Introduction Purpose This training course describes the configuration and session features of the High-performance Embedded Workshop (HEW), a key tool Introduction Purpose This training course describes the configuration and session features of the High-performance Embedded Workshop (HEW), a key tool for developing software for embedded systems that

More information

program.dvi

program.dvi 2001.06.19 1 programming semi ver.1.0 2001.06.19 1 GA SA 2 A 2.1 valuename = value value name = valuename # ; Fig. 1 #-----GA parameter popsize = 200 mutation rate = 0.01 crossover rate = 1.0 generation

More information

joho07-1.ppt

joho07-1.ppt 0xbffffc5c 0xbffffc60 xxxxxxxx xxxxxxxx 00001010 00000000 00000000 00000000 01100011 00000000 00000000 00000000 xxxxxxxx x y 2 func1 func2 double func1(double y) { y = y + 5.0; return y; } double func2(double*

More information

I117 II I117 PROGRAMMING PRACTICE II SOFTWARE DEVELOPMENT ENV. 1 Research Center for Advanced Computing Infrastructure (RCACI) / Yasuhiro Ohara

I117 II I117 PROGRAMMING PRACTICE II SOFTWARE DEVELOPMENT ENV. 1 Research Center for Advanced Computing Infrastructure (RCACI) / Yasuhiro Ohara I117 II I117 PROGRAMMING PRACTICE II SOFTWARE DEVELOPMENT ENV. 1 Research Center for Advanced Computing Infrastructure (RCACI) / Yasuhiro Ohara yasu@jaist.ac.jp / SCHEDULE 1. 2011/06/07(Tue) / Basic of

More information

120802_MPI.ppt

120802_MPI.ppt CPU CPU CPU CPU CPU SMP Symmetric MultiProcessing CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CP OpenMP MPI MPI CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU MPI MPI+OpenMP CPU CPU CPU CPU CPU CPU CPU CP

More information

DOPRI5.dvi

DOPRI5.dvi ODE DOPRI5 ( ) 16 3 31 Runge Kutta Dormand Prince 5(4) [1, pp. 178 179] DOPRI5 http://www.unige.ch/math/folks/hairer/software.html Fortran C C++ [3, pp.51 56] DOPRI5 C cprog.tar % tar xvf cprog.tar cprog/

More information

main

main 14 1. 12 5 main 1.23 3 1.230000 3 1.860867 1 2. 1988 1925 1911 1867 void JPcalendar(int x) 1987 1 64 1 1 1 while(1) Ctrl C void JPcalendar(int x){ if (x > 1988) printf(" %d %d \n", x, x-1988); else if(x

More information

#include #include #include int gcd( int a, int b) { if ( b == 0 ) return a; else { int c = a % b; return gcd( b, c); } /* if */ } int main() { int a = 60; int b = 45; int

More information

(Basic Theory of Information Processing) Fortran Fortan Fortan Fortan 1

(Basic Theory of Information Processing) Fortran Fortan Fortan Fortan 1 (Basic Theory of Information Processing) Fortran Fortan Fortan Fortan 1 17 Fortran Formular Tranlator Lapack Fortran FORTRAN, FORTRAN66, FORTRAN77, FORTRAN90, FORTRAN95 17.1 A Z ( ) 0 9, _, =, +, -, *,

More information

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë 2012 5 24 scalar Open MP Hello World Do (omp do) (omp workshare) (shared, private) π (reduction) PU PU PU 2 16 OpenMP FORTRAN/C/C++ MPI OpenMP 1997 FORTRAN Ver. 1.0 API 1998 C/C++ Ver. 1.0 API 2000 FORTRAN

More information

( CUDA CUDA CUDA CUDA ( NVIDIA CUDA I

(    CUDA CUDA CUDA CUDA (  NVIDIA CUDA I GPGPU (II) GPGPU CUDA 1 GPGPU CUDA(CUDA Unified Device Architecture) CUDA NVIDIA GPU *1 C/C++ (nvcc) CUDA NVIDIA GPU GPU CUDA CUDA 1 CUDA CUDA 2 CUDA NVIDIA GPU PC Windows Linux MaxOSX CUDA GPU CUDA NVIDIA

More information

演習1: 演習準備

演習1: 演習準備 演習 1: 演習準備 2013 年 8 月 6 日神戸大学大学院システム情報学研究科森下浩二 1 演習 1 の内容 神戸大 X10(π-omputer) について システム概要 ログイン方法 コンパイルとジョブ実行方法 OpenMP の演習 ( 入門編 ) 1. parallel 構文 実行時ライブラリ関数 2. ループ構文 3. shared 節 private 節 4. reduction 節

More information

Slides: TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments

Slides: TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments 計算機アーキテクチャ第 11 回 マルチプロセッサ 本資料は授業用です 無断で転載することを禁じます 名古屋大学 大学院情報科学研究科 准教授加藤真平 デスクトップ ジョブレベル並列性 スーパーコンピュータ 並列処理プログラム プログラムの並列化 for (i = 0; i < N; i++) { x[i] = a[i] + b[i]; } プログラムの並列化 x[0] = a[0] + b[0];

More information

/* do-while */ #include <stdio.h> #include <math.h> int main(void) double val1, val2, arith_mean, geo_mean; printf( \n ); do printf( ); scanf( %lf, &v

/* do-while */ #include <stdio.h> #include <math.h> int main(void) double val1, val2, arith_mean, geo_mean; printf( \n ); do printf( ); scanf( %lf, &v 1 http://www7.bpe.es.osaka-u.ac.jp/~kota/classes/jse.html kota@fbs.osaka-u.ac.jp /* do-while */ #include #include int main(void) double val1, val2, arith_mean, geo_mean; printf( \n );

More information

Krylov (b) x k+1 := x k + α k p k (c) r k+1 := r k α k Ap k ( := b Ax k+1 ) (d) β k := r k r k 2 2 (e) : r k 2 / r 0 2 < ε R (f) p k+1 :=

Krylov (b) x k+1 := x k + α k p k (c) r k+1 := r k α k Ap k ( := b Ax k+1 ) (d) β k := r k r k 2 2 (e) : r k 2 / r 0 2 < ε R (f) p k+1 := 127 10 Krylov Krylov (Conjugate-Gradient (CG ), Krylov ) MPIBNCpack 10.1 CG (Conjugate-Gradient CG ) A R n n a 11 a 12 a 1n a 21 a 22 a 2n A T = =... a n1 a n2 a nn n a 11 a 21 a n1 a 12 a 22 a n2 = A...

More information

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë

OpenMP¤òÍѤ¤¤¿ÊÂÎó·×»»¡Ê£±¡Ë 2011 5 26 scalar Open MP Hello World Do (omp do) (omp workshare) (shared, private) π (reduction) scalar magny-cours, 48 scalar scalar 1 % scp. ssh / authorized keys 133. 30. 112. 246 2 48 % ssh 133.30.112.246

More information

新版明解C言語 実践編

新版明解C言語 実践編 2 List - "max.h" a, b max List - max "max.h" #define max(a, b) ((a) > (b)? (a) : (b)) max List -2 List -2 max #include "max.h" int x, y; printf("x"); printf("y"); scanf("%d", &x); scanf("%d", &y); printf("max(x,

More information

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N GPU 1 1 2 1, 3 2, 3 (Graphics Unit: GPU) GPU GPU GPU Evaluation of GPU Computing Based on An Automatic Program Generation Technology Makoto Sugawara, 1 Katsuto Sato, 1 Kazuhiko Komatsu, 2 Hiroyuki Takizawa

More information

25 II :30 16:00 (1),. Do not open this problem booklet until the start of the examination is announced. (2) 3.. Answer the following 3 proble

25 II :30 16:00 (1),. Do not open this problem booklet until the start of the examination is announced. (2) 3.. Answer the following 3 proble 25 II 25 2 6 13:30 16:00 (1),. Do not open this problem boolet until the start of the examination is announced. (2) 3.. Answer the following 3 problems. Use the designated answer sheet for each problem.

More information

<4D F736F F F696E74202D D F95C097F D834F E F93FC96E5284D F96E291E85F8DE391E52E >

<4D F736F F F696E74202D D F95C097F D834F E F93FC96E5284D F96E291E85F8DE391E52E > SX-ACE 並列プログラミング入門 (MPI) ( 演習補足資料 ) 大阪大学サイバーメディアセンター日本電気株式会社 演習問題の構成 ディレクトリ構成 MPI/ -- practice_1 演習問題 1 -- practice_2 演習問題 2 -- practice_3 演習問題 3 -- practice_4 演習問題 4 -- practice_5 演習問題 5 -- practice_6

More information

第5回お試しアカウント付き並列プログラミング講習会

第5回お試しアカウント付き並列プログラミング講習会 qstat -l ID (qstat -f) qscript ID BATCH REQUEST: 253443.batch1 Name: test.sh Owner: uid=32637, gid=30123 Priority: 63 State: 1(RUNNING) Created at: Tue Jun 30 05:36:24 2009 Started at: Tue Jun 30 05:36:27

More information

A/B (2010/10/08) Ver kurino/2010/soft/soft.html A/B

A/B (2010/10/08) Ver kurino/2010/soft/soft.html A/B A/B (2010/10/08) Ver. 1.0 kurino@math.cst.nihon-u.ac.jp http://edu-gw2.math.cst.nihon-u.ac.jp/ kurino/2010/soft/soft.html 2010 10 8 A/B 1 2010 10 8 2 1 1 1.1 OHP.................................... 1 1.2.......................................

More information

Agenda Motivation How it works Performance Limitation Conclusion

Agenda Motivation How it works Performance Limitation Conclusion py2llvm: Python to LLVM translator Syoyo Fujita Agenda Motivation How it works Performance Limitation Conclusion Agenda Motivation How it works Performance Limitation Conclusion py2llvm Python LLVM Python,

More information

r08.dvi

r08.dvi 19 8 ( ) 019.4.0 1 1.1 (linked list) ( ) next ( 1) (head) (tail) ( ) top head tail head data next 1: NULL nil ( ) NULL ( NULL ) ( 1 ) (double linked list ) ( ) 1 next 1 prev 1 head cur tail head cur prev

More information

K227 Java 2

K227 Java 2 1 K227 Java 2 3 4 5 6 Java 7 class Sample1 { public static void main (String args[]) { System.out.println( Java! ); } } 8 > javac Sample1.java 9 10 > java Sample1 Java 11 12 13 http://java.sun.com/j2se/1.5.0/ja/download.html

More information

RHEA key

RHEA key 2 $ cd RHEA $ git pull zsh $ for i in {009..049}; do curl -O https://raw.githubusercontent.com/akiraokumura/rhea-slides/master/photons/lat_photon_weekly_w${i} _p302_v001_extracted.root; done bash $ for

More information

untitled

untitled II yacc 005 : 1, 1 1 1 %{ int lineno=0; 3 int wordno=0; 4 int charno=0; 5 6 %} 7 8 %% 9 [ \t]+ { charno+=strlen(yytext); } 10 "\n" { lineno++; charno++; } 11 [^ \t\n]+ { wordno++; charno+=strlen(yytext);}

More information

C¥×¥í¥°¥é¥ß¥ó¥° ÆþÌç

C¥×¥í¥°¥é¥ß¥ó¥° ÆþÌç C (3) if else switch AND && OR (NOT)! 1 BMI BMI BMI = 10 4 [kg]) ( [cm]) 2 bmi1.c Input your height[cm]: 173.2 Enter Input your weight[kg]: 60.3 Enter Your BMI is 20.1. 10 4 = 10000.0 1 BMI BMI BMI = 10

More information

2

2 WJ-HD150 Digital Disk Recorder WJ-HD150 2 3 q w e r t y u 4 5 6 7 8 9 10 11 12 13 14 15 16 q w SIGNAL GND AC IN 17 SUNDAY MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY DAILY Program 1 Event No.1 Event

More information

PC Windows 95, Windows 98, Windows NT, Windows 2000, MS-DOS, UNIX CPU

PC Windows 95, Windows 98, Windows NT, Windows 2000, MS-DOS, UNIX CPU 1. 1.1. 1.2. 1 PC Windows 95, Windows 98, Windows NT, Windows 2000, MS-DOS, UNIX CPU 2. 2.1. 2 1 2 C a b N: PC BC c 3C ac b 3 4 a F7 b Y c 6 5 a ctrl+f5) 4 2.2. main 2.3. main 2.4. 3 4 5 6 7 printf printf

More information

インテル(R) Visual Fortran Composer XE 2013 Windows版 入門ガイド

インテル(R) Visual Fortran Composer XE 2013 Windows版 入門ガイド Visual Fortran Composer XE 2013 Windows* エクセルソフト株式会社 www.xlsoft.com Rev. 1.1 (2012/12/10) Copyright 1998-2013 XLsoft Corporation. All Rights Reserved. 1 / 53 ... 3... 4... 4... 5 Visual Studio... 9...

More information

1 4 2 EP) (EP) (EP)

1 4 2 EP) (EP) (EP) 2003 2004 2 27 1 1 4 2 EP) 5 3 6 3.1.............................. 6 3.2.............................. 6 3.3 (EP)............... 7 4 8 4.1 (EP).................... 8 4.1.1.................... 18 5 (EP)

More information

tuat1.dvi

tuat1.dvi ( 1 ) http://ist.ksc.kwansei.ac.jp/ tutimura/ 2012 6 23 ( 1 ) 1 / 58 C ( 1 ) 2 / 58 2008 9 2002 2005 T E X ptetex3, ptexlive pt E X UTF-8 xdvi-jp 3 ( 1 ) 3 / 58 ( 1 ) 4 / 58 C,... ( 1 ) 5 / 58 6/23( )

More information

:30 12:00 I. I VI II. III. IV. a d V. VI

:30 12:00 I. I VI II. III. IV. a d V. VI 2018 2018 08 02 10:30 12:00 I. I VI II. III. IV. a d V. VI. 80 100 60 1 I. Backus-Naur BNF N N y N x N xy yx : yxxyxy N N x, y N (parse tree) (1) yxyyx (2) xyxyxy (3) yxxyxyy (4) yxxxyxxy N y N x N yx

More information

programmingII2019-v01

programmingII2019-v01 II 2019 2Q A 6/11 6/18 6/25 7/2 7/9 7/16 7/23 B 6/12 6/19 6/24 7/3 7/10 7/17 7/24 x = 0 dv(t) dt = g Z t2 t 1 dv(t) dt dt = Z t2 t 1 gdt g v(t 2 ) = v(t 1 ) + g(t 2 t 1 ) v v(t) x g(t 2 t 1 ) t 1 t 2

More information

‚æ4›ñ

‚æ4›ñ ( ) ( ) ( ) A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 (OUS) 9 26 1 / 28 ( ) ( ) ( ) A B C D Z a b c d z 0 1 2 9 (OUS) 9

More information

02_C-C++_osx.indd

02_C-C++_osx.indd C/C++ OpenMP* / 2 C/C++ OpenMP* OpenMP* 9.0 1... 2 2... 3 3OpenMP*... 5 3.1... 5 3.2 OpenMP*... 6 3.3 OpenMP*... 8 4OpenMP*... 9 4.1... 9 4.2 OpenMP*... 9 4.3 OpenMP*... 10 4.4... 10 5OpenMP*... 11 5.1

More information

memo

memo 計数工学プログラミング演習 ( 第 3 回 ) 2017/04/25 DEPARTMENT OF MATHEMATICAL INFORMATICS 1 内容 ポインタの続き 引数の値渡しと参照渡し 構造体 2 ポインタで指されるメモリへのアクセス double **R; 型 R[i] と *(R+i) は同じ意味 意味 R double ** ポインタの配列 ( の先頭 ) へのポインタ R[i]

More information

Microsoft PowerPoint - KHPCSS pptx

Microsoft PowerPoint - KHPCSS pptx KOBE HPC サマースクール 2018( 初級 ) 9. 1 対 1 通信関数, 集団通信関数 2018/8/8 KOBE HPC サマースクール 2018 1 2018/8/8 KOBE HPC サマースクール 2018 2 MPI プログラム (M-2):1 対 1 通信関数 問題 1 から 100 までの整数の和を 2 並列で求めなさい. プログラムの方針 プロセス0: 1から50までの和を求める.

More information

fx-9860G Manager PLUS_J

fx-9860G Manager PLUS_J fx-9860g J fx-9860g Manager PLUS http://edu.casio.jp k 1 k III 2 3 1. 2. 4 3. 4. 5 1. 2. 3. 4. 5. 1. 6 7 k 8 k 9 k 10 k 11 k k k 12 k k k 1 2 3 4 5 6 1 2 3 4 5 6 13 k 1 2 3 1 2 3 1 2 3 1 2 3 14 k a j.+-(),m1

More information

1st-session key

1st-session key 1 2013/11/29 Project based Learning: Soccer Agent Program 1 2012/12/9 Project based Learning: Soccer Agent Program PBL Learning by doing Schedule 1,2 2013 11/29 Make 2013 12/6 2013 12/13 2013 12/20 2014

More information

2008 ( 13 ) C LAPACK 2008 ( 13 )C LAPACK p. 1

2008 ( 13 ) C LAPACK 2008 ( 13 )C LAPACK p. 1 2008 ( 13 ) C LAPACK LAPACK p. 1 Q & A Euler http://phase.hpcc.jp/phase/mppack/long.pdf KNOPPIX MT (Mersenne Twister) SFMT., ( ) ( ) ( ) ( ). LAPACK p. 2 C C, main Asir ( Asir ) ( ) (,,...), LAPACK p.

More information

9 8 7 (x-1.0)*(x-1.0) *(x-1.0) (a) f(a) (b) f(a) Figure 1: f(a) a =1.0 (1) a 1.0 f(1.0)

9 8 7 (x-1.0)*(x-1.0) *(x-1.0) (a) f(a) (b) f(a) Figure 1: f(a) a =1.0 (1) a 1.0 f(1.0) E-mail: takio-kurita@aist.go.jp 1 ( ) CPU ( ) 2 1. a f(a) =(a 1.0) 2 (1) a ( ) 1(a) f(a) a (1) a f(a) a =2(a 1.0) (2) 2 0 a f(a) a =2(a 1.0) = 0 (3) 1 9 8 7 (x-1.0)*(x-1.0) 6 4 2.0*(x-1.0) 6 2 5 4 0 3-2

More information

3.1 stdio.h iostream List.2 using namespace std C printf ( ) %d %f %s %d C++ cout cout List.2 Hello World! cout << float a = 1.2f; int b = 3; cout <<

3.1 stdio.h iostream List.2 using namespace std C printf ( ) %d %f %s %d C++ cout cout List.2 Hello World! cout << float a = 1.2f; int b = 3; cout << C++ C C++ 1 C++ C++ C C++ C C++? C C++ C *.c *.cpp C cpp VC C++ 2 C++ C++ C++ [1], C++,,1999 [2],,,2001 [3], ( )( ),,2001 [4] B.W. /D.M.,, C,,1989 C Web [5], http://kumei.ne.jp/c_lang/ 3 Hello World Hello

More information

nakao

nakao Fortran+Python 4 Fortran, 2018 12 12 !2 Python!3 Python 2018 IEEE spectrum https://spectrum.ieee.org/static/interactive-the-top-programming-languages-2018!4 Python print("hello World!") if x == 10: print

More information

ECCS. ECCS,. ( 2. Mac Do-file Editor. Mac Do-file Editor Windows Do-file Editor Top Do-file e

ECCS. ECCS,. (  2. Mac Do-file Editor. Mac Do-file Editor Windows Do-file Editor Top Do-file e 1 1 2015 4 6 1. ECCS. ECCS,. (https://ras.ecc.u-tokyo.ac.jp/guacamole/) 2. Mac Do-file Editor. Mac Do-file Editor Windows Do-file Editor Top Do-file editor, Do View Do-file Editor Execute(do). 3. Mac System

More information

ohp08.dvi

ohp08.dvi 19 8 ( ) 2019.4.20 1 (linked list) ( ) next ( 1) (head) (tail) ( ) top head tail head data next 1: 2 (2) NULL nil ( ) NULL ( NULL ) ( 1 ) (double linked list ) ( 2) 3 (3) head cur tail head cur prev data

More information

{:from => Java, :to => Ruby } Java Ruby KAKUTANI Shintaro; Eiwa System Management, Inc.; a strong Ruby proponent http://kakutani.com http://www.amazon.co.jp/o/asin/4873113202/kakutani-22 http://www.amazon.co.jp/o/asin/477413256x/kakutani-22

More information

C ( ) C ( ) C C C C C 1 Fortran Character*72 name Integer age Real income 3 1 C mandata mandata ( ) name age income mandata ( ) mandat

C ( ) C ( ) C C C C C 1 Fortran Character*72 name Integer age Real income 3 1 C mandata mandata ( ) name age income mandata ( ) mandat C () 14 5 23 C () C C C C C 1 Fortran Character*72 name Integer age Real income 3 1 C 1.1 3 7 mandata mandata () name age income mandata () mandata1 1 #include struct mandata char name[51];

More information

For_Beginners_CAPL.indd

For_Beginners_CAPL.indd CAPL Vector Japan Co., Ltd. 目次 1 CAPL 03 2 CAPL 03 3 CAPL 03 4 CAPL 04 4.1 CAPL 4.2 CAPL 4.3 07 5 CAPL 08 5.1 CANoe 5.2 CANalyzer 6 CAPL 10 7 CAPL 11 7.1 CAPL 7.2 CAPL 7.3 CAPL 7.4 CAPL 16 7.5 18 8 CAPL

More information

L C -6D Z3 L C -0D Z3 3 4 5 6 7 8 9 10 11 1 13 14 15 16 17 OIL CLINIC BAR 18 19 POWER TIMER SENSOR 0 3 1 3 1 POWER TIMER SENSOR 3 4 1 POWER TIMER SENSOR 5 11 00 6 7 1 3 4 5 8 9 30 1 3 31 1 3 1 011 1

More information

WinHPC ppt

WinHPC ppt MPI.NET C# 2 2009 1 20 MPI.NET MPI.NET C# MPI.NET C# MPI MPI.NET 1 1 MPI.NET C# Hello World MPI.NET.NET Framework.NET C# API C# Microsoft.NET java.net (Visual Basic.NET Visual C++) C# class Helloworld

More information

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1

線形代数演算ライブラリBLASとLAPACKの 基礎と実践1 1 / 50 BLAS LAPACK 1, 2015/05/21 CMSI A 2 / 50 BLAS LAPACK (I) BLAS, LAPACK BLAS : - LAPACK : 3 / 50 ( ) 1000 ( ; 1 2 ) :... 3 / 50 ( ) 1000 ( ; 1 2 ) :... 3 / 50 ( ) 1000 ( ; 1 2 ) :... 3 / 50 ( ) 1000

More information

XcalableMP入門

XcalableMP入門 XcalableMP 1 HPC-Phys@, 2018 8 22 XcalableMP XMP XMP Lattice QCD!2 XMP MPI MPI!3 XMP 1/2 PCXMP MPI Fortran CCoarray C++ MPIMPI XMP OpenMP http://xcalablemp.org!4 XMP 2/2 SPMD (Single Program Multiple Data)

More information

Page 1 of 6 B (The World of Mathematics) November 20, 2006 Final Exam 2006 Division: ID#: Name: 1. p, q, r (Let p, q, r are propositions. ) (10pts) (a

Page 1 of 6 B (The World of Mathematics) November 20, 2006 Final Exam 2006 Division: ID#: Name: 1. p, q, r (Let p, q, r are propositions. ) (10pts) (a Page 1 of 6 B (The World of Mathematics) November 0, 006 Final Exam 006 Division: ID#: Name: 1. p, q, r (Let p, q, r are propositions. ) (a) (Decide whether the following holds by completing the truth

More information

Java演習(4) -- 変数と型 --

Java演習(4)   -- 変数と型 -- 50 20 20 5 (20, 20) O 50 100 150 200 250 300 350 x (reserved 50 100 y 50 20 20 5 (20, 20) (1)(Blocks1.java) import javax.swing.japplet; import java.awt.graphics; (reserved public class Blocks1 extends

More information