GPU Computing on Business

Size: px

Start display at page:

Download "GPU Computing on Business"

れいがりゅうとう
4 years ago
Views:

1 GPU Computing on Business 2010 Numerical Technologies Incorporated 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 GPU Computing $$$ Revenue Total Cost low BEP Quantity 10

11 11

12 12

13 13

14 14

15 15

16 GPU Computing $$$ Revenue Total Cost high BEP Quantity 16

17 17

18 CUDA C/C++ Perl PHP Python Java C# 18

19 19

20 GPU 20

21 21

22 22

23 23

24 24

25 (NtParallel DLL) NtParallel DLL NVIDIA Tesla driver 25

26 CUDA (NtParallel DLL) 26

27 CUDA (NtParallel DLL) NtParallel DLL NVIDIA Tesla driver 27

28 GPU nt_parallel_for bool nt_parallel_for( for (string), for (int), for 1(void*), for 2(void*), )... 28

29 // Black Scholes Option Formula Batch Processing Demo. void batch_black_scholes_pricer( int array_size, double* o_data, double* r_data, double* sigma_data, double* s_data, double* k_data, double* t_data ) { // for-loop program. for (int i = 0; i < array_size; ++i) { double R = r_data[i]; double Sigma = sigma_data[i]; double S = s_data[i]; double K = k_data[i]; double T = t_data[i]; double rt = R * T; double sigmasqrtt = Sigma * sqrt(t); double d = (log(s / K) + rt) / sigmasqrtt + sigmasqrtt * 0.5; o_data[i] = S * NormDist(d) - K * NormDist(d - sigmasqrtt) * exp(- rt); } // now you have output in o_data. } 29

$void batch_black_scholes_pricer( int array_size, double* o_data, double* r_data, double* sigma_data, double* s_data, double* k_data, double* t_data ) { // for-loop program.$

30 for // Black Scholes Option Formula Batch Processing Demo. void batch_black_scholes_pricer( int array_size, double* o_data, double* r_data, double* sigma_data, double* s_data, double* k_data, double* t_data ) { // for-loop program. for (int i = 0; i < array_size; ++i) { double R = r_data[i]; double Sigma = sigma_data[i]; double S = s_data[i]; double K = k_data[i]; double T = t_data[i]; double rt = R * T; double sigmasqrtt = Sigma * sqrt(t); double d = (log(s / K) + rt) / sigmasqrtt + sigmasqrtt * 0.5; o_data[i] = S * NormDist(d) - K * NormDist(d - sigmasqrtt) * exp(- rt); } // now you have output in o_data. } 30

$.. for (int i = 0; i < array_size; ++i) { double R = r_data[i]; double Sigma = sigma_data[i]; double S = s_data[i]; double K = k_data[i]; double T = t_data[i]; double rt = R * T; double sigmasqrtt =$

31 for ( )... for (int i = 0; i < array_size; ++i) { double R = r_data[i]; double Sigma = sigma_data[i]; double S = s_data[i]; double K = k_data[i]; double T = t_data[i]; double rt = R * T; double sigmasqrtt = Sigma * sqrt(t); double d = (log(s / K) + rt) / sigmasqrtt + sigmasqrtt * 0.5; o_data[i] = S * NormDist(d) - K * NormDist(d - sigmasqrtt) * exp(- rt); } string code = [](int i, double R, double Sigma, double S, double K, double T) => double {! double rt = R * T;! double sigmasqrtt = Sigma * sqrt(t);! double d = (log(s / K) + rt) / sigmasqrtt + sigmasqrtt * 0.5;! return S * NormDist(d) - K * NormDist(d - sigmasqrtt) * exp(- rt); } ;... 31

32 // Black Scholes Option Formula Batch Processing Demo. void batch_black_scholes_pricer( int array_size, double* o_data, double* r_data, double* sigma_data, double* s_data, double* k_data, double* t_data ) { // for-loop program. string code = [](int i, double R, double Sigma, double S, double K, double T) => double {! double rt = R * T;! double sigmasqrtt = Sigma * sqrt(t);! double d = (log(s / K) + rt) / sigmasqrtt + sigmasqrtt * 0.5;! return S * NormDist(d) - K * NormDist(d - sigmasqrtt) * exp(- rt); } ; // call the GPU. nt_parallel_for(code, array_size, o_data, r_data, sigma_data, s_data, k_data, t_data); // now you have output in o_data. } 32

33 Excel GPU code is here! 33

34 10 34

35 GPU CPU GPU 35

36 NVIDIA NVIDIA Tesla driver NtParallel DLL Other company s driver SSE, AVX 36

37 Scientific Wall St. 1. Mandelbrot 2. Kirkwood Gaps 3. Wavelet Analysis 4. Binomial Tree Option Model 5. Black Scholes Option Model 6. Housing Loan Calculation Boring Accounting Stuff 37

38 : Mandelbrot 38

39 98 : Kirkwood Gaps 39

40 29 : Wavelet Analysis 40

41 7 : Binomial Tree Option Model 41

42 5 : Black Scholes Option Model 42

43 40 : 35 43

44 44

45 45

46 GPU Computing 46

47 GPU API nvcc 47

48 nt_parallel_for shared nothing CUDA 48

49 49

50 50

51 51

52 Relative Performance vs. for-loop Iterations Acceleration Ratio Iterations 52

53 Relative Performance vs. for-loop Iterations Acceleration Ratio Iterations

54 Relative Performance vs. for-loop Iterations Acceleration Ratio Iterations 54

55 Relative Performance vs. for-loop Iterations Acceleration Ratio Iterations C/C++ C#,Java,Perl,Python GPU 55

56 56

57 for-loop Iterations Tesla GPU

58 Complexity Ops/Bytes 58

59 2 Complexity for-loop Iterations 59

60 Complexity Iterations GPU Complexity x5 x10 x50 faster for-loop Iterations 60

61 2 8 Complexity 10 GPU x5 x10 x50 faster GPU has advantage CPU has advantage for-loop Iterations 61

62 Complexity x5 x10 x50 faster Housing Loan for-loop Iterations 62

63 x10 Complexity x5 x10 x50 faster GPU has advantage Break Even Point Housing Loan CPU has advantage for-loop Iterations 63

64 Complexity x5 x10 x50 faster Binomial Tree for-loop Iterations 64

65 GPU Complexity x5 x10 x50 faster GPU resource exhausted Binomial Tree for-loop Iterations 65

66 Complexity x5 x10 x50 faster Mandelbrot Kirkwood Housing Loan Wavelet Binomial Tree Black Scholes for-loop Iterations 66

67 GPU x5: Black Scholes x40: Housing Loan 67

68 Complexity... string code = [] (int i, double annualizedrate, int term, double monthlypayment) => double {! const int t = term;! const double m = monthlypayment;! const double monthlydf = 1.0 / (1.0 + annualizedrate / 12.0);! int elapsedmonth;! double val = 0.0;! double df = 1.0;! for (elapsedmonth = 0; elapsedmonth < t; elapsedmonth = elapsedmonth + 1) {!! df = df * monthlydf;!! val = val + m * df;! }! return val; } ;... x40: Housing Loan... string code = [](int i, double R, double Sigma, double S, double K, double T) => double {! double rt = R * T;! double sigmasqrtt = Sigma * sqrt(t);! double d = (log(s / K) + rt) / sigmasqrtt + sigmasqrtt * 0.5;! return S * NormDist(d) - K * NormDist(d - sigmasqrtt) * exp(- rt); } ;... x5: Black Scholes 68

69 Q. GPU Complexity Iterations 69

70 A. IFRS ECF Complexity Housing Loan 70

71 Complexity x5 x10 x50 faster Mandelbrot Kirkwood Housing Loan Wavelet Binomial Tree Black Scholes Annuities, IFRS ECF depletion models (estimated) for-loop Iterations 71

72 Iterations x10 GPU Complexity x5 x10 x50 faster GPU has advantage Break Even Point Mandelbrot Kirkwood Housing Loan Wavelet Binomial Tree Black Scholes Annuities, IFRS ECF depletion models CPU has advantage (estimated) for-loop Iterations 72

73 Rule of Thumb... 73

74 nt_parallel_for 74

75 75

76 76

77 77

78 78

79 79

( CUDA CUDA CUDA CUDA ( NVIDIA CUDA I

( CUDA CUDA CUDA CUDA ( NVIDIA CUDA I GPGPU (II) GPGPU CUDA 1 GPGPU CUDA(CUDA Unified Device Architecture) CUDA NVIDIA GPU *1 C/C++ (nvcc) CUDA NVIDIA GPU GPU CUDA CUDA 1 CUDA CUDA 2 CUDA NVIDIA GPU PC Windows Linux MaxOSX CUDA GPU CUDA NVIDIA