IPSJ SIG Technical Report Vol.2014-ARC-213 No.24 Vol.2014-HPC-147 No /12/10 GPU 1,a) 1,b) 1,c) 1,d) GPU GPU Structure Of Array Array Of

Similar documents
第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

非線形長波モデルと流体粒子法による津波シミュレータの開発 I_ m ρ v p h g a b a 2h b r ab a b Fang W r ab h 5 Wendland 1995 q= r ab /h a d W r ab h

EGunGPU

GPU n Graphics Processing Unit CG CAD

GPGPU

(a) Picking up of six components (b) Picking up of three simultaneously. components simultaneously. Fig. 2 An example of the simultaneous pickup. 6 /

IPSJ SIG Technical Report Vol.2014-CG-155 No /6/28 1,a) 1,2,3 1 3,4 CG An Interpolation Method of Different Flow Fields using Polar Inter

JFE.dvi

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing


Fig. 2 Signal plane divided into cell of DWT Fig. 1 Schematic diagram for the monitoring system

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

: u i = (2) x i Smagorinsky τ ij τ [3] ij u i u j u i u j = 2ν SGS S ij, (3) ν SGS = (C s ) 2 S (4) x i a u i ρ p P T u ν τ ij S c ν SGS S csgs

IPSJ SIG Technical Report Vol.2009-BIO-17 No /5/26 DNA 1 1 DNA DNA DNA DNA Correcting read errors on DNA sequences determined by Pyrosequencing

Study on Throw Accuracy for Baseball Pitching Machine with Roller (Study of Seam of Ball and Roller) Shinobu SAKAI*5, Juhachi ODA, Kengo KAWATA and Yu

( ) [1] [4] ( ) 2. [5] [6] Piano Tutor[7] [1], [2], [8], [9] Radiobaton[10] Two Finger Piano[11] Coloring-in Piano[12] ism[13] MIDI MIDI 1 Fig. 1 Syst

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted

IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1

IPSJ SIG Technical Report NetMAS NetMAS NetMAS One-dimensional Pedestrian Model for Fast Evacuation Simulator Shunsuke Soeda, 1 Tomohisa Yam


2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

IPSJ SIG Technical Report Secret Tap Secret Tap Secret Flick 1 An Examination of Icon-based User Authentication Method Using Flick Input for

2. Eades 1) Kamada-Kawai 7) Fruchterman 2) 6) ACE 8) HDE 9) Kruskal MDS 13) 11) Kruskal AGI Active Graph Interface 3) Kruskal 5) Kruskal 4) 3. Kruskal

7 OpenFOAM 6) OpenFOAM (Fujitsu PRIMERGY BX9, TFLOPS) Fluent 8) ( ) 9, 1) 11 13) OpenFOAM - realizable k-ε 1) Launder-Gibson 15) OpenFOAM 1.6 CFD ( )

( )

07-二村幸孝・出口大輔.indd

A Precise Calculation Method of the Gradient Operator in Numerical Computation with the MPS Tsunakiyo IRIBE and Eizo NAKAZA A highly precise numerical

xx/xx Vol. Jxx A No. xx 1 Fig. 1 PAL(Panoramic Annular Lens) PAL(Panoramic Annular Lens) PAL (2) PAL PAL 2 PAL 3 2 PAL 1 PAL 3 PAL PAL 2. 1 PAL

Vol.8 No (July 2015) 2/ [3] stratification / *1 2 J-REIT *2 *1 *2 J-REIT % J-REIT J-REIT 6 J-REIT J-REIT 10 J-REIT *3 J-

研究成果報告書

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N

untitled

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

H1-H4*.ai


DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

untitled

untitled

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

4/15 No.

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU

IPSJ SIG Technical Report Vol.2012-HCI-149 No /7/20 1 1,2 1 (HMD: Head Mounted Display) HMD HMD,,,, An Information Presentation Method for Weara

3. ( 1 ) Linear Congruential Generator:LCG 6) (Mersenne Twister:MT ), L 1 ( 2 ) 4 4 G (i,j) < G > < G 2 > < G > 2 g (ij) i= L j= N

42 1 Fig. 2. Li 2 B 4 O 7 crystals with 3inches and 4inches in diameter. Fig. 4. Transmission curve of Li 2 B 4 O 7 crystal. Fig. 5. Refractive index

CG [7] Thomaszewski [12] Baranoski [1] [2] (a) (b) (c) 3 a b c 3(a) E g 3(b) E mag 3(c) E s 3 2 [16] SPH SPH 1960 Rosenswig 4 [9] Sudo [11] Han

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

鉄鋼協会プレゼン

MDD PBL ET 9) 2) ET ET 2.2 2), 1 2 5) MDD PBL PBL MDD MDD MDD 10) MDD Executable UML 11) Executable UML MDD Executable UML

2. CABAC CABAC CABAC 1 1 CABAC Figure 1 Overview of CABAC 2 DCT 2 0/ /1 CABAC [3] 3. 2 値化部 コンテキスト計算部 2 値算術符号化部 CABAC CABAC

149 (Newell [5]) Newell [5], [1], [1], [11] Li,Ryu, and Song [2], [11] Li,Ryu, and Song [2], [1] 1) 2) ( ) ( ) 3) T : 2 a : 3 a 1 :

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-HPC-139 No /5/29 Gfarm/Pwrake NICT NICT 10TB 100TB CPU I/O HPC I/O NICT Gf

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

新しい価値創出に貢献する大規模CAEシミュレーション

[4] ACP (Advanced Communication Primitives) [1] ACP ACP [2] ACP Tofu UDP [3] HPC InfiniBand InfiniBand ACP 2 ACP, 3 InfiniBand ACP 4 5 ACP 2. ACP ACP

DEIM Forum 2009 B4-6, Str

02-量子力学の復習

12) NP 2 MCI MCI 1 START Simple Triage And Rapid Treatment 3) START MCI c 2010 Information Processing Society of Japan

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

The Evaluation on Impact Strength of Structural Elements by Means of Drop Weight Test Elastic Response and Elastic Limit by Hiroshi Maenaka, Member Sh

IPSJ SIG Technical Report Vol.2013-CVIM-188 No /9/2 1,a) D. Marr D. Marr 1. (feature-based) (area-based) (Dense Stereo Vision) van der Ma

A Study of Effective Application of CG Multimedia Contents for Help of Understandings of the Working Principles of the Internal Combustion Engine (The

2.2 6).,.,.,. Yang, 7).,,.,,. 2.3 SIFT SIFT (Scale-Invariant Feature Transform) 8).,. SIFT,,. SIFT, Mean-Shift 9)., SIFT,., SIFT,. 3.,.,,,,,.,,,., 1,

14 2 5

The Japanese Journal of Psychology 1974, Vol. 44, No. 6, AN ANALYSIS OF WORD ATTRIBUTES IMAGERY, CONCRETENESS, MEANINGFULNESS AND EASE OF LEAR

HBase Phoenix API Mars GPU MapReduce GPU Hadoop Hadoop Hadoop MapReduce : (1) MapReduce (2)JobTracker 1 Hadoop CPU GPU Fig. 1 The overview of CPU-GPU

一般総括-表-HP.PDF

LCC LCC INOUE, Gaku TANSEI, Kiyoteru KIDO, Motohiro IMAMURA, Takahiro LCC 7 LCC Ryanair 1 Ryanair Number of Passengers 2,000,000 1,800,000 1,

IPSJ SIG Technical Report Vol.2014-HCI-158 No /5/22 1,a) 2 2 3,b) Development of visualization technique expressing rainfall changing conditions

Run-Based Trieから構成される 決定木の枝刈り法

24312.dvi

318 T. SICE Vol.52 No.6 June 2016 (a) (b) (c) (a) (c) ) 11) (1) (2) 1 5) 6) 7), 8) 5) 20 11) (1

3D UbiCode (Ubiquitous+Code) RFID ResBe (Remote entertainment space Behavior evaluation) 2 UbiCode Fig. 2 UbiCode 2. UbiCode 2. 1 UbiCode UbiCode 2. 2

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth


HPC pdf

Synthesis and Development of Electric Active Stabilizer Suspension System Shuuichi BUMA*6, Yasuhiro OOKUMA, Akiya TANEDA, Katsumi SUZUKI, Jae-Sung CHO

Instability of Aerostatic Journal Bearings with Porous Floating Bush at High Speeds Masaaki MIYATAKE *4, Shigeka YOSHIMOTO, Tomoaki CHIBA and Akira CH

23_02.dvi

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

IPSJ SIG Technical Report An Evaluation Method for the Degree of Strain of an Action Scene Mao Kuroda, 1 Takeshi Takai 1 and Takashi Matsuyama 1

IPSJ SIG Technical Report Vol.2009-CVIM-167 No /6/10 Real AdaBoost HOG 1 1 1, 2 1 Real AdaBoost HOG HOG Real AdaBoost HOG A Method for Reducing

CD納品用.indd

75 unit: mm Fig. Structure of model three-phase stacked transformer cores (a) Alternate-lap joint (b) Step-lap joint 3 4)

1 4 4 [3] SNS 5 SNS , ,000 [2] c 2013 Information Processing Society of Japan

Fuzzy Multiple Discrimminant Analysis (FMDA) 5) (SOM) 6) SOM 3 6) SOM SOM SOM SOM SOM SOM 7) 8) SOM SOM SOM GPU 2. n k f(x) m g(x) (1) 12) { min(max)

(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

Iteration 0 Iteration 1 1 Iteration 2 Iteration 3 N N N! N 1 MOPT(Merge Optimization) 3) MOPT MOP

Title 混合体モデルに基づく圧縮性流体と移動する固体の熱連成計算手法 Author(s) 鳥生, 大祐 ; 牛島, 省 Citation 土木学会論文集 A2( 応用力学 ) = Journal of Japan Civil Engineers, Ser. A2 (2017), 73 Issue

ActionScript Flash Player 8 ActionScript3.0 ActionScript Flash Video ActionScript.swf swf FlashPlayer AVM(Actionscript Virtual Machine) Windows

Copyright c 2000 by Yoshihide Tomiyama

10D16.dvi

Vol.214-HPC-145 No /7/3 C #pragma acc directive-name [clause [[,] clause] ] new-line structured block Fortran!$acc directive-name [clause [[,] c

Visual Evaluation of Polka-dot Patterns Yoojin LEE and Nobuko NARUSE * Granduate School of Bunka Women's University, and * Faculty of Fashion Science,


Transcription:

GPU 1,a) 1,b) 1,c) 1,d) GPU 1 GPU Structure Of Array Array Of Structure 1. MPS(Moving Particle Semi-Implicit) [1] SPH(Smoothed Particle Hydrodynamics) [] DEM(Distinct Element Method)[] [] 1 Tokyo Institute of Technology, Meguro, Tokyo 1-8, Japan a) watanabe@sim.gsic.titech.ac.jp b) taoki@gsic.titech.ac.jp c) tsuzuki@sim.gsic.titech.ac.jp d) shimokawabe@sim.gsic.titech.ac.jp CPU 1core GPU(Graphics Processing Unit) [] [6] [7] [8] GPU [9] [1] GPU SPH DEM GPU 1 GPU GPU DEM GPU [11] [1] Array Of Structure Structure Of c 1 Information Processing Society of Japan 1

Array TSUBAME. Spring GPU NVIDIA Tesla KX CUDA Version 6. Dashpot Spring Friction slider. DEM(Distinct Element Method) Dashpot DEM 1 i j F N ij FT ij F N ij = k N L N ij + c N LN ij t (1) F T ij = k T L T ij + c T LT ij () t L L t k c N T R i i M ij M ij = R i x j x i (x j x i ) F T ij () i j i m i d x i dt I i d θ i dt = i j = i j [F ij ] + m i g () [M ij ] () x i θ i i m i I i Normal Direction Tangential Direction 1 Fig. 1 DEM interaction model. Fig. Collisions among contacting particles...1 DEM N N-1 N DEM DEM c 1 Information Processing Society of Japan

1 1 8 7 9 11 6 1 1 8 7 9 11 6 1 1 cell cell 1 cell cell -1 1-1 1-1 -1 Fig. Neighboring cell list. Fig. Neighboring cell list using linked-list method. 8 8... 1 1 GPU CUDA atomic atomicexch hash 1 index 1 sort hash 1 1 index 1 1 start -1 Fig. Neighboring cell list using hash method.. hash index GPU Thrust sort by key start 1. 6 i rc Rc Rc Rc Rc = rc + αd min (6) D min α c 1 Information Processing Society of Japan

Rc 1 rc 8 7 11 1 6 9 6 Fig. 6 Book-keeping method. 1 6 rc xbook xbook+ = v max t (7) xbook (Rc rc)/ xbook. 7 Rc 7 7 Fig. 7 Combination method of Neighboring cell list with bookkeeping method. 6. GPU Array Of Structure(AOS) Structure Of Array(SOA) AOS SOA DEM 1 AOS SOA 8 SOA AOS 91% 6% 1% 7% AOS SOA SOA AOS 7 SOA c 1 Information Processing Society of Japan

Time [sec] 1 Table 1 Physical condition. Number of particle, Size of calculating area. 1.. [m ] Particle diameter Particle mass Spring constant (Normal) Spring constant (Tangential) Damping coefficient (Normal) Damping coefficient (Tangential) 1. 1 [m]. [kg]. 1 [N/m]. 1 [N/m]. 1 1 [Ns/m]. [Ns/m] Coefficient of friction. Discrete time 8 7 6 1 Fig. 8 8 AOS hash + book-keeping AOS linked-list + book-keeping SOA hash + book-keeping interaction book-keeping update 1. 1 [sec] SOA linked-list + book-keeping Comparison between Array of Structure and Structure of Array. 9 Fig. 9 Initial condition of braking dam problem for paformance test. Table Physical condition. Number of particle 1,, Size of calculating area... [m ] Particle diameter 1. 1 [m] Particle mass 1. 1 [kg] Spring constant (Normal). 1 [N/m] Spring constant (Tangential) 1.6 1 [N/m] Damping coefficient (Normal). [Ns/m] Damping coefficient (Tangential). [Ns/m] Coefficient of friction. Discrete time. 1 6 [sec] 7. 7.1 DEM 1 9 9 6, 9,. sec 1.89 sec 1.7 1.%.% 7. 7. c 1 Information Processing Society of Japan

Time [sec] Table Calculation time of linked-list method and hash method. interaction linked-list hash update total linked-list 19.1. 19. 197.7 hash 1998.1 1.89 18.77 8.9 unit : sec α α (6) α.. 1 α α N α =. 9. sec 1% 1 7. α α =. 1 1 1 Fig. 1.. 1 1. Parameter α interaction book-keeping How frequency of update affects calculation time in book-keeping method. α =. 11 1 α α 1 α =.1 α.1 α.1 α.1 7% c 1 Information Processing Society of Japan 6

Time [sec] Table Calculation time of book-keeping + linked-list method and book-keeping + hash method. interaction book-keeping linked-list hash update total book-keeping + linked-list 6.8 6.97 1. 18.9 61. book-keeping + hash. 81..8 19. 68.9 unit : sec 11 8 7 6 1 Fig. 11..1... Parameter α interaction book-keeping How frequency of update affects calculation time in book-keeping + hash method. 1% 77%7. 18.6 sec 8. DEM α α = 1. α =.1 1 1 1 1 1 1.1 18 18 18 66% 1. MB 1.6 11.1 MB. 876. MB Tesla KX 6 GB 1 1 DEM Tesla KX DEM SPH MPS DEM c 1 Information Processing Society of Japan 7

Table Size of memory usage in each neighboring list method. particle book-keeping linked-list hash book-keeping 7.. lnked-list 7. 8. hash 7. 6. book-keeping + linked-list 7. 6. 68. book-keeping + hash 7. 6. 8.1 unit : MB 1 DEM Fig. 1 Breaking dam simulation using million particles. SPH MPS DEM 1 DEM SPH MPS 9. DEM GPU SOA AOS 1 DEM N GPU 1.6. GPU 6 HPC CREST [1] S Koshizuka and Y Oka. Moving-particle semi-implicit method for fragmentation of incompressible fluid. Nuclear science and engineering, Vol. 1, No., pp. 1, 1996. [] Joseph J Monaghan. An introduction to SPH. Computer physics communications, Vol. 8, No. 1, pp. 89 96, 1988. [] Peter A Cundall and Otto DL Strack. A discrete numerical model for granular assemblies. Geotechnique, Vol. 9, No. 1, pp. 7 6, 1979. [],,. GPU., Vol. 76, No. 7, pp. 6 67, 1. [] Nicolin Govender, Daniel N Wilke, Schalk Kok, and Rosanne Els. Development of a convex polyhedral discrete element simulation framework for NVIDIA Kepler based GPUs. Journal of Computational and Applied Mathematics, Vol. 7, pp. 86, 1. [6],. GPU DEM., Vol. 1, No. 17, 1. [7] Takahiro Harada, Seiichi Koshizuka, and Yoichiro Kawaguchi. Smoothed particle hydrodynamics on GPUs. In Computer Graphics International, pp. 6 7. SBC Petropolis, 7. [8],,. GPU 1.., Vol. 6, No., pp. 8 9, sep 1. [9] José M Domínguez, Alejandro JC Crespo, Daniel Valdez- Balderas, Benedict D Rogers, and Moncho Gómezc 1 Information Processing Society of Japan 8

Gesteira. New multi-gpu implementation for smoothed particle hydrodynamics on heterogeneous clusters. Computer Physics Communications, Vol. 18, No. 8, pp. 188 186, 1. [1],. GPU. Proceedings of the Conference on Computational Engineering and Science, 19, p.., 1. [11],.., Vol. 1, No. 11, 1. [1] Simon Green. Particle simulation using cuda. NVIDIA Whitepaper, December 1, 1. c 1 Information Processing Society of Japan 9