IPSJ SIG Technical Report Vol.2017-OS-141 No /7/27 GPU Victream I/O 1,2,a) CPU(Central Processing Unit) GPU(Graphics Processing Unit) G

Size: px
Start display at page:

Download "IPSJ SIG Technical Report Vol.2017-OS-141 No /7/27 GPU Victream I/O 1,2,a) CPU(Central Processing Unit) GPU(Graphics Processing Unit) G"

Transcription

1 GPU Victream I/O 1,2,a) CPU(Central Processing Unit) GPU(Graphics Processing Unit) GPU I/O(Input/Output) [1] GPU GPU Out-of-Core GPU I/O Victream GPU I/O Victream State-of-the-Art 117% Performance Evaluation of Cooperative Scheduling of Processing and I/O of Victream GPU Middleware Jun Suzuki 1,2,a) Yuki Hayashi 1 Takuya Araki 1 Takashi Takenaka 1 Masaru Kitsuregawa 2 1. GPU CPU NVIDIA CUDA GPU GPU GPU GPU CPU GPU I/O 1 2 CPU GPU GPU CPU 10 GPU I/O PCIe 3.0 x 16 1 NEC System Platform Research Laboratories, NEC 2 Institute of Industrial Science, the University of Tokyo a) I/O 16 GB/s GPU 10 1 CPU GPU I/O GPU GPU I/O I/O CPU 2 GPU 10GB 2 GPU GPU Out-of-Core Out-of-Core GPU I/O GPU GPU I/O [1] GPU Out-of-Core GPU I/O Victream 1

2 1 Intel Xeon E v2 [5]. Single-precision Floating Point Performance 1.8 Tflops Memory Bandwidth 85 GB/s 2 NVIDIA Tesla P100 GPU [6]. Single-precision Floating Point Performance 9.3 Tflops Memory Size 16 GB Memory Bandwidth 732 GB/s I/O Bus PCIe 3.0 x 16 1 Victream. Victream API(Application Programming Interface) API DAG(Directed Acyclic Graph) DAG Victream GPU DAG GPU I/O GPU I/O DAG DAG Dryad[2] Spark[3] GPU PTask[4] Victream GPU Out-of-Core Victream I/O GPU I/O GPU I/O GPU I/O [1] Victream GPU I/O State-of-the-Art Victream 117% 2 Victream 3 Victream 4 Victream 5 2. Victream Victream Spark API GPU Victream CPU GPU C++ Victeam DAG Victream 1 Victream Victream Victream API API DAG RPC(Remote Procedure Call) DAG DAG I/O GPU DAG UDF(User-Defined Function) GRDD(GPU Resilient Distributed Dataset) GRDD GPU UDF GRDD UDF GRDD DAG GRDD GPU Victream GPU GPU I/O GPU I/O GPU Out-of-Core I/O Victream GRDD GRDD / Key-Value 4 GPU I/O DAG Victream GPU GRDD 1 GRDD NVM(Nonvolatile Memory) Express 2

3 Card Victream [1] 3. Victream 3.1 Victream DAG Victream I/O GPU DAG Victream DAG DAG Victream DAG Victream GPU Out-of-Core GPU GPU I/O GPU DAG I/O DAG Victream GPU I/O GPU GPU Sundaram [7] GPU Out-of-Core I/O GPU I/O GPU GPU I/O GPU I/O GPU I/O I/O NP-hard Pseudo-Boolean (PB) Optimization GPU PB Optimization Victream GPU I/O DAG ( GPU ) API API I/O 2 (1)GPU I/O 2 GPU. 3 DAG. I/O (2)GPU I/O 2 DAG Victream GPU Out-of-Core (1) (2) I/O DAG DAG [4] Out-of-Core I/O 2 3 I/O DAG 1 GPU GPU I/O GPU 4 GPU 3 DAG GPU 1 GPU I/O GPU

4 1 1 GPU GPU GPU GPU GPU I/O 1 5 Victream GPU Victream (1) GPU I/O (2) GPU I/O Victream 2 GPU GPU 2 GPU Victream 2 GPU GPU GPU GPU GPU GPU A 5 5 GPU A 5 GPU GPU I/O GPU A I/O GPU A GPU 9 GPU A ( 5) 5 GPU 9 5 GPU 9 GPU DAG 9 5 GPU A 5 GPU A I/O GPU A I/O 9 GPU I/O 9 GPU I/O I/O Victream GPU I/O GPU GPU GPU FIFO(First-In First-Out) GPU FIFO GPU FIFO I/O I/O GPU I/O I/O GPU FIFO FIFO DAG GPU I/O GPU GPU I/O FIFO GPU I/O I/O, I/O GPU DAG I/O 4

5 GPU Victream 2 9 Victream 9 DAG 5 5 I/O Victream GPU I/O GPU I/O Victream GPU Victream GPU I/O GPU GPU GPU Out-of-Core GPU I/O ( ) Victream GPU Out-of-Core I/O I/O GPU I/O I/O GPU I/O I/O Victream 2 DAG DAG I/O Victream GPU I/O 4 I/O 4. subtask get_next_subtask(gpu) { glob_min = iomin_subtask(global_list); local_min = iomin_subtask(local_list[gpu]); if(glob_min < local_min) { remove(global_list, glob_min); return glob_min; } else { remove(local_list[gpu], local_min); return local_min; }} void schedule() { foreach(g in available_gpu) { if(size(global_list) > 0 size(local_list[g]) > 0) { if(memory_use[g] < load_threashold) { st = get_next_subtask(g); pipeline_dispatch(st,g); } }}} 5. 2 DAG GPU GPU 2 GPU GPU DAG GPU 1 GPU GPU DAG GPU GPU GPU GPU GPU 5

6 Victream GPU GPU I/O GPU 4 I/O I/O GPU I/O GPU DAG GPU GPU I/O I/O GPU I/O I/O I/O GPU I/O I/O Victream I/O GPU I/O GPU I/O I/O I/O Victream I/O DAG GPU DAG GPU Victream I/O GPU Victream LRU(Least Recently Used) GPU I/O I/O GPU Victream GPU I/O GPU 4 FIFO I/O I/O GPU I/O I/O GPU GPU I/O Victream GPU I/O GPU GPU I/O GPU I/O GPU GPU 5 GPU Out-of-Core GPU I/O 3.3 I/O I/O GPU GPU GPU FIFO GPU NVIDIA Tesla K20 GPU I/O GPU GPU 5GB 3.52 Tflops E Xeon CPU 2 6

7 OS Ubuntu DAG RAMdisk Victream C++ CUDA K Victream FIFO PTask[4] Data-Aware FIFO GPU GPU GPU I/O PTask FIFO GPU GPU GPU GPU Out-of-Core FIFO GPU Data-Aware GPU I/O FIFO PTask I/O 4.2 (Blur ) 4 Victream API Victream Out-of-Core 4 2 GPU GPU GPU Out-of-Core N GPU 1 GPU N 2 256MB GPU 70% 50% GPU 6 Victream FIFO PTask PTask 92%-117% GPU Victream GPU FIFO PTask GPU GPU 6 4 GPU I/O DAG I/O Out-of-Core 9%-38% Blur GPU RAMdisk I/O Victream I/O 4 7 Out-of-Core Out-of-Core Victream PTask Out-of-Core Victream Out-of-Core 7

8 (a) (b) (c) Blur (d) 6. (a) (b) (c) Blur (d) 7. Ptask FIFO PTask GPU I/O FIFO 2 Victream Out-of-Core GPU I/O 5. [1] GPU Out-of-Core I/O Victream Stateof-the-Art Victream Victream DAG GPU I/O GPU I/O GPU I/O State-of-the-Art 117% 38% [1] Victream 2016 / / (SWoPP2016) (2016). [2] Isard, M., Budiu, M., Yu, Y., Birrell, A. and Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks, ACM SIGOPS Operating Systems Review, Vol. 41, No. 3, ACM, pp (2007). [3] Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M. J., Shenker, S. and Stoica, I.: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing, Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, USENIX Association, pp. 2 2 (2012). [4] Rossbach, C. J., Currey, J., Silberstein, M., Ray, B. and Witchel, E.: PTask: operating system abstractions to manage GPUs as compute devices, Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, ACM, pp (2011). [5] : Xeon E v4, Processor-E v4-60M-Cache-2 40 GHz [6] NVIDIA: NVIDIA TESLA P100 GPU ACCELERATOR, [7] Sundaram, N., Raghunathan, A. and Chakradhar, S. T.: A framework for efficient and scalable execution of domain-specific templates on GPUs, Parallel & Distributed Processing, IPDPS IEEE International Symposium on, IEEE, pp (2009). 8

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N GPU 1 1 2 1, 3 2, 3 (Graphics Unit: GPU) GPU GPU GPU Evaluation of GPU Computing Based on An Automatic Program Generation Technology Makoto Sugawara, 1 Katsuto Sato, 1 Kazuhiko Komatsu, 2 Hiroyuki Takizawa

More information

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU GPGPU (I) GPU GPGPU 1 GPU(Graphics Processing Unit) GPU GPGPU(General-Purpose computing on GPUs) GPU GPGPU GPU ( PC ) PC PC GPU PC PC GPU GPU 2008 TSUBAME NVIDIA GPU(Tesla S1070) TOP500 29 [1] 2009 AMD

More information

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h 23 FPGA CUDA Performance Comparison of FPGA Array with CUDA on Poisson Equation (lijiang@sekine-lab.ei.tuat.ac.jp), (kazuki@sekine-lab.ei.tuat.ac.jp), (takahashi@sekine-lab.ei.tuat.ac.jp), (tamukoh@cc.tuat.ac.jp),

More information

main.dvi

main.dvi PC 1 1 [1][2] [3][4] ( ) GPU(Graphics Processing Unit) GPU PC GPU PC ( 2 GPU ) GPU Harris Corner Detector[5] CPU ( ) ( ) CPU GPU 2 3 GPU 4 5 6 7 1 toyohiro@isc.kyutech.ac.jp 45 2 ( ) CPU ( ) ( ) () 2.1

More information

GPU n Graphics Processing Unit CG CAD

GPU n Graphics Processing Unit CG CAD GPU 2016/06/27 第 20 回 GPU コンピューティング講習会 ( 東京工業大学 ) 1 GPU n Graphics Processing Unit CG CAD www.nvidia.co.jp www.autodesk.co.jp www.pixar.com GPU n GPU ü n NVIDIA CUDA ü NVIDIA GPU ü OS Linux, Windows, Mac

More information

07-二村幸孝・出口大輔.indd

07-二村幸孝・出口大輔.indd GPU Graphics Processing Units HPC High Performance Computing GPU GPGPU General-Purpose computation on GPU CPU GPU GPU *1 Intel Quad-Core Xeon E5472 3.0 GHz 2 6 MB L2 cache 1600 MHz FSB 80 GFlops 1 nvidia

More information

MapTask 678 Map 関数 バッファ管理モジュール リングバッファ 45#$% *+,-./ 0123!"#$% &'() 外部記憶装置 1 MapReduce IFIle IFIle MapReduce 25% MapReduce 2 MapReduce OS

MapTask 678 Map 関数 バッファ管理モジュール リングバッファ 45#$% *+,-./ 0123!#$% &'() 外部記憶装置 1 MapReduce IFIle IFIle MapReduce 25% MapReduce 2 MapReduce OS DEIM Forum 2014 D1-3 MapReduce 180 8585 3 9 11 E-mail: {ozawa.tsuyoshi,oikawa.kazuki,onizuka.makoto,honjo.toshimori}@lab.ntt.co.jp MapReduce 1 Google, Facebook, Yahoo! MapReduce MapReduce MapReduce MapReduce

More information

10D16.dvi

10D16.dvi D IEEJ Transactions on Industry Applications Vol.136 No.10 pp.686 691 DOI: 10.1541/ieejias.136.686 NW Accelerating Techniques for Sequence Alignment based on an Extended NW Algorithm Jin Okaze, Non-member,

More information

PC Development of Distributed PC Grid System,,,, Junji Umemoto, Hiroyuki Ebara, Katsumi Onishi, Hiroaki Morikawa, and Bunryu U PC WAN PC PC WAN PC 1 P

PC Development of Distributed PC Grid System,,,, Junji Umemoto, Hiroyuki Ebara, Katsumi Onishi, Hiroaki Morikawa, and Bunryu U PC WAN PC PC WAN PC 1 P PC Development of Distributed PC Grid System,,,, Junji Umemoto, Hiroyuki Ebara, Katsumi Onishi, Hiroaki Morikawa, and Bunryu U PC WAN PC PC WAN PC 1 PC PC PC PC PC Key Words:Grid, PC Cluster, Distributed

More information

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation 1 1 1 1 SPEC CPU 2000 EQUAKE 1.6 50 500 A Parallelizing Compiler Cooperative Multicore Architecture Simulator with Changeover Mechanism of Simulation Modes GAKUHO TAGUCHI 1 YOUICHI ABE 1 KEIJI KIMURA 1

More information

HBase Phoenix API Mars GPU MapReduce GPU Hadoop Hadoop Hadoop MapReduce : (1) MapReduce (2)JobTracker 1 Hadoop CPU GPU Fig. 1 The overview of CPU-GPU

HBase Phoenix API Mars GPU MapReduce GPU Hadoop Hadoop Hadoop MapReduce : (1) MapReduce (2)JobTracker 1 Hadoop CPU GPU Fig. 1 The overview of CPU-GPU GPU MapReduce 1 1 1, 2, 3 MapReduce GPGPU GPU GPU MapReduce CPU GPU GPU CPU GPU CPU GPU Map K-Means CPU 2GPU CPU 1.02-1.93 Improving MapReduce Task Scheduling for CPU-GPU Heterogeneous Environments Koichi

More information

IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1

IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1 SMYLE OpenCL 128 1 1 1 1 1 2 2 3 3 3 (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 128 SMYLEref SMYLE OpenCL SMYLE OpenCL Implementation and Evaluations on 128 Cores Takuji Hieda 1 Noriko Etani

More information

GPGPU

GPGPU GPGPU 2013 1008 2015 1 23 Abstract In recent years, with the advance of microscope technology, the alive cells have been able to observe. On the other hand, from the standpoint of image processing, the

More information

DEIM Forum 2012 C2-6 Hadoop Web Hadoop Distributed File System Hadoop I/O I/O Hadoo

DEIM Forum 2012 C2-6 Hadoop Web Hadoop Distributed File System Hadoop I/O I/O Hadoo DEIM Forum 12 C2-6 Hadoop 112-86 2-1-1 E-mail: momo@ogl.is.ocha.ac.jp, oguchi@computer.org Web Hadoop Distributed File System Hadoop I/O I/O Hadoop A Study about the Remote Data Access Control for Hadoop

More information

2 JSON., 2. JSON,, JSON Jaql [9] Spark Streaming [8], Spark [7].,, 2, 3 4, JSON [3], Jaql [9], Spark [7] Spark Streaming [8] JSON JSON [

2 JSON., 2. JSON,, JSON Jaql [9] Spark Streaming [8], Spark [7].,, 2, 3 4, JSON [3], Jaql [9], Spark [7] Spark Streaming [8] JSON JSON [ DEIM Forum 2016 G1-4,, 305 8573 1-1-1 305 8573 1-1-1 305 8573 1-1-1 E-mail: denam96@kde.cs.tsukuba.ac.jp, {shiokawa,kitagawa}@cs.tsukuba.ac.jp,,.,,,.,, (1), (2),.,, 1.,.,,.,,,,, Storm [2] STREAM [5], S4

More information

HPC可視化_小野2.pptx

HPC可視化_小野2.pptx 大 小 二 生 高 方 目 大 方 方 方 Rank Site Processors RMax Processor System Model 1 DOE/NNSA/LANL 122400 1026000 PowerXCell 8i BladeCenter QS22 Cluster 2 DOE/NNSA/LLNL 212992 478200 PowerPC 440 BlueGene/L 3 Argonne

More information

rank ”«‘‚“™z‡Ì GPU ‡É‡æ‡éŁÀŠñ›»

rank ”«‘‚“™z‡Ì GPU ‡É‡æ‡éŁÀŠñ›» rank GPU ERATO 2011 11 1 1 / 26 GPU rank/select wavelet tree balanced parenthesis GPU rank 2 / 26 GPU rank/select wavelet tree balanced parenthesis GPU rank 2 / 26 GPU rank/select wavelet tree balanced

More information

211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS /1/18 a a 1 a 2 a 3 a a GPU Graphics Processing Unit GPU CPU GPU GPGPU G

211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS /1/18 a a 1 a 2 a 3 a a GPU Graphics Processing Unit GPU CPU GPU GPGPU G 211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS211 211/1/18 GPU 4 8 BLAS 4 8 BLAS Basic Linear Algebra Subprograms GPU Graphics Processing Unit 4 8 double 2 4 double-double DD 4 4 8 quad-double

More information

untitled

untitled A = QΛQ T A n n Λ Q A = XΛX 1 A n n Λ X GPGPU A 3 T Q T AQ = T (Q: ) T u i = λ i u i T {λ i } {u i } QR MR 3 v i = Q u i A {v i } A n = 9000 Quad Core Xeon 2 LAPACK (4/3) n 3 O(n 2 ) O(n 3 ) A {v i }

More information

Slides: TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments

Slides: TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments 計算機アーキテクチャ第 11 回 マルチプロセッサ 本資料は授業用です 無断で転載することを禁じます 名古屋大学 大学院情報科学研究科 准教授加藤真平 デスクトップ ジョブレベル並列性 スーパーコンピュータ 並列処理プログラム プログラムの並列化 for (i = 0; i < N; i++) { x[i] = a[i] + b[i]; } プログラムの並列化 x[0] = a[0] + b[0];

More information

MATLAB® における並列・分散コンピューティング ~ Parallel Computing Toolbox™ & MATLAB Distributed Computing Server™ ~

MATLAB® における並列・分散コンピューティング ~ Parallel Computing Toolbox™ & MATLAB Distributed Computing Server™ ~ MATLAB における並列 分散コンピューティング ~ Parallel Computing Toolbox & MATLAB Distributed Computing Server ~ MathWorks Japan Application Engineering Group Takashi Yoshida 2016 The MathWorks, Inc. 1 System Configuration

More information

FINAL PROGRAM 25th Annual Workshop SWoPP / / 2012 Tottori Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2012

FINAL PROGRAM 25th Annual Workshop SWoPP / / 2012 Tottori Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2012 FINAL PROGRAM 25th Annual Workshop SWoPP 2012 2012 / / 2012 Tottori Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2012 8 1 ( ) 8 3 ( ) 680-0017 101-5 http://www.torikenmin.jp/kenbun/

More information

IPSJ SIG Technical Report Vol.2014-DBS-159 No.6 Vol.2014-IFAT-115 No /8/1 1,a) 1 1 1,, 1. ([1]) ([2], [3]) A B 1 ([4]) 1 Graduate School of Info

IPSJ SIG Technical Report Vol.2014-DBS-159 No.6 Vol.2014-IFAT-115 No /8/1 1,a) 1 1 1,, 1. ([1]) ([2], [3]) A B 1 ([4]) 1 Graduate School of Info 1,a) 1 1 1,, 1. ([1]) ([2], [3]) A B 1 ([4]) 1 Graduate School of Information Science and Technology, Osaka University a) kawasumi.ryo@ist.osaka-u.ac.jp 1 1 Bucket R*-tree[5] [4] 2 3 4 5 6 2. 2.1 2.2 2.3

More information

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-HPC-139 No /5/29 Gfarm/Pwrake NICT NICT 10TB 100TB CPU I/O HPC I/O NICT Gf

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-HPC-139 No /5/29 Gfarm/Pwrake NICT NICT 10TB 100TB CPU I/O HPC I/O NICT Gf Gfarm/Pwrake NICT 1 1 1 1 2 2 3 4 5 5 5 6 NICT 10TB 100TB CPU I/O HPC I/O NICT Gfarm Gfarm Pwrake A Parallel Processing Technique on the NICT Science Cloud via Gfarm/Pwrake KEN T. MURATA 1 HIDENOBU WATANABE

More information

Chip Size and Performance Evaluations of Shared Cache for On-chip Multiprocessor Takahiro SASAKI, Tomohiro INOUE, Nobuhiko OMORI, Tetsuo HIRONAKA, Han

Chip Size and Performance Evaluations of Shared Cache for On-chip Multiprocessor Takahiro SASAKI, Tomohiro INOUE, Nobuhiko OMORI, Tetsuo HIRONAKA, Han Chip Size and Performance Evaluations of Shared Cache for On-chip Multiprocessor Takahiro SASAKI, Tomohiro INOUE, Nobuhiko OMORI, Tetsuo HIRONAKA, Hans J. MATTAUSCH, and Tetsushi KOIDE 1 1 2 0.5 µm CMOS

More information

AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted

AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted DEGIMA LINPACK Energy Performance for LINPACK Benchmark on DEGIMA 1 AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK 1.4698 GFlops/Watt 1.9658 GFlops/Watt Abstract GPU Computing has

More information

HP High Performance Computing(HPC)

HP High Performance Computing(HPC) ACCELERATE HP High Performance Computing HPC HPC HPC HPC HPC 1000 HPHPC HPC HP HPC HPC HPC HP HPCHP HP HPC 1 HPC HP 2 HPC HPC HP ITIDC HP HPC 1HPC HPC No.1 HPC TOP500 2010 11 HP 159 32% HP HPCHP 2010 Q1-Q4

More information

untitled

untitled A = QΛQ T A n n Λ Q A = XΛX 1 A n n Λ X GPGPU A 3 T Q T AQ = T (Q: ) T u i = λ i u i T {λ i } {u i } QR MR 3 v i = Q u i A {v i } A n = 9000 Quad Core Xeon 2 LAPACK (4/3) n 3 O(n 2 ) O(n 3 ) A {v i }

More information

Microsoft PowerPoint - GPU_computing_2013_01.pptx

Microsoft PowerPoint - GPU_computing_2013_01.pptx GPU コンピューティン No.1 導入 東京工業大学 学術国際情報センター 青木尊之 1 GPU とは 2 GPGPU (General-purpose computing on graphics processing units) GPU を画像処理以外の一般的計算に使う GPU の魅力 高性能 : ハイエンド GPU はピーク 4 TFLOPS 超 手軽さ : 普通の PC にも装着できる 低価格

More information

IPSJ SIG Technical Report Vol.2017-ARC-225 No.12 Vol.2017-SLDM-179 No.12 Vol.2017-EMB-44 No /3/9 1 1 RTOS DefensiveZone DefensiveZone MPU RTOS

IPSJ SIG Technical Report Vol.2017-ARC-225 No.12 Vol.2017-SLDM-179 No.12 Vol.2017-EMB-44 No /3/9 1 1 RTOS DefensiveZone DefensiveZone MPU RTOS 1 1 RTOS DefensiveZone DefensiveZone MPU RTOS RTOS OS Lightweight partitioning architecture for automotive systems Suzuki Takehito 1 Honda Shinya 1 Abstract: Partitioning using protection RTOS has high

More information

FINAL PROGRAM 22th Annual Workshop SWoPP / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing

FINAL PROGRAM 22th Annual Workshop SWoPP / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing FINAL PROGRAM 22th Annual Workshop SWoPP 2009 2009 / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2009 8 4 ( ) 8 6 ( ) 981-0933 1-2-45 http://www.forestsendai.jp

More information

IPSJ SIG Technical Report Vol.2014-ARC-213 No.24 Vol.2014-HPC-147 No /12/10 GPU 1,a) 1,b) 1,c) 1,d) GPU GPU Structure Of Array Array Of

IPSJ SIG Technical Report Vol.2014-ARC-213 No.24 Vol.2014-HPC-147 No /12/10 GPU 1,a) 1,b) 1,c) 1,d) GPU GPU Structure Of Array Array Of GPU 1,a) 1,b) 1,c) 1,d) GPU 1 GPU Structure Of Array Array Of Structure 1. MPS(Moving Particle Semi-Implicit) [1] SPH(Smoothed Particle Hydrodynamics) [] DEM(Distinct Element Method)[] [] 1 Tokyo Institute

More information

B

B B 27 1153021 28 2 10 1 1 5 1.1 CPU................. 5 1.2.... 5 1.3.... 6 1.4.. 7 1.5................................ 8 2 9 2.1.................................. 9 2.2............................ 10 2.3............................

More information

DEIM Forum 2019 H2-2 SuperSQL SuperSQL SQL SuperSQL Web SuperSQL DBMS Pi

DEIM Forum 2019 H2-2 SuperSQL SuperSQL SQL SuperSQL Web SuperSQL DBMS Pi DEIM Forum 2019 H2-2 SuperSQL 223 8522 3 14 1 E-mail: {terui,goto}@db.ics.keio.ac.jp, toyama@ics.keio.ac.jp SuperSQL SQL SuperSQL Web SuperSQL DBMS PipelineDB SuperSQL Web Web 1 SQL SuperSQL HTML SuperSQL

More information

iphone GPGPU GPU OpenCL Mac OS X Snow LeopardOpenCL iphone OpenCL OpenCL NVIDIA GPU CUDA GPU GPU GPU 15 GPU GPU CPU GPU iii OpenMP MPI CPU OpenCL CUDA OpenCL CPU OpenCL GPU NVIDIA Fermi GPU Fermi GPU GPU

More information

21 20 20413525 22 2 4 i 1 1 2 4 2.1.................................. 4 2.1.1 LinuxOS....................... 7 2.1.2....................... 10 2.2........................ 15 3 17 3.1.................................

More information

! 行行 CPUDSP PPESPECell/B.E. CPUGPU 行行 SIMD [SSE, AltiVec] 用 HPC CPUDSP PPESPE (Cell/B.E.) SPE CPUGPU GPU CPU DSP DSP PPE SPE SPE CPU DSP SPE 2

! 行行 CPUDSP PPESPECell/B.E. CPUGPU 行行 SIMD [SSE, AltiVec] 用 HPC CPUDSP PPESPE (Cell/B.E.) SPE CPUGPU GPU CPU DSP DSP PPE SPE SPE CPU DSP SPE 2 ! OpenCL [Open Computing Language] 言 [OpenCL C 言 ] CPU, GPU, Cell/B.E.,DSP 言 行行 [OpenCL Runtime] OpenCL C 言 API Khronos OpenCL Working Group AMD Broadcom Blizzard Apple ARM Codeplay Electronic Arts Freescale

More information

GPUコンピューティング講習会パート1

GPUコンピューティング講習会パート1 GPU コンピューティング (CUDA) 講習会 GPU と GPU を用いた計算の概要 丸山直也 スケジュール 13:20-13:50 GPU を用いた計算の概要 担当丸山 13:50-14:30 GPU コンピューティングによる HPC アプリケーションの高速化の事例紹介 担当青木 14:30-14:40 休憩 14:40-17:00 CUDA プログラミングの基礎 担当丸山 TSUBAME の

More information

AMD AMD AMD Opteron x86 OS 2P 8P x GHz 75W ACP OEM Q4 2.3GHz HE (55W) 2.8GHz SE (105W) AMD PC 2009 All rights reserved. AMD Japan, L

AMD AMD AMD Opteron x86 OS 2P 8P x GHz 75W ACP OEM Q4 2.3GHz HE (55W) 2.8GHz SE (105W) AMD PC 2009 All rights reserved. AMD Japan, L AMD AMD AMD Opteron x86 OS 2P 8P x86 2.3 2.7GHz 75W ACP OEM Q4 2.3GHz HE (55W) 2.8GHz SE (105W) 2009 1 2 AMD PC 2009 All rights reserved. AMD Japan, Ltd. IT 3 AMD PC 2009 All rights reserved. AMD Japan,

More information

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2 FFT 1 Fourier fast Fourier transform FFT FFT FFT 1 FFT FFT 2 Fourier 2.1 Fourier FFT Fourier discrete Fourier transform DFT DFT n 1 y k = j=0 x j ω jk n, 0 k n 1 (1) x j y k ω n = e 2πi/n i = 1 (1) n DFT

More information

GPU GPU CPU CPU CPU GPU GPU N N CPU ( ) 1 GPU CPU GPU 2D 3D CPU GPU GPU GPGPU GPGPU 2 nvidia GPU CUDA 3 GPU 3.1 GPU Core 1

GPU GPU CPU CPU CPU GPU GPU N N CPU ( ) 1 GPU CPU GPU 2D 3D CPU GPU GPU GPGPU GPGPU 2 nvidia GPU CUDA 3 GPU 3.1 GPU Core 1 GPU 4 2010 8 28 1 GPU CPU CPU CPU GPU GPU N N CPU ( ) 1 GPU CPU GPU 2D 3D CPU GPU GPU GPGPU GPGPU 2 nvidia GPU CUDA 3 GPU 3.1 GPU Core 1 Register & Shared Memory ( ) CPU CPU(Intel Core i7 965) GPU(Tesla

More information

09中西

09中西 PC NEC Linux (1) (2) (1) (2) 1 Linux Linux 2002.11.22) LLNL Linux Intel Xeon 2300 ASCIWhite1/7 / HPC (IDC) 2002 800 2005 2004 HPC 80%Linux) Linux ASCI Purple (ASCI 100TFlops Blue Gene/L 1PFlops (2005)

More information

HPC pdf

HPC pdf GPU 1 1 2 2 1 1024 3 GPUGraphics Unit1024 3 GPU GPU GPU GPU 1024 3 Tesla S1070-400 1 GPU 2.6 Accelerating Out-of-core Cone Beam Reconstruction Using GPU Yusuke Okitsu, 1 Fumihiko Ino, 1 Taketo Kishi, 2

More information

mobicom.dvi

mobicom.dvi 13Dynamic Voltage Scaling on a Low-Power Microprocessor Johan Pouwelse 5 Koen Langendoen Henk Sips Faculty of Information Technology and Systems Delft University of Technology, The Netherlands 1 78724

More information

AV 1000 BASE-T LAN 90 IEEE ac USB (3 ) LAN (IEEE 802.1X ) LAN AWS (Amazon Web Services) AP 3 USB wget iperf3 wget 40 MBytes 2 wget 40 MByt

AV 1000 BASE-T LAN 90 IEEE ac USB (3 ) LAN (IEEE 802.1X ) LAN AWS (Amazon Web Services) AP 3 USB wget iperf3 wget 40 MBytes 2 wget 40 MByt 1 BYOD LAN 1 2 3 4 1 BYOD 1 Gb/s LAN BYOD LAN LAN Access Point (AP) IEEE 802.11n BYOD LAN AP wget iperf3 1 AP [2] 2 IEEE 802.11ac [3] AP 4 AV (207 m 2 ) ( 1 2 )[4, 5] AP Wave2 Aruba AP-335 Aruba LAN 7210

More information

1, 4,a) 1, 4 1, 4 1, , 4 3, 4 HPC HPC HPC Slurm 1. HPC Tianhe MW MW [1] MW CREST a)

1, 4,a) 1, 4 1, 4 1, , 4 3, 4 HPC HPC HPC Slurm 1. HPC Tianhe MW MW [1] MW CREST a) Title 電力制約を考慮した資源管理を行うリソースマネージャの実装と評価 Author(s) 坂本, 龍一 ; タン, カオ ; 和, 遠 ; 近藤, 正章 ; 深沢, 圭田, 将嗣 ; 稲富, 雄一 ; 井上, 弘士 Citation 情報処理学会研究報告 = IPSJ SIG Technical Rep 2015-HPC-151(1): 1-8 Issue Date 2015-09-23 URL

More information

HPEハイパフォーマンスコンピューティング ソリューション

HPEハイパフォーマンスコンピューティング ソリューション HPE HPC / AI Page 2 No.1 * 24.8% No.1 * HPE HPC / AI HPC AI SGIHPE HPC / AI GPU TOP500 50th edition Nov. 2017 HPE No.1 124 www.top500.org HPE HPC / AI TSUBAME 3.0 2017 7 AI TSUBAME 3.0 HPE SGI 8600 System

More information

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro TV 1,2,a) 1 2 2015 1 26, 2015 5 21 Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Rotation Using Mobile Device Hiroyuki Kawakita 1,2,a) Toshio Nakagawa 1 Makoto Sato

More information

HP Workstation 総合カタログ

HP Workstation 総合カタログ HP Workstation Z HP 6 Z HP HP Z840 Workstation P.9 HP Z640 Workstation & CPU P.10 HP Z440 Workstation P.11 17.3in WIDE HP ZBook 17 G2 Mobile Workstation P.15 15.6in WIDE HP ZBook 15 G2 Mobile Workstation

More information

分散ストレージシステム (4) (5) (6) 書き込み 書き込み 読み出し 読み出し (2) コーディネータ 1 Fig. 1 Image of distributed storage system. 2 Fig. 2 Process flow of ( 1 ) ( 2 ) ( 3 )

分散ストレージシステム (4) (5) (6) 書き込み 書き込み 読み出し 読み出し (2) コーディネータ 1 Fig. 1 Image of distributed storage system. 2 Fig. 2 Process flow of ( 1 ) ( 2 ) ( 3 ) 1 1 1 1 1 key-value store Application of Load Balancing Mechanism with Considering Data Access Frequency to Daisuke Kawakami, 1 Toshihiro Matsui, 1 Shoichi Saito, 1 Tomoaki Tsumura 1 and Hiroshi Matsuo

More information

IPSJ-HPC

IPSJ-HPC can effectively exploit the I/O performance of clusters with Gbit/sec-class flash memories. In this paper, we first outline our prototype MapReduce system which utilizes distributed key-value store. And

More information

1 3DCG [2] 3DCG CG 3DCG [3] 3DCG 3 3 API 2 3DCG 3 (1) Saito [4] (a) 1920x1080 (b) 1280x720 (c) 640x360 (d) 320x G-Buffer Decaudin[5] G-Buffer D

1 3DCG [2] 3DCG CG 3DCG [3] 3DCG 3 3 API 2 3DCG 3 (1) Saito [4] (a) 1920x1080 (b) 1280x720 (c) 640x360 (d) 320x G-Buffer Decaudin[5] G-Buffer D 3DCG 1) ( ) 2) 2) 1) 2) Real-Time Line Drawing Using Image Processing and Deforming Process Together in 3DCG Takeshi Okuya 1) Katsuaki Tanaka 2) Shigekazu Sakai 2) 1) Department of Intermedia Art and Science,

More information

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055 1 1 1 2 DCRA 1. 1.1 1) 1 Tactile Interface with Air Jets for Floating Images Aya Higuchi, 1 Nomin, 1 Sandor Markon 1 and Satoshi Maekawa 2 The new optical device DCRA can display floating images in free

More information

スライド 1

スライド 1 GPU クラスタによる格子 QCD 計算 広大理尾崎裕介 石川健一 1.1 Introduction Graphic Processing Units 1 チップに数百個の演算器 多数の演算器による並列計算 ~TFLOPS ( 単精度 ) CPU 数十 GFLOPS バンド幅 ~100GB/s コストパフォーマンス ~$400 GPU の開発環境 NVIDIA CUDA http://www.nvidia.co.jp/object/cuda_home_new_jp.html

More information

Run-Based Trieから構成される 決定木の枝刈り法

Run-Based Trieから構成される  決定木の枝刈り法 Run-Based Trie 2 2 25 6 Run-Based Trie Simple Search Run-Based Trie Network A Network B Packet Router Packet Filtering Policy Rule Network A, K Network B Network C, D Action Permit Deny Permit Network

More information

EGunGPU

EGunGPU Super Computing in Accelerator simulations - Electron Gun simulation using GPGPU - K. Ohmi, KEK-Accel Accelerator Physics seminar 2009.11.19 Super computers in KEK HITACHI SR11000 POWER5 16 24GB 16 134GFlops,

More information

,., ping - RTT,., [2],RTT TCP [3] [4] Android.Android,.,,. LAN ACK. [5].. 3., 1.,. 3 AI.,,Amazon, (NN),, 1..NN,, (RNN) RNN

,., ping - RTT,., [2],RTT TCP [3] [4] Android.Android,.,,. LAN ACK. [5].. 3., 1.,. 3 AI.,,Amazon, (NN),, 1..NN,, (RNN) RNN DEIM Forum 2018 F1-1 LAN LSTM 112 8610 2-1-1 163-8677 1-24-2 E-mail: aoi@ogl.is.ocha.ac.jp, oguchi@is.ocha.ac.jp, sane@cc.kogakuin.ac.jp,,.,,., LAN,. Android LAN,. LSTM LAN., LSTM, Analysis of Packet of

More information

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c CodeDrummer: 1 2 3 1 CodeDrummer: Sonification Methods of Function Calls in Program Execution Kazuya Sato, 1 Shigeyuki Hirai, 2 Kazutaka Maruyama 3 and Minoru Terada 1 We propose a program sonification

More information

1 DHT Fig. 1 Example of DHT 2 Successor Fig. 2 Example of Successor 2.1 Distributed Hash Table key key value O(1) DHT DHT 1 DHT 1 ID key ID IP value D

1 DHT Fig. 1 Example of DHT 2 Successor Fig. 2 Example of Successor 2.1 Distributed Hash Table key key value O(1) DHT DHT 1 DHT 1 ID key ID IP value D P2P 1,a) 1 1 Peer-to-Peer P2P P2P P2P Chord P2P Chord Consideration for Efficient Construction of Distributed Hash Trees on P2P Systems Taihei Higuchi 1,a) Masakazu Soshi 1 Tomoyuki Asaeda 1 Abstract:

More information

[4] ACP (Advanced Communication Primitives) [1] ACP ACP [2] ACP Tofu UDP [3] HPC InfiniBand InfiniBand ACP 2 ACP, 3 InfiniBand ACP 4 5 ACP 2. ACP ACP

[4] ACP (Advanced Communication Primitives) [1] ACP ACP [2] ACP Tofu UDP [3] HPC InfiniBand InfiniBand ACP 2 ACP, 3 InfiniBand ACP 4 5 ACP 2. ACP ACP InfiniBand ACP 1,5,a) 1,5,b) 2,5 1,5 4,5 3,5 2,5 ACE (Advanced Communication for Exa) ACP (Advanced Communication Primitives) HPC InfiniBand ACP InfiniBand ACP ACP InfiniBand Open MPI 20% InfiniBand Implementation

More information

56 OS OS OS OS 1 OS HDD OS 1 OS HDD HDD OS OS OSOS HDD 図 1 二重キャッシュ環境 3. 負の参照の時間的局所性 3.1 参照の局所性 Locality of Reference Temporal locality Spatial localit

56 OS OS OS OS 1 OS HDD OS 1 OS HDD HDD OS OS OSOS HDD 図 1 二重キャッシュ環境 3. 負の参照の時間的局所性 3.1 参照の局所性 Locality of Reference Temporal locality Spatial localit 116 26 4 1 2 2 1 3 An Analysis of Locality of Reference in Virtualized Environment Hiroki SUGIMOTO 1, Kousuke TAKEUCHI 2, Kouya HINAGAWA 2 and Saneyasu YAMAGUCHI 1 3 Abstract As cloud computing has spread

More information

GPUコンピューティング講習会パート1

GPUコンピューティング講習会パート1 GPU コンピューティング (CUDA) 講習会 GPU と GPU を用いた計算の概要 丸山直也 スケジュール 13:20-13:50 GPU を用いた計算の概要 担当丸山 13:50-14:30 GPU コンピューティングによる HPC アプリケーションの高速化の事例紹介 担当青木 14:30-14:40 休憩 14:40-17:00 CUDA プログラミングの基礎 担当丸山 TSUBAME の

More information

HP ProLiant 500シリーズ

HP ProLiant 500シリーズ HPProLiant5 DL58/585 HPProLiant5 4 HPProLiant5 HPProLiant5 64 HPProLiant5 TPC-H@1GB 4, 34,99 SAP SD Benchmark Users QphH@1GB 3, 2, 1, 4, 3, 2, 1, DL58 G5, Xeon X735 DL585 G5, AMD Opteron 836SE 17,12 DL58

More information

DEIM Forum 2017 H ,

DEIM Forum 2017 H , DEIM Forum 217 H5-4 113 8656 7 3 1 153 855 4 6 1 3 2 1 2 E-mail: {satoyuki,haya,kgoda,kitsure}@tkl.iis.u-tokyo.ac.jp,.,,.,,.,, 1.. 1956., IBM IBM RAMAC 35 IBM 35 24 5, 5MB. 1961 IBM 131,,, IBM 35 13.,

More information

26102 (1/2) LSISoC: (1) (*) (*) GPU SIMD MIMD FPGA DES, AES (2/2) (2) FPGA(8bit) (ISS: Instruction Set Simulator) (3) (4) LSI ECU110100ECU1 ECU ECU ECU ECU FPGA ECU main() { int i, j, k for { } 1 GP-GPU

More information

MAC root Linux 1 OS Linux 2.6 Linux Security Modules LSM [1] Security-Enhanced Linux SELinux [2] AppArmor[3] OS OS OS LSM LSM Performance Monitor LSMP

MAC root Linux 1 OS Linux 2.6 Linux Security Modules LSM [1] Security-Enhanced Linux SELinux [2] AppArmor[3] OS OS OS LSM LSM Performance Monitor LSMP LSM OS 700-8530 3 1 1 matsuda@swlab.it.okayama-u.ac.jp tabata@cs.okayama-u.ac.jp 242-8502 1623 14 munetoh@jp.ibm.com OS Linux 2.6 Linux Security Modules LSM LSM Linux 4 OS OS LSM An Evaluation of Performance

More information

28 Docker Design and Implementation of Program Evaluation System Using Docker Virtualized Environment

28 Docker Design and Implementation of Program Evaluation System Using Docker Virtualized Environment 28 Docker Design and Implementation of Program Evaluation System Using Docker Virtualized Environment 1170288 2017 2 28 Docker,.,,.,,.,,.,. Docker.,..,., Web, Web.,.,.,, CPU,,. i ., OS..,, OS, VirtualBox,.,

More information

DELL PRECISION T7400 T5400 T3400 M6400 M4400 M2400 R5400 FX100 February /

DELL PRECISION T7400 T5400 T3400 M6400 M4400 M2400 R5400 FX100 February / DELL PRECISION T7400 T5400 T3400 M6400 M4400 M2400 R5400 FX100 February / 2009 www.dell.com/jp Dell Precision Workstation PC9No.1 CADCG PC 9No.1 Dell Precision IDC WW Quarterly Workstation Tracker 2007Q4

More information

スライド 1

スライド 1 WWW Request Client Data Server Request Data Client WWW Request Data Client Server Request Data Client WWW CPU Request Data Client Server Request Data Client Request Client Data Server Request Data Client

More information

hpc141_shirahata.pdf

hpc141_shirahata.pdf GPU アクセラレータと不揮発性メモリ を考慮した I/O 性能の予備評価 白幡晃一 1,2 佐藤仁 1,2 松岡聡 1 1: 東京工業大学 2: JST CREST 1 GPU と不揮発性メモリを用いた 大規模データ処理 大規模データ処理 センサーネットワーク 遺伝子情報 SNS など ペタ ヨッタバイト級 高速処理が必要 スーパーコンピュータ上での大規模データ処理 GPU 高性能 高バンド幅 例

More information

DEIM Forum 2017 H2-2 Android LAN Android 1 Android LAN

DEIM Forum 2017 H2-2 Android LAN Android 1 Android LAN DEIM Forum 2017 H2-2 Android LAN 112-8610 2-1-1 163-8677 1-24-2 E-mail: {ayano,oguchi}@ogl.is.ocha.ac.jp, sane@cc.kogakuin.ac.jp Android 1 Android LAN Ayano KOYANAGI, Saneyasu YAMAGUCHI, and Masato OGUCHI

More information

卒業論文

卒業論文 PC OpenMP SCore PC OpenMP PC PC PC Myrinet PC PC 1 OpenMP 2 1 3 3 PC 8 OpenMP 11 15 15 16 16 18 19 19 19 20 20 21 21 23 26 29 30 31 32 33 4 5 6 7 SCore 9 PC 10 OpenMP 14 16 17 10 17 11 19 12 19 13 20 1421

More information

2) TA Hercules CAA 5 [6], [7] CAA BOSS [8] 2. C II C. ( 1 ) C. ( 2 ). ( 3 ) 100. ( 4 ) () HTML NFS Hercules ( )

2) TA Hercules CAA 5 [6], [7] CAA BOSS [8] 2. C II C. ( 1 ) C. ( 2 ). ( 3 ) 100. ( 4 ) () HTML NFS Hercules ( ) 1,a) 2 4 WC C WC C Grading Student programs for visualizing progress in classroom Naito Hiroshi 1,a) Saito Takashi 2 Abstract: To grade student programs in Computer-Aided Assessment system, we propose

More information

6 2. AUTOSAR 2.1 AUTOSAR AUTOSAR ECU OSEK/VDX 3) OSEK/VDX OS AUTOSAR AUTOSAR ECU AUTOSAR 1 AUTOSAR BSW (Basic Software) (Runtime Environment) Applicat

6 2. AUTOSAR 2.1 AUTOSAR AUTOSAR ECU OSEK/VDX 3) OSEK/VDX OS AUTOSAR AUTOSAR ECU AUTOSAR 1 AUTOSAR BSW (Basic Software) (Runtime Environment) Applicat AUTOSAR 1 1, 2 2 2 AUTOSAR AUTOSAR 3 2 2 41% 29% An Extension of AUTOSAR Communication Layers for Multicore Systems Toshiyuki Ichiba, 1 Hiroaki Takada, 1, 2 Shinya Honda 2 and Ryo Kurachi 2 AUTOSAR, a

More information

3.1 Thalmic Lab Myo * Bluetooth PC Myo 8 RMS RMS t RMS(t) i (i = 1, 2,, 8) 8 SVM libsvm *2 ν-svm 1 Myo 2 8 RMS 3.2 Myo (Root

3.1 Thalmic Lab Myo * Bluetooth PC Myo 8 RMS RMS t RMS(t) i (i = 1, 2,, 8) 8 SVM libsvm *2 ν-svm 1 Myo 2 8 RMS 3.2 Myo (Root 1,a) 2 2 1. 1 College of Information Science, School of Informatics, University of Tsukuba 2 Faculty of Engineering, Information and Systems, University of Tsukuba a) oharada@iplab.cs.tsukuba.ac.jp 2.

More information

倍々精度RgemmのnVidia C2050上への実装と応用

倍々精度RgemmのnVidia C2050上への実装と応用 .. maho@riken.jp http://accc.riken.jp/maho/,,, 2011/2/16 1 - : GPU : SDPA-DD 10 1 - Rgemm : 4 (32 ) nvidia C2050, GPU CPU 150, 24GFlops 25 20 GFLOPS 15 10 QuadAdd Cray, QuadMul Sloppy Kernel QuadAdd Cray,

More information

先進的計算基盤システムシンポジウム SACSIS2012 Symposium on Advanced Computing Systems and Infrastructures SACSIS /5/18 CPU, CPU., Memory-bound CPU,., Memory-bo

先進的計算基盤システムシンポジウム SACSIS2012 Symposium on Advanced Computing Systems and Infrastructures SACSIS /5/18 CPU, CPU., Memory-bound CPU,., Memory-bo CPU, CPU, Memory-bound CPU,, Memory-bound ( ) Performance Monitoring Counter(PMC), PMC (nmi watchdog), PMC CPU., PMC, CPU, Memory-bound, CPU-bound,, CPU,, PMC,,,, CPU, NPB 8, 5% CPU, CPU, 3%, 5% CPU, IS

More information

IPSJ SIG Technical Report Vol.2013-HPC-138 No /2/21 GPU CRS 1,a) 2,b) SpMV GPU CRS SpMV GPU NVIDIA Kepler CUDA5.0 Fermi GPU Kepler Kepler Tesla

IPSJ SIG Technical Report Vol.2013-HPC-138 No /2/21 GPU CRS 1,a) 2,b) SpMV GPU CRS SpMV GPU NVIDIA Kepler CUDA5.0 Fermi GPU Kepler Kepler Tesla GPU CRS 1,a),b) SpMV GPU CRS SpMV GPU NVIDIA Kepler CUDA5.0 Fermi GPU Kepler Kepler Tesla K0 CUDA5.0 cusparse CRS SpMV 00 1.86 177 1. SpMV SpMV CRS Compressed Row Storage *1 SpMV GPU GPU NVIDIA Kepler

More information

27 YouTube YouTube UGC User Generated Content CDN Content Delivery Networks LRU Least Recently Used UGC YouTube CGM Consumer Generated Media CGM CGM U

27 YouTube YouTube UGC User Generated Content CDN Content Delivery Networks LRU Least Recently Used UGC YouTube CGM Consumer Generated Media CGM CGM U YouTube 2016 2 16 27 YouTube YouTube UGC User Generated Content CDN Content Delivery Networks LRU Least Recently Used UGC YouTube CGM Consumer Generated Media CGM CGM UGC UGC YouTube k-means YouTube YouTube

More information

SWoPP BOF BOF-1 8/3 19:10 BoF SWoPP : BOF-2 8/5 17:00 19:00 HW/SW 15 x5 SimMips/MieruPC M-Core/SimMc FPGA S

SWoPP BOF BOF-1 8/3 19:10 BoF SWoPP :   BOF-2 8/5 17:00 19:00 HW/SW 15 x5 SimMips/MieruPC M-Core/SimMc FPGA S FINAL PROGRAM 23rd Annual Workshop SWoPP 2010 2010 / / 2010 Kanazawa Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2010 8 3 ( ) 8 5 ( ) 920-0864 15 1 http://www.bunka-h.gr.jp/

More information

IPSJ SIG Technical Report iphone iphone,,., OpenGl ES 2.0 GLSL(OpenGL Shading Language), iphone GPGPU(General-Purpose Computing on Graphics Proc

IPSJ SIG Technical Report iphone iphone,,., OpenGl ES 2.0 GLSL(OpenGL Shading Language), iphone GPGPU(General-Purpose Computing on Graphics Proc iphone 1 1 1 iphone,,., OpenGl ES 2.0 GLSL(OpenGL Shading Language), iphone GPGPU(General-Purpose Computing on Graphics Processing Unit)., AR Realtime Natural Feature Tracking Library for iphone Makoto

More information

B 2 Thin Q=3 0 0 P= N ( )P Q = 2 3 ( )6 N N TSUB- Hub PCI-Express (PCIe) Gen 2 x8 AME1 5) 3 GPU Socket 0 High-performance Linpack 1

B 2 Thin Q=3 0 0 P= N ( )P Q = 2 3 ( )6 N N TSUB- Hub PCI-Express (PCIe) Gen 2 x8 AME1 5) 3 GPU Socket 0 High-performance Linpack 1 TSUBAME 2.0 Linpack 1,,,, Intel NVIDIA GPU 2010 11 TSUBAME 2.0 Linpack 2CPU 3GPU 1400 Dual-Rail QDR InfiniBand TSUBAME 1.0 30 2.4PFlops TSUBAME 1.0 Linpack GPU 1.192PFlops PFlops Top500 4 Achievement of

More information

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Speech Visualization System Based on Augmented Reality Yuichiro Nagano 1 and Takashi Yoshino 2 As the spread of the Augmented Reality(AR) technology and service,

More information

HPE Moonshot System ~ビッグデータ分析&モバイルワークプレイスを新たなステージへ~

HPE Moonshot System ~ビッグデータ分析&モバイルワークプレイスを新たなステージへ~ Brochure HPE Moonshot System HPE Moonshot System 4.3U 45 HPE Moonshot System Xeon & HPE Moonshot System HPE Moonshot System HPE HPE Moonshot System &IoT & SoC Xeon D-1500 Broadwell-DE HPE ProLiant m510

More information

<4D F736F F F696E74202D2091E63489F15F436F6D C982E682E992B48D8291AC92B489B F090CD2888F38DFC E B8CDD8

<4D F736F F F696E74202D2091E63489F15F436F6D C982E682E992B48D8291AC92B489B F090CD2888F38DFC E B8CDD8 Web キャンパス資料 超音波シミュレーションの基礎 ~ 第 4 回 ComWAVEによる超高速超音波解析 ~ 科学システム開発部 Copyright (c)2006 ITOCHU Techno-Solutions Corporation 本日の説明内容 ComWAVEの概要および特徴 GPGPUとは GPGPUによる解析事例 CAE POWER 超音波研究会開催 (10 月 3 日 ) のご紹介

More information

IPSJ SIG Technical Report IaaS VM 1 1 1, 2 IaaS VM VM VM VM VM VM IaaS VM VM VM FBCrypt-V FBCrypt-V VM VMM FBCrypt-V Xen TightVNC VM Preventing Inform

IPSJ SIG Technical Report IaaS VM 1 1 1, 2 IaaS VM VM VM VM VM VM IaaS VM VM VM FBCrypt-V FBCrypt-V VM VMM FBCrypt-V Xen TightVNC VM Preventing Inform IaaS VM 1 1 1, 2 IaaS VM VM VM VM VM VM IaaS VM VM VM FBCrypt-V FBCrypt-V VM VMM FBCrypt-V Xen TightVNC VM Preventing Information Leakage from Screens via Management VMs in IaaS Naoki Nishimura, 1 Tomohisa

More information

HP xw9400 Workstation

HP xw9400 Workstation HP xw9400 Workstation HP xw9400 Workstation AMD Opteron TM PCI Express x16 64 PCI Express x16 2 USB2.0 8 IEEE1394 2 8DIMM HP HP xw9400 Workstation HP CPU HP CPU 240W CPU HP xw9400 HP CPU CPU CPU CPU Sound

More information

Publish/Subscribe KiZUNA P2P 2 Publish/Subscribe KiZUNA 2. KiZUNA 1 Skip Graph BF Skip Graph BF Skip Graph Skip Graph Skip Graph DDLL 2.1 Skip Graph S

Publish/Subscribe KiZUNA P2P 2 Publish/Subscribe KiZUNA 2. KiZUNA 1 Skip Graph BF Skip Graph BF Skip Graph Skip Graph Skip Graph DDLL 2.1 Skip Graph S KiZUNA: P2P 1,a) 1 1 1 P2P KiZUNA KiZUNA Pure P2P P2P 1 Skip Graph ALM(Application Level Multicast) Pub/Sub, P2P Skip Graph, Bloom Filter KiZUNA: An Implementation of Distributed Microblogging Service

More information

IPSJ SIG Technical Report Vol.2012-ARC-202 No.13 Vol.2012-HPC-137 No /12/13 Tightly Coupled Accelerators 1,a) 1,b) 1,c) 1,d) GPU HA-PACS

IPSJ SIG Technical Report Vol.2012-ARC-202 No.13 Vol.2012-HPC-137 No /12/13 Tightly Coupled Accelerators 1,a) 1,b) 1,c) 1,d) GPU HA-PACS Tightly Coupled Accelerators 1,a) 1,b) 1,c) 1,d) HA-PACS 2012 2 HA-PACS TCA (Tightly Coupled Accelerators) TCA PEACH2 1. (Graphics Processing Unit) HPC GP(General Purpose ) TOP500 [1] CPU PCI Express (PCIe)

More information

untitled

untitled Power Wall HPL1 10 B/F EXTREMETECH Supercomputing director bets $2,000 that we won t have exascale computing by 2020 One of the biggest problems standing in our way is power. [] http://www.extremetech.com/computing/155941

More information

Fuzzy Multiple Discrimminant Analysis (FMDA) 5) (SOM) 6) SOM 3 6) SOM SOM SOM SOM SOM SOM 7) 8) SOM SOM SOM GPU 2. n k f(x) m g(x) (1) 12) { min(max)

Fuzzy Multiple Discrimminant Analysis (FMDA) 5) (SOM) 6) SOM 3 6) SOM SOM SOM SOM SOM SOM 7) 8) SOM SOM SOM GPU 2. n k f(x) m g(x) (1) 12) { min(max) SOM 1 2 2 3 1 (SOM: Self-Organizing Maps) 3 SOM SOM SOM SOM GPU A Study on Visualization of Pareto Solutions by Spherical Self-Organizing Maps MASATO YOSHIMI, 1 KANAME NISHIMOTO, 2 LUYI WANG, 2 TOMOYUKI

More information

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of t-room 1 2 2 2 2 1 1 2 t-room 2 Development of Assistant System for Ensemble in t-room Yosuke Irie, 1 Shigemi Aoyagi, 2 Toshihiro Takada, 2 Keiji Hirata, 2 Katsuhiko Kaji, 2 Shigeru Katagiri 1 and Miho

More information

マルチコアPCクラスタ環境におけるBDD法のハイブリッド並列実装

マルチコアPCクラスタ環境におけるBDD法のハイブリッド並列実装 2010 GPGPU 2010 9 29 MPI/Pthread (DDM) DDM CPU CPU CPU CPU FEM GPU FEM CPU Mult - NUMA Multprocessng Cell GPU Accelerator, GPU CPU Heterogeneous computng L3 cache L3 cache CPU CPU + GPU GPU L3 cache 4

More information

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS 2 3 4 5 2. 2.1 3 1) GPS Global Positioning System

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS 2 3 4 5 2. 2.1 3 1) GPS Global Positioning System Vol. 52 No. 1 257 268 (Jan. 2011) 1 2, 1 1 measurement. In this paper, a dynamic road map making system is proposed. The proposition system uses probe-cars which has an in-vehicle camera and a GPS receiver.

More information

IPSJ SIG Technical Report Vol.2013-ARC-206 No /8/1 Android Dominic Hillenbrand ODROID-X2 GPIO Android OSCAR WFI 500[us] GPIO GP

IPSJ SIG Technical Report Vol.2013-ARC-206 No /8/1 Android Dominic Hillenbrand ODROID-X2 GPIO Android OSCAR WFI 500[us] GPIO GP Android 1 1 1 1 1 Dominic Hillenbrand 1 1 1 ODROID-X2 GPIO Android OSCAR WFI 500[us] GPIO GPIO API GPIO API GPIO MPEG2 Optical Flow MPEG2 1PE 0.97[W] 0.63[W] 2PE 1.88[w] 0.46[W] 3PE 2.79[W] 0.37[W] Optical

More information

book.dvi

book.dvi P2P Web Proxy 1120180 24 3 16 1 3 2 5 2.1 Web........................ 5 2.2 Web Proxy.................................... 10 2.2.1 P2P Web Proxy.............................. 11 3 P2P Web Proxy 13 3.1...................................

More information

Vol.214-HPC-145 No /7/3 C #pragma acc directive-name [clause [[,] clause] ] new-line structured block Fortran!$acc directive-name [clause [[,] c

Vol.214-HPC-145 No /7/3 C #pragma acc directive-name [clause [[,] clause] ] new-line structured block Fortran!$acc directive-name [clause [[,] c Vol.214-HPC-145 No.45 214/7/3 OpenACC 1 3,1,2 1,2 GPU CUDA OpenCL OpenACC OpenACC High-level OpenACC CPU Intex Xeon Phi K2X GPU Intel Xeon Phi 27% K2X GPU 24% 1. TSUBAME2.5 CPU GPU CUDA OpenCL CPU OpenMP

More information

untitled

untitled c NUMA 1. 18 (Moore s law) 1Hz CPU 2. 1 (Register) (RAM) Level 1 (L1) L2 L3 L4 TLB (translation look-aside buffer) (OS) TLB TLB 3. NUMA NUMA (Non-uniform memory access) 819 0395 744 1 2014 10 Copyright

More information

Cloud[2] (48 ) Xeon Phi (50+ ) IBM Cyclops[9] (64 ) Cavium Octeon II (32 ) Tilera Tile-GX (100 ) PE [11][7] 2 Nsim[10] 8080[1] SH-2[5] SH [8

Cloud[2] (48 ) Xeon Phi (50+ ) IBM Cyclops[9] (64 ) Cavium Octeon II (32 ) Tilera Tile-GX (100 ) PE [11][7] 2 Nsim[10] 8080[1] SH-2[5] SH [8 1600 1,a) 1,b) 8080 SH-2 8080 SH-2 Simulation of a Many-Core Architecture with 16 Million Processing Cores Hisanobu Tomari 1,a) Kei Hiraki 1,b) Abstract: 8080 and SH-2 processors are evaluated as building

More information

PowerPoint プレゼンテーション

PowerPoint プレゼンテーション 総務省 ICTスキル総合習得教材 概要版 eラーニング用 [ コース2] データ蓄積 2-5: 多様化が進展するクラウドサービス [ コース1] データ収集 [ コース2] データ蓄積 [ コース3] データ分析 [ コース4] データ利活用 1 2 3 4 5 座学本講座の学習内容 (2-5: 多様化が進展するクラウドサービス ) 講座概要 近年 注目されているクラウドの関連技術を紹介します PCやサーバを構成するパーツを紹介後

More information